1999-02-09 04:51:42 +01:00
|
|
|
Summary
|
|
|
|
-------
|
|
|
|
|
1999-02-08 05:29:25 +01:00
|
|
|
The optimizer generates optimial query plans by doing several steps:
|
|
|
|
|
1999-02-15 23:19:01 +01:00
|
|
|
1) Take each relation in a query, and make a RelOptInfo structure for
|
|
|
|
it. Find each way of accessing the relation, called a Path, including
|
|
|
|
sequential and index scans, and add it to RelOptInfo.pathlist.
|
1999-02-08 05:29:25 +01:00
|
|
|
|
1999-02-09 04:51:42 +01:00
|
|
|
2) Join each RelOptInfo to each other RelOptInfo as specified in the
|
|
|
|
WHERE clause. At this point each RelOptInfo is a single relation, so
|
1999-02-15 23:19:01 +01:00
|
|
|
you are joining every relation to every relation as joined in the WHERE
|
|
|
|
clause.
|
1999-02-09 04:51:42 +01:00
|
|
|
|
|
|
|
Joins occur using two RelOptInfos. One is outer, the other inner.
|
|
|
|
Outers drive lookups of values in the inner. In a nested loop, lookups
|
|
|
|
of values in the inner occur by scanning to find each matching inner
|
1999-02-15 23:19:01 +01:00
|
|
|
row. In a mergejoin, inner and outer rows are ordered, and are accessed
|
|
|
|
in order, so only one scan of inner is required to perform the entire
|
|
|
|
join. In a hashjoin, inner rows are hashed for lookups.
|
1999-02-09 04:51:42 +01:00
|
|
|
|
|
|
|
Each unique join combination becomes a new RelOptInfo. The RelOptInfo
|
1999-02-15 23:19:01 +01:00
|
|
|
is now the joining of two relations. RelOptInfo.pathlist are various
|
1999-02-09 04:51:42 +01:00
|
|
|
paths to create the joined result, having different orderings depending
|
|
|
|
on the join method used.
|
|
|
|
|
|
|
|
3) At this point, every RelOptInfo is joined to each other again, with
|
|
|
|
a new relation added to each RelOptInfo. This continues until all
|
|
|
|
relations have been joined into one RelOptInfo, and the cheapest Path is
|
|
|
|
chosen.
|
|
|
|
|
1999-02-15 23:19:01 +01:00
|
|
|
SELECT *
|
|
|
|
FROM tab1, tab2, tab3, tab4
|
|
|
|
WHERE tab1.col = tab2.col AND
|
|
|
|
tab2.col = tab3.col AND
|
|
|
|
tab3.col = tab4.col
|
|
|
|
|
|
|
|
Tables 1, 2, 3, and 4 are joined as:
|
|
|
|
{1 2},{2 3},{3 4}
|
|
|
|
{1 2 3},{2 3 4}
|
|
|
|
{1 2 3 4}
|
|
|
|
|
|
|
|
SELECT *
|
|
|
|
FROM tab1, tab2, tab3, tab4
|
|
|
|
WHERE tab1.col = tab2.col AND
|
|
|
|
tab1.col = tab3.col AND
|
|
|
|
tab1.col = tab4.col
|
|
|
|
|
|
|
|
Tables 1, 2, 3, and 4 are joined as:
|
|
|
|
{1 2},{1 3},{1 4}
|
|
|
|
{1 2 3},{1 3 4},{1,2,4}
|
|
|
|
{1 2 3 4}
|
|
|
|
|
|
|
|
In the default left-handed joins, each RelOptInfo adds one
|
|
|
|
single-relation RelOptInfo in each join pass, and the added RelOptInfo
|
|
|
|
is always the inner relation in the join. In right-handed joins, the
|
|
|
|
added RelOptInfo is the outer relation in the join. In bushy plans,
|
|
|
|
multi-relation RelOptInfo's can be joined to other multi-relation
|
|
|
|
RelOptInfo's.
|
1999-02-08 05:29:25 +01:00
|
|
|
|
1999-02-04 04:19:11 +01:00
|
|
|
Optimizer Functions
|
|
|
|
-------------------
|
|
|
|
|
1999-02-03 21:15:53 +01:00
|
|
|
These directories take the Query structure returned by the parser, and
|
1997-12-17 19:02:33 +01:00
|
|
|
generate a plan used by the executor. The /plan directory generates the
|
|
|
|
plan, the /path generates all possible ways to join the tables, and
|
|
|
|
/prep handles special cases like inheritance. /utils is utility stuff.
|
|
|
|
|
|
|
|
planner()
|
|
|
|
handle inheritance by processing separately
|
|
|
|
-init_query_planner()
|
|
|
|
preprocess target list
|
|
|
|
preprocess qualifications(WHERE)
|
|
|
|
--query_planner()
|
1998-08-10 04:26:40 +02:00
|
|
|
cnfify()
|
|
|
|
Summary:
|
|
|
|
|
|
|
|
Simple cases with all AND's are handled by removing the AND's:
|
|
|
|
|
|
|
|
convert: a = 1 AND b = 2 AND c = 3
|
|
|
|
to: a = 1, b = 2, c = 3
|
|
|
|
|
|
|
|
Qualifications with OR's are handled differently. OR's inside AND
|
|
|
|
clauses are not modified drastically:
|
|
|
|
|
|
|
|
convert: a = 1 AND b = 2 AND (c = 3 OR d = 4)
|
|
|
|
to: a = 1, b = 2, c = 3 OR d = 4
|
|
|
|
|
|
|
|
OR's in the upper level are more complex to handle:
|
|
|
|
|
|
|
|
convert: (a = 1 AND b = 2) OR c = 3
|
|
|
|
to: (a = 1 OR c = 3) AND (b = 2 OR c = 3)
|
|
|
|
finally: (a = 1 OR c = 3), (b = 2 OR c = 3)
|
|
|
|
|
|
|
|
These clauses all have to be true for a result to be returned,
|
|
|
|
so the optimizer can choose the most restrictive clauses.
|
|
|
|
|
1997-12-17 19:02:33 +01:00
|
|
|
pull out constants from target list
|
|
|
|
get a target list that only contains column names, no expressions
|
|
|
|
if none, then return
|
|
|
|
---subplanner()
|
|
|
|
make list of relations in target
|
|
|
|
make list of relations in where clause
|
1999-02-15 23:19:01 +01:00
|
|
|
split up the qual into restrictions (a=1) and joins (b=c)
|
|
|
|
find relation clauses can do merge sort and hash joins
|
|
|
|
----make_one_rel()
|
|
|
|
set_base_rel_pathlist()
|
|
|
|
find scan and all index paths for each relation
|
|
|
|
find selectivity of columns used in joins
|
|
|
|
-----make_one_rel_by_joins()
|
1997-12-17 19:02:33 +01:00
|
|
|
jump to geqo if needed
|
|
|
|
again:
|
1999-02-15 23:19:01 +01:00
|
|
|
make_rels_by_joins():
|
1997-12-17 19:02:33 +01:00
|
|
|
for each joinrel:
|
1999-02-15 23:19:01 +01:00
|
|
|
make_rels_by_clause_joins()
|
|
|
|
for each rel's joininfo list:
|
1997-12-17 19:02:33 +01:00
|
|
|
if a join from the join clause adds only one relation, do the join
|
1999-02-15 23:19:01 +01:00
|
|
|
or make_rels_by_clauseless_joins()
|
|
|
|
update_rels_pathlist_for_joins()
|
|
|
|
generate nested,merge,hash join paths for new rel's created above
|
|
|
|
merge_rels_with_same_relids()
|
|
|
|
merge RelOptInfo paths that have the same relids because of joins
|
|
|
|
rels_set_cheapest()
|
|
|
|
set cheapest path
|
|
|
|
if all relations in one RelOptInfo, return
|
1997-12-17 19:02:33 +01:00
|
|
|
do group(GROUP)
|
|
|
|
do aggregate
|
|
|
|
put back constants
|
|
|
|
re-flatten target list
|
|
|
|
make unique(DISTINCT)
|
|
|
|
make sort(ORDER BY)
|
1999-02-03 21:15:53 +01:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Optimizer Structures
|
|
|
|
--------------------
|
1999-02-04 04:19:11 +01:00
|
|
|
|
1999-02-15 23:19:01 +01:00
|
|
|
RelOptInfo - a relation or joined relations
|
1999-02-04 04:19:11 +01:00
|
|
|
|
1999-02-15 23:19:01 +01:00
|
|
|
RestrictInfo - restriction clauses
|
|
|
|
JoinInfo - join clauses
|
1999-02-04 04:19:11 +01:00
|
|
|
|
1999-02-15 23:19:01 +01:00
|
|
|
Path - every way to generate a RelOptInfo(sequential,index,joins)
|
|
|
|
IndexPath - index scans
|
|
|
|
NestPath - nested joins
|
|
|
|
MergePath - merge joins
|
|
|
|
HashPath - hash joins
|
1999-02-04 02:47:02 +01:00
|
|
|
|
1999-02-15 23:19:01 +01:00
|
|
|
PathOrder - every ordering type (sort, merge of relations)
|