postgresql

Commit Graph

Author	SHA1	Message	Date
Magnus Hagander	9f2e211386	Remove cvs keywords from all files.	2010-09-20 22:08:53 +02:00
Peter Eisentraut	3f11971916	Remove extra newlines at end and beginning of files, add missing newlines at end of files.	2010-08-19 05:57:36 +00:00
Tom Lane	b78f6264eb	Rework join-removal logic as per recent discussion. In particular this fixes things so that it works for cases where nested removals are possible. The overhead of the optimization should be significantly less, as well.	2010-03-28 22:59:34 +00:00
Tom Lane	25549edb26	Fix equivclass.c's not-quite-right strategy for handling X=X clauses. The original coding correctly noted that these aren't just redundancies (they're effectively X IS NOT NULL, assuming = is strict). However, they got treated that way if X happened to be in a single-member EquivalenceClass already, which could happen if there was an ORDER BY X clause, for instance. The simplest and most reliable solution seems to be to not try to process such clauses through the EquivalenceClass machinery; just throw them back for traditional processing. The amount of work that'd be needed to be smarter than that seems out of proportion to the benefit. Per bug #5084 from Bernt Marius Johnsen, and analysis by Andrew Gierth.	2009-09-29 01:20:34 +00:00
Tom Lane	488d70ab46	Implement "join removal" for cases where the inner side of a left join is unique and is not referenced above the join. In this case the inner side doesn't affect the query result and can be thrown away entirely. Although perhaps nobody would ever write such a thing by hand, it's a reasonably common case in machine-generated SQL. The current implementation only recognizes the case where the inner side is a simple relation with a unique index matching the query conditions. This is enough for the use-cases that have been shown so far, but we might want to try to handle other cases later. Robert Haas, somewhat rewritten by Tom	2009-09-17 20:49:29 +00:00
Tom Lane	b2c51e6eba	Fix another semijoin-ordering bug. We already knew that we couldn't reorder a semijoin into or out of the righthand side of another semijoin, but actually it doesn't work to reorder it into or out of the righthand side of a left or antijoin, either. Per bug #4906 from Mathieu Fenniak. This was sloppy thinking on my part. This identity does work: ( A left join B on (Pab) ) semijoin C on (Pac) == ( A semijoin C on (Pac) ) left join B on (Pab) but I failed to see that that doesn't mean this does: ( A left join B on (Pab) ) semijoin C on (Pbc) != A left join ( B semijoin C on (Pbc) ) on (Pab)	2009-07-21 02:02:44 +00:00
Tom Lane	75c85bd199	Tighten up join ordering rules to account for recent more-careful analysis of the associativity of antijoins. Also improve optimizer/README discussion of outer join ordering rules.	2009-02-27 22:41:38 +00:00
Tom Lane	e006a24ad1	Implement SEMI and ANTI joins in the planner and executor. (Semijoins replace the old JOIN_IN code, but antijoins are new functionality.) Teach the planner to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti joins respectively. Also, LEFT JOINs with suitable upper-level IS NULL filters are recognized as being anti joins. Unify the InClauseInfo and OuterJoinInfo infrastructure into "SpecialJoinInfo". With that change, it becomes possible to associate a SpecialJoinInfo with every join attempt, which permits some cleanup of join selectivity estimation. That needs to be taken much further than this patch does, but the next step is to change the API for oprjoin selectivity functions, which seems like material for a separate patch. So for the moment the output size estimates for semi and especially anti joins are quite bogus.	2008-08-14 18:48:00 +00:00
Tom Lane	9511304752	Rearrange the querytree representation of ORDER BY/GROUP BY/DISTINCT items as per my recent proposal: 1. Fold SortClause and GroupClause into a single node type SortGroupClause. We were already relying on them to be struct-equivalent, so using two node tags wasn't accomplishing much except to get in the way of comparing items with equal(). 2. Add an "eqop" field to SortGroupClause to carry the associated equality operator. This is cheap for the parser to get at the same time it's looking up the sort operator, and storing it eliminates the need for repeated not-so-cheap lookups during planning. In future this will also let us represent GROUP/DISTINCT operations on datatypes that have hash opclasses but no btree opclasses (ie, they have equality but no natural sort order). The previous representation simply didn't work for that, since its only indicator of comparison semantics was a sort operator. 3. Add a hasDistinctOn boolean to struct Query to explicitly record whether the distinctClause came from DISTINCT or DISTINCT ON. This allows removing some complicated and not 100% bulletproof code that attempted to figure that out from the distinctClause alone. This patch doesn't in itself create any new capability, but it's necessary infrastructure for future attempts to use hash-based grouping for DISTINCT and UNION/INTERSECT/EXCEPT.	2008-08-02 21:32:01 +00:00
Bruce Momjian	91509e6a87	Small wording improvements for source code READMEs.	2008-04-09 01:00:46 +00:00
Bruce Momjian	4d048b7b8b	Revert README cleanups.	2008-04-09 00:59:24 +00:00
Bruce Momjian	8cb3ad9f52	Revert sentence removal from nickname in FAQ.	2008-04-09 00:55:30 +00:00
Bruce Momjian	fca9fff41b	More README src cleanups.	2008-03-21 13:23:29 +00:00
Bruce Momjian	4e228447aa	Make source code READMEs more consistent. Add CVS tags to all README files.	2008-03-20 17:55:15 +00:00
Tom Lane	cd2a2ce904	Change have_join_order_restriction() so that we do not force a clauseless join if either of the input relations can legally be joined to any other rels using join clauses. This avoids uselessly (and expensively) considering a lot of really stupid join paths when there is a join restriction with a large footprint, that is, lots of relations inside its LHS or RHS. My patch of 15-Feb-2007 had been causing the code to consider joining every combination of rels inside such a group, which is exponentially bad :-(. With this behavior, clauseless bushy joins will be done if necessary, but they'll be put off as long as possible. Per report from Jakub Ouhrabka. Backpatch to 8.2. We might someday want to backpatch to 8.1 as well, but 8.1 does not have the problem for OUTER JOIN nests, only for IN-clauses, so it's not clear anyone's very likely to hit it in practice; and the current patch doesn't apply cleanly to 8.1.	2007-10-26 18:10:50 +00:00
Tom Lane	cdf0231c88	Create a function variable "join_search_hook" to let plugins override the join search order portion of the planner; this is specifically intended to simplify developing a replacement for GEQO planning. Patch by Julius Stroffek, editorialized on by me. I renamed make_one_rel_by_joins to standard_join_search and make_rels_by_joins to join_search_one_level to better reflect their place within this scheme.	2007-09-26 18:51:51 +00:00
Tom Lane	7c5e5439d2	Get rid of some old and crufty global variables in the planner. When this code was last gone over, there wasn't really any alternative to globals because we didn't have the PlannerInfo struct being passed all through the planner code. Now that we do, we can restructure things to avoid non-reentrancy. I'm fooling with this because otherwise I'd have had to add another global variable for the planned compact range table list.	2007-02-19 07:03:34 +00:00
Tom Lane	6bef118b01	Restructure code that is responsible for ensuring that clauseless joins are considered when it is necessary to do so because of a join-order restriction (that is, an outer-join or IN-subselect construct). The former coding was a bit ad-hoc and inconsistent, and it missed some cases, as exposed by Mario Weilguni's recent bug report. His specific problem was that an IN could be turned into a "clauseless" join due to constant-propagation removing the IN's joinclause, and if the IN's subselect involved more than one relation and there was more than one such IN linking to the same upper relation, then the only valid join orders involve "bushy" plans but we would fail to consider the specific paths needed to get there. (See the example case added to the join regression test.) On examining the code I wonder if there weren't some other problem cases too; in particular it seems that GEQO was defending against a different set of corner cases than the main planner was. There was also an efficiency problem, in that when we did realize we needed a clauseless join because of an IN, we'd consider clauseless joins against every other relation whether this was sensible or not. It seems a better design is to use the outer-join and in-clause lists as a backup heuristic, just as the rule of joining only where there are joinclauses is a heuristic: we'll join two relations if they have a usable joinclause or this might be necessary to satisfy an outer-join or IN-clause join order restriction. I refactored the code to have just one place considering this instead of three, and made sure that it covered all the cases that any of them had been considering. Backpatch as far as 8.1 (which has only the IN-clause form of the disease). By rights 8.0 and 7.4 should have the bug too, but they accidentally fail to fail, because the joininfo structure used in those releases preserves some memory of there having once been a joinclause between the inner and outer sides of an IN, and so it leads the code in the right direction anyway. I'll be conservative and not touch them.	2007-02-16 00:14:01 +00:00
Tom Lane	c17117649b	Repair bug in 8.2's new logic for planning outer joins: we have to allow joins that overlap an outer join's min_righthand but aren't fully contained in it, to support joining within the RHS after having performed an outer join that can commute with this one. Aside from the direct fix in make_join_rel(), fix has_join_restriction() and GEQO's desirable_join() to consider this possibility. Per report from Ian Harding.	2007-02-13 02:31:03 +00:00
Tom Lane	f41803bb39	Refactor planner's pathkeys data structure to create a separate, explicit representation of equivalence classes of variables. This is an extensive rewrite, but it brings a number of benefits: * planner no longer fails in the presence of "incomplete" operator families that don't offer operators for every possible combination of datatypes. * avoid generating and then discarding redundant equality clauses. * remove bogus assumption that derived equalities always use operators named "=". * mergejoins can work with a variety of sort orders (e.g., descending) now, instead of tying each mergejoinable operator to exactly one sort order. * better recognition of redundant sort columns. * can make use of equalities appearing underneath an outer join.	2007-01-20 20:45:41 +00:00
Tom Lane	cffd89ca73	Revise the planner's handling of "pseudoconstant" WHERE clauses, that is clauses containing no variables and no volatile functions. Such a clause can be used as a one-time qual in a gating Result plan node, to suppress plan execution entirely when it is false. Even when the clause is true, putting it in a gating node wins by avoiding repeated evaluation of the clause. In previous PG releases, query_planner() would do this for pseudoconstant clauses appearing at the top level of the jointree, but there was no ability to generate a gating Result deeper in the plan tree. To fix it, get rid of the special case in query_planner(), and instead process pseudoconstant clauses through the normal RestrictInfo qual distribution mechanism. When a pseudoconstant clause is found attached to a path node in create_plan(), pull it out and generate a gating Result at that point. This requires special-casing pseudoconstants in selectivity estimation and cost_qual_eval, but on the whole it's pretty clean. It probably even makes the planner a bit faster than before for the normal case of no pseudoconstants, since removing pull_constant_clauses saves one useless traversal of the qual tree. Per gripe from Phil Frost.	2006-07-01 18:38:33 +00:00
Tom Lane	e3b9852728	Teach planner how to rearrange join order for some classes of OUTER JOIN. Per my recent proposal. I ended up basing the implementation on the existing mechanism for enforcing valid join orders of IN joins --- the rules for valid outer-join orders are somewhat similar.	2005-12-20 02:30:36 +00:00
Tom Lane	a31ad27fc5	Simplify the planner's join clause management by storing join clauses of a relation in a flat 'joininfo' list. The former arrangement grouped the join clauses according to the set of unjoined relids used in each; however, profiling on test cases involving lots of joins proves that that data structure is a net loss. It takes more time to group the join clauses together than is saved by avoiding duplicate tests later. It doesn't help any that there are usually not more than one or two clauses per group ...	2005-06-09 04:19:00 +00:00
Tom Lane	9ab4d98168	Remove planner's private fields from Query struct, and put them into a new PlannerInfo struct, which is passed around instead of the bare Query in all the planning code. This commit is essentially just a code-beautification exercise, but it does open the door to making larger changes to the planner data structures without having to muck with the widely-known Query struct.	2005-06-05 22:32:58 +00:00
Tom Lane	14c7fba3f7	Rethink original decision to use AND/OR Expr nodes to represent bitmap logic operations during planning. Seems cleaner to create two new Path node types, instead --- this avoids duplication of cost-estimation code. Also, create an enable_bitmapscan GUC parameter to control use of bitmap plans.	2005-04-21 19:18:13 +00:00
Tom Lane	9888192fb7	Instead of trying to force WHERE clauses into CNF or DNF normal form, just look for common clauses that can be pulled out of ORs. Per recent discussion, extracting common clauses seems to be the only really useful effect of normalization, and if we do it explicitly then we can avoid cluttering the qual with partially-redundant duplicated expressions, which was an unpleasant side-effect of the old approach.	2003-12-30 21:49:19 +00:00
Tom Lane	bdfbfde1b1	IN clauses appearing at top level of WHERE can now be handled as joins. There are two implementation techniques: the executor understands a new JOIN_IN jointype, which emits at most one matching row per left-hand row, or the result of the IN's sub-select can be fed through a DISTINCT filter and then joined as an ordinary relation. Along the way, some minor code cleanup in the optimizer; notably, break out most of the jointree-rearrangement preprocessing in planner.c and put it in a new file prep/prepjointree.c.	2003-01-20 18:55:07 +00:00
Tom Lane	de97072e3c	Allow merge and hash joins to occur on arbitrary expressions (anything not containing a volatile function), rather than only on 'Var = Var' clauses as before. This makes it practical to do flatten_join_alias_vars at the start of planning, which in turn eliminates a bunch of klugery inside the planner to deal with alias vars. As a free side effect, we now detect implied equality of non-Var expressions; for example in SELECT ... WHERE a.x = b.y and b.y = 42 we will deduce a.x = 42 and use that as a restriction qual on a. Also, we can remove the restriction introduced 12/5/02 to prevent pullup of subqueries whose targetlists contain sublinks. Still TODO: make statistical estimation routines in selfuncs.c and costsize.c smarter about expressions that are more complex than plain Vars. The need for this is considerably greater now that we have to be able to estimate the suitability of merge and hash join techniques on such expressions.	2003-01-15 19:35:48 +00:00
Tom Lane	935969415a	Be more realistic about plans involving Materialize nodes: take their cost into account while planning.	2002-11-30 05:21:03 +00:00
Tom Lane	f6dba10e62	First phase of implementing hash-based grouping/aggregation. An AGG plan node now does its own grouping of the input rows, and has no need for a preceding GROUP node in the plan pipeline. This allows elimination of the misnamed tuplePerGroup option for GROUP, and actually saves more code in nodeGroup.c than it costs in nodeAgg.c, as well as being presumably faster. Restructure the API of query_planner so that we do not commit to using a sorted or unsorted plan in query_planner; instead grouping_planner makes the decision. (Right now it isn't any smarter than query_planner was, but that will change as soon as it has the option to select a hash- based aggregation step.) Despite all the hackery, no initdb needed since only in-memory node types changed.	2002-11-06 00:00:45 +00:00
Bruce Momjian	39e331be72	Add Bob Devine's name to the optimizer README.	2002-08-25 22:39:37 +00:00
Tom Lane	3389a110d4	Get rid of long-since-vestigial Iter node type, in favor of adding a returns-set boolean field in Func and Oper nodes. This allows cleaner, more reliable tests for expressions returning sets in the planner and parser. For example, a WHERE clause returning a set is now detected and complained of in the parser, not only at runtime.	2002-05-12 23:43:04 +00:00
Tom Lane	6254465d06	Extend code that deduces implied equality clauses to detect whether a clause being added to a particular restriction-clause list is redundant with those already in the list. This avoids useless work at runtime, and (perhaps more importantly) keeps the selectivity estimation routines from generating too-small estimates of numbers of output rows. Also some minor improvements in OPTIMIZER_DEBUG displays.	2001-10-18 16:11:42 +00:00
Bruce Momjian	26e0321191	Move structure comments from the top block down to the line entries for this file to match all the other files, and to be clearer.	2001-01-17 06:41:31 +00:00
Tom Lane	ea166f1146	Planner speedup hacking. Avoid saving useless pathkeys, so that path comparison does not consider paths different when they differ only in uninteresting aspects of sort order. (We had a special case of this consideration for indexscans already, but generalize it to apply to ordered join paths too.) Be stricter about what is a canonical pathkey to allow faster pathkey comparison. Cache canonical pathkeys and dispersion stats for left and right sides of a RestrictInfo's clause, to avoid repeated computation. Total speedup will depend on number of tables in a query, but I see about 4x speedup of planning phase for a sample seven-table query.	2000-12-14 22:30:45 +00:00
Tom Lane	6543d81d65	Restructure handling of inheritance queries so that they work with outer joins, and clean things up a good deal at the same time. Append plan node no longer hacks on rangetable at runtime --- instead, all child tables are given their own RT entries during planning. Concept of multiple target tables pushed up into execMain, replacing bug-prone implementation within nodeAppend. Planner now supports generating Append plans for inheritance sets either at the top of the plan (the old way) or at the bottom. Expanding at the bottom is appropriate for tables used as sources, since they may appear inside an outer join; but we must still expand at the top when the target of an UPDATE or DELETE is an inheritance set, because we actually need a different targetlist and junkfilter for each target table in that case. Fortunately a target table can't be inside an outer join... Bizarre mutual recursion between union_planner and prepunion.c is gone --- in fact, union_planner doesn't really have much to do with union queries anymore, so I renamed it grouping_planner.	2000-11-12 00:37:02 +00:00
Tom Lane	3a94e789f5	Subselects in FROM clause, per ISO syntax: FROM (SELECT ...) [AS] alias. (Don't forget that an alias is required.) Views reimplemented as expanding to subselect-in-FROM. Grouping, aggregates, DISTINCT in views actually work now (he says optimistically). No UNION support in subselects/views yet, but I have some ideas about that. Rule-related permissions checking moved out of rewriter and into executor. INITDB REQUIRED!	2000-09-29 18:21:41 +00:00
Tom Lane	ed5003c584	First cut at full support for OUTER JOINs. There are still a few loose ends to clean up (see my message of same date to pghackers), but mostly it works. INITDB REQUIRED!	2000-09-12 21:07:18 +00:00
Tom Lane	cd9f0ca545	Deduce equality constraints that are implied by transitivity of mergejoinable qual clauses, and add them to the query quals. For example, WHERE a = b AND b = c will cause us to add AND a = c. This is necessary to ensure that it's safe to use these variables as interchangeable sort keys, which is something 7.0 knows how to do. Should provide a useful improvement in planning ability, too.	2000-07-24 03:11:01 +00:00
Tom Lane	3ee8f7e207	Restructure planning code so that preprocessing of targetlist and quals to simplify constant expressions and expand SubLink nodes into SubPlans is done in a separate routine subquery_planner() that calls union_planner(). We formerly did most of this work in query_planner(), but that's the wrong place because it may never see the real targetlist. Splitting union_planner into two routines also allows us to avoid redundant work when union_planner is invoked recursively for UNION and inheritance cases. Upshot is that it is now possible to do something like select float8(count()) / (select count() from int4_tbl) from int4_tbl group by f1; which has never worked before.	2000-03-21 05:12:12 +00:00
Tom Lane	b1577a7c78	New cost model for planning, incorporating a penalty for random page accesses versus sequential accesses, a (very crude) estimate of the effects of caching on random page accesses, and cost to evaluate WHERE- clause expressions. Export critical parameters for this model as SET variables. Also, create SET variables for the planner's enable flags (enable_seqscan, enable_indexscan, etc) so that these can be controlled more conveniently than via PGOPTIONS. Planner now estimates both startup cost (cost before retrieving first tuple) and total cost of each path, so it can optimize queries with LIMIT on a reasonable basis by interpolating between these costs. Same facility is a win for EXISTS(...) subqueries and some other cases. Redesign pathkey representation to achieve a major speedup in planning (I saw as much as 5X on a 10-way join); also minor changes in planner to reduce memory consumption by recycling discarded Path nodes and not constructing unnecessary lists. Minor cleanups to display more-plausible costs in some cases in EXPLAIN output. Initdb forced by change in interface to index cost estimation functions.	2000-02-15 20:49:31 +00:00
Tom Lane	d8733ce674	Repair planning bugs caused by my misguided removal of restrictinfo link fields in JoinPaths --- turns out that we do need that after all :-(. Also, rearrange planner so that only one RelOptInfo is created for a particular set of joined base relations, no matter how many different subsets of relations it can be created from. This saves memory and processing time compared to the old method of making a bunch of RelOptInfos and then removing the duplicates. Clean up the jointree iteration logic; not sure if it's better, but I sure find it more readable and plausible now, particularly for the case of 'bushy plans'.	2000-02-07 04:41:04 +00:00
Tom Lane	e6381966c1	Major planner/optimizer revision: get rid of PathOrder node type, store all ordering information in pathkeys lists (which are now lists of lists of PathKeyItem nodes, not just lists of lists of vars). This was a big win --- the code is smaller and IMHO more understandable than it was, even though it handles more cases. I believe the node changes will not force an initdb for anyone; planner nodes don't show up in stored rules.	1999-08-16 02:17:58 +00:00
Bruce Momjian	612b8434e4	optimizer cleanup	1999-02-19 05:18:06 +00:00
Bruce Momjian	8ab72a38df	optimizer cleanup	1999-02-19 02:05:20 +00:00
Bruce Momjian	cd550c7672	Update optimizer readme.	1999-02-15 22:19:01 +00:00
Bruce Momjian	fe35ffe7e0	Major optimizer improvement for joining a large number of tables.	1999-02-09 03:51:42 +00:00
Bruce Momjian	54e5d25666	Optimizer cleanup.	1999-02-08 04:29:25 +00:00
Bruce Momjian	ce3afccf7f	More optimizer cleanups.	1999-02-04 03:19:11 +00:00
Bruce Momjian	18fbe4142f	More optimizer renaming HInfo -> HashInfo.	1999-02-04 01:47:02 +00:00
Bruce Momjian	8d9237d485	Optimizer rename ClauseInfo -> RestrictInfo. Update optimizer README.	1999-02-03 20:15:53 +00:00
Bruce Momjian	2d32d909b5	Cleanup optimizer function names and clarify code.	1998-08-10 02:26:40 +00:00
Bruce Momjian	e46df2ff6e	OPTIMIZER_DEBUG additions.	1998-08-07 05:02:32 +00:00
Bruce Momjian	d3f0e87d17	Cost cleanup.	1997-12-18 12:21:02 +00:00
Bruce Momjian	d158fce8eb	Add optimizer README file.	1997-12-17 18:02:33 +00:00

1 2 3

105 Commits