postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	1d97c19a0f	Fix estimate_num_groups() to not fail on PlaceHolderVars, per report from Stefan Kaltenbrunner. The most reasonable behavior (at least for the near term) seems to be to ignore the PlaceHolderVar and examine its argument instead. In support of this, change the API of pull_var_clause() to allow callers to request recursion into PlaceHolderVars. Currently estimate_num_groups() is the only customer for that behavior, but where there's one there may be others.	2009-04-19 19:46:33 +00:00
Tom Lane	d7a6a04dc7	Fix planner to restore its previous level of intelligence about pushing constants through full joins, as in select * from tenk1 a full join tenk1 b using (unique1) where unique1 = 42; which should generate a fairly cheap plan where we apply the constraint unique1 = 42 in each relation scan. This had been broken by my patch of 2008-06-27, which is now reverted in favor of a more invasive but hopefully less incorrect approach. That patch was meant to prevent incorrect extraction of OR'd indexclauses from OR conditions above an outer join. To do that correctly we need more information than the outerjoin_delay flag can provide, so add a nullable_relids field to RestrictInfo that records exactly which relations are nulled by outer joins that are underneath a particular qual clause. A side benefit is that we can make the test in create_or_index_quals more specific: it is now smart enough to extract an OR'd indexclause into the outer side of an outer join, even though it must not do so in the inner side. The old coding couldn't distinguish these cases so it could not do either.	2009-04-16 20:42:16 +00:00
Tom Lane	e549722a8b	Get rid of the rather fuzzily defined FlattenedSubLink node type in favor of making pull_up_sublinks() construct a full-blown JoinExpr tree representation of IN/EXISTS SubLinks that it is able to convert to semi or anti joins. This makes pull_up_sublinks() a shade more complex, but the gain in semantic clarity is worth it. I still have more to do in this area to address the previously-discussed problems, but this commit in itself fixes at least one bug in HEAD, as shown by added regression test case.	2009-02-25 03:30:38 +00:00
Tom Lane	d04db37072	Arrange for function default arguments to be processed properly in expressions that are set up for execution with ExecPrepareExpr rather than going through the full planner process. By introducing an explicit notion of "expression planning", this patch also lays a bit of groundwork for maybe someday allowing sub-selects in standalone expressions.	2009-01-09 15:46:11 +00:00
Tom Lane	445ce15702	Create a third option named "partition" for constraint_exclusion, and make it the default. This setting enables constraint exclusion checks only for appendrel members (ie, inheritance children and UNION ALL arms), which are the cases in which constraint exclusion is most likely to be useful. Avoiding the overhead for simple queries that are unlikely to benefit should bring the cost down to the point where this is a reasonable default setting. Per today's discussion.	2009-01-07 22:40:49 +00:00
Bruce Momjian	511db38ace	Update copyright for 2009.	2009-01-01 17:24:05 +00:00
Tom Lane	8e8854daa2	Add some basic support for window frame clauses to the window-functions patch. This includes the ability to force the frame to cover the whole partition, and the ability to make the frame end exactly on the current row rather than its last ORDER BY peer. Supporting any more of the full SQL frame-clause syntax will require nontrivial hacking on the window aggregate code, so it'll have to wait for 8.5 or beyond.	2008-12-31 00:08:39 +00:00
Tom Lane	95b07bc7f5	Support window functions a la SQL:2008. Hitoshi Harada, with some kibitzing from Heikki and Tom.	2008-12-28 18:54:01 +00:00
Tom Lane	0436679969	Get rid of adjust_appendrel_attr_needed(), which has been broken ever since we extended the appendrel mechanism to support UNION ALL optimization. The reason nobody noticed was that we are not actually using attr_needed data for appendrel children; hence it seems more reasonable to rip it out than fix it. Back-patch to 8.2 because an Assert failure is possible in corner cases. Per examination of an example from Jim Nasby. In HEAD, also get rid of AppendRelInfo.col_mappings, which is quite inadequate to represent UNION ALL situations; depend entirely on translated_vars instead.	2008-11-11 18:13:32 +00:00
Tom Lane	e6ae3b5dbf	Add a concept of "placeholder" variables to the planner. These are variables that represent some expression that we desire to compute below the top level of the plan, and then let that value "bubble up" as though it were a plain Var (ie, a column value). The immediate application is to allow sub-selects to be flattened even when they are below an outer join and have non-nullable output expressions. Formerly we couldn't flatten because such an expression wouldn't properly go to NULL when evaluated above the outer join. Now, we wrap it in a PlaceHolderVar and arrange for the actual evaluation to occur below the outer join. When the resulting Var bubbles up through the join, it will be set to NULL if necessary, yielding the correct results. This fixes a planner limitation that's existed since 7.1. In future we might want to use this mechanism to re-introduce some form of Hellerstein's "expensive functions" optimization, ie place the evaluation of an expensive function at the most suitable point in the plan tree.	2008-10-21 20:42:53 +00:00
Tom Lane	76e6602417	Improve the recently-added code for inlining set-returning functions so that it can handle functions returning setof record. The case was left undone originally, but it turns out to be simple to fix.	2008-10-09 19:27:40 +00:00
Tom Lane	0d115dde82	Extend CTE patch to support recursive UNION (ie, without ALL). The implementation uses an in-memory hash table, so it will poop out for very large recursive results ... but the performance characteristics of a sort-based implementation would be pretty unpleasant too.	2008-10-07 19:27:04 +00:00
Tom Lane	44d5be0e53	Implement SQL-standard WITH clauses, including WITH RECURSIVE. There are some unimplemented aspects: recursive queries must use UNION ALL (should allow UNION too), and we don't have SEARCH or CYCLE clauses. These might or might not get done for 8.4, but even without them it's a pretty useful feature. There are also a couple of small loose ends and definitional quibbles, which I'll send a memo about to pgsql-hackers shortly. But let's land the patch now so we can get on with other development. Yoshiyuki Asaba, with lots of help from Tatsuo Ishii and Tom Lane	2008-10-04 21:56:55 +00:00
Tom Lane	ee33b95d9c	Improve the plan cache invalidation mechanism to make it invalidate plans when user-defined functions used in a plan are modified. Also invalidate plans when schemas, operators, or operator classes are modified; but for these cases we just invalidate everything rather than tracking exact dependencies, since these types of objects seldom change in a production database. Tom Lane; loosely based on a patch by Martin Pihlak.	2008-09-09 18:58:09 +00:00
Tom Lane	b153c09209	Add a bunch of new error location reports to parse-analysis error messages. There are still some weak spots around JOIN USING and relation alias lists, but most errors reported within backend/parser/ now have locations.	2008-09-01 20:42:46 +00:00
Tom Lane	e5536e77a5	Move exprType(), exprTypmod(), expression_tree_walker(), and related routines into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside the backend. There's probably more that should be done along this line, but this is a start anyway.	2008-08-25 22:42:34 +00:00
Tom Lane	bd3daddaf2	Arrange to convert EXISTS subqueries that are equivalent to hashable IN subqueries into the same thing you'd have gotten from IN (except always with unknownEqFalse = true, so as to get the proper semantics for an EXISTS). I believe this fixes the last case within CVS HEAD in which an EXISTS could give worse performance than an equivalent IN subquery. The tricky part of this is that if the upper query probes the EXISTS for only a few rows, the hashing implementation can actually be worse than the default, and therefore we need to make a cost-based decision about which way to use. But at the time when the planner generates plans for subqueries, it doesn't really know how many times the subquery will be executed. The least invasive solution seems to be to generate both plans and postpone the choice until execution. Therefore, in a query that has been optimized this way, EXPLAIN will show two subplans for the EXISTS, of which only one will actually get executed. There is a lot more that could be done based on this infrastructure: in particular it's interesting to consider switching to the hash plan if we start out using the non-hashed plan but find a lot more upper rows going by than we expected. I have therefore left some minor inefficiencies in place, such as initializing both subplans even though we will currently only use one.	2008-08-22 00:16:04 +00:00
Tom Lane	19e34b6239	Improve sublink pullup code to handle ANY/EXISTS sublinks that are at top level of a JOIN/ON clause, not only at top level of WHERE. (However, we can't do this in an outer join's ON clause, unless the ANY/EXISTS refers only to the nullable side of the outer join, so that it can effectively be pushed down into the nullable side.) Per request from Kevin Grittner. In passing, fix a bug in the initial implementation of EXISTS pullup: it would Assert if the EXIST's WHERE clause used a join alias variable. Since we haven't yet flattened join aliases when this transformation happens, it's necessary to include join relids in the computed set of RHS relids.	2008-08-17 01:20:00 +00:00
Tom Lane	d4af2a6481	Clean up the loose ends in selectivity estimation left by my patch for semi and anti joins. To do this, pass the SpecialJoinInfo struct for the current join as an additional optional argument to operator join selectivity estimation functions. This allows the estimator to tell not only what kind of join is being formed, but which variable is on which side of the join; a requirement long recognized but not dealt with till now. This also leaves the door open for future improvements in the estimators, such as accounting for the null-insertion effects of lower outer joins. I didn't do anything about that in the current patch but the information is in principle deducible from what's passed. The patch also clarifies the definition of join selectivity for semi/anti joins: it's the fraction of the left input that has (at least one) match in the right input. This allows getting rid of some very fuzzy thinking that I had committed in the original 7.4-era IN-optimization patch. There's probably room to estimate this better than the present patch does, but at least we know what to estimate. Since I had to touch CREATE OPERATOR anyway to allow a variant signature for join estimator functions, I took the opportunity to add a couple of additional checks that were missing, per my recent message to -hackers: * Check that estimator functions return float8; * Require execute permission at the time of CREATE OPERATOR on the operator's function as well as the estimator functions; * Require ownership of any pre-existing operator that's modified by the command. I also moved the lookup of the functions out of OperatorCreate() and into operatorcmds.c, since that seemed more consistent with most of the other catalog object creation processes, eg CREATE TYPE.	2008-08-16 00:01:38 +00:00
Tom Lane	e006a24ad1	Implement SEMI and ANTI joins in the planner and executor. (Semijoins replace the old JOIN_IN code, but antijoins are new functionality.) Teach the planner to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti joins respectively. Also, LEFT JOINs with suitable upper-level IS NULL filters are recognized as being anti joins. Unify the InClauseInfo and OuterJoinInfo infrastructure into "SpecialJoinInfo". With that change, it becomes possible to associate a SpecialJoinInfo with every join attempt, which permits some cleanup of join selectivity estimation. That needs to be taken much further than this patch does, but the next step is to change the API for oprjoin selectivity functions, which seems like material for a separate patch. So for the moment the output size estimates for semi and especially anti joins are quite bogus.	2008-08-14 18:48:00 +00:00
Tom Lane	af95d7aa63	Improve INTERSECT/EXCEPT hashing by realizing that we don't need to make any hashtable entries for tuples that are found only in the second input: they can never contribute to the output. Furthermore, this implies that the planner should endeavor to put first the smaller (in number of groups) input relation for an INTERSECT. Implement that, and upgrade prepunion's estimation of the number of rows returned by setops so that there's some amount of sanity in the estimate of which one is smaller.	2008-08-07 19:35:02 +00:00
Tom Lane	368df30427	Support hashing for duplicate-elimination in INTERSECT and EXCEPT queries. This completes my project of improving usage of hashing for duplicate elimination (aggregate functions with DISTINCT remain undone, but that's for some other day). As with the previous patches, this means we can INTERSECT/EXCEPT on datatypes that can hash but not sort, and it means that INTERSECT/EXCEPT without ORDER BY are no longer certain to produce sorted output.	2008-08-07 03:04:04 +00:00
Tom Lane	2d1d96b1ce	Teach the system how to use hashing for UNION. (INTERSECT/EXCEPT will follow, but seem like a separate patch since most of the remaining work is on the executor side.) I took the opportunity to push selection of the grouping operators for set operations into the parser where it belongs. Otherwise this is just a small exercise in making prepunion.c consider both alternatives. As with the recent DISTINCT patch, this means we can UNION on datatypes that can hash but not sort, and it means that UNION without ORDER BY is no longer certain to produce sorted output.	2008-08-07 01:11:52 +00:00
Tom Lane	9511304752	Rearrange the querytree representation of ORDER BY/GROUP BY/DISTINCT items as per my recent proposal: 1. Fold SortClause and GroupClause into a single node type SortGroupClause. We were already relying on them to be struct-equivalent, so using two node tags wasn't accomplishing much except to get in the way of comparing items with equal(). 2. Add an "eqop" field to SortGroupClause to carry the associated equality operator. This is cheap for the parser to get at the same time it's looking up the sort operator, and storing it eliminates the need for repeated not-so-cheap lookups during planning. In future this will also let us represent GROUP/DISTINCT operations on datatypes that have hash opclasses but no btree opclasses (ie, they have equality but no natural sort order). The previous representation simply didn't work for that, since its only indicator of comparison semantics was a sort operator. 3. Add a hasDistinctOn boolean to struct Query to explicitly record whether the distinctClause came from DISTINCT or DISTINCT ON. This allows removing some complicated and not 100% bulletproof code that attempted to figure that out from the distinctClause alone. This patch doesn't in itself create any new capability, but it's necessary infrastructure for future attempts to use hash-based grouping for DISTINCT and UNION/INTERSECT/EXCEPT.	2008-08-02 21:32:01 +00:00
Tom Lane	eaf1b5d348	Tighten up SS_finalize_plan's computation of valid_params to exclude Params of the current query level that aren't in fact output parameters of the current initPlans. (This means, for example, output parameters of regular subplans.) To make this work correctly for output parameters coming from sibling initplans requires rejiggering the API of SS_finalize_plan just a bit: we need the siblings to be visible to it, rather than hidden as SS_make_initplan_from_plan had been doing. This is really part of my response to bug #4290, but I concluded this part probably shouldn't be back-patched, since all that it's doing is to make a debugging cross-check tighter.	2008-07-10 02:14:03 +00:00
Alvaro Herrera	a3540b0f65	Improve our #include situation by moving pointer types away from the corresponding struct definitions. This allows other headers to avoid including certain highly-loaded headers such as rel.h and relscan.h, instead using just relcache.h, heapam.h or genam.h, which are more lightweight and thus cause less unnecessary dependencies.	2008-06-19 00:46:06 +00:00
Tom Lane	db147b3483	Allow the planner's estimate of the fraction of a cursor's rows that will be retrieved to be controlled through a GUC variable. Robert Hell	2008-05-02 21:26:10 +00:00
Tom Lane	25e46a504b	Fix a couple of oversights associated with the "physical tlist" optimization: we had several code paths where a physical tlist could be used for the input to a Sort node, which is a dumb idea because any unneeded table columns will increase the volume of data the sort has to push around. (Unfortunately the easy-looking fix of calling disuse_physical_tlist during make_sort_xxx doesn't work because in most cases we're already committed to the current input tlist --- it's been marked with sort column numbers, or we've built grouping column numbers using it, etc. The tlist has to be selected properly at the calling level before we start constructing sort-col information. This is easy enough to do, we were just failing to take the point into consideration.) Back-patch to 8.3. I believe the problem probably exists clear back to 7.4 when the physical tlist optimization was added, but I'm afraid to back-patch further than 8.3 without a great deal more study than I want to put into it. The code in this area has drifted a lot over time. The real-world importance of these code paths is uncertain anyway --- I think in many cases we'd probably prefer hash-based methods.	2008-04-17 21:22:14 +00:00
Tom Lane	6b73d7e567	Fix an oversight I made in a cleanup patch over a year ago: eval_const_expressions needs to be passed the PlannerInfo ("root") structure, because in some cases we want it to substitute values for Param nodes. (So "constant" is not so constant as all that ...) This mistake partially disabled optimization of unnamed extended-Query statements in 8.3: in particular the LIKE-to-indexscan optimization would never be applied if the LIKE pattern was passed as a parameter, and constraint exclusion depending on a parameter value didn't work either.	2008-04-01 00:48:33 +00:00
Tom Lane	d344115519	Apply my original fix for Taiki Yamaguchi's bug report about DISTINCT MAX(). Add some regression tests for plausible failures in this area.	2008-03-31 16:59:26 +00:00
Tom Lane	0d49838df6	Arrange to "inline" SQL functions that appear in a query's FROM clause, are declared to return set, and consist of just a single SELECT. We can replace the FROM-item with a sub-SELECT and then optimize much as if we were dealing with a view. Patch from Richard Rowell, cleaned up by me.	2008-03-18 22:04:14 +00:00
Tom Lane	c9a1cc694a	Change hash index creation so that rather than always establishing exactly two buckets at the start, we create a number of buckets appropriate for the estimated size of the table. This avoids a lot of expensive bucket-split actions during initial index build on an already-populated table. This is one of the two core ideas of Tom Raney and Shreya Bhargava's patch to reduce hash index build time. I'm committing it separately to make it easier for people to test the effects of this separately from the effects of their other core idea (pre-sorting the index entries by bucket number).	2008-03-15 20:46:31 +00:00
Bruce Momjian	9098ab9e32	Update copyrights in source tree to 2008.	2008-01-01 19:46:01 +00:00
Bruce Momjian	f6e8730d11	Re-run pgindent with updated list of typedefs. (Updated README should avoid this problem in the future.)	2007-11-15 22:25:18 +00:00
Bruce Momjian	fdf5a5efb7	pgindent run for 8.3.	2007-11-15 21:14:46 +00:00
Tom Lane	c291203ca3	Fix EquivalenceClass code to handle volatile sort expressions in a more predictable manner; in particular that if you say ORDER BY output-column-ref, it will in fact sort by that specific column even if there are multiple syntactic matches. An example is SELECT random() AS a, random() AS b FROM ... ORDER BY b, a; While the use-case for this might be a bit debatable, it worked as expected in earlier releases, so we should preserve the behavior for 8.3. Per my recent proposal. While at it, fix convert_subquery_pathkeys() to handle RelabelType stripping in both directions; it needs this for the same reasons make_sort_from_pathkeys does.	2007-11-08 21:49:48 +00:00
Tom Lane	1be0601681	Last week's patch for make_sort_from_pathkeys wasn't good enough: it has to be able to discard top-level RelabelType nodes on both sides of the equivalence-class-to-target-list comparison, since make_pathkey_from_sortinfo might either add or remove a RelabelType. Also fix the latter to do the removal case cleanly. Per example from Peter.	2007-11-08 19:25:37 +00:00
Tom Lane	82d8ab6fc4	Fix the plan-invalidation mechanism to treat regclass constants that refer to a relation as a reason to invalidate a plan when the relation changes. This handles scenarios such as dropping/recreating a sequence that is referenced by nextval('seq') in a cached plan. Rather than teach plancache.c all about digging through plan trees to find regclass Consts, we charge the planner's setrefs.c with making a list of the relation OIDs on which each plan depends. That way the list can be built cheaply during a plan tree traversal that has to happen anyway. Per bug #3662 and subsequent discussion.	2007-10-11 18:05:27 +00:00
Tom Lane	89db887b1e	Keep the planner from failing on "WHERE false AND something IN (SELECT ...)". eval_const_expressions simplifies this to just "WHERE false", but we have already done pull_up_IN_clauses so the IN join will be done, or at least planned, anyway. The trouble case comes when the sub-SELECT is itself a join and we decide to implement the IN by unique-ifying the sub-SELECT outputs: with no remaining reference to the output Vars in WHERE, we won't have propagated the Vars up to the upper join point, leading to "variable not found in subplan target lists" error. Fix by adding an extra scan of in_info_list and forcing all Vars mentioned therein to be propagated up to the IN join point. Per bug report from Miroslav Sulc.	2007-10-04 20:44:47 +00:00
Tom Lane	cdf0231c88	Create a function variable "join_search_hook" to let plugins override the join search order portion of the planner; this is specifically intended to simplify developing a replacement for GEQO planning. Patch by Julius Stroffek, editorialized on by me. I renamed make_one_rel_by_joins to standard_join_search and make_rels_by_joins to join_search_one_level to better reflect their place within this scheme.	2007-09-26 18:51:51 +00:00
Tom Lane	7125687511	Fix cost estimates for EXISTS subqueries that are evaluated as initPlans (because they are uncorrelated with the immediate parent query). We were charging the full run cost to the parent node, disregarding the fact that only one row need be fetched for EXISTS. While this would only be a cosmetic issue in most cases, it might possibly affect planning outcomes if the parent query were itself a subquery to some upper query. Per recent discussion with Steve Crawford.	2007-09-22 21:36:40 +00:00
Tom Lane	282d2a03dd	HOT updates. When we update a tuple without changing any of its indexed columns, and the new version can be stored on the same heap page, we no longer generate extra index entries for the new version. Instead, index searches follow the HOT-chain links to ensure they find the correct tuple version. In addition, this patch introduces the ability to "prune" dead tuples on a per-page basis, without having to do a complete VACUUM pass to recover space. VACUUM is still needed to clean up dead index entries, however. Pavan Deolasee, with help from a bunch of other people.	2007-09-20 17:56:33 +00:00
Magnus Hagander	906b2e1b37	Rename DLLIMPORT macro to PGDLLIMPORT to avoid conflict with third party includes (like tcl) that define DLLIMPORT.	2007-07-25 12:22:54 +00:00
Tom Lane	604ffd280b	Create hooks to let a loadable plugin monitor (or even replace) the planner and/or create plans for hypothetical situations; in particular, investigate plans that would be generated using hypothetical indexes. This is a heavily-rewritten version of the hooks proposed by Gurjeet Singh for his Index Advisor project. In this formulation, the index advisor can be entirely a loadable module instead of requiring a significant part to be in the core backend, and plans can be generated for hypothetical indexes without requiring the creation and rolling-back of system catalog entries. The index advisor patch as-submitted is not compatible with these hooks, but it needs significant work anyway due to other 8.2-to-8.3 planner changes. With these hooks in the core backend, development of the advisor can proceed as a pgfoundry project.	2007-05-25 17:54:25 +00:00
Tom Lane	d7153c5fad	Fix best_inner_indexscan to return both the cheapest-total-cost and cheapest-startup-cost innerjoin indexscans, and make joinpath.c consider both of these (when different) as the inside of a nestloop join. The original design was based on the assumption that indexscan paths always have negligible startup cost, and so total cost is the only important figure of merit; an assumption that's obviously broken by bitmap indexscans. This oversight could lead to choosing poor plans in cases where fast-start behavior is more important than total cost, such as LIMIT and IN queries. 8.1-vintage brain fade exposed by an example from Chuck D.	2007-05-22 01:40:33 +00:00
Tom Lane	2415ad9831	Teach tuplestore.c to throw away data before the "mark" point when the caller is using mark/restore but not rewind or backward-scan capability. Insert a materialize plan node between a mergejoin and its inner child if the inner child is a sort that is expected to spill to disk. The materialize shields the sort from the need to do mark/restore and thereby allows it to perform its final merge pass on-the-fly; while the materialize itself is normally cheap since it won't spill to disk unless the number of tuples with equal key values exceeds work_mem. Greg Stark, with some kibitzing from Tom Lane.	2007-05-21 17:57:35 +00:00
Tom Lane	d26559dbf3	Teach tuplesort.c about "top N" sorting, in which only the first N tuples need be returned. We keep a heap of the current best N tuples and sift-up new tuples into it as we scan the input. For M input tuples this means only about Mlog(N) comparisons instead of Mlog(M), not to mention a lot less workspace when N is small --- avoiding spill-to-disk for large M is actually the most attractive thing about it. Patch includes planner and executor support for invoking this facility in ORDER BY ... LIMIT queries. Greg Stark, with some editorialization by moi.	2007-05-04 01:13:45 +00:00
Tom Lane	66888f7424	Expose more cursor-related functionality in SPI: specifically, allow access to the planner's cursor-related planning options, and provide new FETCH/MOVE routines that allow access to the full power of those commands. Small refactoring of planner(), pg_plan_query(), and pg_plan_queries() APIs to make it convenient to pass the planning options down from SPI. This is the core-code portion of Pavel Stehule's patch for scrollable cursor support in plpgsql; I'll review and apply the plpgsql changes separately.	2007-04-16 01:14:58 +00:00
Tom Lane	fa92d21a48	Avoid running build_index_pathkeys() in situations where there cannot possibly be any useful pathkeys --- to wit, queries with neither any join clauses nor any ORDER BY request. It's nearly free to check for this case and it saves a useful fraction of the planning time for simple queries.	2007-04-15 20:09:28 +00:00
Tom Lane	eab6b8b27e	Turn the rangetable used by the executor into a flat list, and avoid storing useless substructure for its RangeTblEntry nodes. (I chose to keep using the same struct node type and just zero out the link fields for unneeded info, rather than making a separate ExecRangeTblEntry type --- it seemed too fragile to have two different rangetable representations.) Along the way, put subplans into a list in the toplevel PlannedStmt node, and have SubPlan nodes refer to them by list index instead of direct pointers. Vadim wanted to do that years ago, but I never understood what he was on about until now. It makes things a whole lot more robust, because we can stop worrying about duplicate processing of subplans during expression tree traversals. That's been a constant source of bugs, and it's finally gone. There are some consequent simplifications yet to be made, like not using a separate EState for subplans in the executor, but I'll tackle that later.	2007-02-22 22:00:26 +00:00
Tom Lane	9cbd0c155d	Remove the Query structure from the executor's API. This allows us to stop storing mostly-redundant Query trees in prepared statements, portals, etc. To replace Query, a new node type called PlannedStmt is inserted by the planner at the top of a completed plan tree; this carries just the fields of Query that are still needed at runtime. The statement lists kept in portals etc. now consist of intermixed PlannedStmt and bare utility-statement nodes --- no Query. This incidentally allows us to remove some fields from Query and Plan nodes that shouldn't have been there in the first place. Still to do: simplify the execution-time range table; at the moment the range table passed to the executor still contains Query trees for subqueries. initdb forced due to change of stored rules.	2007-02-20 17:32:18 +00:00
Tom Lane	7c5e5439d2	Get rid of some old and crufty global variables in the planner. When this code was last gone over, there wasn't really any alternative to globals because we didn't have the PlannerInfo struct being passed all through the planner code. Now that we do, we can restructure things to avoid non-reentrancy. I'm fooling with this because otherwise I'd have had to add another global variable for the planned compact range table list.	2007-02-19 07:03:34 +00:00
Tom Lane	6bef118b01	Restructure code that is responsible for ensuring that clauseless joins are considered when it is necessary to do so because of a join-order restriction (that is, an outer-join or IN-subselect construct). The former coding was a bit ad-hoc and inconsistent, and it missed some cases, as exposed by Mario Weilguni's recent bug report. His specific problem was that an IN could be turned into a "clauseless" join due to constant-propagation removing the IN's joinclause, and if the IN's subselect involved more than one relation and there was more than one such IN linking to the same upper relation, then the only valid join orders involve "bushy" plans but we would fail to consider the specific paths needed to get there. (See the example case added to the join regression test.) On examining the code I wonder if there weren't some other problem cases too; in particular it seems that GEQO was defending against a different set of corner cases than the main planner was. There was also an efficiency problem, in that when we did realize we needed a clauseless join because of an IN, we'd consider clauseless joins against every other relation whether this was sensible or not. It seems a better design is to use the outer-join and in-clause lists as a backup heuristic, just as the rule of joining only where there are joinclauses is a heuristic: we'll join two relations if they have a usable joinclause or this might be necessary to satisfy an outer-join or IN-clause join order restriction. I refactored the code to have just one place considering this instead of three, and made sure that it covered all the cases that any of them had been considering. Backpatch as far as 8.1 (which has only the IN-clause form of the disease). By rights 8.0 and 7.4 should have the bug too, but they accidentally fail to fail, because the joininfo structure used in those releases preserves some memory of there having once been a joinclause between the inner and outer sides of an IN, and so it leads the code in the right direction anyway. I'll be conservative and not touch them.	2007-02-16 00:14:01 +00:00
Tom Lane	5a7471c307	Add COST and ROWS options to CREATE/ALTER FUNCTION, plus underlying pg_proc columns procost and prorows, to allow simple user adjustment of the estimated cost of a function call, as well as control of the estimated number of rows returned by a set-returning function. We might eventually wish to extend this to allow function-specific estimation routines, but there seems to be consensus that we should try a simple constant estimate first. In particular this provides a relatively simple way to control the order in which different WHERE clauses are applied in a plan node, which is a Good Thing in view of the fact that the recent EquivalenceClass planner rewrite made that much less predictable than before.	2007-01-22 01:35:23 +00:00
Tom Lane	f41803bb39	Refactor planner's pathkeys data structure to create a separate, explicit representation of equivalence classes of variables. This is an extensive rewrite, but it brings a number of benefits: * planner no longer fails in the presence of "incomplete" operator families that don't offer operators for every possible combination of datatypes. * avoid generating and then discarding redundant equality clauses. * remove bogus assumption that derived equalities always use operators named "=". * mergejoins can work with a variety of sort orders (e.g., descending) now, instead of tying each mergejoinable operator to exactly one sort order. * better recognition of redundant sort columns. * can make use of equalities appearing underneath an outer join.	2007-01-20 20:45:41 +00:00
Tom Lane	a191a169d6	Change the planner-to-executor API so that the planner tells the executor which comparison operators to use for plan nodes involving tuple comparison (Agg, Group, Unique, SetOp). Formerly the executor looked up the default equality operator for the datatype, which was really pretty shaky, since it's possible that the data being fed to the node is sorted according to some nondefault operator class that could have an incompatible idea of equality. The planner knows what it has sorted by and therefore can provide the right equality operator to use. Also, this change moves a couple of catalog lookups out of the executor and into the planner, which should help startup time for pre-planned queries by some small amount. Modify the planner to remove some other cavalier assumptions about always being able to use the default operators. Also add "nulls first/last" info to the Plan node for a mergejoin --- neither the executor nor the planner can cope yet, but at least the API is in place.	2007-01-10 18:06:05 +00:00
Bruce Momjian	29dccf5fe0	Update CVS HEAD for 2007 copyright. Back branches are typically not back-stamped for this.	2007-01-05 22:20:05 +00:00
Tom Lane	a78fcfb512	Restructure operator classes to allow improved handling of cross-data-type cases. Operator classes now exist within "operator families". While most families are equivalent to a single class, related classes can be grouped into one family to represent the fact that they are semantically compatible. Cross-type operators are now naturally adjunct parts of a family, without having to wedge them into a particular opclass as we had done originally. This commit restructures the catalogs and cleans up enough of the fallout so that everything still works at least as well as before, but most of the work needed to actually improve the planner's behavior will come later. Also, there are not yet CREATE/DROP/ALTER OPERATOR FAMILY commands; the only way to create a new family right now is to allow CREATE OPERATOR CLASS to make one by default. I owe some more documentation work, too. But that can all be done in smaller pieces once this infrastructure is in place.	2006-12-23 00:43:13 +00:00
Tom Lane	f18c57fdf1	Fix planner to do the right thing when a degenerate outer join (one whose joinclause doesn't use any outer-side vars) requires a "bushy" plan to be created. The normal heuristic to avoid joins with no joinclause has to be overridden in that case. Problem is new in 8.2; before that we forced the outer join order anyway. Per example from Teodor.	2006-12-12 21:31:02 +00:00
Bruce Momjian	f99a569a2e	pgindent run for 8.2.	2006-10-04 00:30:14 +00:00
Tom Lane	0f8fc35a5a	Increase default value of effective_cache_size to 128MB, per discussion.	2006-09-25 22:12:24 +00:00
Tom Lane	b74c543685	Improve usage of effective_cache_size parameter by assuming that all the tables in the query compete for cache space, not just the one we are currently costing an indexscan for. This seems more realistic, and it definitely will help in examples recently exhibited by Stefan Kaltenbrunner. To get the total size of all the tables involved, we must tweak the handling of 'append relations' a bit --- formerly we looked up information about the child tables on-the-fly during set_append_rel_pathlist, but it needs to be done before we start doing any cost estimation, so push it into the add_base_rels_to_query scan.	2006-09-19 22:49:53 +00:00
Tom Lane	7a3e30e608	Add INSERT/UPDATE/DELETE RETURNING, with basic docs and regression tests. plpgsql support to come later. Along the way, convert execMain's SELECT INTO support into a DestReceiver, in order to eliminate some ugly special cases. Jonah Harris and Tom Lane	2006-08-12 02:52:06 +00:00
Joe Conway	9caafda579	Add support for multi-row VALUES clauses as part of INSERT statements (e.g. "INSERT ... VALUES (...), (...), ...") and elsewhere as allowed by the spec. (e.g. similar to a FROM clause subselect). initdb required. Joe Conway and Tom Lane.	2006-08-02 01:59:48 +00:00
Tom Lane	09d3670df3	Change the relation_open protocol so that we obtain lock on a relation (table or index) before trying to open its relcache entry. This fixes race conditions in which someone else commits a change to the relation's catalog entries while we are in process of doing relcache load. Problems of that ilk have been reported sporadically for years, but it was not really practical to fix until recently --- for instance, the recent addition of WAL-log support for in-place updates helped. Along the way, remove pg_am.amconcurrent: all AMs are now expected to support concurrent update.	2006-07-31 20:09:10 +00:00
Peter Eisentraut	79bc99a467	Convert effective_cache_size to an integer, for better integration with upcoming units feature.	2006-07-26 11:35:56 +00:00
Bruce Momjian	085e559654	Change LIMIT/OFFSET to use int8 Dhanaraj M	2006-07-26 00:34:48 +00:00
Tom Lane	98359c3e3f	In the recent changes to make the planner account better for cache effects in a nestloop inner indexscan, I had only dealt with plain index scans and the index portion of bitmap scans. But there will be cache benefits for the heap accesses of bitmap scans too, so fix cost_bitmap_heap_scan() to account for that.	2006-07-22 15:41:56 +00:00
Tom Lane	9b556322c5	Fix some missing inclusions identified with new pgcheckdefines tool.	2006-07-15 03:35:21 +00:00
Bruce Momjian	a22d76d96a	Allow include files to compile own their own. Strip unused include files out unused include files, and add needed includes to C files. The next step is to remove unused include files in C files.	2006-07-13 16:49:20 +00:00
Tom Lane	cffd89ca73	Revise the planner's handling of "pseudoconstant" WHERE clauses, that is clauses containing no variables and no volatile functions. Such a clause can be used as a one-time qual in a gating Result plan node, to suppress plan execution entirely when it is false. Even when the clause is true, putting it in a gating node wins by avoiding repeated evaluation of the clause. In previous PG releases, query_planner() would do this for pseudoconstant clauses appearing at the top level of the jointree, but there was no ability to generate a gating Result deeper in the plan tree. To fix it, get rid of the special case in query_planner(), and instead process pseudoconstant clauses through the normal RestrictInfo qual distribution mechanism. When a pseudoconstant clause is found attached to a path node in create_plan(), pull it out and generate a gating Result at that point. This requires special-casing pseudoconstants in selectivity estimation and cost_qual_eval, but on the whole it's pretty clean. It probably even makes the planner a bit faster than before for the normal case of no pseudoconstants, since removing pull_constant_clauses saves one useless traversal of the qual tree. Per gripe from Phil Frost.	2006-07-01 18:38:33 +00:00
Tom Lane	8a30cc2127	Make the planner estimate costs for nestloop inner indexscans on the basis that the Mackert-Lohmann formula applies across all the repetitions of the nestloop, not just each scan independently. We use the M-L formula to estimate the number of pages fetched from the index as well as from the table; that isn't what it was designed for, but it seems reasonably applicable anyway. This makes large numbers of repetitions look much cheaper than before, which accords with many reports we've received of overestimation of the cost of a nestloop. Also, change the index access cost model to charge random_page_cost per index leaf page touched, while explicitly not counting anything for access to metapage or upper tree pages. This may all need tweaking after we get some field experience, but in simple tests it seems to be giving saner results than before. The main thing is to get the infrastructure in place to let cost_index() and amcostestimate functions take repeated scans into account at all. Per my recent proposal. Note: this patch changes pg_proc.h, but I did not force initdb because the changes are basically cosmetic --- the system does not look into pg_proc to decide how to call an index amcostestimate function, and there's no way to call such a function from SQL at all.	2006-06-06 17:59:58 +00:00
Tom Lane	e4de635a2b	Increase the default value of cpu_index_tuple_cost from 0.001 to 0.005. This shouldn't affect simple indexscans much, while for bitmap scans that are touching a lot of index rows, this seems to bring the estimates more in line with reality. Per recent discussion.	2006-06-05 03:03:42 +00:00
Tom Lane	eed6c9ed7e	Add a GUC parameter seq_page_cost, and use that everywhere we formerly assumed that a sequential page fetch has cost 1.0. This patch doesn't in itself change the system's behavior at all, but it opens the door to people adopting other units of measurement for EXPLAIN costs. Also, if we ever decide it's worth inventing per-tablespace access cost settings, this change provides a workable intellectual framework for that.	2006-06-05 02:49:58 +00:00
Bruce Momjian	f2f5b05655	Update copyright for 2006. Update scripts.	2006-03-05 15:59:11 +00:00
Tom Lane	336a6491aa	Improve my initial, rather hacky implementation of joins to append relations: fix the executor so that we can have an Append plan on the inside of a nestloop and still pass down outer index keys to index scans within the Append, then generate such plans as if they were regular inner indexscans. This avoids the need to evaluate the outer relation multiple times.	2006-02-05 02:59:17 +00:00
Tom Lane	3893127431	Fix constraint exclusion to work in inherited UPDATE/DELETE queries ... in fact, it will be applied now in any query whatsoever. I'm still a bit concerned about the cycles that might be expended in failed proof attempts, but given that CE is turned off by default, it's the user's choice whether to expend those cycles or not. (Possibly we should change the simple bool constraint_exclusion parameter to something more fine-grained?)	2006-02-04 23:03:20 +00:00
Tom Lane	8b109ebf14	Teach planner to convert simple UNION ALL subqueries into append relations, thereby sharing code with the inheritance case. This puts the UNION-ALL-view approach to partitioned tables on par with inheritance, so far as constraint exclusion is concerned: it works either way. (Still need to update the docs to say so.) The definition of "simple UNION ALL" is a little simpler than I would like --- basically the union arms can only be SELECT * FROM foo --- but it's good enough for partitioned-table cases.	2006-02-03 21:08:49 +00:00
Bruce Momjian	59bb147353	Update random() usage so ranges are inclusive/exclusive as required.	2006-02-03 12:45:47 +00:00
Tom Lane	8a1468af4e	Restructure planner's handling of inheritance. Rather than processing inheritance trees on-the-fly, which pretty well constrained us to considering only one way of planning inheritance, expand inheritance sets during the planner prep phase, and build a side data structure that can be consulted later to find which RTEs are members of which inheritance sets. As proof of concept, use the data structure to plan joins against inheritance sets more efficiently: we can now use indexes on the set members in inner-indexscan joins. (The generated plans could be improved further, but it'll take some executor changes.) This data structure will also support handling UNION ALL subqueries in the same way as inheritance sets, but that aspect of it isn't finished yet.	2006-01-31 21:39:25 +00:00
Tom Lane	a1b7e70c5f	Fix code that checks to see if an index can be considered to match the query's requested sort order. It was assuming that build_index_pathkeys always generates a pathkey per index column, which was not true if implied equality deduction had determined that two index columns were effectively equated to each other. Simplest fix seems to be to install an option that causes build_index_pathkeys to support this behavior as well as the original one. Per report from Brian Hirt.	2006-01-29 17:27:42 +00:00
Tom Lane	3a0a16cb7e	Allow row comparisons to be used as indexscan qualifications. This completes the project to upgrade our handling of row comparisons.	2006-01-25 20:29:24 +00:00
Tom Lane	e3b9852728	Teach planner how to rearrange join order for some classes of OUTER JOIN. Per my recent proposal. I ended up basing the implementation on the existing mechanism for enforcing valid join orders of IN joins --- the rules for valid outer-join orders are somewhat similar.	2005-12-20 02:30:36 +00:00
Tom Lane	da27c0a1ef	Teach tid-scan code to make use of "ctid = ANY (array)" clauses, so that "ctid IN (list)" will still work after we convert IN to ScalarArrayOpExpr. Make some minor efficiency improvements while at it, such as ensuring that multiple TIDs are fetched in physical heap order. And fix EXPLAIN so that it shows what's really going on for a TID scan.	2005-11-26 22:14:57 +00:00
Tom Lane	290166f934	Teach planner and executor to handle ScalarArrayOpExpr as an indexable qualification when the underlying operator is indexable and useOr is true. That is, indexkey op ANY (ARRAY[...]) is effectively translated into an OR combination of one indexscan for each array element. This only works for bitmap index scans, of course, since regular indexscans no longer support OR'ing of scans. There are still some loose ends to clean up before changing 'x IN (list)' to translate as a ScalarArrayOpExpr; for instance predtest.c ought to be taught about it. But this gets the basic functionality in place.	2005-11-25 19:47:50 +00:00
Tom Lane	1bdf124b94	Restore the former RestrictInfo field valid_everywhere (but invert the flag sense and rename to "outerjoin_delayed" to more clearly reflect what it means). I had decided that it was redundant in 8.1, but the folly of this is exposed by a bug report from Sebastian Böck. The place where it's needed is to prevent orindxpath.c from cherry-picking arms of an outer-join OR clause to form a relation restriction that isn't actually legal to push down to the relation scan level. There may be some legal cases that this forbids optimizing, but we'd need much closer analysis to determine it.	2005-11-14 23:54:23 +00:00
Bruce Momjian	1dc3498251	Standard pgindent run for 8.1.	2005-10-15 02:49:52 +00:00
Tom Lane	2e1254e7fa	Repair planning bug introduced in 7.4: outer-join ON clauses that referenced only the inner-side relation would be considered as potential equijoin clauses, which is wrong because the condition doesn't necessarily hold above the point of the outer join. Per test case from Kevin Grittner (bug#1916).	2005-09-28 21:17:02 +00:00
Tom Lane	4e5fbb34b3	Change the division of labor between grouping_planner and query_planner so that the latter estimates the number of groups that grouping will produce. This is needed because it is primarily query_planner that makes the decision between fast-start and fast-finish plans, and in the original coding it was unable to make more than a crude rule-of-thumb choice when the query involved grouping. This revision helps us make saner choices for queries like SELECT ... GROUP BY ... LIMIT, as in a recent example from Mark Kirkwood. Also move the responsibility for canonicalizing sort_pathkeys and group_pathkeys into query_planner; this information has to be available anyway to support the first change, and doing it this way lets us get rid of compare_noncanonical_pathkeys entirely.	2005-08-27 22:13:44 +00:00
Bruce Momjian	a7f49252d2	enable_constraint_exclusion => constraint_exclusion Also improve wording.	2005-08-22 17:35:03 +00:00
Tom Lane	dfdf07aab1	Fix up LIMIT/OFFSET planning so that we cope with non-constant LIMIT or OFFSET clauses by using estimate_expression_value(). The main advantage of this is that if the expression is a Param and we have a value for the Param, we'll use that value rather than defaulting. Also, fix some thinkos in the logic for combining LIMIT/OFFSET with an externally supplied tuple fraction (this covers cases like EXISTS(...LIMIT...)). And make sure the results of all this are shown by EXPLAIN. Per a gripe from Merlin Moncure.	2005-08-18 17:51:12 +00:00
Tom Lane	a4ca842319	Fix a bunch of bad interactions between partial indexes and the new planning logic for bitmap indexscans. Partial indexes create corner cases in which a scan might be done with no explicit index qual conditions, and the code wasn't handling those cases nicely. Also be a little tenser about eliminating redundant clauses in the generated plan. Per report from Dmitry Karasik.	2005-07-28 20:26:22 +00:00
Tom Lane	d007a95055	Simple constraint exclusion. For now, only child tables of inheritance scans are candidates for exclusion; this should be fixed eventually. Simon Riggs, with some help from Tom Lane.	2005-07-23 21:05:48 +00:00
Tom Lane	cc5e80b8d1	Teach planner about some cases where a restriction clause can be propagated inside an outer join. In particular, given LEFT JOIN ON (A = B) WHERE A = constant, we cannot conclude that B = constant at the top level (B might be null instead), but we can nonetheless put a restriction B = constant into the quals for B's relation, since no inner-side rows not meeting that condition can contribute to the final result. Similarly, given FULL JOIN USING (J) WHERE J = constant, we can't directly conclude that either input J variable = constant, but it's OK to push such quals into each input rel. Per recent gripe from Kim Bisgaard. Along the way, remove 'valid_everywhere' flag from RestrictInfo, as on closer analysis it was not being used for anything, and was defined backwards anyway.	2005-07-02 23:00:42 +00:00
Tom Lane	2f1210629c	Separate predicate-testing code out of indxpath.c, making it a module in its own right. As proposed by Simon Riggs, but with some editorializing of my own.	2005-06-10 22:25:37 +00:00
Tom Lane	3b167a4099	If a LIMIT is applied to a UNION ALL query, plan each UNION arm as if the limit were directly applied to it. This does not actually add a LIMIT plan node to the generated subqueries --- that would be useless overhead --- but it does cause the planner to prefer fast- start plans when the limit is small. After an idea from Phil Endecott.	2005-06-10 02:21:05 +00:00
Tom Lane	a31ad27fc5	Simplify the planner's join clause management by storing join clauses of a relation in a flat 'joininfo' list. The former arrangement grouped the join clauses according to the set of unjoined relids used in each; however, profiling on test cases involving lots of joins proves that that data structure is a net loss. It takes more time to group the join clauses together than is saved by avoiding duplicate tests later. It doesn't help any that there are usually not more than one or two clauses per group ...	2005-06-09 04:19:00 +00:00
Tom Lane	9ab4d98168	Remove planner's private fields from Query struct, and put them into a new PlannerInfo struct, which is passed around instead of the bare Query in all the planning code. This commit is essentially just a code-beautification exercise, but it does open the door to making larger changes to the planner data structures without having to muck with the widely-known Query struct.	2005-06-05 22:32:58 +00:00
Tom Lane	e2159f3842	Teach the planner to remove SubqueryScan nodes from the plan if they aren't doing anything useful (ie, neither selection nor projection). Also, extend to SubqueryScan the hacks already in place to avoid unnecessary ExecProject calls when the result would just be the same tuple the subquery already delivered. This saves some overhead in UNION and other set operations, as well as avoiding overhead for unflatten-able subqueries. Per example from Sokolov Yura.	2005-05-22 22:30:20 +00:00
Tom Lane	79a1b00226	Replace slightly klugy create_bitmap_restriction() function with a more efficient routine in restrictinfo.c (which can make use of make_restrictinfo_internal).	2005-04-25 02:14:48 +00:00
Tom Lane	5b05185262	Remove support for OR'd indexscans internal to a single IndexScan plan node, as this behavior is now better done as a bitmap OR indexscan. This allows considerable simplification in nodeIndexscan.c itself as well as several planner modules concerned with indexscan plan generation. Also we can improve the sharing of code between regular and bitmap indexscans, since they are now working with nigh-identical Plan nodes.	2005-04-25 01:30:14 +00:00
Tom Lane	bc843d3960	First cut at planner support for bitmap index scans. Lots to do yet, but the code is basically working. Along the way, rewrite the entire approach to processing OR index conditions, and make it work in join cases for the first time ever. orindxpath.c is now basically obsolete, but I left it in for the time being to allow easy comparison testing against the old implementation.	2005-04-22 21:58:32 +00:00
Tom Lane	14c7fba3f7	Rethink original decision to use AND/OR Expr nodes to represent bitmap logic operations during planning. Seems cleaner to create two new Path node types, instead --- this avoids duplication of cost-estimation code. Also, create an enable_bitmapscan GUC parameter to control use of bitmap plans.	2005-04-21 19:18:13 +00:00
Tom Lane	e6f7edb9d5	Install some slightly realistic cost estimation for bitmap index scans.	2005-04-21 02:28:02 +00:00
Tom Lane	4a8c5d0375	Create executor and planner-backend support for decoupled heap and index scans, using in-memory tuple ID bitmaps as the intermediary. The planner frontend (path creation and cost estimation) is not there yet, so none of this code can be executed. I have tested it using some hacked planner code that is far too ugly to see the light of day, however. Committing now so that the bulk of the infrastructure changes go in before the tree drifts under me.	2005-04-19 22:35:18 +00:00
Tom Lane	7ace43e0c2	Fix oversight in MIN/MAX optimization: must not return NULL entries from index, since the aggregates ignore NULLs.	2005-04-12 05:11:28 +00:00
Tom Lane	addc42c339	Create the planner mechanism for optimizing simple MIN and MAX queries into indexscans on matching indexes. For the moment, it only handles int4 and text datatypes; next step is to add a column to pg_aggregate so that all MIN/MAX aggregates can be handled. Per my recent proposal.	2005-04-11 23:06:57 +00:00
Tom Lane	ad161bcc8a	Merge Resdom nodes into TargetEntry nodes to simplify code and save a few palloc's. I also chose to eliminate the restype and restypmod fields entirely, since they are redundant with information stored in the node's contained expression; re-examining the expression at need seems simpler and more reliable than trying to keep restype/restypmod up to date. initdb forced due to change in contents of stored rules.	2005-04-06 16:34:07 +00:00
Tom Lane	5db2e83852	Rethink the order of expression preprocessing: eval_const_expressions really ought to run before canonicalize_qual, because it can now produce forms that canonicalize_qual knows how to improve (eg, NOT clauses). Also, because eval_const_expressions already knows about flattening nested ANDs and ORs into N-argument form, the initial flatten_andors pass in canonicalize_qual is now completely redundant and can be removed. This doesn't save a whole lot of code, but the time and palloc traffic eliminated is a useful gain on large expression trees.	2005-03-28 00:58:26 +00:00
Tom Lane	926e8a00d3	Add a back-link from IndexOptInfo structs to their parent RelOptInfo structs. There are many places in the planner where we were passing both a rel and an index to subroutines, and now need only pass the index struct. Notationally simpler, and perhaps a tad faster.	2005-03-27 06:29:49 +00:00
Tom Lane	febc9a613c	Expand the 'special index operator' machinery to handle special cases for boolean indexes. Previously we would only use such an index with WHERE clauses like 'indexkey = true' or 'indexkey = false'. The new code transforms the cases 'indexkey', 'NOT indexkey', 'indexkey IS TRUE', and 'indexkey IS FALSE' into one of these. While this is only marginally useful in itself, I intend soon to change constant-expression simplification so that 'foo = true' and 'foo = false' are reduced to just 'foo' and 'NOT foo' ... which would lose the ability to use boolean indexes for such queries at all, if the indexscan machinery couldn't make the reverse transformation.	2005-03-26 23:29:20 +00:00
Neil Conway	d344505d1b	This patch moves some code for preprocessing FOR UPDATE from grouping_planner() to preprocess_targetlist(), according to a comment in grouping_planner(). I think the refactoring makes sense, and moves some extraneous details out of grouping_planner().	2005-03-17 23:45:09 +00:00
Tom Lane	595ed2a855	Make the behavior of HAVING without GROUP BY conform to the SQL spec. Formerly, if such a clause contained no aggregate functions we mistakenly treated it as equivalent to WHERE. Per spec it must cause the query to be treated as a grouped query of a single group, the same as appearance of aggregate functions would do. Also, the HAVING filter must execute after aggregate function computation even if it itself contains no aggregate functions.	2005-03-10 23:21:26 +00:00
Tom Lane	0bf2587df4	Improve planner's estimation of the space needed for HashAgg plans: look at the actual aggregate transition datatypes and the actual overhead needed by nodeAgg.c, instead of using pessimistic round numbers. Per a discussion with Michael Tiemann.	2005-01-28 19:34:28 +00:00
Tom Lane	94e4778a31	The result of a FULL or RIGHT join can't be assumed to be sorted by the left input's sorting, because null rows may be inserted at various points. Per report from Ferenc Lutischá¸n.	2005-01-23 02:21:36 +00:00
PostgreSQL Daemon	2ff501590b	Tag appropriate files for rc3 Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...	2004-12-31 22:04:05 +00:00
Tom Lane	9309d5f2ba	In ALTER COLUMN TYPE, strip any implicit coercion operations appearing at the top level of the column's old default expression before adding an implicit coercion to the new column type. This seems to satisfy the principle of least surprise, as per discussion of bug #1290.	2004-10-22 17:20:05 +00:00
Tom Lane	26112850ec	Fix OR-index-scan planner to recognize that a partial index is usable for scanning one term of an OR clause if the index's predicate is implied by that same OR clause term (possibly in conjunction with top-level WHERE clauses). Per recent example from Dawid Kuroczko, http://archives.postgresql.org/pgsql-performance/2004-10/msg00095.php Also, fix a very long-standing bug in index predicate testing, namely the bizarre ordering of decomposition of predicate and restriction clauses. AFAICS the correct way is to break down the predicate all the way, and then for each component term see if you can prove it from the entire restriction set. The original coding had a purely-implementation-artifact distinction between ANDing at the top level and ANDing below that, and proceeded to get the decomposition order wrong everywhere below the top level, with the result that even slightly complicated AND/OR predicates could not be proven. For instance, given create index foop on foo(f2) where f1=42 or f1=1 or (f1 = 11 and f2 = 55); the old code would fail to match this index to the query select * from foo where f1 = 11 and f2 = 55; when it obviously ought to match.	2004-10-11 22:57:00 +00:00
Tom Lane	47aa95e951	Clean up handling of inherited-table update queries, per bug report from Sebastian Böck. The fix involves being more consistent about when rangetable entries are copied or modified. Someday we really need to fix this stuff to not scribble on its input data structures in the first place...	2004-10-02 22:39:49 +00:00
Bruce Momjian	b6b71b85bc	Pgindent run for 8.0.	2004-08-29 05:07:03 +00:00
Bruce Momjian	da9a8649d8	Update copyright to 2004.	2004-08-29 04:13:13 +00:00
Tom Lane	7643bed58e	When using extended-query protocol, postpone planning of unnamed statements until Bind is received, so that actual parameter values are visible to the planner. Make use of the parameter values for estimation purposes (but don't fold them into the actual plan). This buys back most of the potential loss of plan quality that ensues from using out-of-line parameters instead of putting literal values right into the query text. This patch creates a notion of constant-folding expressions 'for estimation purposes only', in which case we can be more aggressive than the normal eval_const_expressions() logic can be. Right now the only difference in behavior is inserting bound values for Params, but it will be interesting to look at other possibilities. One that we've seen come up repeatedly is reducing now() and related functions to current values, so that queries like ... WHERE timestampcol > now() - '1 day' have some chance of being planned effectively. Oliver Jowett, with some kibitzing from Tom Lane.	2004-06-11 01:09:22 +00:00
Tom Lane	2f63232d30	Promote row expressions to full-fledged citizens of the expression syntax, rather than allowing them only in a few special cases as before. In particular you can now pass a ROW() construct to a function that accepts a rowtype parameter. Internal generation of RowExprs fixes a number of corner cases that used to not work very well, such as referencing the whole-row result of a JOIN or subquery. This represents a further step in the work I started a month or so back to make rowtype values into first-class citizens.	2004-05-10 22:44:49 +00:00
Tom Lane	989067bd22	Extend set-operation planning to keep track of the sort ordering induced by the set operation, so that redundant sorts at higher levels can be avoided. This was foreseen a good while back, but not done. Per request from Karel Zak.	2004-04-07 18:17:25 +00:00
Tom Lane	04226b6404	Tweak planner so that index expressions and predicates are matched to queries without regard to whether coercions are stated explicitly or implicitly. Per suggestion from Stephan Szabo.	2004-03-14 23:41:27 +00:00
Tom Lane	a536ed53bc	Make use of statistics on index expressions. There are still some corner cases that could stand improvement, but it does all the basic stuff. A byproduct is that the selectivity routines are no longer constrained to working on simple Vars; we might in future be able to improve the behavior for subexpressions that don't match indexes.	2004-02-17 00:52:53 +00:00
Tom Lane	3969f2924b	Revise GEQO planner to make use of some heuristic knowledge about SQL, namely that it's good to join where there are join clauses rather than where there are not. Also enable it to generate bushy plans at need, so that it doesn't fail in the presence of multiple IN clauses containing sub-joins. These changes appear to improve the behavior enough that we can substantially reduce the default pool size and generations count, thereby decreasing the runtime, and yet get as good or better plans as we were getting in 7.4. Consequently, adjust the default GEQO parameters. I also modified the way geqo_effort is used so that it affects both population size and number of generations; it's now useful as a single control to adjust the GEQO runtime-vs-plan-quality tradeoff. Bump geqo_threshold to 12, since even with these changes GEQO seems to be slower than the regular planner at 11 relations.	2004-01-23 23:54:21 +00:00
Tom Lane	672a807028	Repair error apparently introduced in the initial coding of GUC: the default value for geqo_effort is supposed to be 40, not 1. The actual 'genetic' component of the GEQO algorithm has been practically disabled since 7.1 because of this mistake. Improve documentation while at it.	2004-01-21 23:33:34 +00:00
Tom Lane	6bdfde9a77	When testing whether a sub-plan can do projection, use a general-purpose check instead of hardwiring assumptions that only certain plan node types can appear at the places where we are testing. This was always a pretty fragile assumption, and it turns out to be broken in 7.4 for certain cases involving IN-subselect tests that need type coercion. Also, modify code that builds finished Plan tree so that node types that don't do projection always copy their input node's targetlist, rather than having the tlist passed in from the caller. The old method makes it too easy to write broken code that thinks it can modify the tlist when it cannot.	2004-01-18 00:50:03 +00:00
Tom Lane	fa559a86ee	Adjust indexscan planning logic to keep RestrictInfo nodes associated with index qual clauses in the Path representation. This saves a little work during createplan and (probably more importantly) allows reuse of cached selectivity estimates during indexscan planning. Also fix latent bug: wrong plan would have been generated for a 'special operator' used in a nestloop-inner-indexscan join qual, because the special operator would not have gotten into the list of quals to recheck. This bug is only latent because at present the special-operator code could never trigger on a join qual, but sooner or later someone will want to do it.	2004-01-05 23:39:54 +00:00
Tom Lane	5c74ce23db	Improve UniquePath logic to detect the case where the input is already known unique (eg, it is a SELECT DISTINCT ... subquery), and not do a redundant unique-ification step.	2004-01-05 18:04:39 +00:00
Tom Lane	9091e8d1b2	Add the ability to extract OR indexscan conditions from OR-of-AND join conditions in which each OR subclause includes a constraint on the same relation. This implements the other useful side-effect of conversion to CNF format, without its unpleasant side-effects. As per pghackers discussion of a few weeks ago.	2004-01-05 05:07:36 +00:00
Tom Lane	82b4dd394f	Merge restrictlist_selectivity into clauselist_selectivity by teaching the latter to accept either RestrictInfo nodes or bare clause expressions; and cache the selectivity result in the RestrictInfo node when possible. This extends the caching behavior of approx_selectivity to many more contexts, and should reduce duplicate selectivity calculations.	2004-01-04 03:51:52 +00:00
Tom Lane	6cb1c0238b	Rewrite OR indexscan processing to be more flexible. We can now for the first time generate an OR indexscan for a two-column index when the WHERE condition is like 'col1 = foo AND (col2 = bar OR col2 = baz)' --- before, the OR had to be on the first column of the index or we'd not notice the possibility of using it. Some progress towards extracting OR indexscans from subclauses of an OR that references multiple relations, too, although this code is #ifdef'd out because it needs more work.	2004-01-04 00:07:32 +00:00
Tom Lane	be6c38b903	Adjust the definition of RestrictInfo's left_relids and right_relids fields: now they are valid whenever the clause is a binary opclause, not only when it is a potential join clause (there is a new boolean field canjoin to signal the latter condition). This lets us avoid recomputing the relid sets over and over while examining indexes. Still more work to do to make this as useful as it could be, because there are places that could use the info but don't have access to the RestrictInfo node.	2003-12-30 23:53:15 +00:00
Tom Lane	c607bd693f	Clean up the usage of canonicalize_qual(): in particular, be consistent about whether it is applied before or after eval_const_expressions(). I believe there were some corner cases where the system would fail to recognize that a partial index is applicable because of the previous inconsistency. Store normal rather than 'implicit AND' representations of constraints and index predicates in the catalogs. initdb forced due to representation change of constraints/predicates.	2003-12-28 21:57:37 +00:00
PostgreSQL Daemon	55b113257c	make sure the $Id tags are converted to $PostgreSQL as well ...	2003-11-29 22:41:33 +00:00
Tom Lane	48beecda7c	Remove geqo_random_seed parameter. Having geqo reset the global random() sequence every time it's called is bogus --- it interferes with user control over the seed, and actually decreases randomness overall (because a seed based on time(NULL) is pretty predictable). If you really want a reproducible result from geqo, do 'set seed = 0' before planning a query.	2003-09-07 15:26:54 +00:00
Bruce Momjian	46785776c4	Another pgindent run with updated typedefs.	2003-08-08 21:42:59 +00:00
Bruce Momjian	f3c3deb7d0	Update copyrights to 2003.	2003-08-04 02:40:20 +00:00
Bruce Momjian	089003fb46	pgindent run.	2003-08-04 00:43:34 +00:00
Tom Lane	3d09f6c560	Make cost estimates for SubqueryScan more realistic: charge cpu_tuple_cost for each row processed, and don't forget the evaluation cost of any restriction clauses attached to the node. Per discussion with Greg Stark.	2003-07-14 22:35:54 +00:00
Tom Lane	835bb975d8	Restructure building of join relation targetlists so that a join plan node emits only those vars that are actually needed above it in the plan tree. (There were comments in the code suggesting that this was done at some point in the dim past, but for a long time we have just made join nodes emit everything that either input emitted.) Aside from being marginally more efficient, this fixes the problem noted by Peter Eisentraut where a join above an IN-implemented-as-join might fail, because the subplan targetlist constructed in the latter case didn't meet the expectation of including everything. Along the way, fix some places that were O(N^2) in the targetlist length. This is not all the trouble spots for wide queries by any means, but it's a step forward.	2003-06-29 23:05:05 +00:00
Tom Lane	bee217924d	Support expressions of the form 'scalar op ANY (array)' and 'scalar op ALL (array)', where the operator is applied between the lefthand scalar and each element of the array. The operator must yield boolean; the result of the construct is the OR or AND of the per-element results, respectively. Original coding by Joe Conway, after an idea of Peter's. Rewritten by Tom to keep the implementation strictly separate from subqueries.	2003-06-29 00:33:44 +00:00
Bruce Momjian	111d8e522b	Back out array mega-patch. Joe Conway	2003-06-25 21:30:34 +00:00
Bruce Momjian	46bf651480	Array mega-patch. Joe Conway	2003-06-24 23:14:49 +00:00
Tom Lane	cb02610e50	Adjust nestloop-with-inner-indexscan plan generation so that we catch some cases of redundant clauses that were formerly not caught. We have to special-case this because the clauses involved never get attached to the same join restrictlist and so the existing logic does not notice that they are redundant.	2003-06-15 22:51:45 +00:00
Bruce Momjian	9167a566d6	Add missing DLLIMPORT for cpu_index_tuple_cost to src/include/optimizer/cost.h. This is required to compile the PostGIS extension module with Cygwin http://postgis.refractions.net Norman Vine	2003-06-11 15:01:15 +00:00
Tom Lane	e649796f12	Implement outer-level aggregates to conform to the SQL spec, with extensions to support our historical behavior. An aggregate belongs to the closest query level of any of the variables in its argument, or the current query level if there are no variables (e.g., COUNT(*)). The implementation involves adding an agglevelsup field to Aggref, and treating outer aggregates like outer variables at planning time.	2003-06-06 15:04:03 +00:00
Tom Lane	fc8d970cbc	Replace functional-index facility with expressional indexes. Any column of an index can now be a computed expression instead of a simple variable. Restrictions on expressions are the same as for predicates (only immutable functions, no sub-selects). This fixes problems recently introduced with inlining SQL functions, because the inlining transformation is applied to both expression trees so the planner can still match them up. Along the way, improve efficiency of handling index predicates (both predicates and index expressions are now cached by the relcache) and fix 7.3 oversight that didn't record dependencies of predicate expressions.	2003-05-28 16:04:02 +00:00
Tom Lane	f45df8c014	Cause CHAR(n) to TEXT or VARCHAR conversion to automatically strip trailing blanks, in hopes of reducing the surprise factor for newbies. Remove redundant operators for VARCHAR (it depends wholly on TEXT operations now). Clean up resolution of ambiguous operators/functions to avoid surprising choices for domains: domains are treated as equivalent to their base types and binary-coercibility is no longer considered a preference item when choosing among multiple operators/functions. IsBinaryCoercible now correctly reflects the notion that you need only relabel the type to get from type A to type B: that is, a domain is binary-coercible to its base type, but not vice versa. Various marginal cleanup, including merging the essentially duplicate resolution code in parse_func.c and parse_oper.c. Improve opr_sanity regression test to understand about binary compatibility (using pg_cast), and fix a couple of small errors in the catalogs revealed thereby. Restructure "special operator" handling to fetch operators via index opclasses rather than hardwiring assumptions about names (cleans up the pattern_ops stuff a little).	2003-05-26 00:11:29 +00:00
Tom Lane	2cf57c8f8d	Implement feature of new FE/BE protocol whereby RowDescription identifies the column by table OID and column number, if it's a simple column reference. Along the way, get rid of reskey/reskeyop fields in Resdoms. Turns out that representation was not convenient for either the planner or the executor; we can make the planner deliver exactly what the executor wants with no more effort. initdb forced due to change in stored rule representation.	2003-05-06 00:20:33 +00:00
Tom Lane	5f677af2da	Adjust subquery qual pushdown rules so that we can push down a qual into a UNION that has some type coercions applied to the component queries, so long as the qual itself does not reference any columns that have such coercions. Per example from Jonathan Bartlett 24-Apr-03.	2003-04-24 23:43:09 +00:00
Tom Lane	aa83bc04e0	Restructure parsetree representation of DECLARE CURSOR: now it's a utility statement (DeclareCursorStmt) with a SELECT query dangling from it, rather than a SELECT query with a few unusual fields in it. Add code to determine whether a planned query can safely be run backwards. If DECLARE CURSOR specifies SCROLL, ensure that the plan can be run backwards by adding a Materialize plan node if it can't. Without SCROLL, you get an error if you try to fetch backwards from a cursor that can't handle it. (There is still some discussion about what the exact behavior should be, but this is necessary infrastructure in any case.) Along the way, make EXPLAIN DECLARE CURSOR work.	2003-03-10 03:53:52 +00:00
Tom Lane	21591967bc	Turns out new IN implementation has got some problems in an UPDATE or DELETE with inherited target table. Fix it; add a regression test. Also, correct ancient misspelling of 'inherited'.	2003-03-05 20:01:04 +00:00
Tom Lane	056467ec6b	Teach planner how to propagate pathkeys from sub-SELECTs in FROM up to the outer query. (The implementation is a bit klugy, but it would take nontrivial restructuring to make it nicer, which this is probably not worth.) This avoids unnecessary sort steps in examples like SELECT foo,count(*) FROM (SELECT ... ORDER BY foo,bar) sub GROUP BY foo which means there is now a reasonable technique for controlling the order of inputs to custom aggregates, even in the grouping case.	2003-02-15 20:12:41 +00:00
Tom Lane	b5956a2f22	Detect case where an outer join can be reduced to a plain inner join because there are WHERE clauses that will reject the null-extended rows. Per suggestion from Brandon Craig Rhodes, 19-Nov-02.	2003-02-09 23:57:19 +00:00
Tom Lane	145014f811	Make further use of new bitmapset code: executor's chgParam, extParam, locParam lists can be converted to bitmapsets to speed updating. Also, replace 'locParam' with 'allParam', which contains all the paramIDs relevant to the node (i.e., the union of extParam and locParam); this saves a step during SetChangedParamList() without costing anything elsewhere.	2003-02-09 00:30:41 +00:00
Tom Lane	c15a4c2aef	Replace planner's representation of relation sets, per pghackers discussion. Instead of Lists of integers, we now store variable-length bitmap sets. This should be faster as well as less error-prone.	2003-02-08 20:20:55 +00:00
Tom Lane	2d1f940542	Minor code cleanup: remove no-longer-useful pull_subplans() function, and convert pull_agg_clause() into count_agg_clause(), which is a more efficient way of doing what it's really being used for.	2003-02-04 00:50:01 +00:00
Tom Lane	4cff59d8d5	Tweak planner and executor to avoid doing ExecProject() in table scan nodes where it's not really necessary. In many cases where the scan node is not the topmost plan node (eg, joins, aggregation), it's possible to just return the table tuple directly instead of generating an intermediate projection tuple. In preliminary testing, this reduced the CPU time needed for 'SELECT COUNT(*) FROM foo' by about 10%.	2003-02-03 15:07:08 +00:00
Tom Lane	2e46b762eb	Extend join-selectivity API (oprjoin interface) so that join type is passed to join selectivity estimators. Make use of this in eqjoinsel to derive non-bogus selectivity for IN clauses. Further tweaking of cost estimation for IN. initdb forced because of pg_proc.h changes.	2003-01-28 22:13:41 +00:00
Tom Lane	70fba70430	Upgrade cost estimation for joins, per discussion with Bradley Baetz. Try to model the effect of rescanning input tuples in mergejoins; account for JOIN_IN short-circuiting where appropriate. Also, recognize that mergejoin and hashjoin clauses may now be more than single operator calls, so we have to charge appropriate execution costs.	2003-01-27 20:51:54 +00:00
Tom Lane	9f5f212475	Allow the planner to collapse explicit inner JOINs together, rather than necessarily following the JOIN syntax to develop the query plan. The old behavior is still available by setting GUC variable JOIN_COLLAPSE_LIMIT to 1. Also create a GUC variable FROM_COLLAPSE_LIMIT to control the similar decision about when to collapse sub-SELECT lists into their parent lists. (This behavior existed already, but the limit was always GEQO_THRESHOLD/2; now it's separately adjustable.)	2003-01-25 23:10:30 +00:00
Tom Lane	f5e83662d0	Modify planner's implied-equality-deduction code so that when a set of known-equal expressions includes any constant expressions (including Params from outer queries), we actively suppress any 'var = var' clauses that are or could be deduced from the set, generating only the deducible 'var = const' clauses instead. The idea here is to push down the restrictions implied by the equality set to base relations whenever possible. Once we have applied the 'var = const' clauses, the 'var = var' clauses are redundant, and should be suppressed both to save work at execution and to avoid double-counting restrictivity.	2003-01-24 03:58:44 +00:00
Tom Lane	bdfbfde1b1	IN clauses appearing at top level of WHERE can now be handled as joins. There are two implementation techniques: the executor understands a new JOIN_IN jointype, which emits at most one matching row per left-hand row, or the result of the IN's sub-select can be fed through a DISTINCT filter and then joined as an ordinary relation. Along the way, some minor code cleanup in the optimizer; notably, break out most of the jointree-rearrangement preprocessing in planner.c and put it in a new file prep/prepjointree.c.	2003-01-20 18:55:07 +00:00
Tom Lane	b19adc1aae	Fix parse_agg.c to detect ungrouped Vars in sub-SELECTs; remove code that used to do it in planner. That was an ancient kluge that was never satisfactory; errors should be detected at parse time when possible. But at the time we didn't have the support mechanism (expression_tree_walker et al) to make it convenient to do in the parser.	2003-01-17 03:25:04 +00:00
Tom Lane	a4d82dd4b4	Adjust API of expression_tree_mutator and query_tree_mutator to simplify callers. It turns out the common case is that the caller does want to recurse into sub-queries, so push support for that into these subroutines.	2003-01-17 02:01:21 +00:00
Tom Lane	cde9f852e0	Now that switch_outer processing no longer relies on being run after join_references(), it's practical to consolidate all join_references() processing into the set_plan_references traversal in setrefs.c. This seems considerably cleaner than the old way where we did it for join quals in createplan.c and for targetlists in setrefs.c.	2003-01-15 23:10:32 +00:00
Tom Lane	de97072e3c	Allow merge and hash joins to occur on arbitrary expressions (anything not containing a volatile function), rather than only on 'Var = Var' clauses as before. This makes it practical to do flatten_join_alias_vars at the start of planning, which in turn eliminates a bunch of klugery inside the planner to deal with alias vars. As a free side effect, we now detect implied equality of non-Var expressions; for example in SELECT ... WHERE a.x = b.y and b.y = 42 we will deduce a.x = 42 and use that as a restriction qual on a. Also, we can remove the restriction introduced 12/5/02 to prevent pullup of subqueries whose targetlists contain sublinks. Still TODO: make statistical estimation routines in selfuncs.c and costsize.c smarter about expressions that are more complex than plain Vars. The need for this is considerably greater now that we have to be able to estimate the suitability of merge and hash join techniques on such expressions.	2003-01-15 19:35:48 +00:00
Tom Lane	56e1aab286	Reconsider mechanism for marking sub-selects that are at top level of a qualification clause (and hence can get away with being sloppy about distinguishing FALSE from UNKNOWN). We need to know this in subselect.c; marking the subplans in setrefs.c is too late.	2003-01-13 18:10:53 +00:00
Tom Lane	d4ce5a4f4c	Revise cost_qual_eval() to compute both startup (one-time) and per-tuple costs for expression evaluation, not only per-tuple cost as before. This extension is needed in order to deal realistically with hashed or materialized sub-selects.	2003-01-12 22:35:29 +00:00
Tom Lane	9f76d0d926	Fix GEQO to work again in CVS tip, by being more careful about memory allocation in best_inner_indexscan(). While at it, simplify GEQO's interface to the main planner --- make_join_rel() offers exactly the API it really wants, whereas calling make_rels_by_clause_joins() and make_rels_by_clauseless_joins() required jumping through hoops. Rewrite gimme_tree for clarity (sometimes iteration is much better than recursion), and approximately halve GEQO's runtime by recognizing that tours of the forms (a,b,c,d,...) and (b,a,c,d,...) are equivalent because of symmetry in make_join_rel().	2002-12-16 21:30:30 +00:00
Tom Lane	2d8d66628a	Clean up plantree representation of SubPlan-s --- SubLink does not appear in the planned representation of a subplan at all any more, only SubPlan. This means subselect.c doesn't scribble on its input anymore, which seems like a good thing; and there are no longer three different possible interpretations of a SubLink. Simplify node naming and improve comments in primnodes.h. No change to stored rules, though.	2002-12-14 00:17:59 +00:00
Tom Lane	b0422b215c	Preliminary code review for domain CHECK constraints patch: add documentation, make VALUE a non-reserved word again, use less invasive method of passing ConstraintTestValue into transformExpr, fix problems with nested constraint testing, do correct thing with NULL result from a constraint expression, remove memory leak. Domain checks still need much more work if we are going to allow ALTER DOMAIN, however.	2002-12-12 20:35:16 +00:00
Tom Lane	a0bf885f9e	Phase 2 of read-only-plans project: restructure expression-tree nodes so that all executable expression nodes inherit from a common supertype Expr. This is somewhat of an exercise in code purity rather than any real functional advance, but getting rid of the extra Oper or Func node formerly used in each operator or function call should provide at least a little space and speed improvement. initdb forced by changes in stored-rules representation.	2002-12-12 15:49:42 +00:00
Tom Lane	8e3a87fbd4	Teach planner to expand sufficiently simple SQL-language functions ('SELECT expression') inline, like macros, during the constant-folding phase of planning. The actual expansion is not difficult, but checking that we're not changing the semantics of the call turns out to be more subtle than one might think; in particular must pay attention to permissions issues, strictness, and volatility.	2002-12-01 21:05:14 +00:00
Tom Lane	935969415a	Be more realistic about plans involving Materialize nodes: take their cost into account while planning.	2002-11-30 05:21:03 +00:00
Tom Lane	04c8785c7b	Restructure planning of nestloop inner indexscans so that the set of usable joinclauses is determined accurately for each join. Formerly, the code only considered joinclauses that used all of the rels from the outer side of the join; thus for example FROM (a CROSS JOIN b) JOIN c ON (c.f1 = a.x AND c.f2 = b.y) could not exploit a two-column index on c(f1,f2), since neither of the qual clauses would be in the joininfo list it looked in. The new code does this correctly, and also is able to eliminate redundant clauses, thus fixing the problem noted 24-Oct-02 by Hans-Jürgen Schönig.	2002-11-24 21:52:15 +00:00
Tom Lane	6c1d4662af	Finish implementation of hashed aggregation. Add enable_hashagg GUC parameter to allow it to be forced off for comparison purposes. Add ORDER BY clauses to a bunch of regression test queries that will otherwise produce randomly-ordered output in the new regime.	2002-11-21 00:42:20 +00:00
Tom Lane	b60be3f2f8	Add an at-least-marginally-plausible method of estimating the number of groups produced by GROUP BY. This improves the accuracy of planning estimates for grouped subselects, and is needed to check whether a hashed aggregation plan risks memory overflow.	2002-11-19 23:22:00 +00:00
Bruce Momjian	6b603e67dc	Add DOMAIN check constraints. Rod Taylor	2002-11-15 02:50:21 +00:00
Tom Lane	2103b7baa2	Phase 2 of hashed-aggregation project. nodeAgg.c now knows how to do hashed aggregation, but there's not yet planner support for it.	2002-11-06 22:31:24 +00:00
Tom Lane	f6dba10e62	First phase of implementing hash-based grouping/aggregation. An AGG plan node now does its own grouping of the input rows, and has no need for a preceding GROUP node in the plan pipeline. This allows elimination of the misnamed tuplePerGroup option for GROUP, and actually saves more code in nodeGroup.c than it costs in nodeAgg.c, as well as being presumably faster. Restructure the API of query_planner so that we do not commit to using a sorted or unsorted plan in query_planner; instead grouping_planner makes the decision. (Right now it isn't any smarter than query_planner was, but that will change as soon as it has the option to select a hash- based aggregation step.) Despite all the hackery, no initdb needed since only in-memory node types changed.	2002-11-06 00:00:45 +00:00
Tom Lane	6fdc44be71	Tweak querytree-dependency-extraction code so that columns of tables that are explicitly JOINed are not considered dependencies unless they are actually used in the query: mere presence in the joinaliasvars list of a JOIN RTE doesn't count as being used. The patch touches a number of files because I needed to generalize the API of query_tree_walker to support an additional flag bit, but the changes are otherwise quite small.	2002-09-11 14:48:55 +00:00
Bruce Momjian	e50f52a074	pgindent run.	2002-09-04 20:31:48 +00:00
Tom Lane	0201dac1c3	Push down outer qualification clauses into UNION and INTERSECT subqueries. Per pghackers discussion from back around 1-August.	2002-08-29 16:03:49 +00:00
Peter Eisentraut	43515ba3f8	Remove _deadcode.	2002-07-24 19:16:43 +00:00
Peter Eisentraut	739adf32ee	Remove unused system table columns: pg_language.lancompiler pg_operator.oprprec pg_operator.oprisleft pg_proc.proimplicit pg_proc.probyte_pct pg_proc.properbyte_cpu pg_proc.propercall_cpu pg_proc.prooutin_ratio pg_shadow.usetrace pg_type.typprtlen pg_type.typreceive pg_type.typsend Attempts to use the obsoleted attributes of pg_operator or pg_proc in the CREATE commands will be greeted by a warning. For pg_type, there is no warning (yet) because pg_dump scripts still contain these attributes. Also remove new but already obsolete spellings isVolatile, isStable, isImmutable in WITH clause. (Use new syntax instead.)	2002-07-24 19:11:14 +00:00
Bruce Momjian	38dd3ae7d0	The attached patch fixes a build problem with GEQO when using the PX recombination operator, changes some elog() messages from LOG to DEBUG1, puts some debugging functions inside the appropriate #ifdef (not enabled by default), and makes a few other minor cleanups. BTW, the elog() change is motivated by at least one user who has sent a concerned email to -general asking exactly what the "ERX recombination operator" is, and what it is doing to their DBMS. Neil Conway	2002-07-20 04:59:10 +00:00
Bruce Momjian	d84fe82230	Update copyright to 2002.	2002-06-20 20:29:54 +00:00
Bruce Momjian	0dbfea39f3	Remove KSQO from GUC and move file to _deadcode.	2002-06-16 00:09:12 +00:00
Tom Lane	a5b370943e	Teach query_tree_walker, query_tree_mutator, and SS_finalize_plan to process function RTE expressions, which they were previously missing. This allows outer-Var references and subselects to work correctly in the arguments of a function RTE. Install check to prevent function RTEs from cross-referencing Vars of sibling FROM-items, which doesn't make any sense (if you want to join, write a JOIN or WHERE clause).	2002-05-18 18:49:41 +00:00
Tom Lane	51fd22abdd	Change set_plan_references and join_references to take an rtable List rather than a Query node; this allows set_plan_references to recurse into subplans correctly. Fixes core dump on full outer joins in subplans. Also, invoke preprocess_expression on function RTEs' function expressions. This seems to fix the planner's problems with outer-level Vars in function RTEs.	2002-05-18 02:25:50 +00:00
Tom Lane	3389a110d4	Get rid of long-since-vestigial Iter node type, in favor of adding a returns-set boolean field in Func and Oper nodes. This allows cleaner, more reliable tests for expressions returning sets in the planner and parser. For example, a WHERE clause returning a set is now detected and complained of in the parser, not only at runtime.	2002-05-12 23:43:04 +00:00
Tom Lane	f9e4f611a1	First pass at set-returning-functions in FROM, by Joe Conway with some kibitzing from Tom Lane. Not everything works yet, and there's no documentation or regression test, but let's commit this so Joe doesn't need to cope with tracking changes in so many files ...	2002-05-12 20:10:05 +00:00
Tom Lane	6c59886942	Second try at fixing join alias variables. Instead of attaching miscellaneous lists to join RTEs, attach a list of Vars and COALESCE expressions that will replace the join's alias variables during planning. This simplifies flatten_join_alias_vars while still making it easy to fix up varno references when transforming the query tree. Add regression test cases for interactions of subqueries with outer joins.	2002-04-28 19:54:29 +00:00
Tom Lane	4bdb4be62e	Divide functions into three volatility classes (immutable, stable, and volatile), rather than the old cachable/noncachable distinction. This allows indexscan optimizations in many places where we formerly didn't. Also, add a pronamespace column to pg_proc (it doesn't do anything yet, however).	2002-04-05 00:31:36 +00:00
Tom Lane	6eeb95f0f5	Restructure representation of join alias variables. An explicit JOIN now has an RTE of its own, and references to its outputs now are Vars referencing the JOIN RTE, rather than CASE-expressions. This allows reverse-listing in ruleutils.c to use the correct alias easily, rather than painfully reverse-engineering the alias namespace as it used to do. Also, nested FULL JOINs work correctly, because the result of the inner joins are simple Vars that the planner can cope with. This fixes a bug reported a couple times now, notably by Tatsuo on 18-Nov-01. The alias Vars are expanded into COALESCE expressions where needed at the very end of planning, rather than during parsing. Also, beginnings of support for showing plan qualifier expressions in EXPLAIN. There are probably still cases that need work. initdb forced due to change of stored-rule representation.	2002-03-12 00:52:10 +00:00
Tom Lane	63cc56de54	Suppress subquery pullup and pushdown when the subquery has any set-returning functions in its target list. This ensures that we won't rewrite the query in a way that places set-returning functions into quals (WHERE clauses). Cf. bug reports from Joe Conway.	2001-12-10 22:54:12 +00:00
Bruce Momjian	ea08e6cd55	New pgindent run with fixes suggested by Tom. Patch manually reviewed, initdb/regression tests pass.	2001-11-05 17:46:40 +00:00
Tom Lane	96ca8ffebc	Fix problems with subselects used in GROUP BY expressions, per gripe from Philip Warner. Side effect of change is that GROUP BY expressions will not be re-evaluated at multiple plan levels anymore, whereas this sometimes happened with old code.	2001-10-30 19:58:58 +00:00
Bruce Momjian	6783b2372e	Another pgindent run. Fixes enum indenting, and improves #endif spacing. Also adds space for one-line comments.	2001-10-28 06:26:15 +00:00
Bruce Momjian	b81844b173	pgindent run on all C files. Java run to follow. initdb/regression tests pass.	2001-10-25 05:50:21 +00:00
Tom Lane	6254465d06	Extend code that deduces implied equality clauses to detect whether a clause being added to a particular restriction-clause list is redundant with those already in the list. This avoids useless work at runtime, and (perhaps more importantly) keeps the selectivity estimation routines from generating too-small estimates of numbers of output rows. Also some minor improvements in OPTIMIZER_DEBUG displays.	2001-10-18 16:11:42 +00:00
Tom Lane	f933766ba7	Restructure pg_opclass, pg_amop, and pg_amproc per previous discussions in pgsql-hackers. pg_opclass now has a row for each opclass supported by each index AM, not a row for each opclass name. This allows pg_opclass to show directly whether an AM supports an opclass, and furthermore makes it possible to store additional information about an opclass that might be AM-dependent. pg_opclass and pg_amop now store "lossy" and "haskeytype" information that we previously expected the user to remember to provide in CREATE INDEX commands. Lossiness is no longer an index-level property, but is associated with the use of a particular operator in a particular index opclass. Along the way, IndexSupportInitialize now uses the syscaches to retrieve pg_amop and pg_amproc entries. I find this reduces backend launch time by about ten percent, at the cost of a couple more special cases in catcache.c's IndexScanOK. Initial work by Oleg Bartunov and Teodor Sigaev, further hacking by Tom Lane. initdb forced.	2001-08-21 16:36:06 +00:00
Tom Lane	421467cdc8	Fix optimizer to not try to push WHERE clauses down into a sub-SELECT that has a DISTINCT ON clause, per bug report from Anthony Wood. While at it, improve the DISTINCT-ON-clause recognizer routine to not be fooled by out- of-order DISTINCT lists.	2001-07-31 17:56:31 +00:00
Tom Lane	cdd230d628	Improve planning of OR indexscan plans: for quals like WHERE (a = 1 or a = 2) and b = 42 and an index on (a,b), include the clause b = 42 in the indexquals generated for each arm of the OR clause. Essentially this is an index- driven conversion from CNF to DNF. Implementation is a bit klugy, but better than not exploiting the extra quals at all ...	2001-06-05 17:13:52 +00:00
Tom Lane	7c579fa12d	Further work on making use of new statistics in planner. Adjust APIs of costsize.c routines to pass Query root, so that costsize can figure more things out by itself and not be so dependent on its callers to tell it everything it needs to know. Use selectivity of hash or merge clause to estimate number of tuples processed internally in these joins (this is more useful than it would've been before, since eqjoinsel is somewhat more accurate than before).	2001-06-05 05:26:05 +00:00
Tom Lane	be03eb25f3	Modify optimizer data structures so that IndexOptInfo lists built for create_index_paths are not immediately discarded, but are available for subsequent planner work. This allows avoiding redundant syscache lookups in several places. Change interface to operator selectivity estimation procedures to allow faster and more flexible estimation. Initdb forced due to change of pg_proc entries for selectivity functions!	2001-05-20 20:28:20 +00:00
Tom Lane	c23bc6fbb0	First cut at making indexscan cost estimates depend on correlation between index order and table order.	2001-05-09 23:13:37 +00:00
Tom Lane	f905d65ee3	Rewrite of planner statistics-gathering code. ANALYZE is now available as a separate statement (though it can still be invoked as part of VACUUM, too). pg_statistic redesigned to be more flexible about what statistics are stored. ANALYZE now collects a list of several of the most common values, not just one, plus a histogram (not just the min and max values). Random sampling is used to make the process reasonably fast even on very large tables. The number of values and histogram bins collected is now user-settable via an ALTER TABLE command. There is more still to do; the new stats are not being used everywhere they could be in the planner. But the remaining changes for this project should be localized, and the behavior is already better than before. A not-very-related change is that sorting now makes use of btree comparison routines if it can find one, rather than invoking '<' twice.	2001-05-07 00:43:27 +00:00
Tom Lane	d5096af2c4	Make the world safe for passing whole rows of views to functions. This already worked fine for whole rows of tables, but not so well for views...	2001-04-18 20:42:56 +00:00
Bruce Momjian	9e1552607a	pgindent run. Make it all clean.	2001-03-22 04:01:46 +00:00
Tom Lane	b29f68f611	Take OUTER JOIN semantics into account when estimating the size of join relations. It's not very bright, but at least it now knows that A LEFT JOIN B must produce at least as many rows as are in A ...	2001-02-16 00:03:08 +00:00
Bruce Momjian	623bf843d2	Change Copyright from PostgreSQL, Inc to PostgreSQL Global Development Group.	2001-01-24 19:43:33 +00:00
Bruce Momjian	7df3bb50f0	Add all possible config file options.	2001-01-24 18:37:31 +00:00
Tom Lane	07c741e61c	Fix oversight in planning of GROUP queries: when an expression is used as both a GROUP BY item and an output expression, the top-level Group node should just copy up the evaluated expression value from its input, rather than re-evaluating the expression. Aside from any performance benefit this might offer, this avoids a crash when there is a sub-SELECT in said expression.	2001-01-09 03:48:51 +00:00
Tom Lane	ea166f1146	Planner speedup hacking. Avoid saving useless pathkeys, so that path comparison does not consider paths different when they differ only in uninteresting aspects of sort order. (We had a special case of this consideration for indexscans already, but generalize it to apply to ordered join paths too.) Be stricter about what is a canonical pathkey to allow faster pathkey comparison. Cache canonical pathkeys and dispersion stats for left and right sides of a RestrictInfo's clause, to avoid repeated computation. Total speedup will depend on number of tables in a query, but I see about 4x speedup of planning phase for a sample seven-table query.	2000-12-14 22:30:45 +00:00
Tom Lane	6543d81d65	Restructure handling of inheritance queries so that they work with outer joins, and clean things up a good deal at the same time. Append plan node no longer hacks on rangetable at runtime --- instead, all child tables are given their own RT entries during planning. Concept of multiple target tables pushed up into execMain, replacing bug-prone implementation within nodeAppend. Planner now supports generating Append plans for inheritance sets either at the top of the plan (the old way) or at the bottom. Expanding at the bottom is appropriate for tables used as sources, since they may appear inside an outer join; but we must still expand at the top when the target of an UPDATE or DELETE is an inheritance set, because we actually need a different targetlist and junkfilter for each target table in that case. Fortunately a target table can't be inside an outer join... Bizarre mutual recursion between union_planner and prepunion.c is gone --- in fact, union_planner doesn't really have much to do with union queries anymore, so I renamed it grouping_planner.	2000-11-12 00:37:02 +00:00
Tom Lane	2f35b4efdb	Re-implement LIMIT/OFFSET as a plan node type, instead of a hack in ExecutorRun. This allows LIMIT to work in a view. Also, LIMIT in a cursor declaration will behave in a reasonable fashion, whereas before it was overridden by the FETCH count.	2000-10-26 21:38:24 +00:00
Bruce Momjian	b32685a999	Add proofreader's changes to docs. Fix misspelling of disbursion to dispersion.	2000-10-05 19:48:34 +00:00
Tom Lane	05e3d0ee86	Reimplementation of UNION/INTERSECT/EXCEPT. INTERSECT/EXCEPT now meet the SQL92 semantics, including support for ALL option. All three can be used in subqueries and views. DISTINCT and ORDER BY work now in views, too. This rewrite fixes many problems with cross-datatype UNIONs and INSERT/SELECT where the SELECT yields different datatypes than the INSERT needs. I did that by making UNION subqueries and SELECT in INSERT be treated like subselects-in-FROM, thereby allowing an extra level of targetlist where the datatype conversions can be inserted safely. INITDB NEEDED!	2000-10-05 19:11:39 +00:00
Tom Lane	3a94e789f5	Subselects in FROM clause, per ISO syntax: FROM (SELECT ...) [AS] alias. (Don't forget that an alias is required.) Views reimplemented as expanding to subselect-in-FROM. Grouping, aggregates, DISTINCT in views actually work now (he says optimistically). No UNION support in subselects/views yet, but I have some ideas about that. Rule-related permissions checking moved out of rewriter and into executor. INITDB REQUIRED!	2000-09-29 18:21:41 +00:00
Tom Lane	ba2ea6e0f5	Fix GEQO optimizer to work correctly with new outer-join-capable query representation. Note that GEQO_RELS setting is now interpreted as the number of top-level items in the FROM list, not necessarily the number of relations in the query. This seems appropriate since we are only doing join-path searching over the top-level items.	2000-09-19 18:42:34 +00:00
Tom Lane	ed5003c584	First cut at full support for OUTER JOINs. There are still a few loose ends to clean up (see my message of same date to pghackers), but mostly it works. INITDB REQUIRED!	2000-09-12 21:07:18 +00:00
Tom Lane	7893462e44	Move pg_checkretval out of the planner (where it never belonged) into pg_proc.c (where it's actually used). Fix it to correctly handle tlists that contain resjunk target items, and improve error messages. This addresses bug reported by Krupnikov 6-July-00.	2000-08-21 20:55:31 +00:00
Tom Lane	37168b8da4	Clean up handling of variable-free qual clauses. System now does the right thing with variable-free clauses that contain noncachable functions, such as 'WHERE random() < 0.5' --- these are evaluated once per potential output tuple. Expressions that contain only Params are now candidates to be indexscan quals --- for example, 'var = ($1 + 1)' can now be indexed. Cope with RelabelType nodes atop potential indexscan variables --- this oversight prevents 7.0.* from recognizing some potentially indexscanable situations.	2000-08-13 02:50:35 +00:00
Tom Lane	9426047021	Clean up bogosities in use of random(3) and srandom(3) --- do not assume that RAND_MAX applies to them, since it doesn't. Instead add a config.h parameter MAX_RANDOM_VALUE. This is currently set at 2^31-1 but could be auto-configured if that ever proves necessary. Also fix some outright bugs like calling srand() where srandom() is appropriate.	2000-08-07 00:51:42 +00:00
Tom Lane	cd9f0ca545	Deduce equality constraints that are implied by transitivity of mergejoinable qual clauses, and add them to the query quals. For example, WHERE a = b AND b = c will cause us to add AND a = c. This is necessary to ensure that it's safe to use these variables as interchangeable sort keys, which is something 7.0 knows how to do. Should provide a useful improvement in planning ability, too.	2000-07-24 03:11:01 +00:00
Tom Lane	1aebc3618a	First phase of memory management rewrite (see backend/utils/mmgr/README for details). It doesn't really do that much yet, since there are no short-term memory contexts in the executor, but the infrastructure is in place and long-term contexts are handled reasonably. A few long- standing bugs have been fixed, such as 'VACUUM; anything' in a single query string crashing. Also, out-of-memory is now considered a recoverable ERROR, not FATAL. Eliminate a large amount of crufty, now-dead code in and around memory management. Fix problem with holding off SIGTRAP, SIGSEGV, etc in postmaster and backend startup.	2000-06-28 03:33:33 +00:00
Tom Lane	38db5fab29	Make inheritance planning logic a little simpler and clearer, hopefully even a little faster.	2000-06-20 04:22:21 +00:00
Tom Lane	1ee26b7764	Reimplement nodeMaterial to use a temporary BufFile (or even memory, if the materialized tupleset is small enough) instead of a temporary relation. This was something I was thinking of doing anyway for performance, and Jan says he needs it for TOAST because he doesn't want to cope with toasting noname relations. With this change, the 'noname table' support in heap.c is dead code, and I have accordingly removed it. Also clean up 'noname' plan handling in planner --- nonames are either sort or materialize plans, and it seems less confusing to handle them separately under those names.	2000-06-18 22:44:35 +00:00
Bruce Momjian	df43800fc8	Clean up #include's.	2000-06-15 03:33:12 +00:00
Tom Lane	ce7746201b	Cause inheritance patch to meet minimum coding standards (no gcc warnings).	2000-06-09 03:17:13 +00:00
Bruce Momjian	20ad43b576	Mark functions as static and ifdef NOT_USED as appropriate.	2000-06-08 22:38:00 +00:00
Peter Eisentraut	6a68f42648	The heralded `Grand Unified Configuration scheme' (GUC) That means you can now set your options in either or all of $PGDATA/configuration, some postmaster option (--enable-fsync=off), or set a SET command. The list of options is in backend/utils/misc/guc.c, documentation will be written post haste. pg_options is gone, so is that pq_geqo config file. Also removed were backend -K, -Q, and -T options (no longer applicable, although -d0 does the same as -Q). Added to configure an --enable-syslog option. changed all callers from TPRINTF to elog(DEBUG)	2000-05-31 00:28:42 +00:00
Tom Lane	0f1e39643d	Third round of fmgr updates: eliminate calls using fmgr() and fmgr_faddr() in favor of new-style calls. Lots of cleanup of sloppy casts to use XXXGetDatum and DatumGetXXX ...	2000-05-30 04:25:00 +00:00
Bruce Momjian	52f77df613	Ye-old pgindent run. Same 4-space tabs.	2000-04-12 17:17:23 +00:00
Tom Lane	1c72a8a37a	Fix extremely nasty little bug observed when a sub-SELECT appears in WHERE in a place where it can be part of a nestloop inner indexqual. As the code stood, it put the same physical sub-Plan node into both indxqual and indxqualorig of the IndexScan plan node. That confused later processing in the optimizer (which expected that tracing the subPlan list would visit each subplan node exactly once), and would probably have blown up in the executor if the planner hadn't choked first. Fix by making the 'fixed' indexqual be a complete deep copy of the original indexqual, rather than trying to share nodes below the topmost operator node. This had further ramifications though, because we were making the aforesaid list of sub-Plan nodes during SS_process_sublinks which is run before construction of the 'fixed' indexqual, meaning that the copy of the sub-Plan didn't show up in that list. Fix by rearranging logic so that the sub-Plan list is built by the final set_plan_references pass, not in SS_process_sublinks. This may sound like a mess, but it's actually a good deal cleaner now than it was before, because we are no longer dependent on the assumption that planning will never make a copy of a sub-Plan node.	2000-04-04 01:21:48 +00:00
Tom Lane	1d5e7a6f46	Repair logic flaw in cost estimator: cost_nestloop() was estimating CPU costs using the inner path's parent->rows count as the number of tuples processed per inner scan iteration. This is wrong when we are using an inner indexscan with indexquals based on join clauses, because the rows count in a Relation node reflects the selectivity of the restriction clauses for that rel only. Upshot was that if join clause was very selective, we'd drastically overestimate the true cost of the join. Fix is to calculate correct output-rows estimate for an inner indexscan when the IndexPath node is created and save it in the path node. Change of path node doesn't require initdb, since path nodes don't appear in saved rules.	2000-03-22 22:08:35 +00:00
Tom Lane	3ee8f7e207	Restructure planning code so that preprocessing of targetlist and quals to simplify constant expressions and expand SubLink nodes into SubPlans is done in a separate routine subquery_planner() that calls union_planner(). We formerly did most of this work in query_planner(), but that's the wrong place because it may never see the real targetlist. Splitting union_planner into two routines also allows us to avoid redundant work when union_planner is invoked recursively for UNION and inheritance cases. Upshot is that it is now possible to do something like select float8(count()) / (select count() from int4_tbl) from int4_tbl group by f1; which has never worked before.	2000-03-21 05:12:12 +00:00
Tom Lane	341b328b18	Fix a bunch of minor portability problems and maybe-bugs revealed by running gcc and HP's cc with warnings cranked way up. Signed vs unsigned comparisons, routines declared static and then defined not-static, that kind of thing. Tedious, but perhaps useful...	2000-03-17 02:36:41 +00:00
Tom Lane	b1577a7c78	New cost model for planning, incorporating a penalty for random page accesses versus sequential accesses, a (very crude) estimate of the effects of caching on random page accesses, and cost to evaluate WHERE- clause expressions. Export critical parameters for this model as SET variables. Also, create SET variables for the planner's enable flags (enable_seqscan, enable_indexscan, etc) so that these can be controlled more conveniently than via PGOPTIONS. Planner now estimates both startup cost (cost before retrieving first tuple) and total cost of each path, so it can optimize queries with LIMIT on a reasonable basis by interpolating between these costs. Same facility is a win for EXISTS(...) subqueries and some other cases. Redesign pathkey representation to achieve a major speedup in planning (I saw as much as 5X on a 10-way join); also minor changes in planner to reduce memory consumption by recycling discarded Path nodes and not constructing unnecessary lists. Minor cleanups to display more-plausible costs in some cases in EXPLAIN output. Initdb forced by change in interface to index cost estimation functions.	2000-02-15 20:49:31 +00:00
Tom Lane	d8733ce674	Repair planning bugs caused by my misguided removal of restrictinfo link fields in JoinPaths --- turns out that we do need that after all :-(. Also, rearrange planner so that only one RelOptInfo is created for a particular set of joined base relations, no matter how many different subsets of relations it can be created from. This saves memory and processing time compared to the old method of making a bunch of RelOptInfos and then removing the duplicates. Clean up the jointree iteration logic; not sure if it's better, but I sure find it more readable and plausible now, particularly for the case of 'bushy plans'.	2000-02-07 04:41:04 +00:00
Tom Lane	81fc1d5edb	Rename same() to sameseti() to have a slightly less generic name. Move nonoverlap_sets() and is_subset() to list.c, where they should have lived to begin with, and rename to nonoverlap_setsi and is_subseti since they only work on integer lists.	2000-02-06 03:27:35 +00:00
Tom Lane	78296c2797	Further cleanup for OR-of-AND WHERE-clauses. orindxpath can now handle extracting from an AND subclause just those opclauses that are relevant for a particular index. For example, we can now consider using an index on x to process WHERE (x = 1 AND y = 2) OR (x = 2 AND y = 4) OR ...	2000-02-05 18:26:09 +00:00
Tom Lane	dd979f66be	Redesign DISTINCT ON as discussed in pgsql-sql 1/25/00: syntax is now SELECT DISTINCT ON (expr [, expr ...]) targetlist ... and there is a check to make sure that the user didn't specify an ORDER BY that's incompatible with the DISTINCT operation. Reimplement nodeUnique and nodeGroup to use the proper datatype-specific equality function for each column being compared --- they used to do bitwise comparisons or convert the data to text strings and strcmp(). (To add insult to injury, they'd look up the conversion functions once for each tuple...) Parse/plan representation of DISTINCT is now a list of SortClause nodes. initdb forced by querytree change...	2000-01-27 18:11:50 +00:00
Bruce Momjian	5c25d60244	Add: * Portions Copyright (c) 1996-2000, PostgreSQL, Inc to all files copyright Regents of Berkeley. Man, that's a lot of files.	2000-01-26 05:58:53 +00:00
Tom Lane	8449df8a67	First cut at unifying regular selectivity estimation with indexscan selectivity estimation wasn't right. This is better...	2000-01-23 02:07:00 +00:00
Tom Lane	71ed7eb494	Revise handling of index-type-specific indexscan cost estimation, per pghackers discussion of 5-Jan-2000. The amopselect and amopnpages estimators are gone, and in their place is a per-AM amcostestimate procedure (linked to from pg_am, not pg_amop).	2000-01-22 23:50:30 +00:00
Tom Lane	7bc1fbe100	Remove no-longer-used symbols.	2000-01-11 03:59:31 +00:00
Tom Lane	166b5c1def	Another round of planner/optimizer work. This is just restructuring and code cleanup; no major improvements yet. However, EXPLAIN does produce more intuitive outputs for nested loops with indexscans now...	2000-01-09 00:26:47 +00:00
Tom Lane	7431796b46	fix_parsetree_attnums was not nearly smart enough about walking parse trees. Also rewrite find_all_inheritors() in a more intelligible style.	1999-12-14 03:35:28 +00:00
Tom Lane	a8ae19ec3d	aggregate(DISTINCT ...) works, per SQL spec. Note this forces initdb because of change of Aggref node in stored rules.	1999-12-13 01:27:21 +00:00
Tom Lane	f7f41c7c8c	Replace generic 'Illegal use of aggregates' error message with one that shows the specific ungrouped variable being complained of. Perhaps this will reduce user confusion...	1999-12-09 05:58:56 +00:00
Bruce Momjian	6f9ff92cc0	Tid access method feature from Hiroshi Inoue, Inoue@tpf.co.jp	1999-11-23 20:07:06 +00:00
Tom Lane	610dfa6d55	Combine index_info and find_secondary_indexes into a single routine that returns a list of RelOptInfos, eliminating the need for static state in index_info. That static state was a direct cause of coredumps; if anything decided to elog(ERROR) partway through an index_info search of pg_index, the next query would try to close a scan pointer that was pointing at no-longer-valid memory. Another example of the reasons to avoid static state variables...	1999-11-21 23:25:47 +00:00
Tom Lane	3eb1c82277	Fix planner and rewriter to follow SQL semantics for tables that are mentioned in FROM but not elsewhere in the query: such tables should be joined over anyway. Aside from being more standards-compliant, this allows removal of some very ugly hacks for COUNT(*) processing. Also, allow HAVING clause without aggregate functions, since SQL does. Clean up CREATE RULE statement-list syntax the same way Bruce just fixed the main stmtmulti production. CAUTION: addition of a field to RangeTblEntry nodes breaks stored rules; you will have to initdb if you have any rules.	1999-10-07 04:23:24 +00:00
Tom Lane	40f6524161	Implement constant-expression simplification per Bernard Frankpitt, plus some improvements from yours truly. The simplifier depends on the proiscachable field of pg_proc to tell it whether a function is safe to pre-evaluate --- things like nextval() are not, for example. Update pg_proc.h to contain reasonable cacheability information; as of 6.5.* hardly any functions were marked cacheable. I may have erred too far in the other direction; see recent mail to pghackers for more info. This update does not force an initdb, exactly, but you won't see much benefit from the simplifier until you do one.	1999-09-26 02:28:44 +00:00
Tom Lane	43d32d3683	First cut at doing something reasonable with OR-of-ANDs WHERE conditions. There are some pretty bogus heuristics in prepqual.c that try to decide whether to output CNF or DNF format; they need to be replaced, likely. Right now the code is probably too willing to choose DNF form, which might hurt performance in some cases that used to work OK. But at least we have a foundation to build on.	1999-09-13 00:17:25 +00:00
Tom Lane	2119cc0670	Further improvements in cnfify: reduce amount of self-recursion in or_normalize, remove detection of duplicate subexpressions (since it's highly unlikely to be worth the amount of time it takes), and introduce a dnfify() entry point so that unintelligible backwards logic in UNION processing can be eliminated. This is just an intermediate step --- next thing is to look at not forcing the qual into CNF form when it would be better off in DNF form.	1999-09-12 18:08:17 +00:00
Tom Lane	37d20eb855	Clean up some mistakes in handling of uplevel Vars in planner. Most parts of the planner should ignore, or indeed never even see, uplevel Vars because they will be or have been replaced by Params. There were a couple of places that got it wrong though, probably my fault from recent changes...	1999-08-26 05:09:06 +00:00
Tom Lane	e8140adb10	Further sort-order twiddling in optimizer: be smart about case where ORDER BY and GROUP BY request the same sort order.	1999-08-22 23:56:45 +00:00
Tom Lane	78114cd4d4	Further planner/optimizer cleanups. Move all set_tlist_references and fix_opids processing to a single recursive pass over the plan tree executed at the very tail end of planning, rather than haphazardly here and there at different places. Now that tlist Vars do not get modified until the very end, it's possible to get rid of the klugy var_equal and match_varid partial-matching routines, and just use plain equal() throughout the optimizer. This is a step towards allowing merge and hash joins to be done on expressions instead of only Vars ...	1999-08-22 20:15:04 +00:00
Tom Lane	db436adf76	Major revision of sort-node handling: push knowledge of query sort order down into planner, instead of handling it only at the very top level of the planner. This fixes many things. An explicit sort is now avoided if there is a cheaper alternative (typically an indexscan) not only for ORDER BY, but also for the internal sort of GROUP BY. It works even when there is no other reason (such as a WHERE condition) to consider the indexscan. It works for indexes on functions. It works for indexes on functions, backwards. It's just so cool... CAUTION: I have changed the representation of SortClause nodes, therefore THIS UPDATE BREAKS STORED RULES. You will need to initdb.	1999-08-21 03:49:17 +00:00
Tom Lane	e6381966c1	Major planner/optimizer revision: get rid of PathOrder node type, store all ordering information in pathkeys lists (which are now lists of lists of PathKeyItem nodes, not just lists of lists of vars). This was a big win --- the code is smaller and IMHO more understandable than it was, even though it handles more cases. I believe the node changes will not force an initdb for anyone; planner nodes don't show up in stored rules.	1999-08-16 02:17:58 +00:00
Tom Lane	8f9f6e51a8	Clean up optimizer's handling of indexscan quals that need to be commuted (ie, the index var appears on the right). These are now handled the same way as merge and hash join quals that need to be commuted: the actual reversing of the clause only happens if we actually choose the path and generate a plan from it. Furthermore, the clause is only reversed in the 'indexqual' field of the plan, not in the 'indxqualorig' field. This allows the clause to still be recognized and removed from qpquals of upper level join plans. Also, simplify and generalize match_clause_to_indexkey; now it recognizes binary-compatible indexes for join as well as restriction clauses.	1999-08-12 04:32:54 +00:00
Tom Lane	2ae51c86c9	Minor cleanups and code beautification; eliminate some routines that are now dead code.	1999-08-10 03:00:15 +00:00
Tom Lane	ecef2caae9	Clean up routines in setrefs.c by replacing individual tree walking logic with expression_tree_walker/mutator calls.	1999-08-09 00:56:05 +00:00
Tom Lane	6bc601b648	Create a standardized expression_tree_mutator support routine to go along with expression_tree_walker. (_walker is not suitable for routines that need to alter the tree structure significantly.) Other minor cleanups in clauses.c.	1999-08-09 00:51:26 +00:00
Tom Lane	e1fad50a5d	Revise generation of hashjoin paths: generate one path per hashjoinable clause, not one path for a randomly-chosen element of each set of clauses with the same join operator. That is, if you wrote SELECT ... WHERE t1.f1 = t2.f2 and t1.f3 = t2.f4, and both '=' ops were the same opcode (say, all four fields are int4), then the system would either consider hashing on f1=f2 or on f3=f4, but it would not consider both possibilities. Boo hiss. Also, revise estimation of hashjoin costs to include a penalty when the inner join var has a high disbursion --- ie, the most common value is pretty common. This tends to lead to badly skewed hash bucket occupancy and way more comparisons than you'd expect on average. I imagine that the cost calculation still needs tweaking, but at least it generates a more reasonable plan than before on George Young's example.	1999-08-06 04:00:17 +00:00
Tom Lane	04578a9180	Further cleanups of indexqual processing: simplify control logic in indxpath.c, avoid generation of redundant indexscan paths for the same relation and index.	1999-07-30 04:07:25 +00:00
Tom Lane	b62fdc13f0	Correct bug in best_innerjoin(): it should check all the rels that the inner path needs to join to, but it was only checking for the first one. Failure could only have been observed with an OR-clause that mentions 3 or more tables, and then only if the bogus path was actually selected as cheapest ...	1999-07-27 06:23:12 +00:00
Tom Lane	9e7e29e6c9	First cut at doing LIKE/regex indexing optimization in optimizer rather than parser. This has many advantages, such as not getting fooled by chance uses of operator names ~ and ~~ (the operators are identified by OID now), and not creating useless comparison operations in contexts where the comparisons will not actually be used as indexquals. The new code also recognizes exact-match LIKE and regex patterns, and produces an = indexqual instead of >= and <=. This change does NOT fix the problem with non-ASCII locales: the code still doesn't know how to generate an upper bound indexqual for non-ASCII collation order. But it's no worse than before, just the same deficiency in a different place... Also, dike out loc_restrictinfo fields in Plan nodes. These were doing nothing useful in the absence of 'expensive functions' optimization, and they took a considerable amount of processing to fill in.	1999-07-27 03:51:11 +00:00
Tom Lane	49ed4dd779	Further work on planning of indexscans. Cleaned up interfaces to index_selectivity so that it can be handed an indexqual clause list rather than a bunch of assorted derivative data.	1999-07-25 23:07:26 +00:00
Tom Lane	ac4913a0dd	Clean up messy clause-selectivity code in clausesel.c; repair bug identified by Hiroshi (incorrect cost attributed to OR clauses after multiple passes through set_rest_selec()). I think the code was trying to allow selectivities of OR subclauses to be passed in from outside, but noplace was actually passing any useful data, and set_rest_selec() was passing wrong data. Restructure representation of "indexqual" in IndexPath nodes so that it is the same as for indxqual in completed IndexScan nodes: namely, a toplevel list with an entry for each pass of the index scan, having sublists that are implicitly-ANDed index qual conditions for that pass. You don't want to know what the old representation was :-( Improve documentation of OR-clause indexscan functions. Remove useless 'notclause' field from RestrictInfo nodes. (This might force an initdb for anyone who has stored rules containing RestrictInfos, but I do not think that RestrictInfo ever appears in completed plans.)	1999-07-24 23:21:14 +00:00
Bruce Momjian	3406901a29	Move some system includes into c.h, and remove duplicates.	1999-07-17 20:18:55 +00:00
Bruce Momjian	773088809d	More cleanup	1999-07-16 17:07:40 +00:00
Bruce Momjian	a9591ce66a	Change #include's to use <> and "" as appropriate.	1999-07-15 23:04:24 +00:00
Bruce Momjian	40a89e08b2	Cleanups.	1999-07-15 20:32:30 +00:00
Bruce Momjian	4b2c2850bf	Clean up #include in /include directory. Add scripts for checking includes.	1999-07-15 15:21:54 +00:00
Bruce Momjian	0cf1b79528	Cleanup of /include #include's, for 6.6 only.	1999-07-14 01:20:30 +00:00
Bruce Momjian	9f7ac20e57	Cleanup of min tuple size.	1999-07-07 09:27:28 +00:00
Tom Lane	fd8e580bb7	Clean up problems with sublinks + grouping in planner. Not sure if they are all fixed, because rewriter is now the stumbling block, but at least some cases work that did not work before.	1999-06-21 01:20:57 +00:00
Tom Lane	86f36719db	Create a generic expression-tree-walker subroutine, which will gradually replace all of the boilerplate tree-walk-recursion code that currently exists in O(N) slightly different forms in N subroutines. I've had it with adding missing cases to these subroutines...	1999-06-19 03:41:45 +00:00
Tom Lane	b4210ae0f0	Fix problems with grouping/aggregation in queries that use inheritance ... basically it was completely busted :-(	1999-06-06 17:38:11 +00:00
Bruce Momjian	278bbf4572	Make functions static or NOT_USED as appropriate.	1999-05-26 12:57:23 +00:00
Bruce Momjian	fcff1cdf4e	Another pgindent run. Sorry folks.	1999-05-25 22:43:53 +00:00
Bruce Momjian	07842084fe	pgindent run over code.	1999-05-25 16:15:34 +00:00
Tom Lane	1332c1e144	Change GEQO optimizer to release memory after each gene is evaluated. This bounds memory usage to something reasonable even when many tables are being joined.	1999-05-17 00:25:34 +00:00
Tom Lane	fecb2b0024	Minor code cleanup in optimizer.	1999-05-16 19:45:37 +00:00
Tom Lane	507a0a2ab0	Rip out QueryTreeList structure, root and branch. Querytree lists are now plain old garden-variety Lists, allocated with palloc, rather than specialized expansible-array data allocated with malloc. This substantially simplifies their handling and eliminates several sources of memory leakage. Several basic types of erroneous queries (syntax error, attempt to insert a duplicate key into a unique index) now demonstrably leak zero bytes per query.	1999-05-13 07:29:22 +00:00
Jan Wieck	79c2576f77	Replaced targetlist entry in GroupClause by reference number in Resdom and GroupClause so changing of resno's doesn't confuse the grouping any more. Jan	1999-05-12 15:02:39 +00:00
Tom Lane	da5f1dd722	Revise union_planner and associated routines to clean up breakage from EXCEPT/HAVING patch. Cases involving nontrivial GROUP BY expressions now work again. Also, the code is at least somewhat better documented...	1999-05-03 00:38:44 +00:00
Tom Lane	605d84941d	Clean up cost_sort some more: most callers were double-counting the cost of reading the source data.	1999-05-01 19:47:42 +00:00
Tom Lane	4438b70b94	Repair some problems in planner's handling of HAVING clauses. This fixes a few of the problems Hiroshi Inoue complained of, but I have not touched the rewrite-related issues.	1999-04-19 01:43:12 +00:00
Tom Lane	ff38837fe9	Fix nasty bug in optimization of multiway joins: optimizer would sometimes generate a plan that omitted a sort step before merge.	1999-04-03 00:18:28 +00:00
Bruce Momjian	a564d2bf0f	geqo now at 11 tables	1999-03-07 12:00:40 +00:00
Tom Lane	e0345e09bf	Partial fix for copied-plan bugs reported by Hiroshi Inoue: _copyResult didn't copy subPlan structure completely. _copyAgg is still busted, apparently because of changes from EXCEPT/INTERSECT patch (get_agg_tlist_references is no longer sufficient to find all aggregates). No time to look at that tonight, however.	1999-03-03 00:02:42 +00:00

... 4 5 6 7 8 ...

630 Commits