Commit Graph

107 Commits

Author SHA1 Message Date
Tom Lane 25e46a504b Fix a couple of oversights associated with the "physical tlist" optimization:
we had several code paths where a physical tlist could be used for the input
to a Sort node, which is a dumb idea because any unneeded table columns will
increase the volume of data the sort has to push around.

(Unfortunately the easy-looking fix of calling disuse_physical_tlist during
make_sort_xxx doesn't work because in most cases we're already committed to
the current input tlist --- it's been marked with sort column numbers, or
we've built grouping column numbers using it, etc.  The tlist has to be
selected properly at the calling level before we start constructing sort-col
information.  This is easy enough to do, we were just failing to take the
point into consideration.)

Back-patch to 8.3.  I believe the problem probably exists clear back to 7.4
when the physical tlist optimization was added, but I'm afraid to back-patch
further than 8.3 without a great deal more study than I want to put into it.
The code in this area has drifted a lot over time.  The real-world importance
of these code paths is uncertain anyway --- I think in many cases we'd
probably prefer hash-based methods.
2008-04-17 21:22:14 +00:00
Bruce Momjian 9098ab9e32 Update copyrights in source tree to 2008. 2008-01-01 19:46:01 +00:00
Bruce Momjian f6e8730d11 Re-run pgindent with updated list of typedefs. (Updated README should
avoid this problem in the future.)
2007-11-15 22:25:18 +00:00
Bruce Momjian fdf5a5efb7 pgindent run for 8.3. 2007-11-15 21:14:46 +00:00
Tom Lane 82d8ab6fc4 Fix the plan-invalidation mechanism to treat regclass constants that refer to
a relation as a reason to invalidate a plan when the relation changes.  This
handles scenarios such as dropping/recreating a sequence that is referenced by
nextval('seq') in a cached plan.  Rather than teach plancache.c all about
digging through plan trees to find regclass Consts, we charge the planner's
setrefs.c with making a list of the relation OIDs on which each plan depends.
That way the list can be built cheaply during a plan tree traversal that has
to happen anyway.  Per bug #3662 and subsequent discussion.
2007-10-11 18:05:27 +00:00
Tom Lane 89db887b1e Keep the planner from failing on "WHERE false AND something IN (SELECT ...)".
eval_const_expressions simplifies this to just "WHERE false", but we have
already done pull_up_IN_clauses so the IN join will be done, or at least
planned, anyway.  The trouble case comes when the sub-SELECT is itself a join
and we decide to implement the IN by unique-ifying the sub-SELECT outputs:
with no remaining reference to the output Vars in WHERE, we won't have
propagated the Vars up to the upper join point, leading to "variable not found
in subplan target lists" error.  Fix by adding an extra scan of in_info_list
and forcing all Vars mentioned therein to be propagated up to the IN join
point.  Per bug report from Miroslav Sulc.
2007-10-04 20:44:47 +00:00
Tom Lane d26559dbf3 Teach tuplesort.c about "top N" sorting, in which only the first N tuples
need be returned.  We keep a heap of the current best N tuples and sift-up
new tuples into it as we scan the input.  For M input tuples this means
only about M*log(N) comparisons instead of M*log(M), not to mention a lot
less workspace when N is small --- avoiding spill-to-disk for large M
is actually the most attractive thing about it.  Patch includes planner
and executor support for invoking this facility in ORDER BY ... LIMIT
queries.  Greg Stark, with some editorialization by moi.
2007-05-04 01:13:45 +00:00
Tom Lane eab6b8b27e Turn the rangetable used by the executor into a flat list, and avoid storing
useless substructure for its RangeTblEntry nodes.  (I chose to keep using the
same struct node type and just zero out the link fields for unneeded info,
rather than making a separate ExecRangeTblEntry type --- it seemed too
fragile to have two different rangetable representations.)

Along the way, put subplans into a list in the toplevel PlannedStmt node,
and have SubPlan nodes refer to them by list index instead of direct pointers.
Vadim wanted to do that years ago, but I never understood what he was on about
until now.  It makes things a *whole* lot more robust, because we can stop
worrying about duplicate processing of subplans during expression tree
traversals.  That's been a constant source of bugs, and it's finally gone.

There are some consequent simplifications yet to be made, like not using
a separate EState for subplans in the executor, but I'll tackle that later.
2007-02-22 22:00:26 +00:00
Tom Lane 5a7471c307 Add COST and ROWS options to CREATE/ALTER FUNCTION, plus underlying pg_proc
columns procost and prorows, to allow simple user adjustment of the estimated
cost of a function call, as well as control of the estimated number of rows
returned by a set-returning function.  We might eventually wish to extend this
to allow function-specific estimation routines, but there seems to be
consensus that we should try a simple constant estimate first.  In particular
this provides a relatively simple way to control the order in which different
WHERE clauses are applied in a plan node, which is a Good Thing in view of the
fact that the recent EquivalenceClass planner rewrite made that much less
predictable than before.
2007-01-22 01:35:23 +00:00
Tom Lane f41803bb39 Refactor planner's pathkeys data structure to create a separate, explicit
representation of equivalence classes of variables.  This is an extensive
rewrite, but it brings a number of benefits:
* planner no longer fails in the presence of "incomplete" operator families
that don't offer operators for every possible combination of datatypes.
* avoid generating and then discarding redundant equality clauses.
* remove bogus assumption that derived equalities always use operators
named "=".
* mergejoins can work with a variety of sort orders (e.g., descending) now,
instead of tying each mergejoinable operator to exactly one sort order.
* better recognition of redundant sort columns.
* can make use of equalities appearing underneath an outer join.
2007-01-20 20:45:41 +00:00
Tom Lane a191a169d6 Change the planner-to-executor API so that the planner tells the executor
which comparison operators to use for plan nodes involving tuple comparison
(Agg, Group, Unique, SetOp).  Formerly the executor looked up the default
equality operator for the datatype, which was really pretty shaky, since it's
possible that the data being fed to the node is sorted according to some
nondefault operator class that could have an incompatible idea of equality.
The planner knows what it has sorted by and therefore can provide the right
equality operator to use.  Also, this change moves a couple of catalog lookups
out of the executor and into the planner, which should help startup time for
pre-planned queries by some small amount.  Modify the planner to remove some
other cavalier assumptions about always being able to use the default
operators.  Also add "nulls first/last" info to the Plan node for a mergejoin
--- neither the executor nor the planner can cope yet, but at least the API is
in place.
2007-01-10 18:06:05 +00:00
Bruce Momjian 29dccf5fe0 Update CVS HEAD for 2007 copyright. Back branches are typically not
back-stamped for this.
2007-01-05 22:20:05 +00:00
Tom Lane 7a3e30e608 Add INSERT/UPDATE/DELETE RETURNING, with basic docs and regression tests.
plpgsql support to come later.  Along the way, convert execMain's
SELECT INTO support into a DestReceiver, in order to eliminate some ugly
special cases.

Jonah Harris and Tom Lane
2006-08-12 02:52:06 +00:00
Bruce Momjian 085e559654 Change LIMIT/OFFSET to use int8
Dhanaraj M
2006-07-26 00:34:48 +00:00
Tom Lane cffd89ca73 Revise the planner's handling of "pseudoconstant" WHERE clauses, that is
clauses containing no variables and no volatile functions.  Such a clause
can be used as a one-time qual in a gating Result plan node, to suppress
plan execution entirely when it is false.  Even when the clause is true,
putting it in a gating node wins by avoiding repeated evaluation of the
clause.  In previous PG releases, query_planner() would do this for
pseudoconstant clauses appearing at the top level of the jointree, but
there was no ability to generate a gating Result deeper in the plan tree.
To fix it, get rid of the special case in query_planner(), and instead
process pseudoconstant clauses through the normal RestrictInfo qual
distribution mechanism.  When a pseudoconstant clause is found attached to
a path node in create_plan(), pull it out and generate a gating Result at
that point.  This requires special-casing pseudoconstants in selectivity
estimation and cost_qual_eval, but on the whole it's pretty clean.
It probably even makes the planner a bit faster than before for the normal
case of no pseudoconstants, since removing pull_constant_clauses saves one
useless traversal of the qual tree.  Per gripe from Phil Frost.
2006-07-01 18:38:33 +00:00
Bruce Momjian f2f5b05655 Update copyright for 2006. Update scripts. 2006-03-05 15:59:11 +00:00
Tom Lane e3b9852728 Teach planner how to rearrange join order for some classes of OUTER JOIN.
Per my recent proposal.  I ended up basing the implementation on the
existing mechanism for enforcing valid join orders of IN joins --- the
rules for valid outer-join orders are somewhat similar.
2005-12-20 02:30:36 +00:00
Bruce Momjian 1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Tom Lane 2e1254e7fa Repair planning bug introduced in 7.4: outer-join ON clauses that referenced
only the inner-side relation would be considered as potential equijoin clauses,
which is wrong because the condition doesn't necessarily hold above the point
of the outer join.  Per test case from Kevin Grittner (bug#1916).
2005-09-28 21:17:02 +00:00
Tom Lane 4e5fbb34b3 Change the division of labor between grouping_planner and query_planner
so that the latter estimates the number of groups that grouping will
produce.  This is needed because it is primarily query_planner that
makes the decision between fast-start and fast-finish plans, and in the
original coding it was unable to make more than a crude rule-of-thumb
choice when the query involved grouping.  This revision helps us make
saner choices for queries like SELECT ... GROUP BY ... LIMIT, as in a
recent example from Mark Kirkwood.  Also move the responsibility for
canonicalizing sort_pathkeys and group_pathkeys into query_planner;
this information has to be available anyway to support the first change,
and doing it this way lets us get rid of compare_noncanonical_pathkeys
entirely.
2005-08-27 22:13:44 +00:00
Tom Lane dfdf07aab1 Fix up LIMIT/OFFSET planning so that we cope with non-constant LIMIT
or OFFSET clauses by using estimate_expression_value().  The main advantage
of this is that if the expression is a Param and we have a value for the
Param, we'll use that value rather than defaulting.  Also, fix some
thinkos in the logic for combining LIMIT/OFFSET with an externally
supplied tuple fraction (this covers cases like EXISTS(...LIMIT...)).
And make sure the results of all this are shown by EXPLAIN.  Per a
gripe from Merlin Moncure.
2005-08-18 17:51:12 +00:00
Tom Lane 9ab4d98168 Remove planner's private fields from Query struct, and put them into
a new PlannerInfo struct, which is passed around instead of the bare
Query in all the planning code.  This commit is essentially just a
code-beautification exercise, but it does open the door to making
larger changes to the planner data structures without having to muck
with the widely-known Query struct.
2005-06-05 22:32:58 +00:00
Tom Lane e2159f3842 Teach the planner to remove SubqueryScan nodes from the plan if they
aren't doing anything useful (ie, neither selection nor projection).
Also, extend to SubqueryScan the hacks already in place to avoid
unnecessary ExecProject calls when the result would just be the same
tuple the subquery already delivered.  This saves some overhead in
UNION and other set operations, as well as avoiding overhead for
unflatten-able subqueries.  Per example from Sokolov Yura.
2005-05-22 22:30:20 +00:00
Tom Lane 79a1b00226 Replace slightly klugy create_bitmap_restriction() function with a
more efficient routine in restrictinfo.c (which can make use of
make_restrictinfo_internal).
2005-04-25 02:14:48 +00:00
Tom Lane 5b05185262 Remove support for OR'd indexscans internal to a single IndexScan plan
node, as this behavior is now better done as a bitmap OR indexscan.
This allows considerable simplification in nodeIndexscan.c itself as
well as several planner modules concerned with indexscan plan generation.
Also we can improve the sharing of code between regular and bitmap
indexscans, since they are now working with nigh-identical Plan nodes.
2005-04-25 01:30:14 +00:00
Tom Lane 7ace43e0c2 Fix oversight in MIN/MAX optimization: must not return NULL entries
from index, since the aggregates ignore NULLs.
2005-04-12 05:11:28 +00:00
Tom Lane addc42c339 Create the planner mechanism for optimizing simple MIN and MAX queries
into indexscans on matching indexes.  For the moment, it only handles
int4 and text datatypes; next step is to add a column to pg_aggregate
so that all MIN/MAX aggregates can be handled.  Per my recent proposal.
2005-04-11 23:06:57 +00:00
Tom Lane 595ed2a855 Make the behavior of HAVING without GROUP BY conform to the SQL spec.
Formerly, if such a clause contained no aggregate functions we mistakenly
treated it as equivalent to WHERE.  Per spec it must cause the query to
be treated as a grouped query of a single group, the same as appearance
of aggregate functions would do.  Also, the HAVING filter must execute
after aggregate function computation even if it itself contains no
aggregate functions.
2005-03-10 23:21:26 +00:00
PostgreSQL Daemon 2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Bruce Momjian da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Tom Lane 6bdfde9a77 When testing whether a sub-plan can do projection, use a general-purpose
check instead of hardwiring assumptions that only certain plan node types
can appear at the places where we are testing.  This was always a pretty
fragile assumption, and it turns out to be broken in 7.4 for certain cases
involving IN-subselect tests that need type coercion.
Also, modify code that builds finished Plan tree so that node types that
don't do projection always copy their input node's targetlist, rather than
having the tlist passed in from the caller.  The old method makes it too
easy to write broken code that thinks it can modify the tlist when it
cannot.
2004-01-18 00:50:03 +00:00
PostgreSQL Daemon 55b113257c make sure the $Id tags are converted to $PostgreSQL as well ... 2003-11-29 22:41:33 +00:00
Bruce Momjian 46785776c4 Another pgindent run with updated typedefs. 2003-08-08 21:42:59 +00:00
Bruce Momjian f3c3deb7d0 Update copyrights to 2003. 2003-08-04 02:40:20 +00:00
Bruce Momjian 089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Tom Lane 835bb975d8 Restructure building of join relation targetlists so that a join plan
node emits only those vars that are actually needed above it in the
plan tree.  (There were comments in the code suggesting that this was
done at some point in the dim past, but for a long time we have just
made join nodes emit everything that either input emitted.)  Aside from
being marginally more efficient, this fixes the problem noted by Peter
Eisentraut where a join above an IN-implemented-as-join might fail,
because the subplan targetlist constructed in the latter case didn't
meet the expectation of including everything.
Along the way, fix some places that were O(N^2) in the targetlist
length.  This is not all the trouble spots for wide queries by any
means, but it's a step forward.
2003-06-29 23:05:05 +00:00
Tom Lane bee217924d Support expressions of the form 'scalar op ANY (array)' and
'scalar op ALL (array)', where the operator is applied between the
lefthand scalar and each element of the array.  The operator must
yield boolean; the result of the construct is the OR or AND of the
per-element results, respectively.

Original coding by Joe Conway, after an idea of Peter's.  Rewritten
by Tom to keep the implementation strictly separate from subqueries.
2003-06-29 00:33:44 +00:00
Tom Lane 2cf57c8f8d Implement feature of new FE/BE protocol whereby RowDescription identifies
the column by table OID and column number, if it's a simple column
reference.  Along the way, get rid of reskey/reskeyop fields in Resdoms.
Turns out that representation was not convenient for either the planner
or the executor; we can make the planner deliver exactly what the
executor wants with no more effort.
initdb forced due to change in stored rule representation.
2003-05-06 00:20:33 +00:00
Tom Lane aa83bc04e0 Restructure parsetree representation of DECLARE CURSOR: now it's a
utility statement (DeclareCursorStmt) with a SELECT query dangling from
it, rather than a SELECT query with a few unusual fields in it.  Add
code to determine whether a planned query can safely be run backwards.
If DECLARE CURSOR specifies SCROLL, ensure that the plan can be run
backwards by adding a Materialize plan node if it can't.  Without SCROLL,
you get an error if you try to fetch backwards from a cursor that can't
handle it.  (There is still some discussion about what the exact
behavior should be, but this is necessary infrastructure in any case.)
Along the way, make EXPLAIN DECLARE CURSOR work.
2003-03-10 03:53:52 +00:00
Tom Lane f5e83662d0 Modify planner's implied-equality-deduction code so that when a set
of known-equal expressions includes any constant expressions (including
Params from outer queries), we actively suppress any 'var = var'
clauses that are or could be deduced from the set, generating only the
deducible 'var = const' clauses instead.  The idea here is to push down
the restrictions implied by the equality set to base relations whenever
possible.  Once we have applied the 'var = const' clauses, the 'var = var'
clauses are redundant, and should be suppressed both to save work at
execution and to avoid double-counting restrictivity.
2003-01-24 03:58:44 +00:00
Tom Lane bdfbfde1b1 IN clauses appearing at top level of WHERE can now be handled as joins.
There are two implementation techniques: the executor understands a new
JOIN_IN jointype, which emits at most one matching row per left-hand row,
or the result of the IN's sub-select can be fed through a DISTINCT filter
and then joined as an ordinary relation.
Along the way, some minor code cleanup in the optimizer; notably, break
out most of the jointree-rearrangement preprocessing in planner.c and
put it in a new file prep/prepjointree.c.
2003-01-20 18:55:07 +00:00
Tom Lane cde9f852e0 Now that switch_outer processing no longer relies on being run after
join_references(), it's practical to consolidate all join_references()
processing into the set_plan_references traversal in setrefs.c.  This
seems considerably cleaner than the old way where we did it for join
quals in createplan.c and for targetlists in setrefs.c.
2003-01-15 23:10:32 +00:00
Tom Lane de97072e3c Allow merge and hash joins to occur on arbitrary expressions (anything not
containing a volatile function), rather than only on 'Var = Var' clauses
as before.  This makes it practical to do flatten_join_alias_vars at the
start of planning, which in turn eliminates a bunch of klugery inside the
planner to deal with alias vars.  As a free side effect, we now detect
implied equality of non-Var expressions; for example in
	SELECT ... WHERE a.x = b.y and b.y = 42
we will deduce a.x = 42 and use that as a restriction qual on a.  Also,
we can remove the restriction introduced 12/5/02 to prevent pullup of
subqueries whose targetlists contain sublinks.
Still TODO: make statistical estimation routines in selfuncs.c and costsize.c
smarter about expressions that are more complex than plain Vars.  The need
for this is considerably greater now that we have to be able to estimate
the suitability of merge and hash join techniques on such expressions.
2003-01-15 19:35:48 +00:00
Tom Lane a0bf885f9e Phase 2 of read-only-plans project: restructure expression-tree nodes
so that all executable expression nodes inherit from a common supertype
Expr.  This is somewhat of an exercise in code purity rather than any
real functional advance, but getting rid of the extra Oper or Func node
formerly used in each operator or function call should provide at least
a little space and speed improvement.
initdb forced by changes in stored-rules representation.
2002-12-12 15:49:42 +00:00
Tom Lane 6c1d4662af Finish implementation of hashed aggregation. Add enable_hashagg GUC
parameter to allow it to be forced off for comparison purposes.
Add ORDER BY clauses to a bunch of regression test queries that will
otherwise produce randomly-ordered output in the new regime.
2002-11-21 00:42:20 +00:00
Tom Lane b60be3f2f8 Add an at-least-marginally-plausible method of estimating the number
of groups produced by GROUP BY.  This improves the accuracy of planning
estimates for grouped subselects, and is needed to check whether a
hashed aggregation plan risks memory overflow.
2002-11-19 23:22:00 +00:00
Tom Lane f6dba10e62 First phase of implementing hash-based grouping/aggregation. An AGG plan
node now does its own grouping of the input rows, and has no need for a
preceding GROUP node in the plan pipeline.  This allows elimination of
the misnamed tuplePerGroup option for GROUP, and actually saves more code
in nodeGroup.c than it costs in nodeAgg.c, as well as being presumably
faster.  Restructure the API of query_planner so that we do not commit to
using a sorted or unsorted plan in query_planner; instead grouping_planner
makes the decision.  (Right now it isn't any smarter than query_planner
was, but that will change as soon as it has the option to select a hash-
based aggregation step.)  Despite all the hackery, no initdb needed since
only in-memory node types changed.
2002-11-06 00:00:45 +00:00
Bruce Momjian e50f52a074 pgindent run. 2002-09-04 20:31:48 +00:00
Bruce Momjian d84fe82230 Update copyright to 2002. 2002-06-20 20:29:54 +00:00
Bruce Momjian 0dbfea39f3 Remove KSQO from GUC and move file to _deadcode. 2002-06-16 00:09:12 +00:00