Commit Graph

396 Commits

Author SHA1 Message Date
Bruce Momjian
fdf5a5efb7 pgindent run for 8.3. 2007-11-15 21:14:46 +00:00
Tom Lane
c291203ca3 Fix EquivalenceClass code to handle volatile sort expressions in a more
predictable manner; in particular that if you say ORDER BY output-column-ref,
it will in fact sort by that specific column even if there are multiple
syntactic matches.  An example is
	SELECT random() AS a, random() AS b FROM ... ORDER BY b, a;
While the use-case for this might be a bit debatable, it worked as expected
in earlier releases, so we should preserve the behavior for 8.3.  Per my
recent proposal.

While at it, fix convert_subquery_pathkeys() to handle RelabelType stripping
in both directions; it needs this for the same reasons make_sort_from_pathkeys
does.
2007-11-08 21:49:48 +00:00
Tom Lane
1be0601681 Last week's patch for make_sort_from_pathkeys wasn't good enough: it has
to be able to discard top-level RelabelType nodes on *both* sides of the
equivalence-class-to-target-list comparison, since make_pathkey_from_sortinfo
might either add or remove a RelabelType.  Also fix the latter to do the
removal case cleanly.  Per example from Peter.
2007-11-08 19:25:37 +00:00
Tom Lane
82d8ab6fc4 Fix the plan-invalidation mechanism to treat regclass constants that refer to
a relation as a reason to invalidate a plan when the relation changes.  This
handles scenarios such as dropping/recreating a sequence that is referenced by
nextval('seq') in a cached plan.  Rather than teach plancache.c all about
digging through plan trees to find regclass Consts, we charge the planner's
setrefs.c with making a list of the relation OIDs on which each plan depends.
That way the list can be built cheaply during a plan tree traversal that has
to happen anyway.  Per bug #3662 and subsequent discussion.
2007-10-11 18:05:27 +00:00
Tom Lane
89db887b1e Keep the planner from failing on "WHERE false AND something IN (SELECT ...)".
eval_const_expressions simplifies this to just "WHERE false", but we have
already done pull_up_IN_clauses so the IN join will be done, or at least
planned, anyway.  The trouble case comes when the sub-SELECT is itself a join
and we decide to implement the IN by unique-ifying the sub-SELECT outputs:
with no remaining reference to the output Vars in WHERE, we won't have
propagated the Vars up to the upper join point, leading to "variable not found
in subplan target lists" error.  Fix by adding an extra scan of in_info_list
and forcing all Vars mentioned therein to be propagated up to the IN join
point.  Per bug report from Miroslav Sulc.
2007-10-04 20:44:47 +00:00
Tom Lane
cdf0231c88 Create a function variable "join_search_hook" to let plugins override the
join search order portion of the planner; this is specifically intended to
simplify developing a replacement for GEQO planning.  Patch by Julius
Stroffek, editorialized on by me.  I renamed make_one_rel_by_joins to
standard_join_search and make_rels_by_joins to join_search_one_level to better
reflect their place within this scheme.
2007-09-26 18:51:51 +00:00
Tom Lane
7125687511 Fix cost estimates for EXISTS subqueries that are evaluated as initPlans
(because they are uncorrelated with the immediate parent query).  We were
charging the full run cost to the parent node, disregarding the fact that
only one row need be fetched for EXISTS.  While this would only be a
cosmetic issue in most cases, it might possibly affect planning outcomes
if the parent query were itself a subquery to some upper query.
Per recent discussion with Steve Crawford.
2007-09-22 21:36:40 +00:00
Tom Lane
282d2a03dd HOT updates. When we update a tuple without changing any of its indexed
columns, and the new version can be stored on the same heap page, we no longer
generate extra index entries for the new version.  Instead, index searches
follow the HOT-chain links to ensure they find the correct tuple version.

In addition, this patch introduces the ability to "prune" dead tuples on a
per-page basis, without having to do a complete VACUUM pass to recover space.
VACUUM is still needed to clean up dead index entries, however.

Pavan Deolasee, with help from a bunch of other people.
2007-09-20 17:56:33 +00:00
Magnus Hagander
906b2e1b37 Rename DLLIMPORT macro to PGDLLIMPORT to avoid conflict with
third party includes (like tcl) that define DLLIMPORT.
2007-07-25 12:22:54 +00:00
Tom Lane
604ffd280b Create hooks to let a loadable plugin monitor (or even replace) the planner
and/or create plans for hypothetical situations; in particular, investigate
plans that would be generated using hypothetical indexes.  This is a
heavily-rewritten version of the hooks proposed by Gurjeet Singh for his
Index Advisor project.  In this formulation, the index advisor can be
entirely a loadable module instead of requiring a significant part to be
in the core backend, and plans can be generated for hypothetical indexes
without requiring the creation and rolling-back of system catalog entries.

The index advisor patch as-submitted is not compatible with these hooks,
but it needs significant work anyway due to other 8.2-to-8.3 planner
changes.  With these hooks in the core backend, development of the advisor
can proceed as a pgfoundry project.
2007-05-25 17:54:25 +00:00
Tom Lane
d7153c5fad Fix best_inner_indexscan to return both the cheapest-total-cost and
cheapest-startup-cost innerjoin indexscans, and make joinpath.c consider
both of these (when different) as the inside of a nestloop join.  The
original design was based on the assumption that indexscan paths always
have negligible startup cost, and so total cost is the only important
figure of merit; an assumption that's obviously broken by bitmap
indexscans.  This oversight could lead to choosing poor plans in cases
where fast-start behavior is more important than total cost, such as
LIMIT and IN queries.  8.1-vintage brain fade exposed by an example from
Chuck D.
2007-05-22 01:40:33 +00:00
Tom Lane
2415ad9831 Teach tuplestore.c to throw away data before the "mark" point when the caller
is using mark/restore but not rewind or backward-scan capability.  Insert a
materialize plan node between a mergejoin and its inner child if the inner
child is a sort that is expected to spill to disk.  The materialize shields
the sort from the need to do mark/restore and thereby allows it to perform
its final merge pass on-the-fly; while the materialize itself is normally
cheap since it won't spill to disk unless the number of tuples with equal
key values exceeds work_mem.

Greg Stark, with some kibitzing from Tom Lane.
2007-05-21 17:57:35 +00:00
Tom Lane
d26559dbf3 Teach tuplesort.c about "top N" sorting, in which only the first N tuples
need be returned.  We keep a heap of the current best N tuples and sift-up
new tuples into it as we scan the input.  For M input tuples this means
only about M*log(N) comparisons instead of M*log(M), not to mention a lot
less workspace when N is small --- avoiding spill-to-disk for large M
is actually the most attractive thing about it.  Patch includes planner
and executor support for invoking this facility in ORDER BY ... LIMIT
queries.  Greg Stark, with some editorialization by moi.
2007-05-04 01:13:45 +00:00
Tom Lane
66888f7424 Expose more cursor-related functionality in SPI: specifically, allow
access to the planner's cursor-related planning options, and provide new
FETCH/MOVE routines that allow access to the full power of those commands.
Small refactoring of planner(), pg_plan_query(), and pg_plan_queries()
APIs to make it convenient to pass the planning options down from SPI.

This is the core-code portion of Pavel Stehule's patch for scrollable
cursor support in plpgsql; I'll review and apply the plpgsql changes
separately.
2007-04-16 01:14:58 +00:00
Tom Lane
fa92d21a48 Avoid running build_index_pathkeys() in situations where there cannot
possibly be any useful pathkeys --- to wit, queries with neither any
join clauses nor any ORDER BY request.  It's nearly free to check for
this case and it saves a useful fraction of the planning time for simple
queries.
2007-04-15 20:09:28 +00:00
Tom Lane
eab6b8b27e Turn the rangetable used by the executor into a flat list, and avoid storing
useless substructure for its RangeTblEntry nodes.  (I chose to keep using the
same struct node type and just zero out the link fields for unneeded info,
rather than making a separate ExecRangeTblEntry type --- it seemed too
fragile to have two different rangetable representations.)

Along the way, put subplans into a list in the toplevel PlannedStmt node,
and have SubPlan nodes refer to them by list index instead of direct pointers.
Vadim wanted to do that years ago, but I never understood what he was on about
until now.  It makes things a *whole* lot more robust, because we can stop
worrying about duplicate processing of subplans during expression tree
traversals.  That's been a constant source of bugs, and it's finally gone.

There are some consequent simplifications yet to be made, like not using
a separate EState for subplans in the executor, but I'll tackle that later.
2007-02-22 22:00:26 +00:00
Tom Lane
9cbd0c155d Remove the Query structure from the executor's API. This allows us to stop
storing mostly-redundant Query trees in prepared statements, portals, etc.
To replace Query, a new node type called PlannedStmt is inserted by the
planner at the top of a completed plan tree; this carries just the fields of
Query that are still needed at runtime.  The statement lists kept in portals
etc. now consist of intermixed PlannedStmt and bare utility-statement nodes
--- no Query.  This incidentally allows us to remove some fields from Query
and Plan nodes that shouldn't have been there in the first place.

Still to do: simplify the execution-time range table; at the moment the
range table passed to the executor still contains Query trees for subqueries.

initdb forced due to change of stored rules.
2007-02-20 17:32:18 +00:00
Tom Lane
7c5e5439d2 Get rid of some old and crufty global variables in the planner. When
this code was last gone over, there wasn't really any alternative to
globals because we didn't have the PlannerInfo struct being passed all
through the planner code.  Now that we do, we can restructure things
to avoid non-reentrancy.  I'm fooling with this because otherwise I'd
have had to add another global variable for the planned compact
range table list.
2007-02-19 07:03:34 +00:00
Tom Lane
6bef118b01 Restructure code that is responsible for ensuring that clauseless joins are
considered when it is necessary to do so because of a join-order restriction
(that is, an outer-join or IN-subselect construct).  The former coding was a
bit ad-hoc and inconsistent, and it missed some cases, as exposed by Mario
Weilguni's recent bug report.  His specific problem was that an IN could be
turned into a "clauseless" join due to constant-propagation removing the IN's
joinclause, and if the IN's subselect involved more than one relation and
there was more than one such IN linking to the same upper relation, then the
only valid join orders involve "bushy" plans but we would fail to consider the
specific paths needed to get there.  (See the example case added to the join
regression test.)  On examining the code I wonder if there weren't some other
problem cases too; in particular it seems that GEQO was defending against a
different set of corner cases than the main planner was.  There was also an
efficiency problem, in that when we did realize we needed a clauseless join
because of an IN, we'd consider clauseless joins against every other relation
whether this was sensible or not.  It seems a better design is to use the
outer-join and in-clause lists as a backup heuristic, just as the rule of
joining only where there are joinclauses is a heuristic: we'll join two
relations if they have a usable joinclause *or* this might be necessary to
satisfy an outer-join or IN-clause join order restriction.  I refactored the
code to have just one place considering this instead of three, and made sure
that it covered all the cases that any of them had been considering.

Backpatch as far as 8.1 (which has only the IN-clause form of the disease).
By rights 8.0 and 7.4 should have the bug too, but they accidentally fail
to fail, because the joininfo structure used in those releases preserves some
memory of there having once been a joinclause between the inner and outer
sides of an IN, and so it leads the code in the right direction anyway.
I'll be conservative and not touch them.
2007-02-16 00:14:01 +00:00
Tom Lane
5a7471c307 Add COST and ROWS options to CREATE/ALTER FUNCTION, plus underlying pg_proc
columns procost and prorows, to allow simple user adjustment of the estimated
cost of a function call, as well as control of the estimated number of rows
returned by a set-returning function.  We might eventually wish to extend this
to allow function-specific estimation routines, but there seems to be
consensus that we should try a simple constant estimate first.  In particular
this provides a relatively simple way to control the order in which different
WHERE clauses are applied in a plan node, which is a Good Thing in view of the
fact that the recent EquivalenceClass planner rewrite made that much less
predictable than before.
2007-01-22 01:35:23 +00:00
Tom Lane
f41803bb39 Refactor planner's pathkeys data structure to create a separate, explicit
representation of equivalence classes of variables.  This is an extensive
rewrite, but it brings a number of benefits:
* planner no longer fails in the presence of "incomplete" operator families
that don't offer operators for every possible combination of datatypes.
* avoid generating and then discarding redundant equality clauses.
* remove bogus assumption that derived equalities always use operators
named "=".
* mergejoins can work with a variety of sort orders (e.g., descending) now,
instead of tying each mergejoinable operator to exactly one sort order.
* better recognition of redundant sort columns.
* can make use of equalities appearing underneath an outer join.
2007-01-20 20:45:41 +00:00
Tom Lane
a191a169d6 Change the planner-to-executor API so that the planner tells the executor
which comparison operators to use for plan nodes involving tuple comparison
(Agg, Group, Unique, SetOp).  Formerly the executor looked up the default
equality operator for the datatype, which was really pretty shaky, since it's
possible that the data being fed to the node is sorted according to some
nondefault operator class that could have an incompatible idea of equality.
The planner knows what it has sorted by and therefore can provide the right
equality operator to use.  Also, this change moves a couple of catalog lookups
out of the executor and into the planner, which should help startup time for
pre-planned queries by some small amount.  Modify the planner to remove some
other cavalier assumptions about always being able to use the default
operators.  Also add "nulls first/last" info to the Plan node for a mergejoin
--- neither the executor nor the planner can cope yet, but at least the API is
in place.
2007-01-10 18:06:05 +00:00
Bruce Momjian
29dccf5fe0 Update CVS HEAD for 2007 copyright. Back branches are typically not
back-stamped for this.
2007-01-05 22:20:05 +00:00
Tom Lane
a78fcfb512 Restructure operator classes to allow improved handling of cross-data-type
cases.  Operator classes now exist within "operator families".  While most
families are equivalent to a single class, related classes can be grouped
into one family to represent the fact that they are semantically compatible.
Cross-type operators are now naturally adjunct parts of a family, without
having to wedge them into a particular opclass as we had done originally.

This commit restructures the catalogs and cleans up enough of the fallout so
that everything still works at least as well as before, but most of the work
needed to actually improve the planner's behavior will come later.  Also,
there are not yet CREATE/DROP/ALTER OPERATOR FAMILY commands; the only way
to create a new family right now is to allow CREATE OPERATOR CLASS to make
one by default.  I owe some more documentation work, too.  But that can all
be done in smaller pieces once this infrastructure is in place.
2006-12-23 00:43:13 +00:00
Tom Lane
f18c57fdf1 Fix planner to do the right thing when a degenerate outer join (one whose
joinclause doesn't use any outer-side vars) requires a "bushy" plan to be
created.  The normal heuristic to avoid joins with no joinclause has to be
overridden in that case.  Problem is new in 8.2; before that we forced the
outer join order anyway.  Per example from Teodor.
2006-12-12 21:31:02 +00:00
Bruce Momjian
f99a569a2e pgindent run for 8.2. 2006-10-04 00:30:14 +00:00
Tom Lane
0f8fc35a5a Increase default value of effective_cache_size to 128MB, per discussion. 2006-09-25 22:12:24 +00:00
Tom Lane
b74c543685 Improve usage of effective_cache_size parameter by assuming that all the
tables in the query compete for cache space, not just the one we are
currently costing an indexscan for.  This seems more realistic, and it
definitely will help in examples recently exhibited by Stefan
Kaltenbrunner.  To get the total size of all the tables involved, we must
tweak the handling of 'append relations' a bit --- formerly we looked up
information about the child tables on-the-fly during set_append_rel_pathlist,
but it needs to be done before we start doing any cost estimation, so
push it into the add_base_rels_to_query scan.
2006-09-19 22:49:53 +00:00
Tom Lane
7a3e30e608 Add INSERT/UPDATE/DELETE RETURNING, with basic docs and regression tests.
plpgsql support to come later.  Along the way, convert execMain's
SELECT INTO support into a DestReceiver, in order to eliminate some ugly
special cases.

Jonah Harris and Tom Lane
2006-08-12 02:52:06 +00:00
Joe Conway
9caafda579 Add support for multi-row VALUES clauses as part of INSERT statements
(e.g. "INSERT ... VALUES (...), (...), ...") and elsewhere as allowed
by the spec. (e.g. similar to a FROM clause subselect). initdb required.
Joe Conway and Tom Lane.
2006-08-02 01:59:48 +00:00
Tom Lane
09d3670df3 Change the relation_open protocol so that we obtain lock on a relation
(table or index) before trying to open its relcache entry.  This fixes
race conditions in which someone else commits a change to the relation's
catalog entries while we are in process of doing relcache load.  Problems
of that ilk have been reported sporadically for years, but it was not
really practical to fix until recently --- for instance, the recent
addition of WAL-log support for in-place updates helped.

Along the way, remove pg_am.amconcurrent: all AMs are now expected to support
concurrent update.
2006-07-31 20:09:10 +00:00
Peter Eisentraut
79bc99a467 Convert effective_cache_size to an integer, for better integration with
upcoming units feature.
2006-07-26 11:35:56 +00:00
Bruce Momjian
085e559654 Change LIMIT/OFFSET to use int8
Dhanaraj M
2006-07-26 00:34:48 +00:00
Tom Lane
98359c3e3f In the recent changes to make the planner account better for cache
effects in a nestloop inner indexscan, I had only dealt with plain index
scans and the index portion of bitmap scans.  But there will be cache
benefits for the heap accesses of bitmap scans too, so fix
cost_bitmap_heap_scan() to account for that.
2006-07-22 15:41:56 +00:00
Tom Lane
9b556322c5 Fix some missing inclusions identified with new pgcheckdefines tool. 2006-07-15 03:35:21 +00:00
Bruce Momjian
a22d76d96a Allow include files to compile own their own.
Strip unused include files out unused include files, and add needed
includes to C files.

The next step is to remove unused include files in C files.
2006-07-13 16:49:20 +00:00
Tom Lane
cffd89ca73 Revise the planner's handling of "pseudoconstant" WHERE clauses, that is
clauses containing no variables and no volatile functions.  Such a clause
can be used as a one-time qual in a gating Result plan node, to suppress
plan execution entirely when it is false.  Even when the clause is true,
putting it in a gating node wins by avoiding repeated evaluation of the
clause.  In previous PG releases, query_planner() would do this for
pseudoconstant clauses appearing at the top level of the jointree, but
there was no ability to generate a gating Result deeper in the plan tree.
To fix it, get rid of the special case in query_planner(), and instead
process pseudoconstant clauses through the normal RestrictInfo qual
distribution mechanism.  When a pseudoconstant clause is found attached to
a path node in create_plan(), pull it out and generate a gating Result at
that point.  This requires special-casing pseudoconstants in selectivity
estimation and cost_qual_eval, but on the whole it's pretty clean.
It probably even makes the planner a bit faster than before for the normal
case of no pseudoconstants, since removing pull_constant_clauses saves one
useless traversal of the qual tree.  Per gripe from Phil Frost.
2006-07-01 18:38:33 +00:00
Tom Lane
8a30cc2127 Make the planner estimate costs for nestloop inner indexscans on the basis
that the Mackert-Lohmann formula applies across all the repetitions of the
nestloop, not just each scan independently.  We use the M-L formula to
estimate the number of pages fetched from the index as well as from the table;
that isn't what it was designed for, but it seems reasonably applicable
anyway.  This makes large numbers of repetitions look much cheaper than
before, which accords with many reports we've received of overestimation
of the cost of a nestloop.  Also, change the index access cost model to
charge random_page_cost per index leaf page touched, while explicitly
not counting anything for access to metapage or upper tree pages.  This
may all need tweaking after we get some field experience, but in simple
tests it seems to be giving saner results than before.  The main thing
is to get the infrastructure in place to let cost_index() and amcostestimate
functions take repeated scans into account at all.  Per my recent proposal.

Note: this patch changes pg_proc.h, but I did not force initdb because
the changes are basically cosmetic --- the system does not look into
pg_proc to decide how to call an index amcostestimate function, and
there's no way to call such a function from SQL at all.
2006-06-06 17:59:58 +00:00
Tom Lane
e4de635a2b Increase the default value of cpu_index_tuple_cost from 0.001 to 0.005.
This shouldn't affect simple indexscans much, while for bitmap scans that
are touching a lot of index rows, this seems to bring the estimates more
in line with reality.  Per recent discussion.
2006-06-05 03:03:42 +00:00
Tom Lane
eed6c9ed7e Add a GUC parameter seq_page_cost, and use that everywhere we formerly
assumed that a sequential page fetch has cost 1.0.  This patch doesn't
in itself change the system's behavior at all, but it opens the door to
people adopting other units of measurement for EXPLAIN costs.  Also, if
we ever decide it's worth inventing per-tablespace access cost settings,
this change provides a workable intellectual framework for that.
2006-06-05 02:49:58 +00:00
Bruce Momjian
f2f5b05655 Update copyright for 2006. Update scripts. 2006-03-05 15:59:11 +00:00
Tom Lane
336a6491aa Improve my initial, rather hacky implementation of joins to append
relations: fix the executor so that we can have an Append plan on the
inside of a nestloop and still pass down outer index keys to index scans
within the Append, then generate such plans as if they were regular
inner indexscans.  This avoids the need to evaluate the outer relation
multiple times.
2006-02-05 02:59:17 +00:00
Tom Lane
3893127431 Fix constraint exclusion to work in inherited UPDATE/DELETE queries
... in fact, it will be applied now in any query whatsoever.  I'm still
a bit concerned about the cycles that might be expended in failed proof
attempts, but given that CE is turned off by default, it's the user's
choice whether to expend those cycles or not.  (Possibly we should
change the simple bool constraint_exclusion parameter to something
more fine-grained?)
2006-02-04 23:03:20 +00:00
Tom Lane
8b109ebf14 Teach planner to convert simple UNION ALL subqueries into append relations,
thereby sharing code with the inheritance case.  This puts the UNION-ALL-view
approach to partitioned tables on par with inheritance, so far as constraint
exclusion is concerned: it works either way.  (Still need to update the docs
to say so.)  The definition of "simple UNION ALL" is a little simpler than
I would like --- basically the union arms can only be SELECT * FROM foo
--- but it's good enough for partitioned-table cases.
2006-02-03 21:08:49 +00:00
Bruce Momjian
59bb147353 Update random() usage so ranges are inclusive/exclusive as required. 2006-02-03 12:45:47 +00:00
Tom Lane
8a1468af4e Restructure planner's handling of inheritance. Rather than processing
inheritance trees on-the-fly, which pretty well constrained us to considering
only one way of planning inheritance, expand inheritance sets during the
planner prep phase, and build a side data structure that can be consulted
later to find which RTEs are members of which inheritance sets.  As proof of
concept, use the data structure to plan joins against inheritance sets more
efficiently: we can now use indexes on the set members in inner-indexscan
joins.  (The generated plans could be improved further, but it'll take some
executor changes.)  This data structure will also support handling UNION ALL
subqueries in the same way as inheritance sets, but that aspect of it isn't
finished yet.
2006-01-31 21:39:25 +00:00
Tom Lane
a1b7e70c5f Fix code that checks to see if an index can be considered to match the query's
requested sort order.  It was assuming that build_index_pathkeys always
generates a pathkey per index column, which was not true if implied equality
deduction had determined that two index columns were effectively equated to
each other.  Simplest fix seems to be to install an option that causes
build_index_pathkeys to support this behavior as well as the original one.
Per report from Brian Hirt.
2006-01-29 17:27:42 +00:00
Tom Lane
3a0a16cb7e Allow row comparisons to be used as indexscan qualifications.
This completes the project to upgrade our handling of row comparisons.
2006-01-25 20:29:24 +00:00
Tom Lane
e3b9852728 Teach planner how to rearrange join order for some classes of OUTER JOIN.
Per my recent proposal.  I ended up basing the implementation on the
existing mechanism for enforcing valid join orders of IN joins --- the
rules for valid outer-join orders are somewhat similar.
2005-12-20 02:30:36 +00:00
Tom Lane
da27c0a1ef Teach tid-scan code to make use of "ctid = ANY (array)" clauses, so that
"ctid IN (list)" will still work after we convert IN to ScalarArrayOpExpr.
Make some minor efficiency improvements while at it, such as ensuring that
multiple TIDs are fetched in physical heap order.  And fix EXPLAIN so that
it shows what's really going on for a TID scan.
2005-11-26 22:14:57 +00:00
Tom Lane
290166f934 Teach planner and executor to handle ScalarArrayOpExpr as an indexable
qualification when the underlying operator is indexable and useOr is true.
That is, indexkey op ANY (ARRAY[...]) is effectively translated into an
OR combination of one indexscan for each array element.  This only works
for bitmap index scans, of course, since regular indexscans no longer
support OR'ing of scans.  There are still some loose ends to clean up
before changing 'x IN (list)' to translate as a ScalarArrayOpExpr;
for instance predtest.c ought to be taught about it.  But this gets the
basic functionality in place.
2005-11-25 19:47:50 +00:00
Tom Lane
1bdf124b94 Restore the former RestrictInfo field valid_everywhere (but invert the flag
sense and rename to "outerjoin_delayed" to more clearly reflect what it
means).  I had decided that it was redundant in 8.1, but the folly of this
is exposed by a bug report from Sebastian Böck.  The place where it's
needed is to prevent orindxpath.c from cherry-picking arms of an outer-join
OR clause to form a relation restriction that isn't actually legal to push
down to the relation scan level.  There may be some legal cases that this
forbids optimizing, but we'd need much closer analysis to determine it.
2005-11-14 23:54:23 +00:00
Bruce Momjian
1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Tom Lane
2e1254e7fa Repair planning bug introduced in 7.4: outer-join ON clauses that referenced
only the inner-side relation would be considered as potential equijoin clauses,
which is wrong because the condition doesn't necessarily hold above the point
of the outer join.  Per test case from Kevin Grittner (bug#1916).
2005-09-28 21:17:02 +00:00
Tom Lane
4e5fbb34b3 Change the division of labor between grouping_planner and query_planner
so that the latter estimates the number of groups that grouping will
produce.  This is needed because it is primarily query_planner that
makes the decision between fast-start and fast-finish plans, and in the
original coding it was unable to make more than a crude rule-of-thumb
choice when the query involved grouping.  This revision helps us make
saner choices for queries like SELECT ... GROUP BY ... LIMIT, as in a
recent example from Mark Kirkwood.  Also move the responsibility for
canonicalizing sort_pathkeys and group_pathkeys into query_planner;
this information has to be available anyway to support the first change,
and doing it this way lets us get rid of compare_noncanonical_pathkeys
entirely.
2005-08-27 22:13:44 +00:00
Bruce Momjian
a7f49252d2 enable_constraint_exclusion => constraint_exclusion
Also improve wording.
2005-08-22 17:35:03 +00:00
Tom Lane
dfdf07aab1 Fix up LIMIT/OFFSET planning so that we cope with non-constant LIMIT
or OFFSET clauses by using estimate_expression_value().  The main advantage
of this is that if the expression is a Param and we have a value for the
Param, we'll use that value rather than defaulting.  Also, fix some
thinkos in the logic for combining LIMIT/OFFSET with an externally
supplied tuple fraction (this covers cases like EXISTS(...LIMIT...)).
And make sure the results of all this are shown by EXPLAIN.  Per a
gripe from Merlin Moncure.
2005-08-18 17:51:12 +00:00
Tom Lane
a4ca842319 Fix a bunch of bad interactions between partial indexes and the new
planning logic for bitmap indexscans.  Partial indexes create corner
cases in which a scan might be done with no explicit index qual conditions,
and the code wasn't handling those cases nicely.  Also be a little
tenser about eliminating redundant clauses in the generated plan.
Per report from Dmitry Karasik.
2005-07-28 20:26:22 +00:00
Tom Lane
d007a95055 Simple constraint exclusion. For now, only child tables of inheritance
scans are candidates for exclusion; this should be fixed eventually.
Simon Riggs, with some help from Tom Lane.
2005-07-23 21:05:48 +00:00
Tom Lane
cc5e80b8d1 Teach planner about some cases where a restriction clause can be
propagated inside an outer join.  In particular, given
LEFT JOIN ON (A = B) WHERE A = constant, we cannot conclude that
B = constant at the top level (B might be null instead), but we
can nonetheless put a restriction B = constant into the quals for
B's relation, since no inner-side rows not meeting that condition
can contribute to the final result.  Similarly, given
FULL JOIN USING (J) WHERE J = constant, we can't directly conclude
that either input J variable = constant, but it's OK to push such
quals into each input rel.  Per recent gripe from Kim Bisgaard.
Along the way, remove 'valid_everywhere' flag from RestrictInfo,
as on closer analysis it was not being used for anything, and was
defined backwards anyway.
2005-07-02 23:00:42 +00:00
Tom Lane
2f1210629c Separate predicate-testing code out of indxpath.c, making it a module
in its own right.  As proposed by Simon Riggs, but with some editorializing
of my own.
2005-06-10 22:25:37 +00:00
Tom Lane
3b167a4099 If a LIMIT is applied to a UNION ALL query, plan each UNION arm as
if the limit were directly applied to it.  This does not actually
add a LIMIT plan node to the generated subqueries --- that would be
useless overhead --- but it does cause the planner to prefer fast-
start plans when the limit is small.  After an idea from Phil Endecott.
2005-06-10 02:21:05 +00:00
Tom Lane
a31ad27fc5 Simplify the planner's join clause management by storing join clauses
of a relation in a flat 'joininfo' list.  The former arrangement grouped
the join clauses according to the set of unjoined relids used in each;
however, profiling on test cases involving lots of joins proves that
that data structure is a net loss.  It takes more time to group the
join clauses together than is saved by avoiding duplicate tests later.
It doesn't help any that there are usually not more than one or two
clauses per group ...
2005-06-09 04:19:00 +00:00
Tom Lane
9ab4d98168 Remove planner's private fields from Query struct, and put them into
a new PlannerInfo struct, which is passed around instead of the bare
Query in all the planning code.  This commit is essentially just a
code-beautification exercise, but it does open the door to making
larger changes to the planner data structures without having to muck
with the widely-known Query struct.
2005-06-05 22:32:58 +00:00
Tom Lane
e2159f3842 Teach the planner to remove SubqueryScan nodes from the plan if they
aren't doing anything useful (ie, neither selection nor projection).
Also, extend to SubqueryScan the hacks already in place to avoid
unnecessary ExecProject calls when the result would just be the same
tuple the subquery already delivered.  This saves some overhead in
UNION and other set operations, as well as avoiding overhead for
unflatten-able subqueries.  Per example from Sokolov Yura.
2005-05-22 22:30:20 +00:00
Tom Lane
79a1b00226 Replace slightly klugy create_bitmap_restriction() function with a
more efficient routine in restrictinfo.c (which can make use of
make_restrictinfo_internal).
2005-04-25 02:14:48 +00:00
Tom Lane
5b05185262 Remove support for OR'd indexscans internal to a single IndexScan plan
node, as this behavior is now better done as a bitmap OR indexscan.
This allows considerable simplification in nodeIndexscan.c itself as
well as several planner modules concerned with indexscan plan generation.
Also we can improve the sharing of code between regular and bitmap
indexscans, since they are now working with nigh-identical Plan nodes.
2005-04-25 01:30:14 +00:00
Tom Lane
bc843d3960 First cut at planner support for bitmap index scans. Lots to do yet,
but the code is basically working.  Along the way, rewrite the entire
approach to processing OR index conditions, and make it work in join
cases for the first time ever.  orindxpath.c is now basically obsolete,
but I left it in for the time being to allow easy comparison testing
against the old implementation.
2005-04-22 21:58:32 +00:00
Tom Lane
14c7fba3f7 Rethink original decision to use AND/OR Expr nodes to represent bitmap
logic operations during planning.  Seems cleaner to create two new Path
node types, instead --- this avoids duplication of cost-estimation code.
Also, create an enable_bitmapscan GUC parameter to control use of bitmap
plans.
2005-04-21 19:18:13 +00:00
Tom Lane
e6f7edb9d5 Install some slightly realistic cost estimation for bitmap index scans. 2005-04-21 02:28:02 +00:00
Tom Lane
4a8c5d0375 Create executor and planner-backend support for decoupled heap and index
scans, using in-memory tuple ID bitmaps as the intermediary.  The planner
frontend (path creation and cost estimation) is not there yet, so none
of this code can be executed.  I have tested it using some hacked planner
code that is far too ugly to see the light of day, however.  Committing
now so that the bulk of the infrastructure changes go in before the tree
drifts under me.
2005-04-19 22:35:18 +00:00
Tom Lane
7ace43e0c2 Fix oversight in MIN/MAX optimization: must not return NULL entries
from index, since the aggregates ignore NULLs.
2005-04-12 05:11:28 +00:00
Tom Lane
addc42c339 Create the planner mechanism for optimizing simple MIN and MAX queries
into indexscans on matching indexes.  For the moment, it only handles
int4 and text datatypes; next step is to add a column to pg_aggregate
so that all MIN/MAX aggregates can be handled.  Per my recent proposal.
2005-04-11 23:06:57 +00:00
Tom Lane
ad161bcc8a Merge Resdom nodes into TargetEntry nodes to simplify code and save a
few palloc's.  I also chose to eliminate the restype and restypmod fields
entirely, since they are redundant with information stored in the node's
contained expression; re-examining the expression at need seems simpler
and more reliable than trying to keep restype/restypmod up to date.

initdb forced due to change in contents of stored rules.
2005-04-06 16:34:07 +00:00
Tom Lane
5db2e83852 Rethink the order of expression preprocessing: eval_const_expressions
really ought to run before canonicalize_qual, because it can now produce
forms that canonicalize_qual knows how to improve (eg, NOT clauses).
Also, because eval_const_expressions already knows about flattening
nested ANDs and ORs into N-argument form, the initial flatten_andors
pass in canonicalize_qual is now completely redundant and can be
removed.  This doesn't save a whole lot of code, but the time and
palloc traffic eliminated is a useful gain on large expression trees.
2005-03-28 00:58:26 +00:00
Tom Lane
926e8a00d3 Add a back-link from IndexOptInfo structs to their parent RelOptInfo
structs.  There are many places in the planner where we were passing
both a rel and an index to subroutines, and now need only pass the
index struct.  Notationally simpler, and perhaps a tad faster.
2005-03-27 06:29:49 +00:00
Tom Lane
febc9a613c Expand the 'special index operator' machinery to handle special cases
for boolean indexes.  Previously we would only use such an index with
WHERE clauses like 'indexkey = true' or 'indexkey = false'.  The new
code transforms the cases 'indexkey', 'NOT indexkey', 'indexkey IS TRUE',
and 'indexkey IS FALSE' into one of these.  While this is only marginally
useful in itself, I intend soon to change constant-expression simplification
so that 'foo = true' and 'foo = false' are reduced to just 'foo' and
'NOT foo' ... which would lose the ability to use boolean indexes for
such queries at all, if the indexscan machinery couldn't make the
reverse transformation.
2005-03-26 23:29:20 +00:00
Neil Conway
d344505d1b This patch moves some code for preprocessing FOR UPDATE from
grouping_planner() to preprocess_targetlist(), according to a comment
in grouping_planner(). I think the refactoring makes sense, and moves
some extraneous details out of grouping_planner().
2005-03-17 23:45:09 +00:00
Tom Lane
595ed2a855 Make the behavior of HAVING without GROUP BY conform to the SQL spec.
Formerly, if such a clause contained no aggregate functions we mistakenly
treated it as equivalent to WHERE.  Per spec it must cause the query to
be treated as a grouped query of a single group, the same as appearance
of aggregate functions would do.  Also, the HAVING filter must execute
after aggregate function computation even if it itself contains no
aggregate functions.
2005-03-10 23:21:26 +00:00
Tom Lane
0bf2587df4 Improve planner's estimation of the space needed for HashAgg plans:
look at the actual aggregate transition datatypes and the actual overhead
needed by nodeAgg.c, instead of using pessimistic round numbers.
Per a discussion with Michael Tiemann.
2005-01-28 19:34:28 +00:00
Tom Lane
94e4778a31 The result of a FULL or RIGHT join can't be assumed to be sorted by the
left input's sorting, because null rows may be inserted at various points.
Per report from Ferenc Lutischá¸n.
2005-01-23 02:21:36 +00:00
PostgreSQL Daemon
2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Tom Lane
9309d5f2ba In ALTER COLUMN TYPE, strip any implicit coercion operations appearing
at the top level of the column's old default expression before adding
an implicit coercion to the new column type.  This seems to satisfy the
principle of least surprise, as per discussion of bug #1290.
2004-10-22 17:20:05 +00:00
Tom Lane
26112850ec Fix OR-index-scan planner to recognize that a partial index is usable
for scanning one term of an OR clause if the index's predicate is implied
by that same OR clause term (possibly in conjunction with top-level WHERE
clauses).  Per recent example from Dawid Kuroczko,
http://archives.postgresql.org/pgsql-performance/2004-10/msg00095.php
Also, fix a very long-standing bug in index predicate testing, namely the
bizarre ordering of decomposition of predicate and restriction clauses.
AFAICS the correct way is to break down the predicate all the way, and
then for each component term see if you can prove it from the entire
restriction set.  The original coding had a purely-implementation-artifact
distinction between ANDing at the top level and ANDing below that, and
proceeded to get the decomposition order wrong everywhere below the top
level, with the result that even slightly complicated AND/OR predicates
could not be proven.  For instance, given
create index foop on foo(f2) where f1=42 or f1=1
    or (f1 = 11 and f2 = 55);
the old code would fail to match this index to the query
select * from foo where f1 = 11 and f2 = 55;
when it obviously ought to match.
2004-10-11 22:57:00 +00:00
Tom Lane
47aa95e951 Clean up handling of inherited-table update queries, per bug report
from Sebastian Böck.  The fix involves being more consistent about
when rangetable entries are copied or modified.  Someday we really
need to fix this stuff to not scribble on its input data structures
in the first place...
2004-10-02 22:39:49 +00:00
Bruce Momjian
b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Bruce Momjian
da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Tom Lane
7643bed58e When using extended-query protocol, postpone planning of unnamed statements
until Bind is received, so that actual parameter values are visible to the
planner.  Make use of the parameter values for estimation purposes (but
don't fold them into the actual plan).  This buys back most of the
potential loss of plan quality that ensues from using out-of-line
parameters instead of putting literal values right into the query text.

This patch creates a notion of constant-folding expressions 'for
estimation purposes only', in which case we can be more aggressive than
the normal eval_const_expressions() logic can be.  Right now the only
difference in behavior is inserting bound values for Params, but it will
be interesting to look at other possibilities.  One that we've seen
come up repeatedly is reducing now() and related functions to current
values, so that queries like ... WHERE timestampcol > now() - '1 day'
have some chance of being planned effectively.

Oliver Jowett, with some kibitzing from Tom Lane.
2004-06-11 01:09:22 +00:00
Tom Lane
2f63232d30 Promote row expressions to full-fledged citizens of the expression syntax,
rather than allowing them only in a few special cases as before.  In
particular you can now pass a ROW() construct to a function that accepts
a rowtype parameter.  Internal generation of RowExprs fixes a number of
corner cases that used to not work very well, such as referencing the
whole-row result of a JOIN or subquery.  This represents a further step in
the work I started a month or so back to make rowtype values into
first-class citizens.
2004-05-10 22:44:49 +00:00
Tom Lane
989067bd22 Extend set-operation planning to keep track of the sort ordering induced
by the set operation, so that redundant sorts at higher levels can be
avoided.  This was foreseen a good while back, but not done.  Per request
from Karel Zak.
2004-04-07 18:17:25 +00:00
Tom Lane
04226b6404 Tweak planner so that index expressions and predicates are matched to
queries without regard to whether coercions are stated explicitly or
implicitly.  Per suggestion from Stephan Szabo.
2004-03-14 23:41:27 +00:00
Tom Lane
a536ed53bc Make use of statistics on index expressions. There are still some
corner cases that could stand improvement, but it does all the basic
stuff.  A byproduct is that the selectivity routines are no longer
constrained to working on simple Vars; we might in future be able to
improve the behavior for subexpressions that don't match indexes.
2004-02-17 00:52:53 +00:00
Tom Lane
3969f2924b Revise GEQO planner to make use of some heuristic knowledge about SQL, namely
that it's good to join where there are join clauses rather than where there
are not.  Also enable it to generate bushy plans at need, so that it doesn't
fail in the presence of multiple IN clauses containing sub-joins.  These
changes appear to improve the behavior enough that we can substantially reduce
the default pool size and generations count, thereby decreasing the runtime,
and yet get as good or better plans as we were getting in 7.4.  Consequently,
adjust the default GEQO parameters.  I also modified the way geqo_effort is
used so that it affects both population size and number of generations;
it's now useful as a single control to adjust the GEQO runtime-vs-plan-quality
tradeoff.  Bump geqo_threshold to 12, since even with these changes GEQO
seems to be slower than the regular planner at 11 relations.
2004-01-23 23:54:21 +00:00
Tom Lane
672a807028 Repair error apparently introduced in the initial coding of GUC: the
default value for geqo_effort is supposed to be 40, not 1.  The actual
'genetic' component of the GEQO algorithm has been practically disabled
since 7.1 because of this mistake.  Improve documentation while at it.
2004-01-21 23:33:34 +00:00
Tom Lane
6bdfde9a77 When testing whether a sub-plan can do projection, use a general-purpose
check instead of hardwiring assumptions that only certain plan node types
can appear at the places where we are testing.  This was always a pretty
fragile assumption, and it turns out to be broken in 7.4 for certain cases
involving IN-subselect tests that need type coercion.
Also, modify code that builds finished Plan tree so that node types that
don't do projection always copy their input node's targetlist, rather than
having the tlist passed in from the caller.  The old method makes it too
easy to write broken code that thinks it can modify the tlist when it
cannot.
2004-01-18 00:50:03 +00:00
Tom Lane
fa559a86ee Adjust indexscan planning logic to keep RestrictInfo nodes associated
with index qual clauses in the Path representation.  This saves a little
work during createplan and (probably more importantly) allows reuse of
cached selectivity estimates during indexscan planning.  Also fix latent
bug: wrong plan would have been generated for a 'special operator' used
in a nestloop-inner-indexscan join qual, because the special operator
would not have gotten into the list of quals to recheck.  This bug is
only latent because at present the special-operator code could never
trigger on a join qual, but sooner or later someone will want to do it.
2004-01-05 23:39:54 +00:00
Tom Lane
5c74ce23db Improve UniquePath logic to detect the case where the input is already
known unique (eg, it is a SELECT DISTINCT ... subquery), and not do a
redundant unique-ification step.
2004-01-05 18:04:39 +00:00
Tom Lane
9091e8d1b2 Add the ability to extract OR indexscan conditions from OR-of-AND
join conditions in which each OR subclause includes a constraint on
the same relation.  This implements the other useful side-effect of
conversion to CNF format, without its unpleasant side-effects.  As
per pghackers discussion of a few weeks ago.
2004-01-05 05:07:36 +00:00
Tom Lane
82b4dd394f Merge restrictlist_selectivity into clauselist_selectivity by
teaching the latter to accept either RestrictInfo nodes or bare
clause expressions; and cache the selectivity result in the RestrictInfo
node when possible.  This extends the caching behavior of approx_selectivity
to many more contexts, and should reduce duplicate selectivity
calculations.
2004-01-04 03:51:52 +00:00
Tom Lane
6cb1c0238b Rewrite OR indexscan processing to be more flexible. We can now for the
first time generate an OR indexscan for a two-column index when the WHERE
condition is like 'col1 = foo AND (col2 = bar OR col2 = baz)' --- before,
the OR had to be on the first column of the index or we'd not notice the
possibility of using it.  Some progress towards extracting OR indexscans
from subclauses of an OR that references multiple relations, too, although
this code is #ifdef'd out because it needs more work.
2004-01-04 00:07:32 +00:00