Commit Graph

97 Commits

Author SHA1 Message Date
Bruce Momjian 0239800893 Update copyright for the year 2010. 2010-01-02 16:58:17 +00:00
Tom Lane 1a95f12702 Eliminate a lot of list-management overhead within join_search_one_level
by adding a requirement that build_join_rel add new join RelOptInfos to the
appropriate list immediately at creation.  Per report from Robert Haas,
the list_concat_unique_ptr() calls that this change eliminates were taking
the lion's share of the runtime in larger join problems.  This doesn't do
anything to fix the fundamental combinatorial explosion in large join
problems, but it should push out the threshold of pain a bit further.

Note: because this changes the order in which joinrel lists are built,
it might result in changes in selected plans in cases where different
alternatives have exactly the same costs.  There is one example in the
regression tests.
2009-11-28 00:46:19 +00:00
Tom Lane 0adaf4cb31 Move the handling of SELECT FOR UPDATE locking and rechecking out of
execMain.c and into a new plan node type LockRows.  Like the recent change
to put table updating into a ModifyTable plan node, this increases planning
flexibility by allowing the operations to occur below the top level of the
plan tree.  It's necessary in any case to restore the previous behavior of
having FOR UPDATE locking occur before ModifyTable does.

This partially refactors EvalPlanQual to allow multiple rows-under-test
to be inserted into the EPQ machinery before starting an EPQ test query.
That isn't sufficient to fix EPQ's general bogosity in the face of plans
that return multiple rows per test row, though.  Since this patch is
mostly about getting some plan node infrastructure in place and not about
fixing ten-year-old bugs, I will leave EPQ improvements for another day.

Another behavioral change that we could now think about is doing FOR UPDATE
before LIMIT, but that too seems like it should be treated as a followon
patch.
2009-10-12 18:10:51 +00:00
Bruce Momjian d747140279 8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list
provided by Andrew.
2009-06-11 14:49:15 +00:00
Bruce Momjian 511db38ace Update copyright for 2009. 2009-01-01 17:24:05 +00:00
Tom Lane e6ae3b5dbf Add a concept of "placeholder" variables to the planner. These are variables
that represent some expression that we desire to compute below the top level
of the plan, and then let that value "bubble up" as though it were a plain
Var (ie, a column value).

The immediate application is to allow sub-selects to be flattened even when
they are below an outer join and have non-nullable output expressions.
Formerly we couldn't flatten because such an expression wouldn't properly
go to NULL when evaluated above the outer join.  Now, we wrap it in a
PlaceHolderVar and arrange for the actual evaluation to occur below the outer
join.  When the resulting Var bubbles up through the join, it will be set to
NULL if necessary, yielding the correct results.  This fixes a planner
limitation that's existed since 7.1.

In future we might want to use this mechanism to re-introduce some form of
Hellerstein's "expensive functions" optimization, ie place the evaluation of
an expensive function at the most suitable point in the plan tree.
2008-10-21 20:42:53 +00:00
Tom Lane 44d5be0e53 Implement SQL-standard WITH clauses, including WITH RECURSIVE.
There are some unimplemented aspects: recursive queries must use UNION ALL
(should allow UNION too), and we don't have SEARCH or CYCLE clauses.
These might or might not get done for 8.4, but even without them it's a
pretty useful feature.

There are also a couple of small loose ends and definitional quibbles,
which I'll send a memo about to pgsql-hackers shortly.  But let's land
the patch now so we can get on with other development.

Yoshiyuki Asaba, with lots of help from Tatsuo Ishii and Tom Lane
2008-10-04 21:56:55 +00:00
Tom Lane e006a24ad1 Implement SEMI and ANTI joins in the planner and executor. (Semijoins replace
the old JOIN_IN code, but antijoins are new functionality.)  Teach the planner
to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti
joins respectively.  Also, LEFT JOINs with suitable upper-level IS NULL
filters are recognized as being anti joins.  Unify the InClauseInfo and
OuterJoinInfo infrastructure into "SpecialJoinInfo".  With that change,
it becomes possible to associate a SpecialJoinInfo with every join attempt,
which permits some cleanup of join selectivity estimation.  That needs to be
taken much further than this patch does, but the next step is to change the
API for oprjoin selectivity functions, which seems like material for a
separate patch.  So for the moment the output size estimates for semi and
especially anti joins are quite bogus.
2008-08-14 18:48:00 +00:00
Bruce Momjian 9098ab9e32 Update copyrights in source tree to 2008. 2008-01-01 19:46:01 +00:00
Bruce Momjian fdf5a5efb7 pgindent run for 8.3. 2007-11-15 21:14:46 +00:00
Tom Lane afcf09dd90 Some further performance tweaks for planning large inheritance trees that
are mostly excluded by constraints: do the CE test a bit earlier to save
some adjust_appendrel_attrs() work on excluded children, and arrange to
use array indexing rather than rt_fetch() to fetch RTEs in the main body
of the planner.  The latter is something I'd wanted to do for awhile anyway,
but seeing list_nth_cell() as 35% of the runtime gets one's attention.
2007-04-21 21:01:45 +00:00
Tom Lane eab6b8b27e Turn the rangetable used by the executor into a flat list, and avoid storing
useless substructure for its RangeTblEntry nodes.  (I chose to keep using the
same struct node type and just zero out the link fields for unneeded info,
rather than making a separate ExecRangeTblEntry type --- it seemed too
fragile to have two different rangetable representations.)

Along the way, put subplans into a list in the toplevel PlannedStmt node,
and have SubPlan nodes refer to them by list index instead of direct pointers.
Vadim wanted to do that years ago, but I never understood what he was on about
until now.  It makes things a *whole* lot more robust, because we can stop
worrying about duplicate processing of subplans during expression tree
traversals.  That's been a constant source of bugs, and it's finally gone.

There are some consequent simplifications yet to be made, like not using
a separate EState for subplans in the executor, but I'll tackle that later.
2007-02-22 22:00:26 +00:00
Tom Lane f41803bb39 Refactor planner's pathkeys data structure to create a separate, explicit
representation of equivalence classes of variables.  This is an extensive
rewrite, but it brings a number of benefits:
* planner no longer fails in the presence of "incomplete" operator families
that don't offer operators for every possible combination of datatypes.
* avoid generating and then discarding redundant equality clauses.
* remove bogus assumption that derived equalities always use operators
named "=".
* mergejoins can work with a variety of sort orders (e.g., descending) now,
instead of tying each mergejoinable operator to exactly one sort order.
* better recognition of redundant sort columns.
* can make use of equalities appearing underneath an outer join.
2007-01-20 20:45:41 +00:00
Bruce Momjian 29dccf5fe0 Update CVS HEAD for 2007 copyright. Back branches are typically not
back-stamped for this.
2007-01-05 22:20:05 +00:00
Bruce Momjian f99a569a2e pgindent run for 8.2. 2006-10-04 00:30:14 +00:00
Tom Lane b74c543685 Improve usage of effective_cache_size parameter by assuming that all the
tables in the query compete for cache space, not just the one we are
currently costing an indexscan for.  This seems more realistic, and it
definitely will help in examples recently exhibited by Stefan
Kaltenbrunner.  To get the total size of all the tables involved, we must
tweak the handling of 'append relations' a bit --- formerly we looked up
information about the child tables on-the-fly during set_append_rel_pathlist,
but it needs to be done before we start doing any cost estimation, so
push it into the add_base_rels_to_query scan.
2006-09-19 22:49:53 +00:00
Joe Conway 9caafda579 Add support for multi-row VALUES clauses as part of INSERT statements
(e.g. "INSERT ... VALUES (...), (...), ...") and elsewhere as allowed
by the spec. (e.g. similar to a FROM clause subselect). initdb required.
Joe Conway and Tom Lane.
2006-08-02 01:59:48 +00:00
Tom Lane 09d3670df3 Change the relation_open protocol so that we obtain lock on a relation
(table or index) before trying to open its relcache entry.  This fixes
race conditions in which someone else commits a change to the relation's
catalog entries while we are in process of doing relcache load.  Problems
of that ilk have been reported sporadically for years, but it was not
really practical to fix until recently --- for instance, the recent
addition of WAL-log support for in-place updates helped.

Along the way, remove pg_am.amconcurrent: all AMs are now expected to support
concurrent update.
2006-07-31 20:09:10 +00:00
Bruce Momjian e0522505bd Remove 576 references of include files that were not needed. 2006-07-14 14:52:27 +00:00
Bruce Momjian f2f5b05655 Update copyright for 2006. Update scripts. 2006-03-05 15:59:11 +00:00
Tom Lane 336a6491aa Improve my initial, rather hacky implementation of joins to append
relations: fix the executor so that we can have an Append plan on the
inside of a nestloop and still pass down outer index keys to index scans
within the Append, then generate such plans as if they were regular
inner indexscans.  This avoids the need to evaluate the outer relation
multiple times.
2006-02-05 02:59:17 +00:00
Tom Lane 8b109ebf14 Teach planner to convert simple UNION ALL subqueries into append relations,
thereby sharing code with the inheritance case.  This puts the UNION-ALL-view
approach to partitioned tables on par with inheritance, so far as constraint
exclusion is concerned: it works either way.  (Still need to update the docs
to say so.)  The definition of "simple UNION ALL" is a little simpler than
I would like --- basically the union arms can only be SELECT * FROM foo
--- but it's good enough for partitioned-table cases.
2006-02-03 21:08:49 +00:00
Tom Lane 8a1468af4e Restructure planner's handling of inheritance. Rather than processing
inheritance trees on-the-fly, which pretty well constrained us to considering
only one way of planning inheritance, expand inheritance sets during the
planner prep phase, and build a side data structure that can be consulted
later to find which RTEs are members of which inheritance sets.  As proof of
concept, use the data structure to plan joins against inheritance sets more
efficiently: we can now use indexes on the set members in inner-indexscan
joins.  (The generated plans could be improved further, but it'll take some
executor changes.)  This data structure will also support handling UNION ALL
subqueries in the same way as inheritance sets, but that aspect of it isn't
finished yet.
2006-01-31 21:39:25 +00:00
Tom Lane e3b9852728 Teach planner how to rearrange join order for some classes of OUTER JOIN.
Per my recent proposal.  I ended up basing the implementation on the
existing mechanism for enforcing valid join orders of IN joins --- the
rules for valid outer-join orders are somewhat similar.
2005-12-20 02:30:36 +00:00
Bruce Momjian 436a2956d8 Re-run pgindent, fixing a problem where comment lines after a blank
comment line where output as too long, and update typedefs for /lib
directory.  Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).

Backpatch to 8.1.X.
2005-11-22 18:17:34 +00:00
Bruce Momjian 1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Tom Lane 5d27bf20b4 Make use of new list primitives list_append_unique and list_concat_unique
where applicable.
2005-07-28 22:27:02 +00:00
Tom Lane a31ad27fc5 Simplify the planner's join clause management by storing join clauses
of a relation in a flat 'joininfo' list.  The former arrangement grouped
the join clauses according to the set of unjoined relids used in each;
however, profiling on test cases involving lots of joins proves that
that data structure is a net loss.  It takes more time to group the
join clauses together than is saved by avoiding duplicate tests later.
It doesn't help any that there are usually not more than one or two
clauses per group ...
2005-06-09 04:19:00 +00:00
Tom Lane e3a33a9a9f Marginal hack to avoid spending a lot of time in find_join_rel during
large planning problems: when the list of join rels gets too long, make
an auxiliary hash table that hashes on the identifying Bitmapset.
2005-06-08 23:02:05 +00:00
Tom Lane 9a586fe0c5 Nab some low-hanging fruit: replace the planner's base_rel_list and
other_rel_list with a single array indexed by rangetable index.
This reduces find_base_rel from O(N) to O(1) without any real penalty.
While find_base_rel isn't one of the major bottlenecks in any profile
I've seen so far, it was starting to creep up on the radar screen
for complex queries --- so might as well fix it.
2005-06-06 04:13:36 +00:00
Tom Lane 9ab4d98168 Remove planner's private fields from Query struct, and put them into
a new PlannerInfo struct, which is passed around instead of the bare
Query in all the planning code.  This commit is essentially just a
code-beautification exercise, but it does open the door to making
larger changes to the planner data structures without having to muck
with the widely-known Query struct.
2005-06-05 22:32:58 +00:00
Tom Lane c1393173aa Avoid redundant relation lock grabs during planning, and make sure
that we acquire a lock on relations added to the query due to inheritance.
Formerly, no such lock was held throughout planning, which meant that
a schema change could occur to invalidate the plan before it's even
been completed.
2005-05-23 03:01:14 +00:00
Tom Lane bc843d3960 First cut at planner support for bitmap index scans. Lots to do yet,
but the code is basically working.  Along the way, rewrite the entire
approach to processing OR index conditions, and make it work in join
cases for the first time ever.  orindxpath.c is now basically obsolete,
but I left it in for the time being to allow easy comparison testing
against the old implementation.
2005-04-22 21:58:32 +00:00
PostgreSQL Daemon 2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Tom Lane 5374d097de Change planner to use the current true disk file size as its estimate of
a relation's number of blocks, rather than the possibly-obsolete value
in pg_class.relpages.  Scale the value in pg_class.reltuples correspondingly
to arrive at a hopefully more accurate number of rows.  When pg_class
contains 0/0, estimate a tuple width from the column datatypes and divide
that into current file size to estimate number of rows.  This improved
methodology allows us to jettison the ancient hacks that put bogus default
values into pg_class when a table is first created.  Also, per a suggestion
from Simon, make VACUUM (but not VACUUM FULL or ANALYZE) adjust the value
it puts into pg_class.reltuples to try to represent the mean tuple density
instead of the minimal density that actually prevails just after VACUUM.
These changes alter the plans selected for certain regression tests, so
update the expected files accordingly.  (I removed join_1.out because
it's not clear if it still applies; we can add back any variant versions
as they are shown to be needed.)
2004-12-01 19:00:56 +00:00
Bruce Momjian b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Bruce Momjian da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Tom Lane ae93e5fd6e Make the world very nearly safe for composite-type columns in tables.
1. Solve the problem of not having TOAST references hiding inside composite
values by establishing the rule that toasting only goes one level deep:
a tuple can contain toasted fields, but a composite-type datum that is
to be inserted into a tuple cannot.  Enforcing this in heap_formtuple
is relatively cheap and it avoids a large increase in the cost of running
the tuptoaster during final storage of a row.
2. Fix some interesting problems in expansion of inherited queries that
reference whole-row variables.  We never really did this correctly before,
but it's now relatively painless to solve by expanding the parent's
whole-row Var into a RowExpr() selecting the proper columns from the
child.
If you dike out the preventive check in CheckAttributeType(),
composite-type columns now seem to actually work.  However, we surely
cannot ship them like this --- without I/O for composite types, you
can't get pg_dump to dump tables containing them.  So a little more
work still to do.
2004-06-05 01:55:05 +00:00
Tom Lane 80c6847cc5 Desultory de-FastList-ification. RelOptInfo.reltargetlist is back to
being a plain List.
2004-06-01 03:03:05 +00:00
Neil Conway 72b6ad6313 Use the new List API function names throughout the backend, and disable the
list compatibility API by default. While doing this, I decided to keep
the llast() macro around and introduce llast_int() and llast_oid() variants.
2004-05-30 23:40:41 +00:00
Neil Conway d0b4399d81 Reimplement the linked list data structure used throughout the backend.
In the past, we used a 'Lispy' linked list implementation: a "list" was
merely a pointer to the head node of the list. The problem with that
design is that it makes lappend() and length() linear time. This patch
fixes that problem (and others) by maintaining a count of the list
length and a pointer to the tail node along with each head node pointer.
A "list" is now a pointer to a structure containing some meta-data
about the list; the head and tail pointers in that structure refer
to ListCell structures that maintain the actual linked list of nodes.

The function names of the list API have also been changed to, I hope,
be more logically consistent. By default, the old function names are
still available; they will be disabled-by-default once the rest of
the tree has been updated to use the new API names.
2004-05-26 04:41:50 +00:00
Neil Conway 1812d3b233 Remove the last traces of Joe Hellerstein's "xfunc" optimization. Patch
from Alvaro Herrera. Also, removed lispsort.c, since it is no longer
used.
2004-04-25 18:23:57 +00:00
Tom Lane a536ed53bc Make use of statistics on index expressions. There are still some
corner cases that could stand improvement, but it does all the basic
stuff.  A byproduct is that the selectivity routines are no longer
constrained to working on simple Vars; we might in future be able to
improve the behavior for subexpressions that don't match indexes.
2004-02-17 00:52:53 +00:00
Tom Lane b281ea8cf1 Whole-row references were broken for subqueries and functions, because
attr_needed/attr_widths optimization failed to allow for Vars with attno
zero in this case.  Per report from Tatsuo Ishii.
2003-12-08 18:19:58 +00:00
PostgreSQL Daemon 969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Bruce Momjian f3c3deb7d0 Update copyrights to 2003. 2003-08-04 02:40:20 +00:00
Bruce Momjian 089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Tom Lane 45708f5ebc Error message editing in backend/optimizer, backend/rewrite. 2003-07-25 00:01:09 +00:00
Tom Lane 835bb975d8 Restructure building of join relation targetlists so that a join plan
node emits only those vars that are actually needed above it in the
plan tree.  (There were comments in the code suggesting that this was
done at some point in the dim past, but for a long time we have just
made join nodes emit everything that either input emitted.)  Aside from
being marginally more efficient, this fixes the problem noted by Peter
Eisentraut where a join above an IN-implemented-as-join might fail,
because the subplan targetlist constructed in the latter case didn't
meet the expectation of including everything.
Along the way, fix some places that were O(N^2) in the targetlist
length.  This is not all the trouble spots for wide queries by any
means, but it's a step forward.
2003-06-29 23:05:05 +00:00
Tom Lane 056467ec6b Teach planner how to propagate pathkeys from sub-SELECTs in FROM up to
the outer query.  (The implementation is a bit klugy, but it would take
nontrivial restructuring to make it nicer, which this is probably not
worth.)  This avoids unnecessary sort steps in examples like
SELECT foo,count(*) FROM (SELECT ... ORDER BY foo,bar) sub GROUP BY foo
which means there is now a reasonable technique for controlling the
order of inputs to custom aggregates, even in the grouping case.
2003-02-15 20:12:41 +00:00