Commit Graph

141 Commits

Author SHA1 Message Date
Tom Lane 6bef118b01 Restructure code that is responsible for ensuring that clauseless joins are
considered when it is necessary to do so because of a join-order restriction
(that is, an outer-join or IN-subselect construct).  The former coding was a
bit ad-hoc and inconsistent, and it missed some cases, as exposed by Mario
Weilguni's recent bug report.  His specific problem was that an IN could be
turned into a "clauseless" join due to constant-propagation removing the IN's
joinclause, and if the IN's subselect involved more than one relation and
there was more than one such IN linking to the same upper relation, then the
only valid join orders involve "bushy" plans but we would fail to consider the
specific paths needed to get there.  (See the example case added to the join
regression test.)  On examining the code I wonder if there weren't some other
problem cases too; in particular it seems that GEQO was defending against a
different set of corner cases than the main planner was.  There was also an
efficiency problem, in that when we did realize we needed a clauseless join
because of an IN, we'd consider clauseless joins against every other relation
whether this was sensible or not.  It seems a better design is to use the
outer-join and in-clause lists as a backup heuristic, just as the rule of
joining only where there are joinclauses is a heuristic: we'll join two
relations if they have a usable joinclause *or* this might be necessary to
satisfy an outer-join or IN-clause join order restriction.  I refactored the
code to have just one place considering this instead of three, and made sure
that it covered all the cases that any of them had been considering.

Backpatch as far as 8.1 (which has only the IN-clause form of the disease).
By rights 8.0 and 7.4 should have the bug too, but they accidentally fail
to fail, because the joininfo structure used in those releases preserves some
memory of there having once been a joinclause between the inner and outer
sides of an IN, and so it leads the code in the right direction anyway.
I'll be conservative and not touch them.
2007-02-16 00:14:01 +00:00
Tom Lane c17117649b Repair bug in 8.2's new logic for planning outer joins: we have to allow joins
that overlap an outer join's min_righthand but aren't fully contained in it,
to support joining within the RHS after having performed an outer join that
can commute with this one.  Aside from the direct fix in make_join_rel(),
fix has_join_restriction() and GEQO's desirable_join() to consider this
possibility.  Per report from Ian Harding.
2007-02-13 02:31:03 +00:00
Peter Eisentraut 2cc01004c6 Remove remains of old depend target. 2007-01-20 17:16:17 +00:00
Bruce Momjian 29dccf5fe0 Update CVS HEAD for 2007 copyright. Back branches are typically not
back-stamped for this.
2007-01-05 22:20:05 +00:00
Tom Lane f18c57fdf1 Fix planner to do the right thing when a degenerate outer join (one whose
joinclause doesn't use any outer-side vars) requires a "bushy" plan to be
created.  The normal heuristic to avoid joins with no joinclause has to be
overridden in that case.  Problem is new in 8.2; before that we forced the
outer join order anyway.  Per example from Teodor.
2006-12-12 21:31:02 +00:00
Tom Lane 4df8de7a68 Fix check for whether a clauseless join has to be forced in the presence of
outer joins.  Originally it was only looking for overlap of the righthand
side of a left join, but we have to do it on the lefthand side too.
Per example from Jean-Pierre Pelletier.
2006-10-24 17:50:22 +00:00
Tom Lane 9b556322c5 Fix some missing inclusions identified with new pgcheckdefines tool. 2006-07-15 03:35:21 +00:00
Bruce Momjian e0522505bd Remove 576 references of include files that were not needed. 2006-07-14 14:52:27 +00:00
Bruce Momjian f2f5b05655 Update copyright for 2006. Update scripts. 2006-03-05 15:59:11 +00:00
Tom Lane e3b9852728 Teach planner how to rearrange join order for some classes of OUTER JOIN.
Per my recent proposal.  I ended up basing the implementation on the
existing mechanism for enforcing valid join orders of IN joins --- the
rules for valid outer-join orders are somewhat similar.
2005-12-20 02:30:36 +00:00
Bruce Momjian 436a2956d8 Re-run pgindent, fixing a problem where comment lines after a blank
comment line where output as too long, and update typedefs for /lib
directory.  Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).

Backpatch to 8.1.X.
2005-11-22 18:17:34 +00:00
Bruce Momjian 1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Tom Lane 1265724ff5 The random selection in function linear() could deliver a value equal to max
if geqo_rand() returns exactly 1.0, resulting in failure due to indexing
off the end of the pool array.  Also, since this is using inexact float math,
it seems wise to guard against roundoff error producing values slightly
outside the expected range.  Per report from bug@zedware.org.
2005-06-14 14:21:16 +00:00
Tom Lane a31ad27fc5 Simplify the planner's join clause management by storing join clauses
of a relation in a flat 'joininfo' list.  The former arrangement grouped
the join clauses according to the set of unjoined relids used in each;
however, profiling on test cases involving lots of joins proves that
that data structure is a net loss.  It takes more time to group the
join clauses together than is saved by avoiding duplicate tests later.
It doesn't help any that there are usually not more than one or two
clauses per group ...
2005-06-09 04:19:00 +00:00
Tom Lane e3a33a9a9f Marginal hack to avoid spending a lot of time in find_join_rel during
large planning problems: when the list of join rels gets too long, make
an auxiliary hash table that hashes on the identifying Bitmapset.
2005-06-08 23:02:05 +00:00
Tom Lane 9ab4d98168 Remove planner's private fields from Query struct, and put them into
a new PlannerInfo struct, which is passed around instead of the bare
Query in all the planning code.  This commit is essentially just a
code-beautification exercise, but it does open the door to making
larger changes to the planner data structures without having to muck
with the widely-known Query struct.
2005-06-05 22:32:58 +00:00
PostgreSQL Daemon 2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Tom Lane dd29fc2f61 Fix another place broken by new List implementation :-(. Per example
from goranpop@nspoint.net.  I think this escaped notice because in
simple cases the list is NIL on entry.
2004-12-15 21:13:34 +00:00
Bruce Momjian a5d7ba773d Adjust comments previously moved to column 1 by pgident. 2004-10-07 15:21:58 +00:00
Bruce Momjian b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Bruce Momjian da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Tom Lane 921d749bd4 Adjust our timezone library to use pg_time_t (typedef'd as int64) in
place of time_t, as per prior discussion.  The behavior does not change
on machines without a 64-bit-int type, but on machines with one, which
is most, we are rid of the bizarre boundary behavior at the edges of
the 32-bit-time_t range (1901 and 2038).  The system will now treat
times over the full supported timestamp range as being in your local
time zone.  It may seem a little bizarre to consider that times in
4000 BC are PST or EST, but this is surely at least as reasonable as
propagating Gregorian calendar rules back that far.

I did not modify the format of the zic timezone database files, which
means that for the moment the system will not know about daylight-savings
periods outside the range 1901-2038.  Given the way the files are set up,
it's not a simple decision like 'widen to 64 bits'; we have to actually
think about the range of years that need to be supported.  We should
probably inquire what the plans of the upstream zic people are before
making any decisions of our own.
2004-06-03 02:08:07 +00:00
Neil Conway 72b6ad6313 Use the new List API function names throughout the backend, and disable the
list compatibility API by default. While doing this, I decided to keep
the llast() macro around and introduce llast_int() and llast_oid() variants.
2004-05-30 23:40:41 +00:00
Neil Conway d0b4399d81 Reimplement the linked list data structure used throughout the backend.
In the past, we used a 'Lispy' linked list implementation: a "list" was
merely a pointer to the head node of the list. The problem with that
design is that it makes lappend() and length() linear time. This patch
fixes that problem (and others) by maintaining a count of the list
length and a pointer to the tail node along with each head node pointer.
A "list" is now a pointer to a structure containing some meta-data
about the list; the head and tail pointers in that structure refer
to ListCell structures that maintain the actual linked list of nodes.

The function names of the list API have also been changed to, I hope,
be more logically consistent. By default, the old function names are
still available; they will be disabled-by-default once the rest of
the tree has been updated to use the new API names.
2004-05-26 04:41:50 +00:00
Tom Lane 63bd0db121 Integrate src/timezone library for all platforms. There is more we can
and should do now that we control our own destiny for timezone handling,
but this commit gets the bulk of the picayune diffs in place.
Magnus Hagander and Tom Lane.
2004-05-21 05:08:06 +00:00
Tom Lane 3969f2924b Revise GEQO planner to make use of some heuristic knowledge about SQL, namely
that it's good to join where there are join clauses rather than where there
are not.  Also enable it to generate bushy plans at need, so that it doesn't
fail in the presence of multiple IN clauses containing sub-joins.  These
changes appear to improve the behavior enough that we can substantially reduce
the default pool size and generations count, thereby decreasing the runtime,
and yet get as good or better plans as we were getting in 7.4.  Consequently,
adjust the default GEQO parameters.  I also modified the way geqo_effort is
used so that it affects both population size and number of generations;
it's now useful as a single control to adjust the GEQO runtime-vs-plan-quality
tradeoff.  Bump geqo_threshold to 12, since even with these changes GEQO
seems to be slower than the regular planner at 11 relations.
2004-01-23 23:54:21 +00:00
Tom Lane 672a807028 Repair error apparently introduced in the initial coding of GUC: the
default value for geqo_effort is supposed to be 40, not 1.  The actual
'genetic' component of the GEQO algorithm has been practically disabled
since 7.1 because of this mistake.  Improve documentation while at it.
2004-01-21 23:33:34 +00:00
PostgreSQL Daemon 55b113257c make sure the $Id tags are converted to $PostgreSQL as well ... 2003-11-29 22:41:33 +00:00
PostgreSQL Daemon 969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Tom Lane 48beecda7c Remove geqo_random_seed parameter. Having geqo reset the global random()
sequence every time it's called is bogus --- it interferes with user
control over the seed, and actually decreases randomness overall
(because a seed based on time(NULL) is pretty predictable).  If you really
want a reproducible result from geqo, do 'set seed = 0' before planning
a query.
2003-09-07 15:26:54 +00:00
Tom Lane fcb90fdc95 Change some frequently-reached elog(DEBUG...) calls to ereport(DEBUG...)
for speed reasons.  (ereport falls out much more quickly when no output
is needed than elog does.)
2003-08-12 18:23:21 +00:00
Bruce Momjian f3c3deb7d0 Update copyrights to 2003. 2003-08-04 02:40:20 +00:00
Bruce Momjian 089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Tom Lane 45708f5ebc Error message editing in backend/optimizer, backend/rewrite. 2003-07-25 00:01:09 +00:00
Bruce Momjian 98b6f37e47 Make debug_ GUC varables output DEBUG1 rather than LOG, and mention in
docs that CLIENT/LOG_MIN_MESSAGES now controls debug_* output location.
Doc changes included.
2003-05-27 17:49:47 +00:00
Tom Lane de28dc9a04 Portal and memory management infrastructure for extended query protocol.
Both plannable queries and utility commands are now always executed
within Portals, which have been revamped so that they can handle the
load (they used to be good only for single SELECT queries).  Restructure
code to push command-completion-tag selection logic out of postgres.c,
so that it won't have to be duplicated between simple and extended queries.
initdb forced due to addition of a field to Query nodes.
2003-05-02 20:54:36 +00:00
Tom Lane bdfbfde1b1 IN clauses appearing at top level of WHERE can now be handled as joins.
There are two implementation techniques: the executor understands a new
JOIN_IN jointype, which emits at most one matching row per left-hand row,
or the result of the IN's sub-select can be fed through a DISTINCT filter
and then joined as an ordinary relation.
Along the way, some minor code cleanup in the optimizer; notably, break
out most of the jointree-rearrangement preprocessing in planner.c and
put it in a new file prep/prepjointree.c.
2003-01-20 18:55:07 +00:00
Tom Lane 9f76d0d926 Fix GEQO to work again in CVS tip, by being more careful about memory
allocation in best_inner_indexscan().  While at it, simplify GEQO's
interface to the main planner --- make_join_rel() offers exactly the
API it really wants, whereas calling make_rels_by_clause_joins() and
make_rels_by_clauseless_joins() required jumping through hoops.
Rewrite gimme_tree for clarity (sometimes iteration is much better than
recursion), and approximately halve GEQO's runtime by recognizing that
tours of the forms (a,b,c,d,...) and (b,a,c,d,...) are equivalent
because of symmetry in make_join_rel().
2002-12-16 21:30:30 +00:00
Tom Lane f6dba10e62 First phase of implementing hash-based grouping/aggregation. An AGG plan
node now does its own grouping of the input rows, and has no need for a
preceding GROUP node in the plan pipeline.  This allows elimination of
the misnamed tuplePerGroup option for GROUP, and actually saves more code
in nodeGroup.c than it costs in nodeAgg.c, as well as being presumably
faster.  Restructure the API of query_planner so that we do not commit to
using a sorted or unsorted plan in query_planner; instead grouping_planner
makes the decision.  (Right now it isn't any smarter than query_planner
was, but that will change as soon as it has the option to select a hash-
based aggregation step.)  Despite all the hackery, no initdb needed since
only in-memory node types changed.
2002-11-06 00:00:45 +00:00
Tom Lane 52c9d25933 Be careful to include postgres.h *before* any system headers, to ensure
that the right flavors of largefile-related definitions are seen.
Most of these changes are probably unnecessary, but better safe than
sorry.
2002-09-05 00:43:07 +00:00
Bruce Momjian e50f52a074 pgindent run. 2002-09-04 20:31:48 +00:00
Bruce Momjian 38dd3ae7d0 The attached patch fixes a build problem with GEQO when using the
PX recombination operator, changes some elog() messages from LOG
to DEBUG1, puts some debugging functions inside the appropriate
#ifdef (not enabled by default), and makes a few other minor
cleanups.

BTW, the elog() change is motivated by at least one user who
has sent a concerned email to -general asking exactly what the
"ERX recombination operator" is, and what it is doing to their
DBMS.

Neil Conway
2002-07-20 04:59:10 +00:00
Bruce Momjian d84fe82230 Update copyright to 2002. 2002-06-20 20:29:54 +00:00
Bruce Momjian a033daf566 Commit to match discussed elog() changes. Only update is that LOG is
now just below FATAL in server_min_messages.  Added more text to
highlight ordering difference between it and client_min_messages.

---------------------------------------------------------------------------

REALLYFATAL => PANIC
STOP => PANIC
New INFO level the prints to client by default
New LOG level the prints to server log by default
Cause VACUUM information to print only to the client
NOTICE => INFO where purely information messages are sent
DEBUG => LOG for purely server status messages
DEBUG removed, kept as backward compatible
DEBUG5, DEBUG4, DEBUG3, DEBUG2, DEBUG1 added
DebugLvl removed in favor of new DEBUG[1-5] symbols
New server_min_messages GUC parameter with values:
        DEBUG[5-1], INFO, NOTICE, ERROR, LOG, FATAL, PANIC
New client_min_messages GUC parameter with values:
        DEBUG[5-1], LOG, INFO, NOTICE, ERROR, FATAL, PANIC
Server startup now logged with LOG instead of DEBUG
Remove debug_level GUC parameter
elog() numbers now start at 10
Add test to print error message if older elog() values are passed to elog()
Bootstrap mode now has a -d that requires an argument, like postmaster
2002-03-02 21:39:36 +00:00
Bruce Momjian b81844b173 pgindent run on all C files. Java run to follow. initdb/regression
tests pass.
2001-10-25 05:50:21 +00:00
Peter Eisentraut 12c1552066 Mark many strings in backend not covered by elog for translation. Also,
make strings in xlog.c look more like English and less like binary noise.
2001-06-03 14:53:56 +00:00
Bruce Momjian 857abb0e57 Add newlines around debug output in optimizer showing total costs. 2001-05-08 17:25:28 +00:00
Bruce Momjian 9e1552607a pgindent run. Make it all clean. 2001-03-22 04:01:46 +00:00
Bruce Momjian 623bf843d2 Change Copyright from PostgreSQL, Inc to PostgreSQL Global Development Group. 2001-01-24 19:43:33 +00:00
Bruce Momjian 5088f0748a Change lcons(x, NIL) to makeList(x) where appropriate. 2001-01-17 17:26:45 +00:00