Commit Graph

96 Commits

Author SHA1 Message Date
Bruce Momjian 0e550ff617 Revert DTrace patch from Robert Lor 2009-04-02 20:59:10 +00:00
Bruce Momjian 227f817c1f Add support for additional DTrace probes.
Robert Lor
2009-04-02 19:14:34 +00:00
Bruce Momjian 511db38ace Update copyright for 2009. 2009-01-01 17:24:05 +00:00
Tom Lane 118461114e Performance fix for new anti-join code in nodeMergejoin.c: after finding a
match in antijoin mode, we should advance to next outer tuple not next inner.
We know we don't want to return this outer tuple, and there is no point in
advancing over matching inner tuples now, because we'd just have to do it
again if the next outer tuple has the same merge key.  This makes a noticeable
difference if there are lots of duplicate keys in both inputs.

Similarly, after finding a match in semijoin mode, arrange to advance to
the next outer tuple after returning the current match; or immediately,
if it fails the extra quals.  The rationale is the same.  (This is a
performance bug in existing releases; perhaps worth back-patching?  The
planner tries to avoid using mergejoin with lots of duplicates, so it may
not be a big issue in practice.)

Nestloop and hash got this right to start with, but I made some cosmetic
adjustments there to make the corresponding bits of logic look more similar.
2008-08-15 19:20:42 +00:00
Tom Lane e006a24ad1 Implement SEMI and ANTI joins in the planner and executor. (Semijoins replace
the old JOIN_IN code, but antijoins are new functionality.)  Teach the planner
to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti
joins respectively.  Also, LEFT JOINs with suitable upper-level IS NULL
filters are recognized as being anti joins.  Unify the InClauseInfo and
OuterJoinInfo infrastructure into "SpecialJoinInfo".  With that change,
it becomes possible to associate a SpecialJoinInfo with every join attempt,
which permits some cleanup of join selectivity estimation.  That needs to be
taken much further than this patch does, but the next step is to change the
API for oprjoin selectivity functions, which seems like material for a
separate patch.  So for the moment the output size estimates for semi and
especially anti joins are quite bogus.
2008-08-14 18:48:00 +00:00
Tom Lane 226837e57e Since createplan.c no longer cares whether index operators are lossy, it has
no particular need to do get_op_opfamily_properties() while building an
indexscan plan.  Postpone that lookup until executor start.  This simplifies
createplan.c a lot more than it complicates nodeIndexscan.c, and makes things
more uniform since we already had to do it that way for RowCompare
expressions.  Should be a bit faster too, at least for plans that aren't
re-used many times, since we avoid palloc'ing and perhaps copying the
intermediate list data structure.
2008-04-13 20:51:21 +00:00
Bruce Momjian 9098ab9e32 Update copyrights in source tree to 2008. 2008-01-01 19:46:01 +00:00
Bruce Momjian fdf5a5efb7 pgindent run for 8.3. 2007-11-15 21:14:46 +00:00
Tom Lane 2415ad9831 Teach tuplestore.c to throw away data before the "mark" point when the caller
is using mark/restore but not rewind or backward-scan capability.  Insert a
materialize plan node between a mergejoin and its inner child if the inner
child is a sort that is expected to spill to disk.  The materialize shields
the sort from the need to do mark/restore and thereby allows it to perform
its final merge pass on-the-fly; while the materialize itself is normally
cheap since it won't spill to disk unless the number of tuples with equal
key values exceeds work_mem.

Greg Stark, with some kibitzing from Tom Lane.
2007-05-21 17:57:35 +00:00
Tom Lane 5413eef8dc Repair failure to check that a table is still compatible with a previously
made query plan.  Use of ALTER COLUMN TYPE creates a hazard for cached
query plans: they could contain Vars that claim a column has a different
type than it now has.  Fix this by checking during plan startup that Vars
at relation scan level match the current relation tuple descriptor.  Since
at that point we already have at least AccessShareLock, we can be sure the
column type will not change underneath us later in the query.  However,
since a backend's locks do not conflict against itself, there is still a
hole for an attacker to exploit: he could try to execute ALTER COLUMN TYPE
while a query is in progress in the current backend.  Seal that hole by
rejecting ALTER TABLE whenever the target relation is already open in
the current backend.

This is a significant security hole: not only can one trivially crash the
backend, but with appropriate misuse of pass-by-reference datatypes it is
possible to read out arbitrary locations in the server process's memory,
which could allow retrieving database content the user should not be able
to see.  Our thanks to Jeff Trout for the initial report.

Security: CVE-2007-0556
2007-02-02 00:07:03 +00:00
Tom Lane ad429fe314 Teach nodeMergejoin how to handle DESC and/or NULLS FIRST sort orders.
So far only tested by hacking the planner ...
2007-01-11 17:19:13 +00:00
Tom Lane a191a169d6 Change the planner-to-executor API so that the planner tells the executor
which comparison operators to use for plan nodes involving tuple comparison
(Agg, Group, Unique, SetOp).  Formerly the executor looked up the default
equality operator for the datatype, which was really pretty shaky, since it's
possible that the data being fed to the node is sorted according to some
nondefault operator class that could have an incompatible idea of equality.
The planner knows what it has sorted by and therefore can provide the right
equality operator to use.  Also, this change moves a couple of catalog lookups
out of the executor and into the planner, which should help startup time for
pre-planned queries by some small amount.  Modify the planner to remove some
other cavalier assumptions about always being able to use the default
operators.  Also add "nulls first/last" info to the Plan node for a mergejoin
--- neither the executor nor the planner can cope yet, but at least the API is
in place.
2007-01-10 18:06:05 +00:00
Bruce Momjian 29dccf5fe0 Update CVS HEAD for 2007 copyright. Back branches are typically not
back-stamped for this.
2007-01-05 22:20:05 +00:00
Tom Lane a78fcfb512 Restructure operator classes to allow improved handling of cross-data-type
cases.  Operator classes now exist within "operator families".  While most
families are equivalent to a single class, related classes can be grouped
into one family to represent the fact that they are semantically compatible.
Cross-type operators are now naturally adjunct parts of a family, without
having to wedge them into a particular opclass as we had done originally.

This commit restructures the catalogs and cleans up enough of the fallout so
that everything still works at least as well as before, but most of the work
needed to actually improve the planner's behavior will come later.  Also,
there are not yet CREATE/DROP/ALTER OPERATOR FAMILY commands; the only way
to create a new family right now is to allow CREATE OPERATOR CLASS to make
one by default.  I owe some more documentation work, too.  But that can all
be done in smaller pieces once this infrastructure is in place.
2006-12-23 00:43:13 +00:00
Bruce Momjian f99a569a2e pgindent run for 8.2. 2006-10-04 00:30:14 +00:00
Bruce Momjian e0522505bd Remove 576 references of include files that were not needed. 2006-07-14 14:52:27 +00:00
Tom Lane 06e10abc0b Fix problems with cached tuple descriptors disappearing while still in use
by creating a reference-count mechanism, similar to what we did a long time
ago for catcache entries.  The back branches have an ugly solution involving
lots of extra copies, but this way is more efficient.  Reference counting is
only applied to tupdescs that are actually in caches --- there seems no need
to use it for tupdescs that are generated in the executor, since they'll go
away during plan shutdown by virtue of being in the per-query memory context.
Neil Conway and Tom Lane
2006-06-16 18:42:24 +00:00
Tom Lane b3358e2642 Fix bug introduced into mergejoin logic by performance improvement patch of
2005-05-13.  When we find that a new inner tuple can't possibly match any
outer tuple (because it contains a NULL), we can't immediately skip the
tuple when we are in NEXTINNER state.  Doing so can lead to emitting
multiple copies of the tuple in FillInner mode, because we may rescan the
tuple after returning to a previous marked tuple.  Instead, proceed to
NEXTOUTER state the same as we used to do.  After we've found that there's
no need to return to the marked position, we can go to SKIPINNER_ADVANCE
state instead of SKIP_TEST when the inner tuple is unmatchable; this
preserves the performance improvement.  Per bug report from Bruce.
I also made a couple of cosmetic code rearrangements and added a regression
test for the problem.
2006-03-17 19:38:12 +00:00
Bruce Momjian f2f5b05655 Update copyright for 2006. Update scripts. 2006-03-05 15:59:11 +00:00
Tom Lane 2c0ef9777c Extend the ExecInitNode API so that plan nodes receive a set of flag
bits indicating which optional capabilities can actually be exercised
at runtime.  This will allow Sort and Material nodes, and perhaps later
other nodes, to avoid unnecessary overhead in common cases.
This commit just adds the infrastructure and arranges to pass the correct
flag values down to plan nodes; none of the actual optimizations are here
yet.  I'm committing this separately in case anyone wants to measure the
added overhead.  (It should be negligible.)

Simon Riggs and Tom Lane
2006-02-28 04:10:28 +00:00
Bruce Momjian 436a2956d8 Re-run pgindent, fixing a problem where comment lines after a blank
comment line where output as too long, and update typedefs for /lib
directory.  Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).

Backpatch to 8.1.X.
2005-11-22 18:17:34 +00:00
Bruce Momjian 1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Tom Lane 2ef172a2a4 Fix latent bug in ExecSeqRestrPos: it leaves the plan node's result slot
in an inconsistent state.  (This is only latent because in reality
ExecSeqRestrPos is dead code at the moment ... but someday maybe it won't
be.)  Add some comments about what the API for plan node mark/restore
actually is, because it's not immediately obvious.
2005-05-15 21:19:55 +00:00
Tom Lane fabef3044a Minor refactoring to eliminate duplicate code and make startup a
tad faster.
2005-05-14 21:29:23 +00:00
Tom Lane 184e7a73a5 Revise nodeMergejoin in light of example provided by Guillaume Smet.
When one side of the join has a NULL, we don't want to uselessly try
to match it against every remaining tuple of the other side.  While
at it, rewrite the comparison machinery to avoid multiple evaluations
of the left and right input expressions and to use a btree comparator
where available, instead of double operator calls.  Also revise the
state machine to eliminate redundant comparisons and hopefully make it
more readable too.
2005-05-13 21:20:16 +00:00
Tom Lane 278bd0cc22 For some reason access/tupmacs.h has been #including utils/memutils.h,
which is neither needed by nor related to that header.  Remove the bogus
inclusion and instead include the header in those C files that actually
need it.  Also fix unnecessary inclusions and bad inclusion order in
tsearch2 files.
2005-05-06 17:24:55 +00:00
Tom Lane f97aebd162 Revise TupleTableSlot code to avoid unnecessary construction and disassembly
of tuples when passing data up through multiple plan nodes.  A slot can now
hold either a normal "physical" HeapTuple, or a "virtual" tuple consisting
of Datum/isnull arrays.  Upper plan levels can usually just copy the Datum
arrays, avoiding heap_formtuple() and possible subsequent nocachegetattr()
calls to extract the data again.  This work extends Atsushi Ogawa's earlier
patch, which provided the key idea of adding Datum arrays to TupleTableSlots.
(I believe however that something like this was foreseen way back in Berkeley
days --- see the old comment on ExecProject.)  A test case involving many
levels of join of fairly wide tables (about 80 columns altogether) showed
about 3x overall speedup, though simple queries will probably not be
helped very much.

I have also duplicated some code in heaptuple.c in order to provide versions
of heap_formtuple and friends that use "bool" arrays to indicate null
attributes, instead of the old convention of "char" arrays containing either
'n' or ' '.  This provides a better match to the convention used by
ExecEvalExpr.  While I have not made a concerted effort to get rid of uses
of the old routines, I think they should be deprecated and eventually removed.
2005-03-16 21:38:10 +00:00
PostgreSQL Daemon 2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Bruce Momjian b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Bruce Momjian da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Neil Conway 72b6ad6313 Use the new List API function names throughout the backend, and disable the
list compatibility API by default. While doing this, I decided to keep
the llast() macro around and introduce llast_int() and llast_oid() variants.
2004-05-30 23:40:41 +00:00
Neil Conway d0b4399d81 Reimplement the linked list data structure used throughout the backend.
In the past, we used a 'Lispy' linked list implementation: a "list" was
merely a pointer to the head node of the list. The problem with that
design is that it makes lappend() and length() linear time. This patch
fixes that problem (and others) by maintaining a count of the list
length and a pointer to the tail node along with each head node pointer.
A "list" is now a pointer to a structure containing some meta-data
about the list; the head and tail pointers in that structure refer
to ListCell structures that maintain the actual linked list of nodes.

The function names of the list API have also been changed to, I hope,
be more logically consistent. By default, the old function names are
still available; they will be disabled-by-default once the rest of
the tree has been updated to use the new API names.
2004-05-26 04:41:50 +00:00
Tom Lane c1352052ef Replace the switching function ExecEvalExpr() with a macro that jumps
directly to the appropriate per-node execution function, using a function
pointer stored by ExecInitExpr.  This speeds things up by eliminating one
level of function call.  The function-pointer technique also enables further
small improvements such as only making one-time tests once (and then
changing the function pointer).  Overall this seems to gain about 10%
on evaluation of simple expressions, which isn't earthshaking but seems
a worthwhile gain for a relatively small hack.  Per recent discussion
on pghackers.
2004-03-17 01:02:24 +00:00
PostgreSQL Daemon 969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Peter Eisentraut feb4f44d29 Message editing: remove gratuitous variations in message wording, standardize
terms, add some clarifications, fix some untranslatable attempts at dynamic
message building.
2003-09-25 06:58:07 +00:00
Bruce Momjian 46785776c4 Another pgindent run with updated typedefs. 2003-08-08 21:42:59 +00:00
Bruce Momjian f3c3deb7d0 Update copyrights to 2003. 2003-08-04 02:40:20 +00:00
Bruce Momjian 089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Tom Lane 5e6d691e0d Error message editing in backend/executor. 2003-07-21 17:05:12 +00:00
Tom Lane 94a3c60324 Ditch ExecGetTupType() in favor of the much simpler ExecGetResultType(),
which does the same thing.  Perhaps at one time there was a reason to
allow plan nodes to store their result types in different places, but
AFAICT that's been unnecessary for a good while.
2003-05-05 17:57:47 +00:00
Tom Lane bdfbfde1b1 IN clauses appearing at top level of WHERE can now be handled as joins.
There are two implementation techniques: the executor understands a new
JOIN_IN jointype, which emits at most one matching row per left-hand row,
or the result of the IN's sub-select can be fed through a DISTINCT filter
and then joined as an ordinary relation.
Along the way, some minor code cleanup in the optimizer; notably, break
out most of the jointree-rearrangement preprocessing in planner.c and
put it in a new file prep/prepjointree.c.
2003-01-20 18:55:07 +00:00
Tom Lane 5bab36e9f6 Revise executor APIs so that all per-query state structure is built in
a per-query memory context created by CreateExecutorState --- and destroyed
by FreeExecutorState.  This provides a final solution to the longstanding
problem of memory leaked by various ExecEndNode calls.
2002-12-15 16:17:59 +00:00
Tom Lane 3a4f7dde16 Phase 3 of read-only-plans project: ExecInitExpr now builds expression
execution state trees, and ExecEvalExpr takes an expression state tree
not an expression plan tree.  The plan tree is now read-only as far as
the executor is concerned.  Next step is to begin actually exploiting
this property.
2002-12-13 19:46:01 +00:00
Tom Lane a0bf885f9e Phase 2 of read-only-plans project: restructure expression-tree nodes
so that all executable expression nodes inherit from a common supertype
Expr.  This is somewhat of an exercise in code purity rather than any
real functional advance, but getting rid of the extra Oper or Func node
formerly used in each operator or function call should provide at least
a little space and speed improvement.
initdb forced by changes in stored-rules representation.
2002-12-12 15:49:42 +00:00
Tom Lane 1fd0c59e25 Phase 1 of read-only-plans project: cause executor state nodes to point
to plan nodes, not vice-versa.  All executor state nodes now inherit from
struct PlanState.  Copying of plan trees has been simplified by not
storing a list of SubPlans in Plan nodes (eliminating duplicate links).
The executor still needs such a list, but it can build it during
ExecutorStart since it has to scan the plan tree anyway.
No initdb forced since no stored-on-disk structures changed, but you
will need a full recompile because of node-numbering changes.
2002-12-05 15:50:39 +00:00
Bruce Momjian e50f52a074 pgindent run. 2002-09-04 20:31:48 +00:00
Bruce Momjian d84fe82230 Update copyright to 2002. 2002-06-20 20:29:54 +00:00
Bruce Momjian 92288a1cf9 Change made to elog:
o  Change all current CVS messages of NOTICE to WARNING.  We were going
to do this just before 7.3 beta but it has to be done now, as you will
see below.

o Change current INFO messages that should be controlled by
client_min_messages to NOTICE.

o Force remaining INFO messages, like from EXPLAIN, VACUUM VERBOSE, etc.
to always go to the client.

o Remove INFO from the client_min_messages options and add NOTICE.

Seems we do need three non-ERROR elog levels to handle the various
behaviors we need for these messages.

Regression passed.
2002-03-06 06:10:59 +00:00
Tom Lane f8c109528c Teach planner about the idea that a mergejoin won't necessarily read
both input streams to the end.  If one variable's range is much less
than the other, an indexscan-based merge can win by not scanning all
of the other table.  Per example from Reinhard Max.
2002-03-01 04:09:28 +00:00
Bruce Momjian 6783b2372e Another pgindent run. Fixes enum indenting, and improves #endif
spacing.  Also adds space for one-line comments.
2001-10-28 06:26:15 +00:00