Commit Graph

160 Commits

Author SHA1 Message Date
Tom Lane 08ccdf020e Fix oversight in planning for multiple indexscans driven by
ScalarArrayOpExpr index quals: we were estimating the right total
number of rows returned, but treating the index-access part of the
cost as if a single scan were fetching that many consecutive index
tuples.  Actually we should treat it as a multiple indexscan, and
if there are enough of 'em the Mackert-Lohman discount should kick in.
2006-07-01 22:07:23 +00:00
Tom Lane cffd89ca73 Revise the planner's handling of "pseudoconstant" WHERE clauses, that is
clauses containing no variables and no volatile functions.  Such a clause
can be used as a one-time qual in a gating Result plan node, to suppress
plan execution entirely when it is false.  Even when the clause is true,
putting it in a gating node wins by avoiding repeated evaluation of the
clause.  In previous PG releases, query_planner() would do this for
pseudoconstant clauses appearing at the top level of the jointree, but
there was no ability to generate a gating Result deeper in the plan tree.
To fix it, get rid of the special case in query_planner(), and instead
process pseudoconstant clauses through the normal RestrictInfo qual
distribution mechanism.  When a pseudoconstant clause is found attached to
a path node in create_plan(), pull it out and generate a gating Result at
that point.  This requires special-casing pseudoconstants in selectivity
estimation and cost_qual_eval, but on the whole it's pretty clean.
It probably even makes the planner a bit faster than before for the normal
case of no pseudoconstants, since removing pull_constant_clauses saves one
useless traversal of the qual tree.  Per gripe from Phil Frost.
2006-07-01 18:38:33 +00:00
Tom Lane 8a30cc2127 Make the planner estimate costs for nestloop inner indexscans on the basis
that the Mackert-Lohmann formula applies across all the repetitions of the
nestloop, not just each scan independently.  We use the M-L formula to
estimate the number of pages fetched from the index as well as from the table;
that isn't what it was designed for, but it seems reasonably applicable
anyway.  This makes large numbers of repetitions look much cheaper than
before, which accords with many reports we've received of overestimation
of the cost of a nestloop.  Also, change the index access cost model to
charge random_page_cost per index leaf page touched, while explicitly
not counting anything for access to metapage or upper tree pages.  This
may all need tweaking after we get some field experience, but in simple
tests it seems to be giving saner results than before.  The main thing
is to get the infrastructure in place to let cost_index() and amcostestimate
functions take repeated scans into account at all.  Per my recent proposal.

Note: this patch changes pg_proc.h, but I did not force initdb because
the changes are basically cosmetic --- the system does not look into
pg_proc to decide how to call an index amcostestimate function, and
there's no way to call such a function from SQL at all.
2006-06-06 17:59:58 +00:00
Tom Lane 7868590c61 While making the seq_page_cost changes, I was struck by the fact that
cost_nonsequential_access() is really totally inappropriate for its only
remaining use, namely estimating I/O costs in cost_sort().  The routine
was designed on the assumption that disk caching might eliminate the need
for some re-reads on a random basis, but there's nothing very random in
that sense about sort's access pattern --- it'll always be picking up the
oldest outputs.  If we had a good fix on the effective cache size we
might consider charging zero for I/O unless the sort temp file size
exceeds it, but that's probably putting much too much faith in the
parameter.  Instead just drop the logic in favor of a fixed compromise
between seq_page_cost and random_page_cost per page of sort I/O.
2006-06-05 20:56:33 +00:00
Tom Lane eed6c9ed7e Add a GUC parameter seq_page_cost, and use that everywhere we formerly
assumed that a sequential page fetch has cost 1.0.  This patch doesn't
in itself change the system's behavior at all, but it opens the door to
people adopting other units of measurement for EXPLAIN costs.  Also, if
we ever decide it's worth inventing per-tablespace access cost settings,
this change provides a workable intellectual framework for that.
2006-06-05 02:49:58 +00:00
Bruce Momjian f2f5b05655 Update copyright for 2006. Update scripts. 2006-03-05 15:59:11 +00:00
Tom Lane df700e6b40 Improve tuplesort.c to support variable merge order. The original coding
with fixed merge order (fixed number of "tapes") was based on obsolete
assumptions, namely that tape drives are expensive.  Since our "tapes"
are really just a couple of buffers, we can have a lot of them given
adequate workspace.  This allows reduction of the number of merge passes
with consequent savings of I/O during large sorts.

Simon Riggs with some rework by Tom Lane
2006-02-19 05:54:06 +00:00
Tom Lane 336a6491aa Improve my initial, rather hacky implementation of joins to append
relations: fix the executor so that we can have an Append plan on the
inside of a nestloop and still pass down outer index keys to index scans
within the Append, then generate such plans as if they were regular
inner indexscans.  This avoids the need to evaluate the outer relation
multiple times.
2006-02-05 02:59:17 +00:00
Tom Lane 6e07709760 Implement SQL-compliant treatment of row comparisons for < <= > >= cases
(previously we only did = and <> correctly).  Also, allow row comparisons
with any operators that are in btree opclasses, not only those with these
specific names.  This gets rid of a whole lot of indefensible assumptions
about the behavior of particular operators based on their names ... though
it's still true that IN and NOT IN expand to "= ANY".  The patch adds a
RowCompareExpr expression node type, and makes some changes in the
representation of ANY/ALL/ROWCOMPARE SubLinks so that they can share code
with RowCompareExpr.

I have not yet done anything about making RowCompareExpr an indexable
operator, but will look at that soon.

initdb forced due to changes in stored rules.
2005-12-28 01:30:02 +00:00
Tom Lane da27c0a1ef Teach tid-scan code to make use of "ctid = ANY (array)" clauses, so that
"ctid IN (list)" will still work after we convert IN to ScalarArrayOpExpr.
Make some minor efficiency improvements while at it, such as ensuring that
multiple TIDs are fetched in physical heap order.  And fix EXPLAIN so that
it shows what's really going on for a TID scan.
2005-11-26 22:14:57 +00:00
Bruce Momjian 436a2956d8 Re-run pgindent, fixing a problem where comment lines after a blank
comment line where output as too long, and update typedefs for /lib
directory.  Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).

Backpatch to 8.1.X.
2005-11-22 18:17:34 +00:00
Bruce Momjian 1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Tom Lane e011459029 Make set_function_size_estimates() marginally smarter: per original
comment, it can at least test whether the expression returns set.
2005-10-05 17:19:19 +00:00
Tom Lane 974e3cf30a cost_agg really ought to charge something per output tuple; else there
are cases where it appears to have zero run cost.
2005-08-27 22:37:00 +00:00
Tom Lane 9ab4d98168 Remove planner's private fields from Query struct, and put them into
a new PlannerInfo struct, which is passed around instead of the bare
Query in all the planning code.  This commit is essentially just a
code-beautification exercise, but it does open the door to making
larger changes to the planner data structures without having to muck
with the widely-known Query struct.
2005-06-05 22:32:58 +00:00
Tom Lane bc843d3960 First cut at planner support for bitmap index scans. Lots to do yet,
but the code is basically working.  Along the way, rewrite the entire
approach to processing OR index conditions, and make it work in join
cases for the first time ever.  orindxpath.c is now basically obsolete,
but I left it in for the time being to allow easy comparison testing
against the old implementation.
2005-04-22 21:58:32 +00:00
Tom Lane 14c7fba3f7 Rethink original decision to use AND/OR Expr nodes to represent bitmap
logic operations during planning.  Seems cleaner to create two new Path
node types, instead --- this avoids duplication of cost-estimation code.
Also, create an enable_bitmapscan GUC parameter to control use of bitmap
plans.
2005-04-21 19:18:13 +00:00
Tom Lane e6f7edb9d5 Install some slightly realistic cost estimation for bitmap index scans. 2005-04-21 02:28:02 +00:00
Tom Lane 4a8c5d0375 Create executor and planner-backend support for decoupled heap and index
scans, using in-memory tuple ID bitmaps as the intermediary.  The planner
frontend (path creation and cost estimation) is not there yet, so none
of this code can be executed.  I have tested it using some hacked planner
code that is far too ugly to see the light of day, however.  Committing
now so that the bulk of the infrastructure changes go in before the tree
drifts under me.
2005-04-19 22:35:18 +00:00
Tom Lane 280de290d7 In cost_mergejoin, the early-exit effect should not apply to the
outer side of an outer join.  Per andrew@supernews.
2005-04-04 01:43:12 +00:00
Tom Lane bf3dbb5881 First steps towards index scans with heap access decoupled from index
access: define new index access method functions 'amgetmulti' that can
fetch multiple TIDs per call.  (The functions exist but are totally
untested as yet.)  Since I was modifying pg_am anyway, remove the
no-longer-needed 'rel' parameter from amcostestimate functions, and
also remove the vestigial amowner column that was creating useless
work for Alvaro's shared-object-dependencies project.
Initdb forced due to changes in pg_am.
2005-03-27 23:53:05 +00:00
Tom Lane 926e8a00d3 Add a back-link from IndexOptInfo structs to their parent RelOptInfo
structs.  There are many places in the planner where we were passing
both a rel and an index to subroutines, and now need only pass the
index struct.  Notationally simpler, and perhaps a tad faster.
2005-03-27 06:29:49 +00:00
Tom Lane 849074f9ae Revise hash join code so that we can increase the number of batches
on-the-fly, and thereby avoid blowing out memory when the planner has
underestimated the hash table size.  Hash join will now obey the
work_mem limit with some faithfulness.  Per my recent proposal
(hash aggregate part isn't done yet though).
2005-03-06 22:15:05 +00:00
PostgreSQL Daemon 2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Tom Lane 4e91824b94 Make some adjustments to reduce platform dependencies in plan selection.
In particular, there was a mathematical tie between the two possible
nestloop-with-materialized-inner-scan plans for a join (ie, we computed
the same cost with either input on the inside), resulting in a roundoff
error driven choice, if the relations were both small enough to fit in
sort_mem.  Add a small cost factor to ensure we prefer materializing the
smaller input.  This changes several regression test plans, but with any
luck we will now have more stability across platforms.
2004-12-02 01:34:18 +00:00
Tom Lane 529db99c6e Avoid overflow in cost_sort when work_mem exceeds 1Gb. 2004-10-23 00:05:27 +00:00
Bruce Momjian b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Bruce Momjian da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Tom Lane fcbc438727 Label CVS tip as 8.0devel instead of 7.5devel. Adjust various comments
and documentation to reference 8.0 instead of 7.5.
2004-08-04 21:34:35 +00:00
Tom Lane 3485cc3a7c Adjust cost_nonsequential_access() to have more reasonable behavior
when random_page_cost has a small value.  Per Manfred Koizar, though
I didn't use his equation exactly.
2004-06-10 21:02:00 +00:00
Tom Lane ae93e5fd6e Make the world very nearly safe for composite-type columns in tables.
1. Solve the problem of not having TOAST references hiding inside composite
values by establishing the rule that toasting only goes one level deep:
a tuple can contain toasted fields, but a composite-type datum that is
to be inserted into a tuple cannot.  Enforcing this in heap_formtuple
is relatively cheap and it avoids a large increase in the cost of running
the tuptoaster during final storage of a row.
2. Fix some interesting problems in expansion of inherited queries that
reference whole-row variables.  We never really did this correctly before,
but it's now relatively painless to solve by expanding the parent's
whole-row Var into a RowExpr() selecting the proper columns from the
child.
If you dike out the preventive check in CheckAttributeType(),
composite-type columns now seem to actually work.  However, we surely
cannot ship them like this --- without I/O for composite types, you
can't get pg_dump to dump tables containing them.  So a little more
work still to do.
2004-06-05 01:55:05 +00:00
Tom Lane 80c6847cc5 Desultory de-FastList-ification. RelOptInfo.reltargetlist is back to
being a plain List.
2004-06-01 03:03:05 +00:00
Neil Conway 72b6ad6313 Use the new List API function names throughout the backend, and disable the
list compatibility API by default. While doing this, I decided to keep
the llast() macro around and introduce llast_int() and llast_oid() variants.
2004-05-30 23:40:41 +00:00
Neil Conway d0b4399d81 Reimplement the linked list data structure used throughout the backend.
In the past, we used a 'Lispy' linked list implementation: a "list" was
merely a pointer to the head node of the list. The problem with that
design is that it makes lappend() and length() linear time. This patch
fixes that problem (and others) by maintaining a count of the list
length and a pointer to the tail node along with each head node pointer.
A "list" is now a pointer to a structure containing some meta-data
about the list; the head and tail pointers in that structure refer
to ListCell structures that maintain the actual linked list of nodes.

The function names of the list API have also been changed to, I hope,
be more logically consistent. By default, the old function names are
still available; they will be disabled-by-default once the rest of
the tree has been updated to use the new API names.
2004-05-26 04:41:50 +00:00
Tom Lane e5170860ee Support FULL JOIN with no join clauses, such as X FULL JOIN Y ON TRUE.
That particular corner case is not exactly compelling, but given 7.4's
ability to discard redundant join clauses, it is possible for the situation
to arise from queries that are not so obviously silly.  Per bug report
of 6-Apr-04.
2004-04-06 18:46:03 +00:00
Tom Lane a536ed53bc Make use of statistics on index expressions. There are still some
corner cases that could stand improvement, but it does all the basic
stuff.  A byproduct is that the selectivity routines are no longer
constrained to working on simple Vars; we might in future be able to
improve the behavior for subexpressions that don't match indexes.
2004-02-17 00:52:53 +00:00
Tom Lane 391c3811a2 Rename SortMem and VacuumMem to work_mem and maintenance_work_mem.
Make btree index creation and initial validation of foreign-key constraints
use maintenance_work_mem rather than work_mem as their memory limit.
Add some code to guc.c to allow these variables to be referenced by their
old names in SHOW and SET commands, for backwards compatibility.
2004-02-03 17:34:04 +00:00
Tom Lane 0ee53b5c33 Don't return an overoptimistic result from join_in_selectivity when
we have detected that an IN subquery must return unique results.
2004-01-19 03:52:28 +00:00
Tom Lane b0c4a50bbb Instead of rechecking lossy index operators by putting them into the
regular qpqual ('filter condition'), add special-purpose code to
nodeIndexscan.c to recheck them.  This ends being almost no net addition
of code, because the removal of planner code balances out the extra
executor code, but it is significantly more efficient when a lossy
operator is involved in an OR indexscan.  The old implementation had
to recheck the entire indexqual in such cases.
2004-01-06 04:31:01 +00:00
Tom Lane fa559a86ee Adjust indexscan planning logic to keep RestrictInfo nodes associated
with index qual clauses in the Path representation.  This saves a little
work during createplan and (probably more importantly) allows reuse of
cached selectivity estimates during indexscan planning.  Also fix latent
bug: wrong plan would have been generated for a 'special operator' used
in a nestloop-inner-indexscan join qual, because the special operator
would not have gotten into the list of quals to recheck.  This bug is
only latent because at present the special-operator code could never
trigger on a join qual, but sooner or later someone will want to do it.
2004-01-05 23:39:54 +00:00
Tom Lane 9091e8d1b2 Add the ability to extract OR indexscan conditions from OR-of-AND
join conditions in which each OR subclause includes a constraint on
the same relation.  This implements the other useful side-effect of
conversion to CNF format, without its unpleasant side-effects.  As
per pghackers discussion of a few weeks ago.
2004-01-05 05:07:36 +00:00
Tom Lane 82b4dd394f Merge restrictlist_selectivity into clauselist_selectivity by
teaching the latter to accept either RestrictInfo nodes or bare
clause expressions; and cache the selectivity result in the RestrictInfo
node when possible.  This extends the caching behavior of approx_selectivity
to many more contexts, and should reduce duplicate selectivity
calculations.
2004-01-04 03:51:52 +00:00
Bruce Momjian ed96bfde18 Here is the definition of relation_byte_size() in optimizer/path/costsize.c:
----------------------------------------------------------------------
/*
 * relation_byte_size
 *        Estimate the storage space in bytes for a given number of tuples
 *        of a given width (size in bytes).
 */
static double
relation_byte_size(double tuples, int width)
{
        return tuples * (MAXALIGN(width) + MAXALIGN(sizeof(HeapTupleData)));
}

----------------------------------------------------------------------

Shouldn't this be HeapTupleHeaderData and not HeapTupleData ?

(Of course, from a costing perspective these shouldn't be very different but ...)

Sailesh Krishnamurthy
2003-12-18 03:46:45 +00:00
Tom Lane 7f8f7665fc Planner failed to be smart about binary-compatible expressions in pathkeys
and hash bucket-size estimation.  Issue has been there awhile but is more
critical in 7.4 because it affects varchar columns.  Per report from
Greg Stark.
2003-12-03 17:45:10 +00:00
PostgreSQL Daemon 969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Tom Lane a1dcd8f6dd Add a little more smarts to estimate_hash_bucketsize(): if there's no
statistics, but there is a unique index on the column, we can safely
assume it's well-distributed.
2003-10-05 22:44:25 +00:00
Bruce Momjian 46785776c4 Another pgindent run with updated typedefs. 2003-08-08 21:42:59 +00:00
Bruce Momjian f3c3deb7d0 Update copyrights to 2003. 2003-08-04 02:40:20 +00:00
Bruce Momjian 089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Tom Lane 45708f5ebc Error message editing in backend/optimizer, backend/rewrite. 2003-07-25 00:01:09 +00:00