Commit Graph

277 Commits

Author SHA1 Message Date
Tom Lane
63cc56de54 Suppress subquery pullup and pushdown when the subquery has any
set-returning functions in its target list.  This ensures that we
won't rewrite the query in a way that places set-returning functions
into quals (WHERE clauses).  Cf. bug reports from Joe Conway.
2001-12-10 22:54:12 +00:00
Tom Lane
c5c97318f9 In find_mergeclauses_for_pathkeys, it's okay to return multiple merge
clauses per path key.  Indeed, we *must* do so or we will be unable to
form a valid plan for FULL JOIN with overlapping join conditions, eg
select * from a full join b on
a.v1 = b.v1 and a.v2 = b.v2 and a.v1 = b.v2.
2001-11-11 20:33:53 +00:00
Tom Lane
ad511a3ff3 sort_inner_and_outer needs a check to ensure that it's consumed all the
mergeclauses in RIGHT/FULL join cases, just like the other routines have.
I'm not quite sure why I thought it didn't need one --- but Nick
Fankhauser's recent bug report proves that it does.
2001-11-11 19:18:54 +00:00
Bruce Momjian
ea08e6cd55 New pgindent run with fixes suggested by Tom. Patch manually reviewed,
initdb/regression tests pass.
2001-11-05 17:46:40 +00:00
Bruce Momjian
c41b6b1b9c Fix small problem Tom Lane found with pgindent run. 2001-10-30 05:38:56 +00:00
Bruce Momjian
6783b2372e Another pgindent run. Fixes enum indenting, and improves #endif
spacing.  Also adds space for one-line comments.
2001-10-28 06:26:15 +00:00
Bruce Momjian
b81844b173 pgindent run on all C files. Java run to follow. initdb/regression
tests pass.
2001-10-25 05:50:21 +00:00
Tom Lane
6254465d06 Extend code that deduces implied equality clauses to detect whether a
clause being added to a particular restriction-clause list is redundant
with those already in the list.  This avoids useless work at runtime,
and (perhaps more importantly) keeps the selectivity estimation routines
from generating too-small estimates of numbers of output rows.
Also some minor improvements in OPTIMIZER_DEBUG displays.
2001-10-18 16:11:42 +00:00
Tom Lane
f933766ba7 Restructure pg_opclass, pg_amop, and pg_amproc per previous discussions in
pgsql-hackers.  pg_opclass now has a row for each opclass supported by each
index AM, not a row for each opclass name.  This allows pg_opclass to show
directly whether an AM supports an opclass, and furthermore makes it possible
to store additional information about an opclass that might be AM-dependent.
pg_opclass and pg_amop now store "lossy" and "haskeytype" information that we
previously expected the user to remember to provide in CREATE INDEX commands.
Lossiness is no longer an index-level property, but is associated with the
use of a particular operator in a particular index opclass.

Along the way, IndexSupportInitialize now uses the syscaches to retrieve
pg_amop and pg_amproc entries.  I find this reduces backend launch time by
about ten percent, at the cost of a couple more special cases in catcache.c's
IndexScanOK.

Initial work by Oleg Bartunov and Teodor Sigaev, further hacking by Tom Lane.

initdb forced.
2001-08-21 16:36:06 +00:00
Tom Lane
246793469e Modify partial-index-predicate applicability tester to test whether
clauses are equal(), before trying to match them up using btree opclass
inference rules.  This allows it to recognize many simple cases involving
non-btree operations, for example 'x IS NULL'.  Clean up code a little.
2001-08-06 18:09:45 +00:00
Tom Lane
421467cdc8 Fix optimizer to not try to push WHERE clauses down into a sub-SELECT that
has a DISTINCT ON clause, per bug report from Anthony Wood.  While at it,
improve the DISTINCT-ON-clause recognizer routine to not be fooled by out-
of-order DISTINCT lists.
2001-07-31 17:56:31 +00:00
Tom Lane
40db52af34 Do not push down quals into subqueries that have LIMIT/OFFSET clauses,
since the added qual could change the set of rows that get past the
LIMIT.  Per discussion on pgsql-sql 7/15/01.
2001-07-16 17:57:02 +00:00
Tom Lane
f31dc0ada7 Partial indexes work again, courtesy of Martijn van Oosterhout.
Note: I didn't force an initdb, figuring that one today was enough.
However, there is a new function in pg_proc.h, and pg_dump won't be
able to dump partial indexes until you add that function.
2001-07-16 05:07:00 +00:00
Tom Lane
4d58a7ca87 Optimizer can now estimate selectivity of IS NULL, IS NOT NULL,
IS TRUE, etc, with some degree of verisimilitude.  Split out
selectivity support functions from builtins.h into a new header
file selfuncs.h, so as to reduce the number of header files builtins.h
must depend on.  Fix a few missing inclusions exposed thereby.
From Joe Conway, with some kibitzing from Tom Lane.
2001-06-25 21:11:45 +00:00
Tom Lane
1f1ca182be Make inet/cidr << and <<= operators indexable. From Alex Pilosov <alex@pilosoft.com>. 2001-06-17 02:05:20 +00:00
Tom Lane
01a819abe3 Make planner compute the number of hash buckets the same way that
nodeHash.c will compute it (by sharing code).
2001-06-11 00:17:08 +00:00
Tom Lane
a8fe109ac1 Fix thinko in hash cost estimation: average frequency
should be computed from total number of distinct values in whole
relation, not # distinct values we expect to have after restriction
clauses are applied.
2001-06-10 02:59:35 +00:00
Tom Lane
cdd230d628 Improve planning of OR indexscan plans: for quals like
WHERE (a = 1 or a = 2) and b = 42
and an index on (a,b), include the clause b = 42 in the indexquals
generated for each arm of the OR clause.  Essentially this is an index-
driven conversion from CNF to DNF.  Implementation is a bit klugy, but
better than not exploiting the extra quals at all ...
2001-06-05 17:13:52 +00:00
Tom Lane
7c579fa12d Further work on making use of new statistics in planner. Adjust APIs
of costsize.c routines to pass Query root, so that costsize can figure
more things out by itself and not be so dependent on its callers to tell
it everything it needs to know.  Use selectivity of hash or merge clause
to estimate number of tuples processed internally in these joins
(this is more useful than it would've been before, since eqjoinsel is
somewhat more accurate than before).
2001-06-05 05:26:05 +00:00
Tom Lane
be03eb25f3 Modify optimizer data structures so that IndexOptInfo lists built for
create_index_paths are not immediately discarded, but are available for
subsequent planner work.  This allows avoiding redundant syscache lookups
in several places.  Change interface to operator selectivity estimation
procedures to allow faster and more flexible estimation.
Initdb forced due to change of pg_proc entries for selectivity functions!
2001-05-20 20:28:20 +00:00
Tom Lane
c23bc6fbb0 First cut at making indexscan cost estimates depend on correlation
between index order and table order.
2001-05-09 23:13:37 +00:00
Tom Lane
6cda3ad8fe Cause planner to make use of average-column-width statistic that is now
collected by ANALYZE.  Also, add some modest amount of intelligence to
guesses that are used for varlena columns in the absence of any ANALYZE
statistics.  The 'width' reported by EXPLAIN is finally something less
than totally bogus for varlena columns ... and, in consequence, hashjoin
estimating should be a little better ...
2001-05-09 00:35:09 +00:00
Bruce Momjian
857abb0e57 Add newlines around debug output in optimizer showing total costs. 2001-05-08 17:25:28 +00:00
Tom Lane
f905d65ee3 Rewrite of planner statistics-gathering code. ANALYZE is now available as
a separate statement (though it can still be invoked as part of VACUUM, too).
pg_statistic redesigned to be more flexible about what statistics are
stored.  ANALYZE now collects a list of several of the most common values,
not just one, plus a histogram (not just the min and max values).  Random
sampling is used to make the process reasonably fast even on very large
tables.  The number of values and histogram bins collected is now
user-settable via an ALTER TABLE command.

There is more still to do; the new stats are not being used everywhere
they could be in the planner.  But the remaining changes for this project
should be localized, and the behavior is already better than before.

A not-very-related change is that sorting now makes use of btree comparison
routines if it can find one, rather than invoking '<' twice.
2001-05-07 00:43:27 +00:00
Tom Lane
a43f20cb0a Tweak nestloop costing to weight restart cost of inner path more heavily.
Without this, it was making some pretty silly decisions about whether an
expensive sub-SELECT should be the inner or outer side of a join...
2001-04-25 22:04:37 +00:00
Tom Lane
f9094c44c0 Prevent generation of invalid plans for RIGHT or FULL joins with multiple
join clauses.  The mergejoin executor wants all the join clauses to appear
as merge quals, not as extra joinquals, for these kinds of joins.  But the
planner would consider plans in which partially-sorted input paths were
used, leading to only some of the join clauses becoming merge quals.
This is fine for inner/left joins, not fine for right/full joins.
2001-04-15 00:48:17 +00:00
Bruce Momjian
7cf952e7b4 Fix comments that were mis-wrapped, for Tom Lane. 2001-03-23 04:49:58 +00:00
Bruce Momjian
0686d49da0 Remove dashes in comments that don't need them, rewrap with pgindent. 2001-03-22 06:16:21 +00:00
Bruce Momjian
9e1552607a pgindent run. Make it all clean. 2001-03-22 04:01:46 +00:00
Tom Lane
13cc7eb3e2 Clean up two rather nasty bugs in operator selection code.
1. If there is exactly one pg_operator entry of the right name and oprkind,
oper() and related routines would return that entry whether its input type
had anything to do with the request or not.  This is just premature
optimization: we shouldn't return the single candidate until after we verify
that it really is a valid candidate, ie, is at least coercion-compatible
with the given types.

2. oper() and related routines only promise a coercion-compatible result.
Unfortunately, there were quite a few callers that assumed the returned
operator is binary-compatible with the given datatype; they would proceed
to call it without making any datatype coercions.  These callers include
sorting, grouping, aggregation, and VACUUM ANALYZE.  In general I think
it is appropriate for these callers to require an exact or binary-compatible
match, so I've added a new routine compatible_oper() that only succeeds if
it can find an operator that doesn't require any run-time conversions.
Callers now call oper() or compatible_oper() depending on whether they are
prepared to deal with type conversion or not.

The upshot of these bugs is revealed by the following silliness in PL/Tcl's
selftest: it creates an operator @< on int4, and then tries to use it to
sort a char(N) column.  The system would let it do that :-( (and evidently
has done so since 6.3 :-( :-().  The result in this case was just a silly
sort order, but the reverse combination would've provoked coredump from
trying to dereference integers.  With this fix you get more reasonable
behavior:
pltcl_test=# select * from T_pkey1 order by key1, key2 using @<;
ERROR:  Unable to identify an operator '@<' for types 'bpchar' and 'bpchar'
        You will have to retype this query using an explicit cast
2001-02-16 03:16:58 +00:00
Tom Lane
b29f68f611 Take OUTER JOIN semantics into account when estimating the size of join
relations.  It's not very bright, but at least it now knows that
A LEFT JOIN B must produce at least as many rows as are in A ...
2001-02-16 00:03:08 +00:00
Tom Lane
83b4ab53ad Update a couple of obsolete comments. 2001-02-15 17:46:40 +00:00
Tom Lane
503f042cd7 Fix inappropriate attempt to push down qual clauses into a view that
has UNION/INTERSECT/EXCEPT operations.  Per bug report from Ferrier.
2001-02-03 21:17:52 +00:00
Bruce Momjian
623bf843d2 Change Copyright from PostgreSQL, Inc to PostgreSQL Global Development Group. 2001-01-24 19:43:33 +00:00
Tom Lane
b06fbc7ad2 Fix performance issue with qualifications on VIEWs: outer query should
try to push restrictions on the view down into the view subquery,
so that they can become indexscan quals or what-have-you rather than
being applied at the top level of the subquery.  7.0 and before were
able to do this, though in a much klugier way, and I'd hate to have
anyone complaining that 7.1 is stupider than 7.0 ...
2001-01-18 07:12:37 +00:00
Bruce Momjian
5088f0748a Change lcons(x, NIL) to makeList(x) where appropriate. 2001-01-17 17:26:45 +00:00
Tom Lane
97cfb9d606 Make sure make_rels_by_clause_joins doesn't return multiple references
to same joinrel.  Although make_rels_by_joins doesn't mind, GEQO has
an Assert that doesn't like this.
2000-12-18 06:50:51 +00:00
Tom Lane
ea166f1146 Planner speedup hacking. Avoid saving useless pathkeys, so that path
comparison does not consider paths different when they differ only in
uninteresting aspects of sort order.  (We had a special case of this
consideration for indexscans already, but generalize it to apply to
ordered join paths too.)  Be stricter about what is a canonical pathkey
to allow faster pathkey comparison.  Cache canonical pathkeys and
dispersion stats for left and right sides of a RestrictInfo's clause,
to avoid repeated computation.  Total speedup will depend on number of
tables in a query, but I see about 4x speedup of planning phase for
a sample seven-table query.
2000-12-14 22:30:45 +00:00
Tom Lane
17b843d677 Cache eval cost of qualification expressions in RestrictInfo nodes to
avoid repeated evaluations in cost_qual_eval().  This turns out to save
a useful fraction of planning time.  No change to external representation
of RestrictInfo --- although that node type doesn't appear in stored
rules anyway.
2000-12-12 23:33:34 +00:00
Tom Lane
bbea3643a3 Store current LC_COLLATE and LC_CTYPE settings in pg_control during initdb;
re-adopt these settings at every postmaster or standalone-backend startup.
This should fix problems with indexes becoming corrupt due to failure to
provide consistent locale environment for postmaster at all times.  Also,
refuse to start up a non-locale-enabled compilation in a database originally
initdb'd with a non-C locale.  Suppress LIKE index optimization if locale
is not "C" or "POSIX" (are there any other locales where it's safe?).
Issue NOTICE during initdb if selected locale disables LIKE optimization.
2000-11-25 20:33:54 +00:00
Tom Lane
48437f5c3a Ensure that mergejoin plan will be considered for FULL OUTER JOIN even
if enable_mergejoin = OFF.  Must do this, because we have no other
implementation method for full joins.
2000-11-23 03:57:31 +00:00
Tom Lane
a933ee38bb Change SearchSysCache coding conventions so that a reference count is
maintained for each cache entry.  A cache entry will not be freed until
the matching ReleaseSysCache call has been executed.  This eliminates
worries about cache entries getting dropped while still in use.  See
my posting to pg-hackers of even date for more info.
2000-11-16 22:30:52 +00:00
Tom Lane
6543d81d65 Restructure handling of inheritance queries so that they work with outer
joins, and clean things up a good deal at the same time.  Append plan node
no longer hacks on rangetable at runtime --- instead, all child tables are
given their own RT entries during planning.  Concept of multiple target
tables pushed up into execMain, replacing bug-prone implementation within
nodeAppend.  Planner now supports generating Append plans for inheritance
sets either at the top of the plan (the old way) or at the bottom.  Expanding
at the bottom is appropriate for tables used as sources, since they may
appear inside an outer join; but we must still expand at the top when the
target of an UPDATE or DELETE is an inheritance set, because we actually need
a different targetlist and junkfilter for each target table in that case.
Fortunately a target table can't be inside an outer join...  Bizarre mutual
recursion between union_planner and prepunion.c is gone --- in fact,
union_planner doesn't really have much to do with union queries anymore,
so I renamed it grouping_planner.
2000-11-12 00:37:02 +00:00
Tom Lane
09a8912f73 Ensure clause_selectivity() behaves sanely when examining an uplevel Var
or a Var that references a subquery output.
2000-10-25 21:48:12 +00:00
Bruce Momjian
b32685a999 Add proofreader's changes to docs.
Fix misspelling of disbursion to dispersion.
2000-10-05 19:48:34 +00:00
Tom Lane
05e3d0ee86 Reimplementation of UNION/INTERSECT/EXCEPT. INTERSECT/EXCEPT now meet the
SQL92 semantics, including support for ALL option.  All three can be used
in subqueries and views.  DISTINCT and ORDER BY work now in views, too.
This rewrite fixes many problems with cross-datatype UNIONs and INSERT/SELECT
where the SELECT yields different datatypes than the INSERT needs.  I did
that by making UNION subqueries and SELECT in INSERT be treated like
subselects-in-FROM, thereby allowing an extra level of targetlist where the
datatype conversions can be inserted safely.
INITDB NEEDED!
2000-10-05 19:11:39 +00:00
Tom Lane
3a94e789f5 Subselects in FROM clause, per ISO syntax: FROM (SELECT ...) [AS] alias.
(Don't forget that an alias is required.)  Views reimplemented as expanding
to subselect-in-FROM.  Grouping, aggregates, DISTINCT in views actually
work now (he says optimistically).  No UNION support in subselects/views
yet, but I have some ideas about that.  Rule-related permissions checking
moved out of rewriter and into executor.
INITDB REQUIRED!
2000-09-29 18:21:41 +00:00
Tom Lane
ba2ea6e0f5 Fix GEQO optimizer to work correctly with new outer-join-capable
query representation.  Note that GEQO_RELS setting is now interpreted
as the number of top-level items in the FROM list, not necessarily the
number of relations in the query.  This seems appropriate since we are
only doing join-path searching over the top-level items.
2000-09-19 18:42:34 +00:00
Tom Lane
8ae9ad1cb8 Reimplement LIKE/ESCAPE as operators so that indexscan optimization
can still work, per recent discussion on pghackers.  Correct some bugs
in ILIKE implementation.
2000-09-15 18:45:31 +00:00
Tom Lane
ed5003c584 First cut at full support for OUTER JOINs. There are still a few loose
ends to clean up (see my message of same date to pghackers), but mostly
it works.  INITDB REQUIRED!
2000-09-12 21:07:18 +00:00