Commit Graph

49 Commits

Author SHA1 Message Date
Simon Riggs 8cb53654db Add DROP INDEX CONCURRENTLY [IF EXISTS], uses ShareUpdateExclusiveLock 2012-04-06 10:21:40 +01:00
Tom Lane c6a11b89e4 Teach SPGiST to store nulls and do whole-index scans.
This patch fixes the other major compatibility-breaking limitation of
SPGiST, that it didn't store anything for null values of the indexed
column, and so could not support whole-index scans or "x IS NULL"
tests.  The approach is to create a wholly separate search tree for
the null entries, and use fixed "allTheSame" insertion and search
rules when processing this tree, instead of calling the index opclass
methods.  This way the opclass methods do not need to worry about
dealing with nulls.

Catversion bump is for pg_am updates as well as the change in on-disk
format of SPGiST indexes; there are some tweaks in SPGiST WAL records
as well.

Heavily rewritten version of a patch by Oleg Bartunov and Teodor Sigaev.
(The original also stored nulls separately, but it reused GIN code to do
so; which required undesirable compromises in the on-disk format, and
would likely lead to bugs due to the GIN code being required to work in
two very different contexts.)
2012-03-11 16:29:59 -04:00
Tom Lane de5a08c59d Tweak duplicate-index-column regression test to avoid locale sensitivity.
The originally-chosen test case gives different results in es_EC locale
because of unusual rule for sorting strings beginning with "LL".  Adjust
the comparison value to avoid that, while hopefully not introducing new
locale dependencies elsewhere.  Per report from Jaime Casanova.
2012-01-12 14:18:08 -05:00
Tom Lane 15ba590792 Adjust SP-GiST regression tests to be less locale-sensitive.
The original test cases gave varying results depending on whether the
locale sorts digits before or after letters.  Since that's not really
what we wish to test here, adjust the test data to not contain any strings
beginning with digits.  Per report from Pavel Stehule.
2011-12-29 17:04:36 -05:00
Tom Lane e2c2c2e8b1 Improve planner's handling of duplicated index column expressions.
It's potentially useful for an index to repeat the same indexable column
or expression in multiple index columns, if the columns have different
opclasses.  (If they share opclasses too, the duplicate column is pretty
useless, but nonetheless we've allowed such cases since 9.0.)  However,
the planner failed to cope with this, because createplan.c was relying on
simple equal() matching to figure out which index column each index qual
is intended for.  We do have that information available upstream in
indxpath.c, though, so the fix is to not flatten the multi-level indexquals
list when putting it into an IndexPath.  Then we can rely on the sublist
structure to identify target index columns in createplan.c.  There's a
similar issue for index ORDER BYs (the KNNGIST feature), so introduce a
multi-level-list representation for that too.  This adds a bit more
representational overhead, but we might more or less buy that back by not
having to search for matching index columns anymore in createplan.c;
likewise btcostestimate saves some cycles.

Per bug #6351 from Christian Rudolph.  Likely symptoms include the "btree
index keys must be ordered by attribute" failure shown there, as well as
"operator MMMM is not a member of opfamily NNNN".

Although this is a pre-existing problem that can be demonstrated in 9.0 and
9.1, I'm not going to back-patch it, because the API changes in the planner
seem likely to break things such as index plugins.  The corner cases where
this matters seem too narrow to justify possibly breaking things in a minor
release.
2011-12-23 18:45:14 -05:00
Tom Lane 9220362493 Teach SP-GiST to do index-only scans.
Operator classes can specify whether or not they support this; this
preserves the flexibility to use lossy representations within an index.

In passing, move constant data about a given index into the rd_amcache
cache area, instead of doing fresh lookups each time we start an index
operation.  This is mainly to try to make sure that spgcanreturn() has
insignificant cost; I still don't have any proof that it matters for
actual index accesses.  Also, get rid of useless copying of FmgrInfo
pointers; we can perfectly well use the relcache's versions in-place.
2011-12-19 14:58:41 -05:00
Tom Lane 8daeb5ddd6 Add SP-GiST (space-partitioned GiST) index access method.
SP-GiST is comparable to GiST in flexibility, but supports non-balanced
partitioned search structures rather than balanced trees.  As described at
PGCon 2011, this new indexing structure can beat GiST in both index build
time and query speed for search problems that it is well matched to.

There are a number of areas that could still use improvement, but at this
point the code seems committable.

Teodor Sigaev and Oleg Bartunov, with considerable revisions by Tom Lane
2011-12-17 16:42:30 -05:00
Tom Lane 882368e854 Fix btree stop-at-nulls logic properly.
As pointed out by Naoya Anzai, my previous try at this was a few bricks
shy of a load, because I had forgotten that the initial-positioning logic
might not try to skip over nulls at the end of the index the scan will
start from.  We ought to fix that, because it represents an unnecessary
inefficiency, but first let's get the scan-stop logic back to a safe
state.  With this patch, we preserve the performance benefit requested
in bug #6278 for the case of scanning forward into NULLs (in a NULLS
LAST index), but the reverse case of scanning backward across NULLs
when there's no suitable initial-positioning qual is still inefficient.
2011-11-02 17:53:49 -04:00
Tom Lane a5652d3e05 Restore correct btree preprocessing of "indexedcol IS NULL" conditions.
Such a condition is unsatisfiable in combination with any other type of
btree-indexable condition (since we assume btree operators are always
strict).  8.3 and 8.4 had an explicit test for this, which I removed in
commit 29c4ad9829, mistakenly thinking that
the case would be subsumed by the more general handling of IS (NOT) NULL
added in that patch.  Put it back, and improve the comments about it, and
add a regression test case.

Per bug #6079 from Renat Nasyrov, and analysis by Dean Rasheed.
2011-06-29 19:46:47 -04:00
Tom Lane 88452d5ba6 Implement ALTER TABLE ADD UNIQUE/PRIMARY KEY USING INDEX.
This feature allows a unique or pkey constraint to be created using an
already-existing unique index.  While the constraint isn't very
functionally different from the bare index, it's nice to be able to do that
for documentation purposes.  The main advantage over just issuing a plain
ALTER TABLE ADD UNIQUE/PRIMARY KEY is that the index can be created with
CREATE INDEX CONCURRENTLY, so that there is not a long interval where the
table is locked against updates.

On the way, refactor some of the code in DefineIndex() and index_create()
so that we don't have to pass through those functions in order to create
the index constraint's catalog entries.  Also, in parse_utilcmd.c, pass
around the ParseState pointer in struct CreateStmtContext to save on
notation, and add error location pointers to some error reports that didn't
have one before.

Gurjeet Singh, reviewed by Steve Singer and Tom Lane
2011-01-25 15:43:05 -05:00
Tom Lane 73912e7fbd Fix GIN to support null keys, empty and null items, and full index scans.
Per my recent proposal(s).  Null key datums can now be returned by
extractValue and extractQuery functions, and will be stored in the index.
Also, placeholder entries are made for indexable items that are NULL or
contain no keys according to extractValue.  This means that the index is
now always complete, having at least one entry for every indexed heap TID,
and so we can get rid of the prohibition on full-index scans.  A full-index
scan is implemented much the same way as partial-match scans were already:
we build a bitmap representing all the TIDs found in the index, and then
drive the results off that.

Also, introduce a concept of a "search mode" that can be requested by
extractQuery when the operator requires matching to empty items (this is
just as cheap as matching to a single key) or requires a full index scan
(which is not so cheap, but it sure beats failing or giving wrong answers).
The behavior remains backward compatible for opclasses that don't return
any null keys or request a non-default search mode.

Using these features, we can now make the GIN index opclass for anyarray
behave in a way that matches the actual anyarray operators for &&, <@, @>,
and = ... which it failed to do before in assorted corner cases.

This commit fixes the core GIN code and ginarrayprocs.c, updates the
documentation, and adds some simple regression test cases for the new
behaviors using the array operators.  The tsearch and contrib GIN opclass
support functions still need to be looked over and probably fixed.

Another thing I intend to fix separately is that this is pretty inefficient
for cases where more than one scan condition needs a full-index search:
we'll run duplicate GinScanEntrys, each one of which builds a large bitmap.
There is some existing logic to merge duplicate GinScanEntrys but it needs
refactoring to make it work for entries belonging to different scan keys.

Note that most of gin.h has been split out into a new file gin_private.h,
so that gin.h doesn't export anything that's not supposed to be used by GIN
opclasses or the rest of the backend.  I did quite a bit of other code
beautification work as well, mostly fixing comments and choosing more
appropriate names for things.
2011-01-07 19:16:24 -05:00
Tom Lane 554506871b KNNGIST, otherwise known as order-by-operator support for GIST.
This commit represents a rather heavily editorialized version of
Teodor's builtin_knngist_itself-0.8.2 and builtin_knngist_proc-0.8.1
patches.  I redid the opclass API to add a separate Distance method
instead of turning the Consistent method into an illogical mess,
fixed some bit-rot in the rbtree interfaces, and generally worked over
the code style and comments.

There's still no non-code documentation to speak of, but I'll work on
that separately.  Some contrib-module changes are also yet to come
(right now, point <-> point is the only KNN-ified operator).

Teodor Sigaev and Tom Lane
2010-12-03 20:53:29 -05:00
Peter Eisentraut fc946c39ae Remove useless whitespace at end of lines 2010-11-23 22:34:55 +02:00
Teodor Sigaev 4cbe473938 Add point_ops opclass for GiST. 2010-01-14 16:31:09 +00:00
Tom Lane 29c4ad9829 Support "x IS NOT NULL" clauses as indexscan conditions. This turns out
to be just a minor extension of the previous patch that made "x IS NULL"
indexable, because we can treat the IS NOT NULL condition as if it were
"x < NULL" or "x > NULL" (depending on the index's NULLS FIRST/LAST option),
just like IS NULL is treated like "x = NULL".  Aside from any possible
usefulness in its own right, this is an important improvement for
index-optimized MAX/MIN aggregates: it is now reliably possible to get
a column's min or max value cheaply, even when there are a lot of nulls
cluttering the interesting end of the index.
2010-01-01 21:53:49 +00:00
Tom Lane d68e08d1fe Allow the index name to be omitted in CREATE INDEX, causing the system to
choose an index name the same as it would do for an unnamed index constraint.
(My recent changes to the index naming logic have helped to ensure that this
will be a reasonable choice.)  Per a suggestion from Peter.

A necessary side-effect is to promote CONCURRENTLY to type_func_name_keyword
status, ie, it can't be a table/column/index name anymore unless quoted.
This is not all bad, since we have heard more than once of people typing
CREATE INDEX CONCURRENTLY ON foo (...) and getting a normal index build of
an index named "concurrently", which was not what they wanted.  Now this
syntax will result in a concurrent build of an index with system-chosen
name; which they can rename afterwards if they want something else.
2009-12-23 17:41:45 +00:00
Tom Lane 527f0ae3fa Department of second thoughts: let's show the exact key during unique index
build failures, too.  Refactor a bit more since that error message isn't
spelled the same.
2009-08-01 20:59:17 +00:00
Tom Lane b680ae4bdb Improve unique-constraint-violation error messages to include the exact
values being complained of.

In passing, also remove the arbitrary length limitation in the similar
error detail message for foreign key violations.

Itagaki Takahiro
2009-08-01 19:59:41 +00:00
Teodor Sigaev 49475aab8d Correct calculations of overlap and contains operations over polygons. 2009-07-28 09:48:00 +00:00
Tom Lane 8835d63b27 Experiment with using EXPLAIN COSTS OFF in regression tests.
This is a simple test to see whether COSTS OFF will help much with getting
EXPLAIN output that's sufficiently platform-independent for use in the
regression tests.  The planner does have some freedom of choice in these
examples (plain via bitmap indexscan), so I'm not sure what will happen.
2009-07-27 00:26:03 +00:00
Tom Lane 27cb66fdfe Multi-column GIN indexes. Teodor Sigaev 2008-07-11 21:06:29 +00:00
Tom Lane 4e82a95476 Replace "amgetmulti" AM functions with "amgetbitmap", in which the whole
indexscan always occurs in one call, and the results are returned in a
TIDBitmap instead of a limited-size array of TIDs.  This should improve
speed a little by reducing AM entry/exit overhead, and it is necessary
infrastructure if we are ever to support bitmap indexes.

In an only slightly related change, add support for TIDBitmaps to preserve
(somewhat lossily) the knowledge that particular TIDs reported by an index
need to have their quals rechecked when the heap is visited.  This facility
is not really used yet; we'll need to extend the forced-recheck feature to
plain indexscans before it's useful, and that hasn't been coded yet.
The intent is to use it to clean up 8.3's horrid @@@ kluge for text search
with weighted queries.  There might be other uses in future, but that one
alone is sufficient reason.

Heikki Linnakangas, with some adjustments by me.
2008-04-10 22:25:26 +00:00
Tom Lane 2aae35d049 Mention the index name in 'could not create unique index' errors,
per suggestion from Rene Gollent.
2007-10-29 21:31:28 +00:00
Tom Lane 282d2a03dd HOT updates. When we update a tuple without changing any of its indexed
columns, and the new version can be stored on the same heap page, we no longer
generate extra index entries for the new version.  Instead, index searches
follow the HOT-chain links to ensure they find the correct tuple version.

In addition, this patch introduces the ability to "prune" dead tuples on a
per-page basis, without having to do a complete VACUUM pass to recover space.
VACUUM is still needed to clean up dead index entries, however.

Pavan Deolasee, with help from a bunch of other people.
2007-09-20 17:56:33 +00:00
Peter Eisentraut f4a3789b39 Clarify some error messages about duplicate things. 2007-06-03 22:16:03 +00:00
Tom Lane f02a82b6ad Make 'col IS NULL' clauses be indexable conditions.
Teodor Sigaev, with some kibitzing from Tom Lane.
2007-04-06 22:33:43 +00:00
Tom Lane 4431758229 Support ORDER BY ... NULLS FIRST/LAST, and add ASC/DESC/NULLS FIRST/NULLS LAST
per-column options for btree indexes.  The planner's support for this is still
pretty rudimentary; it does not yet know how to plan mergejoins with
nondefault ordering options.  The documentation is pretty rudimentary, too.
I'll work on improving that stuff later.

Note incompatible change from prior behavior: ORDER BY ... USING will now be
rejected if the operator is not a less-than or greater-than member of some
btree opclass.  This prevents less-than-sane behavior if an operator that
doesn't actually define a proper sort ordering is selected.
2007-01-09 02:14:16 +00:00
Tom Lane ba920e1c91 Rename contains/contained-by operators to @> and <@, per discussion that
agreed these symbols are less easily confused.  I made new pg_operator
entries (with new OIDs) for the old names, so as to provide backward
compatibility while making it pretty easy to remove the old names in
some future release cycle.  This commit only touches the core datatypes,
contrib will be fixed separately.
2006-09-10 00:29:35 +00:00
Tom Lane e093dcdd28 Add the ability to create indexes 'concurrently', that is, without
blocking concurrent writes to the table.  Greg Stark, with a little help
from Tom Lane.
2006-08-25 04:06:58 +00:00
Teodor Sigaev 001d30ee6b Add support to GIN for =(anyarray,anyarray) operation 2006-07-11 19:49:14 +00:00
Teodor Sigaev 8a3631f8d8 GIN: Generalized Inverted iNdex.
text[], int4[], Tsearch2 support for GIN.
2006-05-02 11:28:56 +00:00
Tom Lane 2a8d3d83ef R-tree is dead ... long live GiST. 2005-11-07 17:36:47 +00:00
Tom Lane 784b948984 Fix platform-dependency in recently added regression tests.
Per buildfarm results.
2005-07-01 23:18:01 +00:00
Tom Lane e7e1694295 Migrate rtree_gist functionality into the core system, and add some
basic regression tests for GiST to the standard regression tests.
I took the opportunity to add an rtree-equivalent gist opclass for
circles; the contrib version only covered boxes and polygons, but
indexing circles is very handy for distance searches.
2005-07-01 19:19:05 +00:00
Tom Lane addc42c339 Create the planner mechanism for optimizing simple MIN and MAX queries
into indexscans on matching indexes.  For the moment, it only handles
int4 and text datatypes; next step is to add a column to pg_aggregate
so that all MIN/MAX aggregates can be handled.  Per my recent proposal.
2005-04-11 23:06:57 +00:00
Tom Lane 42ce74bf17 COMMENT ON casts, conversions, languages, operator classes, and
large objects.  Dump all these in pg_dump; also add code to pg_dump
user-defined conversions.  Make psql's large object code rely on
the backend for inserting/deleting LOB comments, instead of trying to
hack pg_description directly.  Documentation and regression tests added.

Christopher Kings-Lynne, code reviewed by Tom
2003-11-21 22:32:49 +00:00
Tom Lane fa5c8a055a Cross-data-type comparisons are now indexable by btrees, pursuant to my
pghackers proposal of 8-Nov.  All the existing cross-type comparison
operators (int2/int4/int8 and float4/float8) have appropriate support.
The original proposal of storing the right-hand-side datatype as part of
the primary key for pg_amop and pg_amproc got modified a bit in the event;
it is easier to store zero as the 'default' case and only store a nonzero
when the operator is actually cross-type.  Along the way, remove the
long-since-defunct bigbox_ops operator class.
2003-11-12 21:15:59 +00:00
Peter Eisentraut feb4f44d29 Message editing: remove gratuitous variations in message wording, standardize
terms, add some clarifications, fix some untranslatable attempts at dynamic
message building.
2003-09-25 06:58:07 +00:00
Tom Lane ec7aa4b515 Error message editing in backend/access. 2003-07-21 20:29:40 +00:00
Tom Lane d998fac953 Add a regression test to catch future silliness in the index-building
area...
2003-05-29 01:09:13 +00:00
Tom Lane fc8d970cbc Replace functional-index facility with expressional indexes. Any column
of an index can now be a computed expression instead of a simple variable.
Restrictions on expressions are the same as for predicates (only immutable
functions, no sub-selects).  This fixes problems recently introduced with
inlining SQL functions, because the inlining transformation is applied to
both expression trees so the planner can still match them up.  Along the
way, improve efficiency of handling index predicates (both predicates and
index expressions are now cached by the relcache) and fix 7.3 oversight
that didn't record dependencies of predicate expressions.
2003-05-28 16:04:02 +00:00
Tom Lane 7d66bf261c Add some minimal exercising of functional-index feature to regression
tests.
2001-08-27 23:23:35 +00:00
Tom Lane f31dc0ada7 Partial indexes work again, courtesy of Martijn van Oosterhout.
Note: I didn't force an initdb, figuring that one today was enough.
However, there is a new function in pg_proc.h, and pg_dump won't be
able to dump partial indexes until you add that function.
2001-07-16 05:07:00 +00:00
Tom Lane 598ea2c359 Finish repairing 6.5's problems with r-tree indexes: create appropriate
selectivity functions and make the r-tree operators use them.  The
estimation functions themselves are just stubs, unfortunately, but
perhaps someday someone will make them compute realistic estimates.
Change pg_am so that the optimizer can reliably tell the difference
between ordered and unordered indexes --- before it would think that
an r-tree index can be scanned in '<<' order, which is not right AFAIK.
Repair broken negator links for network_sup and related ops.
Initdb forced.  This might be my last initdb force for 7.0 ... hope so
anyway ...
2000-02-17 03:40:02 +00:00
Thomas G. Lockhart 9c1b29816e Update output to new psql conventions. 2000-01-05 17:31:08 +00:00
Bruce Momjian 0d203b745d Re-apply Darren's char2-16 removal code. 1998-04-26 04:12:15 +00:00
Bruce Momjian db21523314 Back out char2-char16 removal. Add later. 1998-04-07 18:14:38 +00:00
Bruce Momjian 57b5966405 The following uuencoded, gzip'd file will ...
1. Remove the char2, char4, char8 and char16 types from postgresql
2. Change references of char16 to name in the regression tests.
3. Rename the char16.sql regression test to name.sql.  4. Modify
the regression test scripts and outputs to match up.

Might require new regression.{SYSTEM} files...

Darren King
1998-03-30 17:28:21 +00:00
Marc G. Fournier 588ae64c44 More splits and cleanups... 1997-04-06 06:07:13 +00:00