1996-08-28 03:59:28 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
1999-02-14 00:22:53 +01:00
|
|
|
* index.h
|
2006-05-11 01:18:39 +02:00
|
|
|
* prototypes for catalog/index.c.
|
1996-08-28 03:59:28 +02:00
|
|
|
*
|
|
|
|
*
|
2015-01-06 17:43:47 +01:00
|
|
|
* Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
|
2000-01-26 06:58:53 +01:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
1996-08-28 03:59:28 +02:00
|
|
|
*
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/include/catalog/index.h
|
1996-08-28 03:59:28 +02:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
1997-09-07 07:04:48 +02:00
|
|
|
#ifndef INDEX_H
|
1996-08-28 03:59:28 +02:00
|
|
|
#define INDEX_H
|
|
|
|
|
2015-03-25 21:17:56 +01:00
|
|
|
#include "catalog/objectaddress.h"
|
1999-07-16 19:07:40 +02:00
|
|
|
#include "nodes/execnodes.h"
|
1996-08-28 03:59:28 +02:00
|
|
|
|
2003-05-28 18:04:02 +02:00
|
|
|
|
2002-03-10 07:02:24 +01:00
|
|
|
#define DEFAULT_INDEX_TYPE "btree"
|
Restructure index AM interface for index building and index tuple deletion,
per previous discussion on pghackers. Most of the duplicate code in
different AMs' ambuild routines has been moved out to a common routine
in index.c; this means that all index types now do the right things about
inserting recently-dead tuples, etc. (I also removed support for EXTEND
INDEX in the ambuild routines, since that's about to go away anyway, and
it cluttered the code a lot.) The retail indextuple deletion routines have
been replaced by a "bulk delete" routine in which the indexscan is inside
the access method. I haven't pushed this change as far as it should go yet,
but it should allow considerable simplification of the internal bookkeeping
for deletions. Also, add flag columns to pg_am to eliminate various
hardcoded tests on AM OIDs, and remove unused pg_am columns.
Fix rtree and gist index types to not attempt to store NULLs; before this,
gist usually crashed, while rtree managed not to crash but computed wacko
bounding boxes for NULL entries (which might have had something to do with
the performance problems we've heard about occasionally).
Add AtEOXact routines to hash, rtree, and gist, all of which have static
state that needs to be reset after an error. We discovered this need long
ago for btree, but missed the other guys.
Oh, one more thing: concurrent VACUUM is now the default.
2001-07-16 00:48:19 +02:00
|
|
|
|
|
|
|
/* Typedef for callback function for IndexBuildHeapScan */
|
|
|
|
typedef void (*IndexBuildCallback) (Relation index,
|
2005-10-15 04:49:52 +02:00
|
|
|
HeapTuple htup,
|
|
|
|
Datum *values,
|
|
|
|
bool *isnull,
|
|
|
|
bool tupleIsAlive,
|
|
|
|
void *state);
|
Restructure index AM interface for index building and index tuple deletion,
per previous discussion on pghackers. Most of the duplicate code in
different AMs' ambuild routines has been moved out to a common routine
in index.c; this means that all index types now do the right things about
inserting recently-dead tuples, etc. (I also removed support for EXTEND
INDEX in the ambuild routines, since that's about to go away anyway, and
it cluttered the code a lot.) The retail indextuple deletion routines have
been replaced by a "bulk delete" routine in which the indexscan is inside
the access method. I haven't pushed this change as far as it should go yet,
but it should allow considerable simplification of the internal bookkeeping
for deletions. Also, add flag columns to pg_am to eliminate various
hardcoded tests on AM OIDs, and remove unused pg_am columns.
Fix rtree and gist index types to not attempt to store NULLs; before this,
gist usually crashed, while rtree managed not to crash but computed wacko
bounding boxes for NULL entries (which might have had something to do with
the performance problems we've heard about occasionally).
Add AtEOXact routines to hash, rtree, and gist, all of which have static
state that needs to be reset after an error. We discovered this need long
ago for btree, but missed the other guys.
Oh, one more thing: concurrent VACUUM is now the default.
2001-07-16 00:48:19 +02:00
|
|
|
|
Fix assorted bugs in CREATE/DROP INDEX CONCURRENTLY.
Commit 8cb53654dbdb4c386369eb988062d0bbb6de725e, which introduced DROP
INDEX CONCURRENTLY, managed to break CREATE INDEX CONCURRENTLY via a poor
choice of catalog state representation. The pg_index state for an index
that's reached the final pre-drop stage was the same as the state for an
index just created by CREATE INDEX CONCURRENTLY. This meant that the
(necessary) change to make RelationGetIndexList ignore about-to-die indexes
also made it ignore freshly-created indexes; which is catastrophic because
the latter do need to be considered in HOT-safety decisions. Failure to
do so leads to incorrect index entries and subsequently wrong results from
queries depending on the concurrently-created index.
To fix, add an additional boolean column "indislive" to pg_index, so that
the freshly-created and about-to-die states can be distinguished. (This
change obviously is only possible in HEAD. This patch will need to be
back-patched, but in 9.2 we'll use a kluge consisting of overloading the
formerly-impossible state of indisvalid = true and indisready = false.)
In addition, change CREATE/DROP INDEX CONCURRENTLY so that the pg_index
flag changes they make without exclusive lock on the index are made via
heap_inplace_update() rather than a normal transactional update. The
latter is not very safe because moving the pg_index tuple could result in
concurrent SnapshotNow scans finding it twice or not at all, thus possibly
resulting in index corruption. This is a pre-existing bug in CREATE INDEX
CONCURRENTLY, which was copied into the DROP code.
In addition, fix various places in the code that ought to check to make
sure that the indexes they are manipulating are valid and/or ready as
appropriate. These represent bugs that have existed since 8.2, since
a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid
index behind, and we ought not try to do anything that might fail with
such an index.
Also fix RelationReloadIndexInfo to ensure it copies all the pg_index
columns that are allowed to change after initial creation. Previously we
could have been left with stale values of some fields in an index relcache
entry. It's not clear whether this actually had any user-visible
consequences, but it's at least a bug waiting to happen.
In addition, do some code and docs review for DROP INDEX CONCURRENTLY;
some cosmetic code cleanup but mostly addition and revision of comments.
This will need to be back-patched, but in a noticeably different form,
so I'm committing it to HEAD before working on the back-patch.
Problem reported by Amit Kapila, diagnosis by Pavan Deolassee,
fix by Tom Lane and Andres Freund.
2012-11-29 03:25:27 +01:00
|
|
|
/* Action code for index_set_state_flags */
|
|
|
|
typedef enum
|
|
|
|
{
|
|
|
|
INDEX_CREATE_SET_READY,
|
|
|
|
INDEX_CREATE_SET_VALID,
|
|
|
|
INDEX_DROP_CLEAR_VALID,
|
|
|
|
INDEX_DROP_SET_DEAD
|
|
|
|
} IndexStateFlagsAction;
|
|
|
|
|
Restructure index AM interface for index building and index tuple deletion,
per previous discussion on pghackers. Most of the duplicate code in
different AMs' ambuild routines has been moved out to a common routine
in index.c; this means that all index types now do the right things about
inserting recently-dead tuples, etc. (I also removed support for EXTEND
INDEX in the ambuild routines, since that's about to go away anyway, and
it cluttered the code a lot.) The retail indextuple deletion routines have
been replaced by a "bulk delete" routine in which the indexscan is inside
the access method. I haven't pushed this change as far as it should go yet,
but it should allow considerable simplification of the internal bookkeeping
for deletions. Also, add flag columns to pg_am to eliminate various
hardcoded tests on AM OIDs, and remove unused pg_am columns.
Fix rtree and gist index types to not attempt to store NULLs; before this,
gist usually crashed, while rtree managed not to crash but computed wacko
bounding boxes for NULL entries (which might have had something to do with
the performance problems we've heard about occasionally).
Add AtEOXact routines to hash, rtree, and gist, all of which have static
state that needs to be reset after an error. We discovered this need long
ago for btree, but missed the other guys.
Oh, one more thing: concurrent VACUUM is now the default.
2001-07-16 00:48:19 +02:00
|
|
|
|
2011-01-25 21:42:03 +01:00
|
|
|
extern void index_check_primary_key(Relation heapRel,
|
|
|
|
IndexInfo *indexInfo,
|
|
|
|
bool is_alter_table);
|
|
|
|
|
|
|
|
extern Oid index_create(Relation heapRelation,
|
2002-03-31 08:26:32 +02:00
|
|
|
const char *indexRelationName,
|
2005-04-14 03:38:22 +02:00
|
|
|
Oid indexRelationId,
|
2011-07-18 17:02:48 +02:00
|
|
|
Oid relFileNode,
|
2001-03-22 05:01:46 +01:00
|
|
|
IndexInfo *indexInfo,
|
Adjust naming of indexes and their columns per recent discussion.
Index expression columns are now named after the FigureColname result for
their expressions, rather than always being "pg_expression_N". Digits are
appended to this name if needed to make the column name unique within the
index. (That happens for regular columns too, thus fixing the old problem
that CREATE INDEX fooi ON foo (f1, f1) fails. Before exclusion indexes
there was no real reason to do such a thing, but now maybe there is.)
Default names for indexes and associated constraints now include the column
names of all their columns, not only the first one as in previous practice.
(Of course, this will be truncated as needed to fit in NAMEDATALEN. Also,
pkey indexes retain the historical behavior of not naming specific columns
at all.)
An example of the results:
regression=# create table foo (f1 int, f2 text,
regression(# exclude (f1 with =, lower(f2) with =));
NOTICE: CREATE TABLE / EXCLUDE will create implicit index "foo_f1_lower_exclusion" for table "foo"
CREATE TABLE
regression=# \d foo_f1_lower_exclusion
Index "public.foo_f1_lower_exclusion"
Column | Type | Definition
--------+---------+------------
f1 | integer | f1
lower | text | lower(f2)
btree, for table "public.foo"
2009-12-23 03:35:25 +01:00
|
|
|
List *indexColNames,
|
2001-03-22 05:01:46 +01:00
|
|
|
Oid accessMethodObjectId,
|
2004-06-18 08:14:31 +02:00
|
|
|
Oid tableSpaceId,
|
2011-02-08 22:04:18 +01:00
|
|
|
Oid *collationObjectId,
|
2001-03-22 05:01:46 +01:00
|
|
|
Oid *classObjectId,
|
2007-01-09 03:14:16 +01:00
|
|
|
int16 *coloptions,
|
2006-07-04 00:45:41 +02:00
|
|
|
Datum reloptions,
|
2006-05-11 01:18:39 +02:00
|
|
|
bool isprimary,
|
2002-07-12 20:43:19 +02:00
|
|
|
bool isconstraint,
|
2009-07-29 22:56:21 +02:00
|
|
|
bool deferrable,
|
|
|
|
bool initdeferred,
|
2004-05-05 06:48:48 +02:00
|
|
|
bool allow_system_table_mods,
|
2006-08-25 06:06:58 +02:00
|
|
|
bool skip_build,
|
2012-10-23 23:07:26 +02:00
|
|
|
bool concurrent,
|
2014-11-06 10:48:33 +01:00
|
|
|
bool is_internal,
|
|
|
|
bool if_not_exists);
|
1996-08-28 03:59:28 +02:00
|
|
|
|
2015-03-25 21:17:56 +01:00
|
|
|
extern ObjectAddress index_constraint_create(Relation heapRelation,
|
2011-01-25 21:42:03 +01:00
|
|
|
Oid indexRelationId,
|
|
|
|
IndexInfo *indexInfo,
|
|
|
|
const char *constraintName,
|
|
|
|
char constraintType,
|
|
|
|
bool deferrable,
|
|
|
|
bool initdeferred,
|
|
|
|
bool mark_as_primary,
|
|
|
|
bool update_pgindex,
|
2012-08-11 18:51:24 +02:00
|
|
|
bool remove_old_dependencies,
|
2013-03-18 03:55:14 +01:00
|
|
|
bool allow_system_table_mods,
|
|
|
|
bool is_internal);
|
2011-01-25 21:42:03 +01:00
|
|
|
|
2012-04-06 11:21:40 +02:00
|
|
|
extern void index_drop(Oid indexId, bool concurrent);
|
1996-08-28 03:59:28 +02:00
|
|
|
|
2003-05-28 18:04:02 +02:00
|
|
|
extern IndexInfo *BuildIndexInfo(Relation index);
|
2000-07-15 00:18:02 +02:00
|
|
|
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
extern void BuildSpeculativeIndexInfo(Relation index, IndexInfo *ii);
|
|
|
|
|
2000-07-15 00:18:02 +02:00
|
|
|
extern void FormIndexDatum(IndexInfo *indexInfo,
|
2005-03-16 22:38:10 +01:00
|
|
|
TupleTableSlot *slot,
|
2003-05-28 18:04:02 +02:00
|
|
|
EState *estate,
|
2005-03-21 02:24:04 +01:00
|
|
|
Datum *values,
|
|
|
|
bool *isnull);
|
1996-08-28 03:59:28 +02:00
|
|
|
|
2006-05-11 01:18:39 +02:00
|
|
|
extern void index_build(Relation heapRelation,
|
|
|
|
Relation indexRelation,
|
|
|
|
IndexInfo *indexInfo,
|
2011-04-20 00:50:56 +02:00
|
|
|
bool isprimary,
|
|
|
|
bool isreindex);
|
Restructure index AM interface for index building and index tuple deletion,
per previous discussion on pghackers. Most of the duplicate code in
different AMs' ambuild routines has been moved out to a common routine
in index.c; this means that all index types now do the right things about
inserting recently-dead tuples, etc. (I also removed support for EXTEND
INDEX in the ambuild routines, since that's about to go away anyway, and
it cluttered the code a lot.) The retail indextuple deletion routines have
been replaced by a "bulk delete" routine in which the indexscan is inside
the access method. I haven't pushed this change as far as it should go yet,
but it should allow considerable simplification of the internal bookkeeping
for deletions. Also, add flag columns to pg_am to eliminate various
hardcoded tests on AM OIDs, and remove unused pg_am columns.
Fix rtree and gist index types to not attempt to store NULLs; before this,
gist usually crashed, while rtree managed not to crash but computed wacko
bounding boxes for NULL entries (which might have had something to do with
the performance problems we've heard about occasionally).
Add AtEOXact routines to hash, rtree, and gist, all of which have static
state that needs to be reset after an error. We discovered this need long
ago for btree, but missed the other guys.
Oh, one more thing: concurrent VACUUM is now the default.
2001-07-16 00:48:19 +02:00
|
|
|
|
|
|
|
extern double IndexBuildHeapScan(Relation heapRelation,
|
2001-10-25 07:50:21 +02:00
|
|
|
Relation indexRelation,
|
|
|
|
IndexInfo *indexInfo,
|
2008-11-13 18:42:10 +01:00
|
|
|
bool allow_sync,
|
2001-10-25 07:50:21 +02:00
|
|
|
IndexBuildCallback callback,
|
|
|
|
void *callback_state);
|
BRIN: Block Range Indexes
BRIN is a new index access method intended to accelerate scans of very
large tables, without the maintenance overhead of btrees or other
traditional indexes. They work by maintaining "summary" data about
block ranges. Bitmap index scans work by reading each summary tuple and
comparing them with the query quals; all pages in the range are returned
in a lossy TID bitmap if the quals are consistent with the values in the
summary tuple, otherwise not. Normal index scans are not supported
because these indexes do not store TIDs.
As new tuples are added into the index, the summary information is
updated (if the block range in which the tuple is added is already
summarized) or not; in the latter case, a subsequent pass of VACUUM or
the brin_summarize_new_values() function will create the summary
information.
For data types with natural 1-D sort orders, the summary info consists
of the maximum and the minimum values of each indexed column within each
page range. This type of operator class we call "Minmax", and we
supply a bunch of them for most data types with B-tree opclasses.
Since the BRIN code is generalized, other approaches are possible for
things such as arrays, geometric types, ranges, etc; even for things
such as enum types we could do something different than minmax with
better results. In this commit I only include minmax.
Catalog version bumped due to new builtin catalog entries.
There's more that could be done here, but this is a good step forwards.
Loosely based on ideas from Simon Riggs; code mostly by Álvaro Herrera,
with contribution by Heikki Linnakangas.
Patch reviewed by: Amit Kapila, Heikki Linnakangas, Robert Haas.
Testing help from Jeff Janes, Erik Rijkers, Emanuel Calvo.
PS:
The research leading to these results has received funding from the
European Union's Seventh Framework Programme (FP7/2007-2013) under
grant agreement n° 318633.
2014-11-07 20:38:14 +01:00
|
|
|
extern double IndexBuildHeapRangeScan(Relation heapRelation,
|
|
|
|
Relation indexRelation,
|
|
|
|
IndexInfo *indexInfo,
|
|
|
|
bool allow_sync,
|
|
|
|
BlockNumber start_blockno,
|
|
|
|
BlockNumber end_blockno,
|
|
|
|
IndexBuildCallback callback,
|
|
|
|
void *callback_state);
|
1996-11-13 21:56:15 +01:00
|
|
|
|
2006-08-25 06:06:58 +02:00
|
|
|
extern void validate_index(Oid heapId, Oid indexId, Snapshot snapshot);
|
|
|
|
|
Fix assorted bugs in CREATE/DROP INDEX CONCURRENTLY.
Commit 8cb53654dbdb4c386369eb988062d0bbb6de725e, which introduced DROP
INDEX CONCURRENTLY, managed to break CREATE INDEX CONCURRENTLY via a poor
choice of catalog state representation. The pg_index state for an index
that's reached the final pre-drop stage was the same as the state for an
index just created by CREATE INDEX CONCURRENTLY. This meant that the
(necessary) change to make RelationGetIndexList ignore about-to-die indexes
also made it ignore freshly-created indexes; which is catastrophic because
the latter do need to be considered in HOT-safety decisions. Failure to
do so leads to incorrect index entries and subsequently wrong results from
queries depending on the concurrently-created index.
To fix, add an additional boolean column "indislive" to pg_index, so that
the freshly-created and about-to-die states can be distinguished. (This
change obviously is only possible in HEAD. This patch will need to be
back-patched, but in 9.2 we'll use a kluge consisting of overloading the
formerly-impossible state of indisvalid = true and indisready = false.)
In addition, change CREATE/DROP INDEX CONCURRENTLY so that the pg_index
flag changes they make without exclusive lock on the index are made via
heap_inplace_update() rather than a normal transactional update. The
latter is not very safe because moving the pg_index tuple could result in
concurrent SnapshotNow scans finding it twice or not at all, thus possibly
resulting in index corruption. This is a pre-existing bug in CREATE INDEX
CONCURRENTLY, which was copied into the DROP code.
In addition, fix various places in the code that ought to check to make
sure that the indexes they are manipulating are valid and/or ready as
appropriate. These represent bugs that have existed since 8.2, since
a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid
index behind, and we ought not try to do anything that might fail with
such an index.
Also fix RelationReloadIndexInfo to ensure it copies all the pg_index
columns that are allowed to change after initial creation. Previously we
could have been left with stale values of some fields in an index relcache
entry. It's not clear whether this actually had any user-visible
consequences, but it's at least a bug waiting to happen.
In addition, do some code and docs review for DROP INDEX CONCURRENTLY;
some cosmetic code cleanup but mostly addition and revision of comments.
This will need to be back-patched, but in a noticeably different form,
so I'm committing it to HEAD before working on the back-patch.
Problem reported by Amit Kapila, diagnosis by Pavan Deolassee,
fix by Tom Lane and Andres Freund.
2012-11-29 03:25:27 +01:00
|
|
|
extern void index_set_state_flags(Oid indexId, IndexStateFlagsAction action);
|
|
|
|
|
2014-11-15 05:19:49 +01:00
|
|
|
extern void reindex_index(Oid indexId, bool skip_constraint_checks,
|
|
|
|
char relpersistence);
|
2011-01-21 04:44:10 +01:00
|
|
|
|
2011-04-16 23:26:41 +02:00
|
|
|
/* Flag bits for reindex_relation(): */
|
2014-11-15 05:19:49 +01:00
|
|
|
#define REINDEX_REL_PROCESS_TOAST 0x01
|
|
|
|
#define REINDEX_REL_SUPPRESS_INDEX_USE 0x02
|
|
|
|
#define REINDEX_REL_CHECK_CONSTRAINTS 0x04
|
|
|
|
#define REINDEX_REL_FORCE_INDEXES_UNLOGGED 0x08
|
|
|
|
#define REINDEX_REL_FORCE_INDEXES_PERMANENT 0x10
|
2011-04-16 23:26:41 +02:00
|
|
|
|
|
|
|
extern bool reindex_relation(Oid relid, int flags);
|
2010-02-07 21:48:13 +01:00
|
|
|
|
|
|
|
extern bool ReindexIsProcessingHeap(Oid heapOid);
|
|
|
|
extern bool ReindexIsProcessingIndex(Oid indexOid);
|
2012-06-10 21:20:04 +02:00
|
|
|
extern Oid IndexGetRelation(Oid indexId, bool missing_ok);
|
2001-10-28 07:26:15 +01:00
|
|
|
|
2001-11-05 18:46:40 +01:00
|
|
|
#endif /* INDEX_H */
|