1996-08-28 03:59:28 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
1999-02-14 00:22:53 +01:00
|
|
|
* index.h
|
2006-05-11 01:18:39 +02:00
|
|
|
* prototypes for catalog/index.c.
|
1996-08-28 03:59:28 +02:00
|
|
|
*
|
|
|
|
*
|
2024-01-04 02:49:05 +01:00
|
|
|
* Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
|
2000-01-26 06:58:53 +01:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
1996-08-28 03:59:28 +02:00
|
|
|
*
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/include/catalog/index.h
|
1996-08-28 03:59:28 +02:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#ifndef INDEX_H
|
|
|
|
#define INDEX_H
|
|
|
|
|
2015-03-25 21:17:56 +01:00
|
|
|
#include "catalog/objectaddress.h"
|
1999-07-16 19:07:40 +02:00
|
|
|
#include "nodes/execnodes.h"
|
1996-08-28 03:59:28 +02:00
|
|
|
|
2003-05-28 18:04:02 +02:00
|
|
|
|
2002-03-10 07:02:24 +01:00
|
|
|
#define DEFAULT_INDEX_TYPE "btree"
|
Restructure index AM interface for index building and index tuple deletion,
per previous discussion on pghackers. Most of the duplicate code in
different AMs' ambuild routines has been moved out to a common routine
in index.c; this means that all index types now do the right things about
inserting recently-dead tuples, etc. (I also removed support for EXTEND
INDEX in the ambuild routines, since that's about to go away anyway, and
it cluttered the code a lot.) The retail indextuple deletion routines have
been replaced by a "bulk delete" routine in which the indexscan is inside
the access method. I haven't pushed this change as far as it should go yet,
but it should allow considerable simplification of the internal bookkeeping
for deletions. Also, add flag columns to pg_am to eliminate various
hardcoded tests on AM OIDs, and remove unused pg_am columns.
Fix rtree and gist index types to not attempt to store NULLs; before this,
gist usually crashed, while rtree managed not to crash but computed wacko
bounding boxes for NULL entries (which might have had something to do with
the performance problems we've heard about occasionally).
Add AtEOXact routines to hash, rtree, and gist, all of which have static
state that needs to be reset after an error. We discovered this need long
ago for btree, but missed the other guys.
Oh, one more thing: concurrent VACUUM is now the default.
2001-07-16 00:48:19 +02:00
|
|
|
|
Fix assorted bugs in CREATE/DROP INDEX CONCURRENTLY.
Commit 8cb53654dbdb4c386369eb988062d0bbb6de725e, which introduced DROP
INDEX CONCURRENTLY, managed to break CREATE INDEX CONCURRENTLY via a poor
choice of catalog state representation. The pg_index state for an index
that's reached the final pre-drop stage was the same as the state for an
index just created by CREATE INDEX CONCURRENTLY. This meant that the
(necessary) change to make RelationGetIndexList ignore about-to-die indexes
also made it ignore freshly-created indexes; which is catastrophic because
the latter do need to be considered in HOT-safety decisions. Failure to
do so leads to incorrect index entries and subsequently wrong results from
queries depending on the concurrently-created index.
To fix, add an additional boolean column "indislive" to pg_index, so that
the freshly-created and about-to-die states can be distinguished. (This
change obviously is only possible in HEAD. This patch will need to be
back-patched, but in 9.2 we'll use a kluge consisting of overloading the
formerly-impossible state of indisvalid = true and indisready = false.)
In addition, change CREATE/DROP INDEX CONCURRENTLY so that the pg_index
flag changes they make without exclusive lock on the index are made via
heap_inplace_update() rather than a normal transactional update. The
latter is not very safe because moving the pg_index tuple could result in
concurrent SnapshotNow scans finding it twice or not at all, thus possibly
resulting in index corruption. This is a pre-existing bug in CREATE INDEX
CONCURRENTLY, which was copied into the DROP code.
In addition, fix various places in the code that ought to check to make
sure that the indexes they are manipulating are valid and/or ready as
appropriate. These represent bugs that have existed since 8.2, since
a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid
index behind, and we ought not try to do anything that might fail with
such an index.
Also fix RelationReloadIndexInfo to ensure it copies all the pg_index
columns that are allowed to change after initial creation. Previously we
could have been left with stale values of some fields in an index relcache
entry. It's not clear whether this actually had any user-visible
consequences, but it's at least a bug waiting to happen.
In addition, do some code and docs review for DROP INDEX CONCURRENTLY;
some cosmetic code cleanup but mostly addition and revision of comments.
This will need to be back-patched, but in a noticeably different form,
so I'm committing it to HEAD before working on the back-patch.
Problem reported by Amit Kapila, diagnosis by Pavan Deolassee,
fix by Tom Lane and Andres Freund.
2012-11-29 03:25:27 +01:00
|
|
|
/* Action code for index_set_state_flags */
|
|
|
|
typedef enum
|
|
|
|
{
|
|
|
|
INDEX_CREATE_SET_READY,
|
|
|
|
INDEX_CREATE_SET_VALID,
|
|
|
|
INDEX_DROP_CLEAR_VALID,
|
|
|
|
INDEX_DROP_SET_DEAD,
|
|
|
|
} IndexStateFlagsAction;
|
|
|
|
|
2020-12-03 02:13:21 +01:00
|
|
|
/* options for REINDEX */
|
2021-01-18 06:03:10 +01:00
|
|
|
typedef struct ReindexParams
|
2020-12-03 02:13:21 +01:00
|
|
|
{
|
2021-01-18 06:03:10 +01:00
|
|
|
bits32 options; /* bitmask of REINDEXOPT_* */
|
2021-02-04 06:34:20 +01:00
|
|
|
Oid tablespaceOid; /* New tablespace to move indexes to.
|
|
|
|
* InvalidOid to do nothing. */
|
2021-01-18 06:03:10 +01:00
|
|
|
} ReindexParams;
|
|
|
|
|
|
|
|
/* flag bits for ReindexParams->flags */
|
|
|
|
#define REINDEXOPT_VERBOSE 0x01 /* print progress info */
|
|
|
|
#define REINDEXOPT_REPORT_PROGRESS 0x02 /* report pgstat progress */
|
|
|
|
#define REINDEXOPT_MISSING_OK 0x04 /* skip missing relations */
|
|
|
|
#define REINDEXOPT_CONCURRENTLY 0x08 /* concurrent mode */
|
2020-12-03 02:13:21 +01:00
|
|
|
|
2019-03-28 03:59:06 +01:00
|
|
|
/* state info for validate_index bulkdelete callback */
|
|
|
|
typedef struct ValidateIndexState
|
|
|
|
{
|
|
|
|
Tuplesortstate *tuplesort; /* for sorting the index TIDs */
|
|
|
|
/* statistics (for debug purposes only): */
|
|
|
|
double htups,
|
|
|
|
itups,
|
|
|
|
tups_inserted;
|
|
|
|
} ValidateIndexState;
|
Restructure index AM interface for index building and index tuple deletion,
per previous discussion on pghackers. Most of the duplicate code in
different AMs' ambuild routines has been moved out to a common routine
in index.c; this means that all index types now do the right things about
inserting recently-dead tuples, etc. (I also removed support for EXTEND
INDEX in the ambuild routines, since that's about to go away anyway, and
it cluttered the code a lot.) The retail indextuple deletion routines have
been replaced by a "bulk delete" routine in which the indexscan is inside
the access method. I haven't pushed this change as far as it should go yet,
but it should allow considerable simplification of the internal bookkeeping
for deletions. Also, add flag columns to pg_am to eliminate various
hardcoded tests on AM OIDs, and remove unused pg_am columns.
Fix rtree and gist index types to not attempt to store NULLs; before this,
gist usually crashed, while rtree managed not to crash but computed wacko
bounding boxes for NULL entries (which might have had something to do with
the performance problems we've heard about occasionally).
Add AtEOXact routines to hash, rtree, and gist, all of which have static
state that needs to be reset after an error. We discovered this need long
ago for btree, but missed the other guys.
Oh, one more thing: concurrent VACUUM is now the default.
2001-07-16 00:48:19 +02:00
|
|
|
|
2011-01-25 21:42:03 +01:00
|
|
|
extern void index_check_primary_key(Relation heapRel,
|
2023-08-23 06:14:11 +02:00
|
|
|
const IndexInfo *indexInfo,
|
2018-10-07 00:17:46 +02:00
|
|
|
bool is_alter_table,
|
2023-08-23 06:14:11 +02:00
|
|
|
const IndexStmt *stmt);
|
2011-01-25 21:42:03 +01:00
|
|
|
|
2017-11-14 15:19:05 +01:00
|
|
|
#define INDEX_CREATE_IS_PRIMARY (1 << 0)
|
|
|
|
#define INDEX_CREATE_ADD_CONSTRAINT (1 << 1)
|
|
|
|
#define INDEX_CREATE_SKIP_BUILD (1 << 2)
|
|
|
|
#define INDEX_CREATE_CONCURRENT (1 << 3)
|
|
|
|
#define INDEX_CREATE_IF_NOT_EXISTS (1 << 4)
|
Local partitioned indexes
When CREATE INDEX is run on a partitioned table, create catalog entries
for an index on the partitioned table (which is just a placeholder since
the table proper has no data of its own), and recurse to create actual
indexes on the existing partitions; create them in future partitions
also.
As a convenience gadget, if the new index definition matches some
existing index in partitions, these are picked up and used instead of
creating new ones. Whichever way these indexes come about, they become
attached to the index on the parent table and are dropped alongside it,
and cannot be dropped on isolation unless they are detached first.
To support pg_dump'ing these indexes, add commands
CREATE INDEX ON ONLY <table>
(which creates the index on the parent partitioned table, without
recursing) and
ALTER INDEX ATTACH PARTITION
(which is used after the indexes have been created individually on each
partition, to attach them to the parent index). These reconstruct prior
database state exactly.
Reviewed-by: (in alphabetical order) Peter Eisentraut, Robert Haas, Amit
Langote, Jesper Pedersen, Simon Riggs, David Rowley
Discussion: https://postgr.es/m/20171113170646.gzweigyrgg6pwsg4@alvherre.pgsql
2018-01-19 15:49:22 +01:00
|
|
|
#define INDEX_CREATE_PARTITIONED (1 << 5)
|
|
|
|
#define INDEX_CREATE_INVALID (1 << 6)
|
2017-11-14 15:19:05 +01:00
|
|
|
|
2011-01-25 21:42:03 +01:00
|
|
|
extern Oid index_create(Relation heapRelation,
|
2002-03-31 08:26:32 +02:00
|
|
|
const char *indexRelationName,
|
2005-04-14 03:38:22 +02:00
|
|
|
Oid indexRelationId,
|
Local partitioned indexes
When CREATE INDEX is run on a partitioned table, create catalog entries
for an index on the partitioned table (which is just a placeholder since
the table proper has no data of its own), and recurse to create actual
indexes on the existing partitions; create them in future partitions
also.
As a convenience gadget, if the new index definition matches some
existing index in partitions, these are picked up and used instead of
creating new ones. Whichever way these indexes come about, they become
attached to the index on the parent table and are dropped alongside it,
and cannot be dropped on isolation unless they are detached first.
To support pg_dump'ing these indexes, add commands
CREATE INDEX ON ONLY <table>
(which creates the index on the parent partitioned table, without
recursing) and
ALTER INDEX ATTACH PARTITION
(which is used after the indexes have been created individually on each
partition, to attach them to the parent index). These reconstruct prior
database state exactly.
Reviewed-by: (in alphabetical order) Peter Eisentraut, Robert Haas, Amit
Langote, Jesper Pedersen, Simon Riggs, David Rowley
Discussion: https://postgr.es/m/20171113170646.gzweigyrgg6pwsg4@alvherre.pgsql
2018-01-19 15:49:22 +01:00
|
|
|
Oid parentIndexRelid,
|
2018-02-19 20:59:37 +01:00
|
|
|
Oid parentConstraintId,
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
RelFileNumber relFileNumber,
|
2000-07-15 00:18:02 +02:00
|
|
|
IndexInfo *indexInfo,
|
2023-08-23 06:14:11 +02:00
|
|
|
const List *indexColNames,
|
2023-08-23 06:14:11 +02:00
|
|
|
Oid accessMethodId,
|
2004-06-18 08:14:31 +02:00
|
|
|
Oid tableSpaceId,
|
2023-08-23 06:14:11 +02:00
|
|
|
const Oid *collationIds,
|
|
|
|
const Oid *opclassIds,
|
2023-10-03 17:39:31 +02:00
|
|
|
const Datum *opclassOptions,
|
2023-08-23 06:14:11 +02:00
|
|
|
const int16 *coloptions,
|
2024-03-17 12:38:27 +01:00
|
|
|
const NullableDatum *stattargets,
|
2006-07-04 00:45:41 +02:00
|
|
|
Datum reloptions,
|
2017-11-14 15:19:05 +01:00
|
|
|
bits16 flags,
|
|
|
|
bits16 constr_flags,
|
2004-05-05 06:48:48 +02:00
|
|
|
bool allow_system_table_mods,
|
2018-02-19 20:59:37 +01:00
|
|
|
bool is_internal,
|
|
|
|
Oid *constraintId);
|
2017-11-14 15:19:05 +01:00
|
|
|
|
|
|
|
#define INDEX_CONSTR_CREATE_MARK_AS_PRIMARY (1 << 0)
|
|
|
|
#define INDEX_CONSTR_CREATE_DEFERRABLE (1 << 1)
|
|
|
|
#define INDEX_CONSTR_CREATE_INIT_DEFERRED (1 << 2)
|
|
|
|
#define INDEX_CONSTR_CREATE_UPDATE_INDEX (1 << 3)
|
|
|
|
#define INDEX_CONSTR_CREATE_REMOVE_OLD_DEPS (1 << 4)
|
2024-01-24 15:43:41 +01:00
|
|
|
#define INDEX_CONSTR_CREATE_WITHOUT_OVERLAPS (1 << 5)
|
1996-08-28 03:59:28 +02:00
|
|
|
|
2019-03-29 08:25:20 +01:00
|
|
|
extern Oid index_concurrently_create_copy(Relation heapRelation,
|
|
|
|
Oid oldIndexId,
|
2021-02-04 06:34:20 +01:00
|
|
|
Oid tablespaceOid,
|
2019-03-29 08:25:20 +01:00
|
|
|
const char *newName);
|
|
|
|
|
|
|
|
extern void index_concurrently_build(Oid heapRelationId,
|
|
|
|
Oid indexRelationId);
|
|
|
|
|
|
|
|
extern void index_concurrently_swap(Oid newIndexId,
|
|
|
|
Oid oldIndexId,
|
|
|
|
const char *oldName);
|
|
|
|
|
|
|
|
extern void index_concurrently_set_dead(Oid heapId,
|
|
|
|
Oid indexId);
|
|
|
|
|
2015-03-25 21:17:56 +01:00
|
|
|
extern ObjectAddress index_constraint_create(Relation heapRelation,
|
2011-01-25 21:42:03 +01:00
|
|
|
Oid indexRelationId,
|
2018-02-19 20:59:37 +01:00
|
|
|
Oid parentConstraintId,
|
2023-08-23 06:14:11 +02:00
|
|
|
const IndexInfo *indexInfo,
|
2011-01-25 21:42:03 +01:00
|
|
|
const char *constraintName,
|
|
|
|
char constraintType,
|
2017-11-14 15:19:05 +01:00
|
|
|
bits16 constr_flags,
|
2013-03-18 03:55:14 +01:00
|
|
|
bool allow_system_table_mods,
|
|
|
|
bool is_internal);
|
2011-01-25 21:42:03 +01:00
|
|
|
|
2019-03-29 08:25:20 +01:00
|
|
|
extern void index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode);
|
1996-08-28 03:59:28 +02:00
|
|
|
|
2003-05-28 18:04:02 +02:00
|
|
|
extern IndexInfo *BuildIndexInfo(Relation index);
|
2000-07-15 00:18:02 +02:00
|
|
|
|
Fix misbehavior with expression indexes on ON COMMIT DELETE ROWS tables.
We implement ON COMMIT DELETE ROWS by truncating tables marked that
way, which requires also truncating/rebuilding their indexes. But
RelationTruncateIndexes asks the relcache for up-to-date copies of any
index expressions, which may cause execution of eval_const_expressions
on them, which can result in actual execution of subexpressions.
This is a bad thing to have happening during ON COMMIT. Manuel Rigger
reported that use of a SQL function resulted in crashes due to
expectations that ActiveSnapshot would be set, which it isn't.
The most obvious fix perhaps would be to push a snapshot during
PreCommit_on_commit_actions, but I think that would just open the door
to more problems: CommitTransaction explicitly expects that no
user-defined code can be running at this point.
Fortunately, since we know that no tuples exist to be indexed, there
seems no need to use the real index expressions or predicates during
RelationTruncateIndexes. We can set up dummy index expressions
instead (we do need something that will expose the right data type,
as there are places that build index tupdescs based on this), and
just ignore predicates and exclusion constraints.
In a green field it'd likely be better to reimplement ON COMMIT DELETE
ROWS using the same "init fork" infrastructure used for unlogged
relations. That seems impractical without catalog changes though,
and even without that it'd be too big a change to back-patch.
So for now do it like this.
Per private report from Manuel Rigger. This has been broken forever,
so back-patch to all supported branches.
2019-12-01 19:09:26 +01:00
|
|
|
extern IndexInfo *BuildDummyIndexInfo(Relation index);
|
|
|
|
|
2023-08-23 06:14:11 +02:00
|
|
|
extern bool CompareIndexInfo(const IndexInfo *info1, const IndexInfo *info2,
|
|
|
|
const Oid *collations1, const Oid *collations2,
|
|
|
|
const Oid *opfamilies1, const Oid *opfamilies2,
|
|
|
|
const AttrMap *attmap);
|
Local partitioned indexes
When CREATE INDEX is run on a partitioned table, create catalog entries
for an index on the partitioned table (which is just a placeholder since
the table proper has no data of its own), and recurse to create actual
indexes on the existing partitions; create them in future partitions
also.
As a convenience gadget, if the new index definition matches some
existing index in partitions, these are picked up and used instead of
creating new ones. Whichever way these indexes come about, they become
attached to the index on the parent table and are dropped alongside it,
and cannot be dropped on isolation unless they are detached first.
To support pg_dump'ing these indexes, add commands
CREATE INDEX ON ONLY <table>
(which creates the index on the parent partitioned table, without
recursing) and
ALTER INDEX ATTACH PARTITION
(which is used after the indexes have been created individually on each
partition, to attach them to the parent index). These reconstruct prior
database state exactly.
Reviewed-by: (in alphabetical order) Peter Eisentraut, Robert Haas, Amit
Langote, Jesper Pedersen, Simon Riggs, David Rowley
Discussion: https://postgr.es/m/20171113170646.gzweigyrgg6pwsg4@alvherre.pgsql
2018-01-19 15:49:22 +01:00
|
|
|
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
extern void BuildSpeculativeIndexInfo(Relation index, IndexInfo *ii);
|
|
|
|
|
2000-07-15 00:18:02 +02:00
|
|
|
extern void FormIndexDatum(IndexInfo *indexInfo,
|
2005-03-16 22:38:10 +01:00
|
|
|
TupleTableSlot *slot,
|
2003-05-28 18:04:02 +02:00
|
|
|
EState *estate,
|
2005-03-21 02:24:04 +01:00
|
|
|
Datum *values,
|
|
|
|
bool *isnull);
|
1996-08-28 03:59:28 +02:00
|
|
|
|
2006-05-11 01:18:39 +02:00
|
|
|
extern void index_build(Relation heapRelation,
|
|
|
|
Relation indexRelation,
|
|
|
|
IndexInfo *indexInfo,
|
Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds. Testing
to date shows that this can often be 2-3x faster than a serial
index build.
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature. We can
refine it as we get more experience.
Peter Geoghegan with some help from Rushabh Lathia. While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature. Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 19:25:55 +01:00
|
|
|
bool isreindex,
|
|
|
|
bool parallel);
|
Restructure index AM interface for index building and index tuple deletion,
per previous discussion on pghackers. Most of the duplicate code in
different AMs' ambuild routines has been moved out to a common routine
in index.c; this means that all index types now do the right things about
inserting recently-dead tuples, etc. (I also removed support for EXTEND
INDEX in the ambuild routines, since that's about to go away anyway, and
it cluttered the code a lot.) The retail indextuple deletion routines have
been replaced by a "bulk delete" routine in which the indexscan is inside
the access method. I haven't pushed this change as far as it should go yet,
but it should allow considerable simplification of the internal bookkeeping
for deletions. Also, add flag columns to pg_am to eliminate various
hardcoded tests on AM OIDs, and remove unused pg_am columns.
Fix rtree and gist index types to not attempt to store NULLs; before this,
gist usually crashed, while rtree managed not to crash but computed wacko
bounding boxes for NULL entries (which might have had something to do with
the performance problems we've heard about occasionally).
Add AtEOXact routines to hash, rtree, and gist, all of which have static
state that needs to be reset after an error. We discovered this need long
ago for btree, but missed the other guys.
Oh, one more thing: concurrent VACUUM is now the default.
2001-07-16 00:48:19 +02:00
|
|
|
|
2006-08-25 06:06:58 +02:00
|
|
|
extern void validate_index(Oid heapId, Oid indexId, Snapshot snapshot);
|
|
|
|
|
Fix assorted bugs in CREATE/DROP INDEX CONCURRENTLY.
Commit 8cb53654dbdb4c386369eb988062d0bbb6de725e, which introduced DROP
INDEX CONCURRENTLY, managed to break CREATE INDEX CONCURRENTLY via a poor
choice of catalog state representation. The pg_index state for an index
that's reached the final pre-drop stage was the same as the state for an
index just created by CREATE INDEX CONCURRENTLY. This meant that the
(necessary) change to make RelationGetIndexList ignore about-to-die indexes
also made it ignore freshly-created indexes; which is catastrophic because
the latter do need to be considered in HOT-safety decisions. Failure to
do so leads to incorrect index entries and subsequently wrong results from
queries depending on the concurrently-created index.
To fix, add an additional boolean column "indislive" to pg_index, so that
the freshly-created and about-to-die states can be distinguished. (This
change obviously is only possible in HEAD. This patch will need to be
back-patched, but in 9.2 we'll use a kluge consisting of overloading the
formerly-impossible state of indisvalid = true and indisready = false.)
In addition, change CREATE/DROP INDEX CONCURRENTLY so that the pg_index
flag changes they make without exclusive lock on the index are made via
heap_inplace_update() rather than a normal transactional update. The
latter is not very safe because moving the pg_index tuple could result in
concurrent SnapshotNow scans finding it twice or not at all, thus possibly
resulting in index corruption. This is a pre-existing bug in CREATE INDEX
CONCURRENTLY, which was copied into the DROP code.
In addition, fix various places in the code that ought to check to make
sure that the indexes they are manipulating are valid and/or ready as
appropriate. These represent bugs that have existed since 8.2, since
a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid
index behind, and we ought not try to do anything that might fail with
such an index.
Also fix RelationReloadIndexInfo to ensure it copies all the pg_index
columns that are allowed to change after initial creation. Previously we
could have been left with stale values of some fields in an index relcache
entry. It's not clear whether this actually had any user-visible
consequences, but it's at least a bug waiting to happen.
In addition, do some code and docs review for DROP INDEX CONCURRENTLY;
some cosmetic code cleanup but mostly addition and revision of comments.
This will need to be back-patched, but in a noticeably different form,
so I'm committing it to HEAD before working on the back-patch.
Problem reported by Amit Kapila, diagnosis by Pavan Deolassee,
fix by Tom Lane and Andres Freund.
2012-11-29 03:25:27 +01:00
|
|
|
extern void index_set_state_flags(Oid indexId, IndexStateFlagsAction action);
|
|
|
|
|
2020-04-21 21:58:42 +02:00
|
|
|
extern Oid IndexGetRelation(Oid indexId, bool missing_ok);
|
|
|
|
|
2023-12-04 01:53:49 +01:00
|
|
|
extern void reindex_index(const ReindexStmt *stmt, Oid indexId,
|
|
|
|
bool skip_constraint_checks, char persistence,
|
|
|
|
const ReindexParams *params);
|
2011-01-21 04:44:10 +01:00
|
|
|
|
2011-04-16 23:26:41 +02:00
|
|
|
/* Flag bits for reindex_relation(): */
|
2014-11-15 05:19:49 +01:00
|
|
|
#define REINDEX_REL_PROCESS_TOAST 0x01
|
|
|
|
#define REINDEX_REL_SUPPRESS_INDEX_USE 0x02
|
|
|
|
#define REINDEX_REL_CHECK_CONSTRAINTS 0x04
|
|
|
|
#define REINDEX_REL_FORCE_INDEXES_UNLOGGED 0x08
|
|
|
|
#define REINDEX_REL_FORCE_INDEXES_PERMANENT 0x10
|
2011-04-16 23:26:41 +02:00
|
|
|
|
2023-12-04 01:53:49 +01:00
|
|
|
extern bool reindex_relation(const ReindexStmt *stmt, Oid relid, int flags,
|
|
|
|
const ReindexParams *params);
|
2010-02-07 21:48:13 +01:00
|
|
|
|
|
|
|
extern bool ReindexIsProcessingHeap(Oid heapOid);
|
|
|
|
extern bool ReindexIsProcessingIndex(Oid indexOid);
|
2001-10-28 07:26:15 +01:00
|
|
|
|
2020-04-21 21:58:42 +02:00
|
|
|
extern void ResetReindexState(int nestLevel);
|
2018-01-19 13:48:44 +01:00
|
|
|
extern Size EstimateReindexStateSpace(void);
|
|
|
|
extern void SerializeReindexState(Size maxsize, char *start_address);
|
2023-08-23 06:14:11 +02:00
|
|
|
extern void RestoreReindexState(const void *reindexstate);
|
2018-01-19 13:48:44 +01:00
|
|
|
|
2022-09-20 04:18:36 +02:00
|
|
|
extern void IndexSetParentIndex(Relation partitionIdx, Oid parentOid);
|
Local partitioned indexes
When CREATE INDEX is run on a partitioned table, create catalog entries
for an index on the partitioned table (which is just a placeholder since
the table proper has no data of its own), and recurse to create actual
indexes on the existing partitions; create them in future partitions
also.
As a convenience gadget, if the new index definition matches some
existing index in partitions, these are picked up and used instead of
creating new ones. Whichever way these indexes come about, they become
attached to the index on the parent table and are dropped alongside it,
and cannot be dropped on isolation unless they are detached first.
To support pg_dump'ing these indexes, add commands
CREATE INDEX ON ONLY <table>
(which creates the index on the parent partitioned table, without
recursing) and
ALTER INDEX ATTACH PARTITION
(which is used after the indexes have been created individually on each
partition, to attach them to the parent index). These reconstruct prior
database state exactly.
Reviewed-by: (in alphabetical order) Peter Eisentraut, Robert Haas, Amit
Langote, Jesper Pedersen, Simon Riggs, David Rowley
Discussion: https://postgr.es/m/20171113170646.gzweigyrgg6pwsg4@alvherre.pgsql
2018-01-19 15:49:22 +01:00
|
|
|
|
2019-03-28 03:59:06 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* itemptr_encode - Encode ItemPointer as int64/int8
|
|
|
|
*
|
|
|
|
* This representation must produce values encoded as int64 that sort in the
|
|
|
|
* same order as their corresponding original TID values would (using the
|
|
|
|
* default int8 opclass to produce a result equivalent to the default TID
|
|
|
|
* opclass).
|
|
|
|
*
|
|
|
|
* As noted in validate_index(), this can be significantly faster.
|
|
|
|
*/
|
|
|
|
static inline int64
|
|
|
|
itemptr_encode(ItemPointer itemptr)
|
|
|
|
{
|
|
|
|
BlockNumber block = ItemPointerGetBlockNumber(itemptr);
|
|
|
|
OffsetNumber offset = ItemPointerGetOffsetNumber(itemptr);
|
|
|
|
int64 encoded;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Use the 16 least significant bits for the offset. 32 adjacent bits are
|
|
|
|
* used for the block number. Since remaining bits are unused, there
|
|
|
|
* cannot be negative encoded values (We assume a two's complement
|
|
|
|
* representation).
|
|
|
|
*/
|
|
|
|
encoded = ((uint64) block << 16) | (uint16) offset;
|
|
|
|
|
|
|
|
return encoded;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* itemptr_decode - Decode int64/int8 representation back to ItemPointer
|
|
|
|
*/
|
|
|
|
static inline void
|
|
|
|
itemptr_decode(ItemPointer itemptr, int64 encoded)
|
|
|
|
{
|
|
|
|
BlockNumber block = (BlockNumber) (encoded >> 16);
|
|
|
|
OffsetNumber offset = (OffsetNumber) (encoded & 0xFFFF);
|
|
|
|
|
|
|
|
ItemPointerSet(itemptr, block, offset);
|
|
|
|
}
|
|
|
|
|
1996-08-28 03:59:28 +02:00
|
|
|
#endif /* INDEX_H */
|