postgresql/src/include/catalog/index.h

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

220 lines
6.7 KiB
C
Raw Normal View History

/*-------------------------------------------------------------------------
*
* index.h
* prototypes for catalog/index.c.
*
*
* Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
2010-09-20 22:08:53 +02:00
* src/include/catalog/index.h
*
*-------------------------------------------------------------------------
*/
#ifndef INDEX_H
#define INDEX_H
#include "catalog/objectaddress.h"
1999-07-16 19:07:40 +02:00
#include "nodes/execnodes.h"
#define DEFAULT_INDEX_TYPE "btree"
Fix assorted bugs in CREATE/DROP INDEX CONCURRENTLY. Commit 8cb53654dbdb4c386369eb988062d0bbb6de725e, which introduced DROP INDEX CONCURRENTLY, managed to break CREATE INDEX CONCURRENTLY via a poor choice of catalog state representation. The pg_index state for an index that's reached the final pre-drop stage was the same as the state for an index just created by CREATE INDEX CONCURRENTLY. This meant that the (necessary) change to make RelationGetIndexList ignore about-to-die indexes also made it ignore freshly-created indexes; which is catastrophic because the latter do need to be considered in HOT-safety decisions. Failure to do so leads to incorrect index entries and subsequently wrong results from queries depending on the concurrently-created index. To fix, add an additional boolean column "indislive" to pg_index, so that the freshly-created and about-to-die states can be distinguished. (This change obviously is only possible in HEAD. This patch will need to be back-patched, but in 9.2 we'll use a kluge consisting of overloading the formerly-impossible state of indisvalid = true and indisready = false.) In addition, change CREATE/DROP INDEX CONCURRENTLY so that the pg_index flag changes they make without exclusive lock on the index are made via heap_inplace_update() rather than a normal transactional update. The latter is not very safe because moving the pg_index tuple could result in concurrent SnapshotNow scans finding it twice or not at all, thus possibly resulting in index corruption. This is a pre-existing bug in CREATE INDEX CONCURRENTLY, which was copied into the DROP code. In addition, fix various places in the code that ought to check to make sure that the indexes they are manipulating are valid and/or ready as appropriate. These represent bugs that have existed since 8.2, since a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid index behind, and we ought not try to do anything that might fail with such an index. Also fix RelationReloadIndexInfo to ensure it copies all the pg_index columns that are allowed to change after initial creation. Previously we could have been left with stale values of some fields in an index relcache entry. It's not clear whether this actually had any user-visible consequences, but it's at least a bug waiting to happen. In addition, do some code and docs review for DROP INDEX CONCURRENTLY; some cosmetic code cleanup but mostly addition and revision of comments. This will need to be back-patched, but in a noticeably different form, so I'm committing it to HEAD before working on the back-patch. Problem reported by Amit Kapila, diagnosis by Pavan Deolassee, fix by Tom Lane and Andres Freund.
2012-11-29 03:25:27 +01:00
/* Action code for index_set_state_flags */
typedef enum
{
INDEX_CREATE_SET_READY,
INDEX_CREATE_SET_VALID,
INDEX_DROP_CLEAR_VALID,
INDEX_DROP_SET_DEAD,
} IndexStateFlagsAction;
/* options for REINDEX */
typedef struct ReindexParams
{
bits32 options; /* bitmask of REINDEXOPT_* */
Oid tablespaceOid; /* New tablespace to move indexes to.
* InvalidOid to do nothing. */
} ReindexParams;
/* flag bits for ReindexParams->flags */
#define REINDEXOPT_VERBOSE 0x01 /* print progress info */
#define REINDEXOPT_REPORT_PROGRESS 0x02 /* report pgstat progress */
#define REINDEXOPT_MISSING_OK 0x04 /* skip missing relations */
#define REINDEXOPT_CONCURRENTLY 0x08 /* concurrent mode */
/* state info for validate_index bulkdelete callback */
typedef struct ValidateIndexState
{
Tuplesortstate *tuplesort; /* for sorting the index TIDs */
/* statistics (for debug purposes only): */
double htups,
itups,
tups_inserted;
} ValidateIndexState;
extern void index_check_primary_key(Relation heapRel,
const IndexInfo *indexInfo,
bool is_alter_table,
const IndexStmt *stmt);
#define INDEX_CREATE_IS_PRIMARY (1 << 0)
#define INDEX_CREATE_ADD_CONSTRAINT (1 << 1)
#define INDEX_CREATE_SKIP_BUILD (1 << 2)
#define INDEX_CREATE_CONCURRENT (1 << 3)
#define INDEX_CREATE_IF_NOT_EXISTS (1 << 4)
#define INDEX_CREATE_PARTITIONED (1 << 5)
#define INDEX_CREATE_INVALID (1 << 6)
extern Oid index_create(Relation heapRelation,
const char *indexRelationName,
Oid indexRelationId,
Oid parentIndexRelid,
Oid parentConstraintId,
Change internal RelFileNode references to RelFileNumber or RelFileLocator. We have been using the term RelFileNode to refer to either (1) the integer that is used to name the sequence of files for a certain relation within the directory set aside for that tablespace/database combination; or (2) that value plus the OIDs of the tablespace and database; or occasionally (3) the whole series of files created for a relation based on those values. Using the same name for more than one thing is confusing. Replace RelFileNode with RelFileNumber when we're talking about just the single number, i.e. (1) from above, and with RelFileLocator when we're talking about all the things that are needed to locate a relation's files on disk, i.e. (2) from above. In the places where we refer to (3) as a relfilenode, instead refer to "relation storage". Since there is a ton of SQL code in the world that knows about pg_class.relfilenode, don't change the name of that column, or of other SQL-facing things that derive their name from it. On the other hand, do adjust closely-related internal terminology. For example, the structure member names dbNode and spcNode appear to be derived from the fact that the structure itself was called RelFileNode, so change those to dbOid and spcOid. Likewise, various variables with names like rnode and relnode get renamed appropriately, according to how they're being used in context. Hopefully, this is clearer than before. It is also preparation for future patches that intend to widen the relfilenumber fields from its current width of 32 bits. Variables that store a relfilenumber are now declared as type RelFileNumber rather than type Oid; right now, these are the same, but that can now more easily be changed. Dilip Kumar, per an idea from me. Reviewed also by Andres Freund. I fixed some whitespace issues, changed a couple of words in a comment, and made one other minor correction. Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
RelFileNumber relFileNumber,
IndexInfo *indexInfo,
const List *indexColNames,
Oid accessMethodId,
Oid tableSpaceId,
const Oid *collationIds,
const Oid *opclassIds,
const Datum *opclassOptions,
const int16 *coloptions,
const NullableDatum *stattargets,
Datum reloptions,
bits16 flags,
bits16 constr_flags,
bool allow_system_table_mods,
bool is_internal,
Oid *constraintId);
#define INDEX_CONSTR_CREATE_MARK_AS_PRIMARY (1 << 0)
#define INDEX_CONSTR_CREATE_DEFERRABLE (1 << 1)
#define INDEX_CONSTR_CREATE_INIT_DEFERRED (1 << 2)
#define INDEX_CONSTR_CREATE_UPDATE_INDEX (1 << 3)
#define INDEX_CONSTR_CREATE_REMOVE_OLD_DEPS (1 << 4)
#define INDEX_CONSTR_CREATE_WITHOUT_OVERLAPS (1 << 5)
extern Oid index_concurrently_create_copy(Relation heapRelation,
Oid oldIndexId,
Oid tablespaceOid,
const char *newName);
extern void index_concurrently_build(Oid heapRelationId,
Oid indexRelationId);
extern void index_concurrently_swap(Oid newIndexId,
Oid oldIndexId,
const char *oldName);
extern void index_concurrently_set_dead(Oid heapId,
Oid indexId);
extern ObjectAddress index_constraint_create(Relation heapRelation,
Oid indexRelationId,
Oid parentConstraintId,
const IndexInfo *indexInfo,
const char *constraintName,
char constraintType,
bits16 constr_flags,
bool allow_system_table_mods,
bool is_internal);
extern void index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode);
extern IndexInfo *BuildIndexInfo(Relation index);
Fix misbehavior with expression indexes on ON COMMIT DELETE ROWS tables. We implement ON COMMIT DELETE ROWS by truncating tables marked that way, which requires also truncating/rebuilding their indexes. But RelationTruncateIndexes asks the relcache for up-to-date copies of any index expressions, which may cause execution of eval_const_expressions on them, which can result in actual execution of subexpressions. This is a bad thing to have happening during ON COMMIT. Manuel Rigger reported that use of a SQL function resulted in crashes due to expectations that ActiveSnapshot would be set, which it isn't. The most obvious fix perhaps would be to push a snapshot during PreCommit_on_commit_actions, but I think that would just open the door to more problems: CommitTransaction explicitly expects that no user-defined code can be running at this point. Fortunately, since we know that no tuples exist to be indexed, there seems no need to use the real index expressions or predicates during RelationTruncateIndexes. We can set up dummy index expressions instead (we do need something that will expose the right data type, as there are places that build index tupdescs based on this), and just ignore predicates and exclusion constraints. In a green field it'd likely be better to reimplement ON COMMIT DELETE ROWS using the same "init fork" infrastructure used for unlogged relations. That seems impractical without catalog changes though, and even without that it'd be too big a change to back-patch. So for now do it like this. Per private report from Manuel Rigger. This has been broken forever, so back-patch to all supported branches.
2019-12-01 19:09:26 +01:00
extern IndexInfo *BuildDummyIndexInfo(Relation index);
extern bool CompareIndexInfo(const IndexInfo *info1, const IndexInfo *info2,
const Oid *collations1, const Oid *collations2,
const Oid *opfamilies1, const Oid *opfamilies2,
const AttrMap *attmap);
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
extern void BuildSpeculativeIndexInfo(Relation index, IndexInfo *ii);
extern void FormIndexDatum(IndexInfo *indexInfo,
TupleTableSlot *slot,
EState *estate,
Datum *values,
bool *isnull);
extern void index_build(Relation heapRelation,
Relation indexRelation,
IndexInfo *indexInfo,
bool isreindex,
bool parallel);
extern void validate_index(Oid heapId, Oid indexId, Snapshot snapshot);
Fix assorted bugs in CREATE/DROP INDEX CONCURRENTLY. Commit 8cb53654dbdb4c386369eb988062d0bbb6de725e, which introduced DROP INDEX CONCURRENTLY, managed to break CREATE INDEX CONCURRENTLY via a poor choice of catalog state representation. The pg_index state for an index that's reached the final pre-drop stage was the same as the state for an index just created by CREATE INDEX CONCURRENTLY. This meant that the (necessary) change to make RelationGetIndexList ignore about-to-die indexes also made it ignore freshly-created indexes; which is catastrophic because the latter do need to be considered in HOT-safety decisions. Failure to do so leads to incorrect index entries and subsequently wrong results from queries depending on the concurrently-created index. To fix, add an additional boolean column "indislive" to pg_index, so that the freshly-created and about-to-die states can be distinguished. (This change obviously is only possible in HEAD. This patch will need to be back-patched, but in 9.2 we'll use a kluge consisting of overloading the formerly-impossible state of indisvalid = true and indisready = false.) In addition, change CREATE/DROP INDEX CONCURRENTLY so that the pg_index flag changes they make without exclusive lock on the index are made via heap_inplace_update() rather than a normal transactional update. The latter is not very safe because moving the pg_index tuple could result in concurrent SnapshotNow scans finding it twice or not at all, thus possibly resulting in index corruption. This is a pre-existing bug in CREATE INDEX CONCURRENTLY, which was copied into the DROP code. In addition, fix various places in the code that ought to check to make sure that the indexes they are manipulating are valid and/or ready as appropriate. These represent bugs that have existed since 8.2, since a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid index behind, and we ought not try to do anything that might fail with such an index. Also fix RelationReloadIndexInfo to ensure it copies all the pg_index columns that are allowed to change after initial creation. Previously we could have been left with stale values of some fields in an index relcache entry. It's not clear whether this actually had any user-visible consequences, but it's at least a bug waiting to happen. In addition, do some code and docs review for DROP INDEX CONCURRENTLY; some cosmetic code cleanup but mostly addition and revision of comments. This will need to be back-patched, but in a noticeably different form, so I'm committing it to HEAD before working on the back-patch. Problem reported by Amit Kapila, diagnosis by Pavan Deolassee, fix by Tom Lane and Andres Freund.
2012-11-29 03:25:27 +01:00
extern void index_set_state_flags(Oid indexId, IndexStateFlagsAction action);
extern Oid IndexGetRelation(Oid indexId, bool missing_ok);
extern void reindex_index(const ReindexStmt *stmt, Oid indexId,
bool skip_constraint_checks, char persistence,
const ReindexParams *params);
/* Flag bits for reindex_relation(): */
#define REINDEX_REL_PROCESS_TOAST 0x01
#define REINDEX_REL_SUPPRESS_INDEX_USE 0x02
#define REINDEX_REL_CHECK_CONSTRAINTS 0x04
#define REINDEX_REL_FORCE_INDEXES_UNLOGGED 0x08
#define REINDEX_REL_FORCE_INDEXES_PERMANENT 0x10
extern bool reindex_relation(const ReindexStmt *stmt, Oid relid, int flags,
const ReindexParams *params);
extern bool ReindexIsProcessingHeap(Oid heapOid);
extern bool ReindexIsProcessingIndex(Oid indexOid);
extern void ResetReindexState(int nestLevel);
extern Size EstimateReindexStateSpace(void);
extern void SerializeReindexState(Size maxsize, char *start_address);
extern void RestoreReindexState(const void *reindexstate);
extern void IndexSetParentIndex(Relation partitionIdx, Oid parentOid);
/*
* itemptr_encode - Encode ItemPointer as int64/int8
*
* This representation must produce values encoded as int64 that sort in the
* same order as their corresponding original TID values would (using the
* default int8 opclass to produce a result equivalent to the default TID
* opclass).
*
* As noted in validate_index(), this can be significantly faster.
*/
static inline int64
itemptr_encode(ItemPointer itemptr)
{
BlockNumber block = ItemPointerGetBlockNumber(itemptr);
OffsetNumber offset = ItemPointerGetOffsetNumber(itemptr);
int64 encoded;
/*
* Use the 16 least significant bits for the offset. 32 adjacent bits are
* used for the block number. Since remaining bits are unused, there
* cannot be negative encoded values (We assume a two's complement
* representation).
*/
encoded = ((uint64) block << 16) | (uint16) offset;
return encoded;
}
/*
* itemptr_decode - Decode int64/int8 representation back to ItemPointer
*/
static inline void
itemptr_decode(ItemPointer itemptr, int64 encoded)
{
BlockNumber block = (BlockNumber) (encoded >> 16);
OffsetNumber offset = (OffsetNumber) (encoded & 0xFFFF);
ItemPointerSet(itemptr, block, offset);
}
#endif /* INDEX_H */