1996-08-28 03:59:28 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
1999-02-14 00:22:53 +01:00
|
|
|
* relcache.h
|
1997-09-07 07:04:48 +02:00
|
|
|
* Relation descriptor cache definitions.
|
1996-08-28 03:59:28 +02:00
|
|
|
*
|
|
|
|
*
|
2020-01-01 18:21:45 +01:00
|
|
|
* Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
|
2000-01-26 06:58:53 +01:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
1996-08-28 03:59:28 +02:00
|
|
|
*
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/include/utils/relcache.h
|
1996-08-28 03:59:28 +02:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
1997-09-07 07:04:48 +02:00
|
|
|
#ifndef RELCACHE_H
|
1996-08-28 03:59:28 +02:00
|
|
|
#define RELCACHE_H
|
|
|
|
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
#include "postgres.h"
|
2008-06-19 02:46:06 +02:00
|
|
|
#include "access/tupdesc.h"
|
|
|
|
#include "nodes/bitmapset.h"
|
|
|
|
|
|
|
|
|
2017-11-07 18:28:35 +01:00
|
|
|
/*
|
|
|
|
* Name of relcache init file(s), used to speed up backend startup
|
|
|
|
*/
|
|
|
|
#define RELCACHE_INIT_FILENAME "pg_internal.init"
|
|
|
|
|
2008-06-19 02:46:06 +02:00
|
|
|
typedef struct RelationData *Relation;
|
|
|
|
|
|
|
|
/* ----------------
|
|
|
|
* RelationPtr is used in the executor to support index scans
|
|
|
|
* where we have to keep track of several index relations in an
|
2014-05-06 18:12:18 +02:00
|
|
|
* array. -cim 9/10/89
|
2008-06-19 02:46:06 +02:00
|
|
|
* ----------------
|
|
|
|
*/
|
|
|
|
typedef Relation *RelationPtr;
|
1996-08-28 03:59:28 +02:00
|
|
|
|
|
|
|
/*
|
2006-07-31 22:09:10 +02:00
|
|
|
* Routines to open (lookup) and close a relcache entry
|
1996-08-28 03:59:28 +02:00
|
|
|
*/
|
|
|
|
extern Relation RelationIdGetRelation(Oid relationId);
|
1997-09-08 04:41:22 +02:00
|
|
|
extern void RelationClose(Relation relation);
|
1999-10-04 01:55:40 +02:00
|
|
|
|
2000-06-17 23:49:04 +02:00
|
|
|
/*
|
|
|
|
* Routines to compute/retrieve additional cached information
|
|
|
|
*/
|
2016-06-18 21:22:34 +02:00
|
|
|
extern List *RelationGetFKeyList(Relation relation);
|
2000-06-17 23:49:04 +02:00
|
|
|
extern List *RelationGetIndexList(Relation relation);
|
Implement multivariate n-distinct coefficients
Add support for explicitly declared statistic objects (CREATE
STATISTICS), allowing collection of statistics on more complex
combinations that individual table columns. Companion commands DROP
STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are
added too. All this DDL has been designed so that more statistic types
can be added later on, such as multivariate most-common-values and
multivariate histograms between columns of a single table, leaving room
for permitting columns on multiple tables, too, as well as expressions.
This commit only adds support for collection of n-distinct coefficient
on user-specified sets of columns in a single table. This is useful to
estimate number of distinct groups in GROUP BY and DISTINCT clauses;
estimation errors there can cause over-allocation of memory in hashed
aggregates, for instance, so it's a worthwhile problem to solve. A new
special pseudo-type pg_ndistinct is used.
(num-distinct estimation was deemed sufficiently useful by itself that
this is worthwhile even if no further statistic types are added
immediately; so much so that another version of essentially the same
functionality was submitted by Kyotaro Horiguchi:
https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp
though this commit does not use that code.)
Author: Tomas Vondra. Some code rework by Álvaro.
Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes,
Ideriha Takeshi
Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz
https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 18:06:10 +01:00
|
|
|
extern List *RelationGetStatExtList(Relation relation);
|
2017-01-19 18:00:00 +01:00
|
|
|
extern Oid RelationGetPrimaryKeyIndex(Relation relation);
|
2014-05-14 20:55:48 +02:00
|
|
|
extern Oid RelationGetReplicaIndex(Relation relation);
|
2003-05-28 18:04:02 +02:00
|
|
|
extern List *RelationGetIndexExpressions(Relation relation);
|
Fix misbehavior with expression indexes on ON COMMIT DELETE ROWS tables.
We implement ON COMMIT DELETE ROWS by truncating tables marked that
way, which requires also truncating/rebuilding their indexes. But
RelationTruncateIndexes asks the relcache for up-to-date copies of any
index expressions, which may cause execution of eval_const_expressions
on them, which can result in actual execution of subexpressions.
This is a bad thing to have happening during ON COMMIT. Manuel Rigger
reported that use of a SQL function resulted in crashes due to
expectations that ActiveSnapshot would be set, which it isn't.
The most obvious fix perhaps would be to push a snapshot during
PreCommit_on_commit_actions, but I think that would just open the door
to more problems: CommitTransaction explicitly expects that no
user-defined code can be running at this point.
Fortunately, since we know that no tuples exist to be indexed, there
seems no need to use the real index expressions or predicates during
RelationTruncateIndexes. We can set up dummy index expressions
instead (we do need something that will expose the right data type,
as there are places that build index tupdescs based on this), and
just ignore predicates and exclusion constraints.
In a green field it'd likely be better to reimplement ON COMMIT DELETE
ROWS using the same "init fork" infrastructure used for unlogged
relations. That seems impractical without catalog changes though,
and even without that it'd be too big a change to back-patch.
So for now do it like this.
Per private report from Manuel Rigger. This has been broken forever,
so back-patch to all supported branches.
2019-12-01 19:09:26 +01:00
|
|
|
extern List *RelationGetDummyIndexExpressions(Relation relation);
|
2003-05-28 18:04:02 +02:00
|
|
|
extern List *RelationGetIndexPredicate(Relation relation);
|
Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 18:17:11 +02:00
|
|
|
extern Datum *RelationGetIndexRawAttOptions(Relation relation);
|
|
|
|
extern bytea **RelationGetIndexAttOptions(Relation relation, bool copy);
|
Add new wal_level, logical, sufficient for logical decoding.
When wal_level=logical, we'll log columns from the old tuple as
configured by the REPLICA IDENTITY facility added in commit
07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65. This makes it possible
a properly-configured logical replication solution to correctly
follow table updates even if they change the chosen key columns,
or, with REPLICA IDENTITY FULL, even if the table has no key at
all. Note that updates which do not modify the replica identity
column won't log anything extra, making the choice of a good key
(i.e. one that will rarely be changed) important to performance
when wal_level=logical is configured.
Each insert, update, or delete to a catalog table will also log
the CMIN and/or CMAX values of stamped by the current transaction.
This is necessary because logical decoding will require access to
historical snapshots of the catalog in order to decode some data
types, and the CMIN/CMAX values that we may need in order to judge
row visibility may have been overwritten by the time we need them.
Andres Freund, reviewed in various versions by myself, Heikki
Linnakangas, KONDO Mitsumasa, and many others.
2013-12-11 00:33:45 +01:00
|
|
|
|
|
|
|
typedef enum IndexAttrBitmapKind
|
|
|
|
{
|
2019-01-15 18:07:10 +01:00
|
|
|
INDEX_ATTR_BITMAP_ALL,
|
Add new wal_level, logical, sufficient for logical decoding.
When wal_level=logical, we'll log columns from the old tuple as
configured by the REPLICA IDENTITY facility added in commit
07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65. This makes it possible
a properly-configured logical replication solution to correctly
follow table updates even if they change the chosen key columns,
or, with REPLICA IDENTITY FULL, even if the table has no key at
all. Note that updates which do not modify the replica identity
column won't log anything extra, making the choice of a good key
(i.e. one that will rarely be changed) important to performance
when wal_level=logical is configured.
Each insert, update, or delete to a catalog table will also log
the CMIN and/or CMAX values of stamped by the current transaction.
This is necessary because logical decoding will require access to
historical snapshots of the catalog in order to decode some data
types, and the CMIN/CMAX values that we may need in order to judge
row visibility may have been overwritten by the time we need them.
Andres Freund, reviewed in various versions by myself, Heikki
Linnakangas, KONDO Mitsumasa, and many others.
2013-12-11 00:33:45 +01:00
|
|
|
INDEX_ATTR_BITMAP_KEY,
|
2017-01-19 18:00:00 +01:00
|
|
|
INDEX_ATTR_BITMAP_PRIMARY_KEY,
|
Add new wal_level, logical, sufficient for logical decoding.
When wal_level=logical, we'll log columns from the old tuple as
configured by the REPLICA IDENTITY facility added in commit
07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65. This makes it possible
a properly-configured logical replication solution to correctly
follow table updates even if they change the chosen key columns,
or, with REPLICA IDENTITY FULL, even if the table has no key at
all. Note that updates which do not modify the replica identity
column won't log anything extra, making the choice of a good key
(i.e. one that will rarely be changed) important to performance
when wal_level=logical is configured.
Each insert, update, or delete to a catalog table will also log
the CMIN and/or CMAX values of stamped by the current transaction.
This is necessary because logical decoding will require access to
historical snapshots of the catalog in order to decode some data
types, and the CMIN/CMAX values that we may need in order to judge
row visibility may have been overwritten by the time we need them.
Andres Freund, reviewed in various versions by myself, Heikki
Linnakangas, KONDO Mitsumasa, and many others.
2013-12-11 00:33:45 +01:00
|
|
|
INDEX_ATTR_BITMAP_IDENTITY_KEY
|
|
|
|
} IndexAttrBitmapKind;
|
|
|
|
|
|
|
|
extern Bitmapset *RelationGetIndexAttrBitmap(Relation relation,
|
2019-07-22 03:01:50 +02:00
|
|
|
IndexAttrBitmapKind attrKind);
|
Add new wal_level, logical, sufficient for logical decoding.
When wal_level=logical, we'll log columns from the old tuple as
configured by the REPLICA IDENTITY facility added in commit
07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65. This makes it possible
a properly-configured logical replication solution to correctly
follow table updates even if they change the chosen key columns,
or, with REPLICA IDENTITY FULL, even if the table has no key at
all. Note that updates which do not modify the replica identity
column won't log anything extra, making the choice of a good key
(i.e. one that will rarely be changed) important to performance
when wal_level=logical is configured.
Each insert, update, or delete to a catalog table will also log
the CMIN and/or CMAX values of stamped by the current transaction.
This is necessary because logical decoding will require access to
historical snapshots of the catalog in order to decode some data
types, and the CMIN/CMAX values that we may need in order to judge
row visibility may have been overwritten by the time we need them.
Andres Freund, reviewed in various versions by myself, Heikki
Linnakangas, KONDO Mitsumasa, and many others.
2013-12-11 00:33:45 +01:00
|
|
|
|
2009-12-07 06:22:23 +01:00
|
|
|
extern void RelationGetExclusionInfo(Relation indexRelation,
|
2019-05-22 19:04:48 +02:00
|
|
|
Oid **operators,
|
|
|
|
Oid **procs,
|
|
|
|
uint16 **strategies);
|
2000-06-17 23:49:04 +02:00
|
|
|
|
2001-10-07 01:21:45 +02:00
|
|
|
extern void RelationInitIndexAccessInfo(Relation relation);
|
|
|
|
|
2017-01-19 18:00:00 +01:00
|
|
|
/* caller must include pg_publication.h */
|
|
|
|
struct PublicationActions;
|
|
|
|
extern struct PublicationActions *GetRelationPublicationActions(Relation relation);
|
|
|
|
|
tableam: introduce table AM infrastructure.
This introduces the concept of table access methods, i.e. CREATE
ACCESS METHOD ... TYPE TABLE and
CREATE TABLE ... USING (storage-engine).
No table access functionality is delegated to table AMs as of this
commit, that'll be done in following commits.
Subsequent commits will incrementally abstract table access
functionality to be routed through table access methods. That change
is too large to be reviewed & committed at once, so it'll be done
incrementally.
Docs will be updated at the end, as adding them incrementally would
likely make them less coherent, and definitely is a lot more work,
without a lot of benefit.
Table access methods are specified similar to index access methods,
i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with
callbacks. In contrast to index AMs that struct needs to live as long
as a backend, typically that's achieved by just returning a pointer to
a constant struct.
Psql's \d+ now displays a table's access method. That can be disabled
with HIDE_TABLEAM=true, which is mainly useful so regression tests can
be run against different AMs. It's quite possible that this behaviour
still needs to be fine tuned.
For now it's not allowed to set a table AM for a partitioned table, as
we've not resolved how partitions would inherit that. Disallowing
allows us to introduce, if we decide that's the way forward, such a
behaviour without a compatibility break.
Catversion bumped, to add the heap table AM and references to it.
Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de
https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 18:54:38 +01:00
|
|
|
extern void RelationInitTableAccessMethod(Relation relation);
|
|
|
|
|
Provide database object names as separate fields in error messages.
This patch addresses the problem that applications currently have to
extract object names from possibly-localized textual error messages,
if they want to know for example which index caused a UNIQUE_VIOLATION
failure. It adds new error message fields to the wire protocol, which
can carry the name of a table, table column, data type, or constraint
associated with the error. (Since the protocol spec has always instructed
clients to ignore unrecognized field types, this should not create any
compatibility problem.)
Support for providing these new fields has been added to just a limited set
of error reports (mainly, those in the "integrity constraint violation"
SQLSTATE class), but we will doubtless add them to more calls in future.
Pavel Stehule, reviewed and extensively revised by Peter Geoghegan, with
additional hacking by Tom Lane.
2013-01-29 23:06:26 +01:00
|
|
|
/*
|
|
|
|
* Routines to support ereport() reports of relation-related errors
|
|
|
|
*/
|
|
|
|
extern int errtable(Relation rel);
|
|
|
|
extern int errtablecol(Relation rel, int attnum);
|
|
|
|
extern int errtablecolname(Relation rel, const char *colname);
|
|
|
|
extern int errtableconstraint(Relation rel, const char *conname);
|
|
|
|
|
2000-08-06 06:40:08 +02:00
|
|
|
/*
|
|
|
|
* Routines for backend startup
|
|
|
|
*/
|
|
|
|
extern void RelationCacheInitialize(void);
|
|
|
|
extern void RelationCacheInitializePhase2(void);
|
2009-08-12 22:53:31 +02:00
|
|
|
extern void RelationCacheInitializePhase3(void);
|
2000-08-06 06:40:08 +02:00
|
|
|
|
2001-06-29 23:08:25 +02:00
|
|
|
/*
|
|
|
|
* Routine to create a relcache entry for an about-to-be-created relation
|
|
|
|
*/
|
|
|
|
extern Relation RelationBuildLocalRelation(const char *relname,
|
2019-05-22 19:04:48 +02:00
|
|
|
Oid relnamespace,
|
|
|
|
TupleDesc tupDesc,
|
|
|
|
Oid relid,
|
|
|
|
Oid accessmtd,
|
|
|
|
Oid relfilenode,
|
|
|
|
Oid reltablespace,
|
|
|
|
bool shared_relation,
|
|
|
|
bool mapped_relation,
|
|
|
|
char relpersistence,
|
|
|
|
char relkind);
|
2001-06-29 23:08:25 +02:00
|
|
|
|
2010-02-03 02:14:17 +01:00
|
|
|
/*
|
Skip WAL for new relfilenodes, under wal_level=minimal.
Until now, only selected bulk operations (e.g. COPY) did this. If a
given relfilenode received both a WAL-skipping COPY and a WAL-logged
operation (e.g. INSERT), recovery could lose tuples from the COPY. See
src/backend/access/transam/README section "Skipping WAL for New
RelFileNode" for the new coding rules. Maintainers of table access
methods should examine that section.
To maintain data durability, just before commit, we choose between an
fsync of the relfilenode and copying its contents to WAL. A new GUC,
wal_skip_threshold, guides that choice. If this change slows a workload
that creates small, permanent relfilenodes under wal_level=minimal, try
adjusting wal_skip_threshold. Users setting a timeout on COMMIT may
need to adjust that timeout, and log_min_duration_statement analysis
will reflect time consumption moving to COMMIT from commands like COPY.
Internally, this requires a reliable determination of whether
RollbackAndReleaseCurrentSubTransaction() would unlink a relation's
current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the
specification of rd_createSubid such that the field is zero when a new
rel has an old rd_node. Make relcache.c retain entries for certain
dropped relations until end of transaction.
Bump XLOG_PAGE_MAGIC, since this introduces XLOG_GIST_ASSIGN_LSN.
Future servers accept older WAL, so this bump is discretionary.
Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert
Haas. Heikki Linnakangas and Michael Paquier implemented earlier
designs that materially clarified the problem. Reviewed, in earlier
designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane,
Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout.
Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
2020-04-04 21:25:34 +02:00
|
|
|
* Routines to manage assignment of new relfilenode to a relation
|
2010-02-03 02:14:17 +01:00
|
|
|
*/
|
2019-03-29 04:01:14 +01:00
|
|
|
extern void RelationSetNewRelfilenode(Relation relation, char persistence);
|
Skip WAL for new relfilenodes, under wal_level=minimal.
Until now, only selected bulk operations (e.g. COPY) did this. If a
given relfilenode received both a WAL-skipping COPY and a WAL-logged
operation (e.g. INSERT), recovery could lose tuples from the COPY. See
src/backend/access/transam/README section "Skipping WAL for New
RelFileNode" for the new coding rules. Maintainers of table access
methods should examine that section.
To maintain data durability, just before commit, we choose between an
fsync of the relfilenode and copying its contents to WAL. A new GUC,
wal_skip_threshold, guides that choice. If this change slows a workload
that creates small, permanent relfilenodes under wal_level=minimal, try
adjusting wal_skip_threshold. Users setting a timeout on COMMIT may
need to adjust that timeout, and log_min_duration_statement analysis
will reflect time consumption moving to COMMIT from commands like COPY.
Internally, this requires a reliable determination of whether
RollbackAndReleaseCurrentSubTransaction() would unlink a relation's
current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the
specification of rd_createSubid such that the field is zero when a new
rel has an old rd_node. Make relcache.c retain entries for certain
dropped relations until end of transaction.
Bump XLOG_PAGE_MAGIC, since this introduces XLOG_GIST_ASSIGN_LSN.
Future servers accept older WAL, so this bump is discretionary.
Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert
Haas. Heikki Linnakangas and Michael Paquier implemented earlier
designs that materially clarified the problem. Reviewed, in earlier
designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane,
Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout.
Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
2020-04-04 21:25:34 +02:00
|
|
|
extern void RelationAssumeNewRelfilenode(Relation relation);
|
2010-02-03 02:14:17 +01:00
|
|
|
|
1999-10-04 01:55:40 +02:00
|
|
|
/*
|
|
|
|
* Routines for flushing/rebuilding relcache entries in various scenarios
|
|
|
|
*/
|
2001-06-29 23:08:25 +02:00
|
|
|
extern void RelationForgetRelation(Oid rid);
|
|
|
|
|
2005-01-10 21:02:24 +01:00
|
|
|
extern void RelationCacheInvalidateEntry(Oid relationId);
|
1996-08-28 03:59:28 +02:00
|
|
|
|
2000-01-31 05:35:57 +01:00
|
|
|
extern void RelationCacheInvalidate(void);
|
1996-08-28 03:59:28 +02:00
|
|
|
|
2010-02-07 21:48:13 +01:00
|
|
|
extern void RelationCloseSmgrByOid(Oid relationId);
|
|
|
|
|
Skip WAL for new relfilenodes, under wal_level=minimal.
Until now, only selected bulk operations (e.g. COPY) did this. If a
given relfilenode received both a WAL-skipping COPY and a WAL-logged
operation (e.g. INSERT), recovery could lose tuples from the COPY. See
src/backend/access/transam/README section "Skipping WAL for New
RelFileNode" for the new coding rules. Maintainers of table access
methods should examine that section.
To maintain data durability, just before commit, we choose between an
fsync of the relfilenode and copying its contents to WAL. A new GUC,
wal_skip_threshold, guides that choice. If this change slows a workload
that creates small, permanent relfilenodes under wal_level=minimal, try
adjusting wal_skip_threshold. Users setting a timeout on COMMIT may
need to adjust that timeout, and log_min_duration_statement analysis
will reflect time consumption moving to COMMIT from commands like COPY.
Internally, this requires a reliable determination of whether
RollbackAndReleaseCurrentSubTransaction() would unlink a relation's
current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the
specification of rd_createSubid such that the field is zero when a new
rel has an old rd_node. Make relcache.c retain entries for certain
dropped relations until end of transaction.
Bump XLOG_PAGE_MAGIC, since this introduces XLOG_GIST_ASSIGN_LSN.
Future servers accept older WAL, so this bump is discretionary.
Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert
Haas. Heikki Linnakangas and Michael Paquier implemented earlier
designs that materially clarified the problem. Reviewed, in earlier
designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane,
Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout.
Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
2020-04-04 21:25:34 +02:00
|
|
|
#ifdef USE_ASSERT_CHECKING
|
|
|
|
extern void AssertPendingSyncs_RelationCache(void);
|
|
|
|
#else
|
|
|
|
#define AssertPendingSyncs_RelationCache() do {} while (0)
|
|
|
|
#endif
|
2004-07-01 02:52:04 +02:00
|
|
|
extern void AtEOXact_RelationCache(bool isCommit);
|
2004-09-16 18:58:44 +02:00
|
|
|
extern void AtEOSubXact_RelationCache(bool isCommit, SubTransactionId mySubid,
|
2019-05-22 19:04:48 +02:00
|
|
|
SubTransactionId parentSubid);
|
1999-09-04 20:42:15 +02:00
|
|
|
|
2002-02-19 21:11:20 +01:00
|
|
|
/*
|
2009-08-12 22:53:31 +02:00
|
|
|
* Routines to help manage rebuilding of relcache init files
|
2002-02-19 21:11:20 +01:00
|
|
|
*/
|
Fix the logic for putting relations into the relcache init file.
Commit f3b5565dd4e59576be4c772da364704863e6a835 was a couple of bricks shy
of a load; specifically, it missed putting pg_trigger_tgrelid_tgname_index
into the relcache init file, because that index is not used by any
syscache. However, we have historically nailed that index into cache for
performance reasons. The upshot was that load_relcache_init_file always
decided that the init file was busted and silently ignored it, resulting
in a significant hit to backend startup speed.
To fix, reinstantiate RelationIdIsInInitFile() as a wrapper around
RelationSupportsSysCache(), which can know about additional relations
that should be in the init file despite being unknown to syscache.c.
Also install some guards against future mistakes of this type: make
write_relcache_init_file Assert that all nailed relations get written to
the init file, and make load_relcache_init_file emit a WARNING if it takes
the "wrong number of nailed relations" exit path. Now that we remove the
init files during postmaster startup, that case should never occur in the
field, even if we are starting a minor-version update that added or removed
rels from the nailed set. So the warning shouldn't ever be seen by end
users, but it will show up in the regression tests if somebody breaks this
logic.
Back-patch to all supported branches, like the previous commit.
2015-06-25 20:39:05 +02:00
|
|
|
extern bool RelationIdIsInInitFile(Oid relationId);
|
2011-08-16 19:11:54 +02:00
|
|
|
extern void RelationCacheInitFilePreInvalidate(void);
|
|
|
|
extern void RelationCacheInitFilePostInvalidate(void);
|
2009-08-12 22:53:31 +02:00
|
|
|
extern void RelationCacheInitFileRemove(void);
|
2000-11-21 22:16:06 +01:00
|
|
|
|
2002-02-19 21:11:20 +01:00
|
|
|
/* should be used only by relcache.c and catcache.c */
|
|
|
|
extern bool criticalRelcachesBuilt;
|
2010-02-26 03:01:40 +01:00
|
|
|
|
2009-08-12 22:53:31 +02:00
|
|
|
/* should be used only by relcache.c and postinit.c */
|
|
|
|
extern bool criticalSharedRelcachesBuilt;
|
2001-10-28 07:26:15 +01:00
|
|
|
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
#endif /* RELCACHE_H */
|