1996-07-09 08:22:35 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
1999-02-14 00:22:53 +01:00
|
|
|
* heap.c
|
1996-07-09 08:22:35 +02:00
|
|
|
* code to create and destroy POSTGRES heap relations
|
|
|
|
*
|
2024-01-04 02:49:05 +01:00
|
|
|
* Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
|
2000-01-26 06:58:53 +01:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
*
|
|
|
|
* IDENTIFICATION
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/backend/catalog/heap.c
|
1999-02-02 04:45:56 +01:00
|
|
|
*
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
* INTERFACE ROUTINES
|
1997-11-28 05:40:40 +01:00
|
|
|
* heap_create() - Create an uncataloged heap relation
|
1997-11-28 18:28:02 +01:00
|
|
|
* heap_create_with_catalog() - Create a cataloged relation
|
1999-12-10 04:56:14 +01:00
|
|
|
* heap_drop_with_catalog() - Removes named relation from catalogs
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
* NOTES
|
|
|
|
* this code taken from access/heap/create.c, which contains
|
1998-08-06 07:13:14 +02:00
|
|
|
* the old heap_create_with_catalog, amcreate, and amdestroy.
|
1997-11-28 05:40:40 +01:00
|
|
|
* those routines will soon call these routines using the function
|
|
|
|
* manager,
|
1996-07-09 08:22:35 +02:00
|
|
|
* just like the poorly named "NewXXX" routines do. The
|
|
|
|
* "New" routines are all going to die soon, once and for all!
|
|
|
|
* -cim 1/13/91
|
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
1998-04-27 06:08:07 +02:00
|
|
|
#include "postgres.h"
|
|
|
|
|
2019-12-27 00:09:00 +01:00
|
|
|
#include "access/genam.h"
|
Improve concurrency of foreign key locking
This patch introduces two additional lock modes for tuples: "SELECT FOR
KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each
other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
FOR UPDATE". UPDATE commands that do not modify the values stored in
the columns that are part of the key of the tuple now grab a SELECT FOR
NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
with tuple locks of the FOR KEY SHARE variety.
Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.
The added tuple lock semantics require some rejiggering of the multixact
module, so that the locking level that each transaction is holding can
be stored alongside its Xid. Also, multixacts now need to persist
across server restarts and crashes, because they can now represent not
only tuple locks, but also tuple updates. This means we need more
careful tracking of lifetime of pg_multixact SLRU files; since they now
persist longer, we require more infrastructure to figure out when they
can be removed. pg_upgrade also needs to be careful to copy
pg_multixact files over from the old server to the new, or at least part
of multixact.c state, depending on the versions of the old and new
servers.
Tuple time qualification rules (HeapTupleSatisfies routines) need to be
careful not to consider tuples with the "is multi" infomask bit set as
being only locked; they might need to look up MultiXact values (i.e.
possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
whereas they previously were assured to only use information readily
available from the tuple header. This is considered acceptable, because
the extra I/O would involve cases that would previously cause some
commands to block waiting for concurrent transactions to finish.
Another important change is the fact that locking tuples that have
previously been updated causes the future versions to be marked as
locked, too; this is essential for correctness of foreign key checks.
This causes additional WAL-logging, also (there was previously a single
WAL record for a locked tuple; now there are as many as updated copies
of the tuple there exist.)
With all this in place, contention related to tuples being checked by
foreign key rules should be much reduced.
As a bonus, the old behavior that a subtransaction grabbing a stronger
tuple lock than the parent (sub)transaction held on a given tuple and
later aborting caused the weaker lock to be lost, has been fixed.
Many new spec files were added for isolation tester framework, to ensure
overall behavior is sane. There's probably room for several more tests.
There were several reviewers of this patch; in particular, Noah Misch
and Andres Freund spent considerable time in it. Original idea for the
patch came from Simon Riggs, after a problem report by Joel Jacobson.
Most code is from me, with contributions from Marti Raudsepp, Alexander
Shulgin, Noah Misch and Andres Freund.
This patch was discussed in several pgsql-hackers threads; the most
important start at the following message-ids:
AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
1290721684-sup-3951@alvh.no-ip.org
1294953201-sup-2099@alvh.no-ip.org
1320343602-sup-2290@alvh.no-ip.org
1339690386-sup-8927@alvh.no-ip.org
4FE5FF020200002500048A3D@gw.wicourts.gov
4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 16:04:59 +01:00
|
|
|
#include "access/multixact.h"
|
2019-01-21 19:18:20 +01:00
|
|
|
#include "access/relation.h"
|
|
|
|
#include "access/table.h"
|
2019-03-29 04:01:14 +01:00
|
|
|
#include "access/tableam.h"
|
2022-05-12 18:20:36 +02:00
|
|
|
#include "catalog/binary_upgrade.h"
|
1998-04-27 06:08:07 +02:00
|
|
|
#include "catalog/catalog.h"
|
|
|
|
#include "catalog/heap.h"
|
|
|
|
#include "catalog/index.h"
|
2010-11-25 17:48:49 +01:00
|
|
|
#include "catalog/objectaccess.h"
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
#include "catalog/partition.h"
|
tableam: introduce table AM infrastructure.
This introduces the concept of table access methods, i.e. CREATE
ACCESS METHOD ... TYPE TABLE and
CREATE TABLE ... USING (storage-engine).
No table access functionality is delegated to table AMs as of this
commit, that'll be done in following commits.
Subsequent commits will incrementally abstract table access
functionality to be routed through table access methods. That change
is too large to be reviewed & committed at once, so it'll be done
incrementally.
Docs will be updated at the end, as adding them incrementally would
likely make them less coherent, and definitely is a lot more work,
without a lot of benefit.
Table access methods are specified similar to index access methods,
i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with
callbacks. In contrast to index AMs that struct needs to live as long
as a backend, typically that's achieved by just returning a pointer to
a constant struct.
Psql's \d+ now displays a table's access method. That can be disabled
with HIDE_TABLEAM=true, which is mainly useful so regression tests can
be run against different AMs. It's quite possible that this behaviour
still needs to be fine tuned.
For now it's not allowed to set a table AM for a partitioned table, as
we've not resolved how partitions would inherit that. Disallowing
allows us to introduce, if we decide that's the way forward, such a
behaviour without a compatibility break.
Catversion bumped, to add the heap table AM and references to it.
Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de
https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 18:54:38 +01:00
|
|
|
#include "catalog/pg_am.h"
|
1998-04-27 06:08:07 +02:00
|
|
|
#include "catalog/pg_attrdef.h"
|
2011-02-12 14:54:13 +01:00
|
|
|
#include "catalog/pg_collation.h"
|
2002-07-12 20:43:19 +02:00
|
|
|
#include "catalog/pg_constraint.h"
|
2011-01-02 05:48:11 +01:00
|
|
|
#include "catalog/pg_foreign_table.h"
|
1998-04-27 06:08:07 +02:00
|
|
|
#include "catalog/pg_inherits.h"
|
2005-04-14 22:03:27 +02:00
|
|
|
#include "catalog/pg_namespace.h"
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
#include "catalog/pg_opclass.h"
|
|
|
|
#include "catalog/pg_partitioned_table.h"
|
1999-11-28 03:03:04 +01:00
|
|
|
#include "catalog/pg_statistic.h"
|
2017-03-23 13:36:36 +01:00
|
|
|
#include "catalog/pg_subscription_rel.h"
|
2007-10-12 20:55:12 +02:00
|
|
|
#include "catalog/pg_tablespace.h"
|
1999-10-04 04:12:26 +02:00
|
|
|
#include "catalog/pg_type.h"
|
2008-11-19 11:34:52 +01:00
|
|
|
#include "catalog/storage.h"
|
2002-11-10 00:56:39 +01:00
|
|
|
#include "commands/tablecmds.h"
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
#include "commands/typecmds.h"
|
2024-02-16 21:05:36 +01:00
|
|
|
#include "common/int.h"
|
2000-01-16 20:57:00 +01:00
|
|
|
#include "miscadmin.h"
|
2008-08-26 00:42:34 +02:00
|
|
|
#include "nodes/nodeFuncs.h"
|
2019-01-29 21:48:51 +01:00
|
|
|
#include "optimizer/optimizer.h"
|
2002-05-13 01:43:04 +02:00
|
|
|
#include "parser/parse_coerce.h"
|
2011-03-20 01:29:08 +01:00
|
|
|
#include "parser/parse_collate.h"
|
1999-07-16 07:00:38 +02:00
|
|
|
#include "parser/parse_expr.h"
|
1999-10-04 01:55:40 +02:00
|
|
|
#include "parser/parse_relation.h"
|
2019-03-30 08:13:09 +01:00
|
|
|
#include "parser/parsetree.h"
|
2019-02-21 17:38:54 +01:00
|
|
|
#include "partitioning/partdesc.h"
|
pgstat: scaffolding for transactional stats creation / drop.
One problematic part of the current statistics collector design is that there
is no reliable way of getting rid of statistics entries. Because of that
pgstat_vacuum_stat() (called by [auto-]vacuum) matches all stats for the
current database with the catalog contents and tries to drop now-superfluous
entries. That's quite expensive. What's worse, it doesn't work on physical
replicas, despite physical replicas collection statistics entries.
This commit introduces infrastructure to create / drop statistics entries
transactionally, together with the underlying catalog objects (functions,
relations, subscriptions). pgstat_xact.c maintains a list of stats entries
created / dropped transactionally in the current transaction. To ensure the
removal of statistics entries is durable dropped statistics entries are
included in commit / abort (and prepare) records, which also ensures that
stats entries are dropped on standbys.
Statistics entries created separately from creating the underlying catalog
object (e.g. when stats were previously lost due to an immediate restart)
are *not* WAL logged. However that can only happen outside of the transaction
creating the catalog object, so it does not lead to "leaked" statistics
entries.
For this to work, functions creating / dropping functions / relations /
subscriptions need to call into pgstat. For subscriptions this was already
done when dropping subscriptions, via pgstat_report_subscription_drop() (now
renamed to pgstat_drop_subscription()).
This commit does not actually drop stats yet, it just provides the
infrastructure. It is however a largely independent piece of infrastructure,
so committing it separately makes sense.
Bumps XLOG_PAGE_MAGIC.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-07 03:22:22 +02:00
|
|
|
#include "pgstat.h"
|
2017-04-28 20:00:58 +02:00
|
|
|
#include "storage/lmgr.h"
|
2011-06-08 12:47:21 +02:00
|
|
|
#include "storage/predicate.h"
|
1998-04-27 06:08:07 +02:00
|
|
|
#include "utils/builtins.h"
|
2000-05-28 19:56:29 +02:00
|
|
|
#include "utils/fmgroids.h"
|
2002-03-03 18:47:56 +01:00
|
|
|
#include "utils/inval.h"
|
2000-12-22 20:21:37 +01:00
|
|
|
#include "utils/lsyscache.h"
|
2008-03-26 22:10:39 +01:00
|
|
|
#include "utils/syscache.h"
|
1998-04-27 06:08:07 +02:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
|
2015-03-11 03:33:25 +01:00
|
|
|
/* Potentially set by pg_upgrade_support functions */
|
2011-01-08 03:25:34 +01:00
|
|
|
Oid binary_upgrade_next_heap_pg_class_oid = InvalidOid;
|
|
|
|
Oid binary_upgrade_next_toast_pg_class_oid = InvalidOid;
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
RelFileNumber binary_upgrade_next_heap_pg_class_relfilenumber = InvalidRelFileNumber;
|
|
|
|
RelFileNumber binary_upgrade_next_toast_pg_class_relfilenumber = InvalidRelFileNumber;
|
2010-02-03 02:14:17 +01:00
|
|
|
|
1999-02-02 04:45:56 +01:00
|
|
|
static void AddNewRelationTuple(Relation pg_class_desc,
|
2001-08-10 20:57:42 +02:00
|
|
|
Relation new_rel_desc,
|
2010-01-29 00:21:13 +01:00
|
|
|
Oid new_rel_oid,
|
|
|
|
Oid new_type_oid,
|
|
|
|
Oid reloftype,
|
2005-08-26 05:08:15 +02:00
|
|
|
Oid relowner,
|
2006-07-02 04:23:23 +02:00
|
|
|
char relkind,
|
2019-03-29 04:01:14 +01:00
|
|
|
TransactionId relfrozenxid,
|
|
|
|
TransactionId relminmxid,
|
2009-10-05 21:24:49 +02:00
|
|
|
Datum relacl,
|
2006-07-04 00:45:41 +02:00
|
|
|
Datum reloptions);
|
Change many routines to return ObjectAddress rather than OID
The changed routines are mostly those that can be directly called by
ProcessUtilitySlow; the intention is to make the affected object
information more precise, in support for future event trigger changes.
Originally it was envisioned that the OID of the affected object would
be enough, and in most cases that is correct, but upon actually
implementing the event trigger changes it turned out that ObjectAddress
is more widely useful.
Additionally, some command execution routines grew an output argument
that's an object address which provides further info about the executed
command. To wit:
* for ALTER DOMAIN / ADD CONSTRAINT, it corresponds to the address of
the new constraint
* for ALTER OBJECT / SET SCHEMA, it corresponds to the address of the
schema that originally contained the object.
* for ALTER EXTENSION {ADD, DROP} OBJECT, it corresponds to the address
of the object added to or dropped from the extension.
There's no user-visible change in this commit, and no functional change
either.
Discussion: 20150218213255.GC6717@tamriel.snowman.net
Reviewed-By: Stephen Frost, Andres Freund
2015-03-03 18:10:50 +01:00
|
|
|
static ObjectAddress AddNewRelationType(const char *typeName,
|
2002-03-29 20:06:29 +01:00
|
|
|
Oid typeNamespace,
|
|
|
|
Oid new_rel_oid,
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
char new_rel_kind,
|
Repair a longstanding bug in CLUSTER and the rewriting variants of ALTER
TABLE: if the command is executed by someone other than the table owner (eg,
a superuser) and the table has a toast table, the toast table's pg_type row
ends up with the wrong typowner, ie, the command issuer not the table owner.
This is quite harmless for most purposes, since no interesting permissions
checks consult the pg_type row. However, it could lead to unexpected failures
if one later tries to drop the role that issued the command (in 8.1 or 8.2),
or strange warnings from pg_dump afterwards (in 8.3 and up, which will allow
the DROP ROLE because we don't create a "redundant" owner dependency for table
rowtypes). Problem identified by Cott Lang.
Back-patch to 8.1. The problem is actually far older --- the CLUSTER variant
can be demonstrated in 7.0 --- but it's mostly cosmetic before 8.1 because we
didn't track ownership dependencies before 8.1. Also, fixing it before 8.1
would require changing the call signature of heap_create_with_catalog(), which
seems to carry a nontrivial risk of breaking add-on modules.
2009-02-24 02:38:10 +01:00
|
|
|
Oid ownerid,
|
2009-09-27 00:42:03 +02:00
|
|
|
Oid new_row_type,
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
Oid new_array_type);
|
2004-08-28 23:05:26 +02:00
|
|
|
static void RelationRemoveInheritance(Oid relid);
|
2017-10-31 15:34:31 +01:00
|
|
|
static Oid StoreRelCheck(Relation rel, const char *ccname, Node *expr,
|
2012-04-21 04:46:20 +02:00
|
|
|
bool is_validated, bool is_local, int inhcount,
|
2013-03-18 03:55:14 +01:00
|
|
|
bool is_no_inherit, bool is_internal);
|
|
|
|
static void StoreConstraints(Relation rel, List *cooked_constraints,
|
|
|
|
bool is_internal);
|
2017-10-31 15:34:31 +01:00
|
|
|
static bool MergeWithExistingConstraint(Relation rel, const char *ccname, Node *expr,
|
2012-04-21 04:46:20 +02:00
|
|
|
bool allow_merge, bool is_local,
|
Fix two bugs in merging of inherited CHECK constraints.
Historically, we've allowed users to add a CHECK constraint to a child
table and then add an identical CHECK constraint to the parent. This
results in "merging" the two constraints so that the pre-existing
child constraint ends up with both conislocal = true and coninhcount > 0.
However, if you tried to do it in the other order, you got a duplicate
constraint error. This is problematic for pg_dump, which needs to issue
separated ADD CONSTRAINT commands in some cases, but has no good way to
ensure that the constraints will be added in the required order.
And it's more than a bit arbitrary, too. The goal of complaining about
duplicated ADD CONSTRAINT commands can be served if we reject the case of
adding a constraint when the existing one already has conislocal = true;
but if it has conislocal = false, let's just make the ADD CONSTRAINT set
conislocal = true. In this way, either order of adding the constraints
has the same end result.
Another problem was that the code allowed creation of a parent constraint
marked convalidated that is merged with a child constraint that is
!convalidated. In this case, an inheritance scan of the parent table could
emit some rows violating the constraint condition, which would be an
unexpected result given the marking of the parent constraint as validated.
Hence, forbid merging of constraints in this case. (Note: valid child and
not-valid parent seems fine, so continue to allow that.)
Per report from Benedikt Grundmann. Back-patch to 9.2 where we introduced
possibly-not-valid check constraints. The second bug obviously doesn't
apply before that, and I think the first doesn't either, because pg_dump
only gets into this situation when dealing with not-valid constraints.
Report: <CADbMkNPT-Jz5PRSQ4RbUASYAjocV_KHUWapR%2Bg8fNvhUAyRpxA%40mail.gmail.com>
Discussion: <22108.1475874586@sss.pgh.pa.us>
2016-10-09 01:29:27 +02:00
|
|
|
bool is_initially_valid,
|
2012-04-21 04:46:20 +02:00
|
|
|
bool is_no_inherit);
|
2002-03-03 18:47:56 +01:00
|
|
|
static void SetRelationNumChecks(Relation rel, int numchecks);
|
2008-05-10 01:32:05 +02:00
|
|
|
static Node *cookConstraint(ParseState *pstate,
|
|
|
|
Node *raw_constraint,
|
|
|
|
char *relname);
|
1997-08-19 23:40:56 +02:00
|
|
|
|
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
* XXX UGLY HARD CODED BADNESS FOLLOWS XXX
|
|
|
|
*
|
|
|
|
* these should all be moved to someplace in the lib/catalog
|
|
|
|
* module, if not obliterated first.
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Note:
|
2001-05-07 02:43:27 +02:00
|
|
|
* Should the system special case these attributes in the future?
|
|
|
|
* Advantage: consume much less space in the ATTRIBUTE relation.
|
|
|
|
* Disadvantage: special cases will be all over the place.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
|
|
|
|
2010-01-22 17:40:19 +01:00
|
|
|
/*
|
2011-08-05 19:24:03 +02:00
|
|
|
* The initializers below do not include trailing variable length fields,
|
2010-01-22 17:40:19 +01:00
|
|
|
* but that's OK - we're never going to reference anything beyond the
|
2021-05-23 18:12:09 +02:00
|
|
|
* fixed-size portion of the structure anyway. Fields that can default
|
|
|
|
* to zeroes are also not mentioned.
|
2010-01-22 17:40:19 +01:00
|
|
|
*/
|
|
|
|
|
2018-10-16 18:44:43 +02:00
|
|
|
static const FormData_pg_attribute a1 = {
|
2018-08-29 08:36:30 +02:00
|
|
|
.attname = {"ctid"},
|
|
|
|
.atttypid = TIDOID,
|
|
|
|
.attlen = sizeof(ItemPointerData),
|
|
|
|
.attnum = SelfItemPointerAttributeNumber,
|
|
|
|
.attcacheoff = -1,
|
|
|
|
.atttypmod = -1,
|
|
|
|
.attbyval = false,
|
2020-03-04 16:34:25 +01:00
|
|
|
.attalign = TYPALIGN_SHORT,
|
2021-05-23 18:12:09 +02:00
|
|
|
.attstorage = TYPSTORAGE_PLAIN,
|
2018-08-29 08:36:30 +02:00
|
|
|
.attnotnull = true,
|
|
|
|
.attislocal = true,
|
1996-07-09 08:22:35 +02:00
|
|
|
};
|
|
|
|
|
2018-10-16 18:44:43 +02:00
|
|
|
static const FormData_pg_attribute a2 = {
|
2018-08-29 08:36:30 +02:00
|
|
|
.attname = {"xmin"},
|
|
|
|
.atttypid = XIDOID,
|
|
|
|
.attlen = sizeof(TransactionId),
|
|
|
|
.attnum = MinTransactionIdAttributeNumber,
|
|
|
|
.attcacheoff = -1,
|
|
|
|
.atttypmod = -1,
|
|
|
|
.attbyval = true,
|
2020-03-04 16:34:25 +01:00
|
|
|
.attalign = TYPALIGN_INT,
|
2021-05-23 18:12:09 +02:00
|
|
|
.attstorage = TYPSTORAGE_PLAIN,
|
2018-08-29 08:36:30 +02:00
|
|
|
.attnotnull = true,
|
|
|
|
.attislocal = true,
|
1996-07-09 08:22:35 +02:00
|
|
|
};
|
|
|
|
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
static const FormData_pg_attribute a3 = {
|
2018-08-29 08:36:30 +02:00
|
|
|
.attname = {"cmin"},
|
|
|
|
.atttypid = CIDOID,
|
|
|
|
.attlen = sizeof(CommandId),
|
|
|
|
.attnum = MinCommandIdAttributeNumber,
|
|
|
|
.attcacheoff = -1,
|
|
|
|
.atttypmod = -1,
|
|
|
|
.attbyval = true,
|
2020-03-04 16:34:25 +01:00
|
|
|
.attalign = TYPALIGN_INT,
|
2021-05-23 18:12:09 +02:00
|
|
|
.attstorage = TYPSTORAGE_PLAIN,
|
2018-08-29 08:36:30 +02:00
|
|
|
.attnotnull = true,
|
|
|
|
.attislocal = true,
|
1996-07-09 08:22:35 +02:00
|
|
|
};
|
|
|
|
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
static const FormData_pg_attribute a4 = {
|
2018-08-29 08:36:30 +02:00
|
|
|
.attname = {"xmax"},
|
|
|
|
.atttypid = XIDOID,
|
|
|
|
.attlen = sizeof(TransactionId),
|
|
|
|
.attnum = MaxTransactionIdAttributeNumber,
|
|
|
|
.attcacheoff = -1,
|
|
|
|
.atttypmod = -1,
|
|
|
|
.attbyval = true,
|
2020-03-04 16:34:25 +01:00
|
|
|
.attalign = TYPALIGN_INT,
|
2021-05-23 18:12:09 +02:00
|
|
|
.attstorage = TYPSTORAGE_PLAIN,
|
2018-08-29 08:36:30 +02:00
|
|
|
.attnotnull = true,
|
|
|
|
.attislocal = true,
|
1996-07-09 08:22:35 +02:00
|
|
|
};
|
|
|
|
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
static const FormData_pg_attribute a5 = {
|
2018-08-29 08:36:30 +02:00
|
|
|
.attname = {"cmax"},
|
|
|
|
.atttypid = CIDOID,
|
|
|
|
.attlen = sizeof(CommandId),
|
|
|
|
.attnum = MaxCommandIdAttributeNumber,
|
|
|
|
.attcacheoff = -1,
|
|
|
|
.atttypmod = -1,
|
|
|
|
.attbyval = true,
|
2020-03-04 16:34:25 +01:00
|
|
|
.attalign = TYPALIGN_INT,
|
2021-05-23 18:12:09 +02:00
|
|
|
.attstorage = TYPSTORAGE_PLAIN,
|
2018-08-29 08:36:30 +02:00
|
|
|
.attnotnull = true,
|
|
|
|
.attislocal = true,
|
1996-07-09 08:22:35 +02:00
|
|
|
};
|
|
|
|
|
2000-10-11 23:28:19 +02:00
|
|
|
/*
|
2001-05-07 02:43:27 +02:00
|
|
|
* We decided to call this attribute "tableoid" rather than say
|
|
|
|
* "classoid" on the basis that in the future there may be more than one
|
|
|
|
* table of a particular class/type. In any case table is still the word
|
|
|
|
* used in SQL.
|
|
|
|
*/
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
static const FormData_pg_attribute a6 = {
|
2018-08-29 08:36:30 +02:00
|
|
|
.attname = {"tableoid"},
|
|
|
|
.atttypid = OIDOID,
|
|
|
|
.attlen = sizeof(Oid),
|
|
|
|
.attnum = TableOidAttributeNumber,
|
|
|
|
.attcacheoff = -1,
|
|
|
|
.atttypmod = -1,
|
|
|
|
.attbyval = true,
|
2020-03-04 16:34:25 +01:00
|
|
|
.attalign = TYPALIGN_INT,
|
2021-05-23 18:12:09 +02:00
|
|
|
.attstorage = TYPSTORAGE_PLAIN,
|
2018-08-29 08:36:30 +02:00
|
|
|
.attnotnull = true,
|
|
|
|
.attislocal = true,
|
2000-07-03 00:01:27 +02:00
|
|
|
};
|
|
|
|
|
2023-09-26 12:28:57 +02:00
|
|
|
static const FormData_pg_attribute *const SysAtt[] = {&a1, &a2, &a3, &a4, &a5, &a6};
|
2001-05-07 02:43:27 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* This function returns a Form_pg_attribute pointer for a system attribute.
|
2003-07-21 03:59:11 +02:00
|
|
|
* Note that we elog if the presented attno is invalid, which would only
|
|
|
|
* happen if there's a problem upstream.
|
2001-05-07 02:43:27 +02:00
|
|
|
*/
|
2018-10-16 18:44:43 +02:00
|
|
|
const FormData_pg_attribute *
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
SystemAttributeDefinition(AttrNumber attno)
|
2001-05-07 02:43:27 +02:00
|
|
|
{
|
|
|
|
if (attno >= 0 || attno < -(int) lengthof(SysAtt))
|
2003-07-21 03:59:11 +02:00
|
|
|
elog(ERROR, "invalid system attribute number %d", attno);
|
2001-05-07 02:43:27 +02:00
|
|
|
return SysAtt[-attno - 1];
|
|
|
|
}
|
1996-07-09 08:22:35 +02:00
|
|
|
|
2001-10-23 00:47:57 +02:00
|
|
|
/*
|
|
|
|
* If the given name is a system attribute name, return a Form_pg_attribute
|
|
|
|
* pointer for a prototype definition. If not, return NULL.
|
|
|
|
*/
|
2018-10-16 18:44:43 +02:00
|
|
|
const FormData_pg_attribute *
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
SystemAttributeByName(const char *attname)
|
2001-10-23 00:47:57 +02:00
|
|
|
{
|
|
|
|
int j;
|
|
|
|
|
|
|
|
for (j = 0; j < (int) lengthof(SysAtt); j++)
|
|
|
|
{
|
2018-10-16 18:44:43 +02:00
|
|
|
const FormData_pg_attribute *att = SysAtt[j];
|
2001-10-23 00:47:57 +02:00
|
|
|
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
if (strcmp(NameStr(att->attname), attname) == 0)
|
|
|
|
return att;
|
2001-10-23 00:47:57 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
* XXX END OF UGLY HARD CODED BADNESS XXX
|
2000-07-03 00:01:27 +02:00
|
|
|
* ---------------------------------------------------------------- */
|
1996-07-09 08:22:35 +02:00
|
|
|
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
1997-11-28 05:40:40 +01:00
|
|
|
* heap_create - Create an uncataloged heap relation
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
2005-08-12 03:36:05 +02:00
|
|
|
* Note API change: the caller must now always provide the OID
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
* to use for the relation. The relfilenumber may be (and in
|
pg_upgrade: Preserve relfilenodes and tablespace OIDs.
Currently, database OIDs, relfilenodes, and tablespace OIDs can all
change when a cluster is upgraded using pg_upgrade. It seems better
to preserve them, because (1) it makes troubleshooting pg_upgrade
easier, since you don't have to do a lot of work to match up files
in the old and new clusters, (2) it allows 'rsync' to save bandwidth
when used to re-sync a cluster after an upgrade, and (3) if we ever
encrypt or sign blocks, we would likely want to use a nonce that
depends on these values.
This patch only arranges to preserve relfilenodes and tablespace
OIDs. The task of preserving database OIDs is left for another patch,
since it involves some complexities that don't exist in these cases.
Database OIDs have a similar issue, but there are some tricky points
in that case that do not apply to these cases, so that problem is left
for another patch.
Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.
Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
2022-01-17 19:32:44 +01:00
|
|
|
* the simplest cases is) left unspecified.
|
|
|
|
*
|
|
|
|
* create_storage indicates whether or not to create the storage.
|
|
|
|
* However, even if create_storage is true, no storage will be
|
|
|
|
* created if the relkind is one that doesn't have storage.
|
2005-04-14 03:38:22 +02:00
|
|
|
*
|
2001-08-10 20:57:42 +02:00
|
|
|
* rel->rd_rel is initialized by RelationBuildLocalRelation,
|
|
|
|
* and is mostly zeroes at return.
|
1996-07-09 08:22:35 +02:00
|
|
|
* ----------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
Relation
|
2002-03-31 08:26:32 +02:00
|
|
|
heap_create(const char *relname,
|
2002-03-26 20:17:02 +01:00
|
|
|
Oid relnamespace,
|
2004-06-18 08:14:31 +02:00
|
|
|
Oid reltablespace,
|
2005-04-14 03:38:22 +02:00
|
|
|
Oid relid,
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
RelFileNumber relfilenumber,
|
tableam: introduce table AM infrastructure.
This introduces the concept of table access methods, i.e. CREATE
ACCESS METHOD ... TYPE TABLE and
CREATE TABLE ... USING (storage-engine).
No table access functionality is delegated to table AMs as of this
commit, that'll be done in following commits.
Subsequent commits will incrementally abstract table access
functionality to be routed through table access methods. That change
is too large to be reviewed & committed at once, so it'll be done
incrementally.
Docs will be updated at the end, as adding them incrementally would
likely make them less coherent, and definitely is a lot more work,
without a lot of benefit.
Table access methods are specified similar to index access methods,
i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with
callbacks. In contrast to index AMs that struct needs to live as long
as a backend, typically that's achieved by just returning a pointer to
a constant struct.
Psql's \d+ now displays a table's access method. That can be disabled
with HIDE_TABLEAM=true, which is mainly useful so regression tests can
be run against different AMs. It's quite possible that this behaviour
still needs to be fine tuned.
For now it's not allowed to set a table AM for a partitioned table, as
we've not resolved how partitions would inherit that. Disallowing
allows us to introduce, if we decide that's the way forward, such a
behaviour without a compatibility break.
Catversion bumped, to add the heap table AM and references to it.
Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de
https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 18:54:38 +01:00
|
|
|
Oid accessmtd,
|
1999-02-02 04:45:56 +01:00
|
|
|
TupleDesc tupDesc,
|
2004-08-31 19:10:36 +02:00
|
|
|
char relkind,
|
2010-12-13 18:34:26 +01:00
|
|
|
char relpersistence,
|
2002-04-27 23:24:34 +02:00
|
|
|
bool shared_relation,
|
2013-06-03 16:22:31 +02:00
|
|
|
bool mapped_relation,
|
2019-03-29 04:01:14 +01:00
|
|
|
bool allow_system_table_mods,
|
|
|
|
TransactionId *relfrozenxid,
|
pg_upgrade: Preserve relfilenodes and tablespace OIDs.
Currently, database OIDs, relfilenodes, and tablespace OIDs can all
change when a cluster is upgraded using pg_upgrade. It seems better
to preserve them, because (1) it makes troubleshooting pg_upgrade
easier, since you don't have to do a lot of work to match up files
in the old and new clusters, (2) it allows 'rsync' to save bandwidth
when used to re-sync a cluster after an upgrade, and (3) if we ever
encrypt or sign blocks, we would likely want to use a nonce that
depends on these values.
This patch only arranges to preserve relfilenodes and tablespace
OIDs. The task of preserving database OIDs is left for another patch,
since it involves some complexities that don't exist in these cases.
Database OIDs have a similar issue, but there are some tricky points
in that case that do not apply to these cases, so that problem is left
for another patch.
Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.
Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
2022-01-17 19:32:44 +01:00
|
|
|
MultiXactId *relminmxid,
|
|
|
|
bool create_storage)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2001-06-29 23:08:25 +02:00
|
|
|
Relation rel;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2005-08-12 03:36:05 +02:00
|
|
|
/* The caller must have provided an OID for the relation. */
|
|
|
|
Assert(OidIsValid(relid));
|
|
|
|
|
2013-06-03 16:22:31 +02:00
|
|
|
/*
|
Refine our definition of what constitutes a system relation.
Although user-defined relations can't be directly created in
pg_catalog, it's possible for them to end up there, because you can
create them in some other schema and then use ALTER TABLE .. SET SCHEMA
to move them there. Previously, such relations couldn't afterwards
be manipulated, because IsSystemRelation()/IsSystemClass() rejected
all attempts to modify objects in the pg_catalog schema, regardless
of their origin. With this patch, they now reject only those
objects in pg_catalog which were created at initdb-time, allowing
most operations on user-created tables in pg_catalog to proceed
normally.
This patch also adds new functions IsCatalogRelation() and
IsCatalogClass(), which is similar to IsSystemRelation() and
IsSystemClass() but with a slightly narrower definition: only TOAST
tables of system catalogs are included, rather than *all* TOAST tables.
This is currently used only for making decisions about when
invalidation messages need to be sent, but upcoming logical decoding
patches will find other uses for this information.
Andres Freund, with some modifications by me.
2013-11-29 02:57:20 +01:00
|
|
|
* Don't allow creating relations in pg_catalog directly, even though it
|
|
|
|
* is allowed to move user defined relations there. Semantics with search
|
|
|
|
* paths including pg_catalog are too confusing for now.
|
|
|
|
*
|
|
|
|
* But allow creating indexes on relations in pg_catalog even if
|
|
|
|
* allow_system_table_mods = off, upper layers already guarantee it's on a
|
|
|
|
* user defined relation, not a system one.
|
2013-06-03 16:22:31 +02:00
|
|
|
*/
|
|
|
|
if (!allow_system_table_mods &&
|
Clean up the behavior and API of catalog.c's is-catalog-relation tests.
The right way for IsCatalogRelation/Class to behave is to return true
for OIDs less than FirstBootstrapObjectId (not FirstNormalObjectId),
without any of the ad-hoc fooling around with schema membership.
The previous code was wrong because (1) it claimed that
information_schema tables were not catalog relations but their toast
tables were, which is silly; and (2) if you dropped and recreated
information_schema, which is a supported operation, the behavior
changed. That's even sillier. With this definition, "catalog
relations" are exactly the ones traceable to the postgres.bki data,
which seems like what we want.
With this simplification, we don't actually need access to the pg_class
tuple to identify a catalog relation; we only need its OID. Hence,
replace IsCatalogClass with "IsCatalogRelationOid(oid)". But keep
IsCatalogRelation as a convenience function.
This allows fixing some arguably-wrong semantics in contrib/sepgsql and
ReindexRelationConcurrently, which were using an IsSystemNamespace test
where what they really should be using is IsCatalogRelationOid. The
previous coding failed to protect toast tables of system catalogs, and
also was not on board with the general principle that user-created tables
do not become catalogs just by virtue of being renamed into pg_catalog.
We can also get rid of a messy hack in ReindexMultipleTables.
While we're at it, also rename IsSystemNamespace to IsCatalogNamespace,
because the previous name invited confusion with the more expansive
semantics used by IsSystemRelation/Class.
Also improve the comments in catalog.c.
There are a few remaining places in replication-related code that are
special-casing OIDs below FirstNormalObjectId. I'm inclined to think
those are wrong too, and if there should be any special case it should
just extend to FirstBootstrapObjectId. But first we need to debate
whether a FOR ALL TABLES publication should include information_schema.
Discussion: https://postgr.es/m/21697.1557092753@sss.pgh.pa.us
Discussion: https://postgr.es/m/15150.1557257111@sss.pgh.pa.us
2019-05-09 05:27:29 +02:00
|
|
|
((IsCatalogNamespace(relnamespace) && relkind != RELKIND_INDEX) ||
|
Refine our definition of what constitutes a system relation.
Although user-defined relations can't be directly created in
pg_catalog, it's possible for them to end up there, because you can
create them in some other schema and then use ALTER TABLE .. SET SCHEMA
to move them there. Previously, such relations couldn't afterwards
be manipulated, because IsSystemRelation()/IsSystemClass() rejected
all attempts to modify objects in the pg_catalog schema, regardless
of their origin. With this patch, they now reject only those
objects in pg_catalog which were created at initdb-time, allowing
most operations on user-created tables in pg_catalog to proceed
normally.
This patch also adds new functions IsCatalogRelation() and
IsCatalogClass(), which is similar to IsSystemRelation() and
IsSystemClass() but with a slightly narrower definition: only TOAST
tables of system catalogs are included, rather than *all* TOAST tables.
This is currently used only for making decisions about when
invalidation messages need to be sent, but upcoming logical decoding
patches will find other uses for this information.
Andres Freund, with some modifications by me.
2013-11-29 02:57:20 +01:00
|
|
|
IsToastNamespace(relnamespace)) &&
|
2013-06-03 16:22:31 +02:00
|
|
|
IsNormalProcessingMode())
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
|
|
|
|
errmsg("permission denied to create \"%s.%s\"",
|
|
|
|
get_namespace_name(relnamespace), relname),
|
|
|
|
errdetail("System catalog modifications are currently disallowed.")));
|
|
|
|
|
2019-03-29 04:01:14 +01:00
|
|
|
*relfrozenxid = InvalidTransactionId;
|
|
|
|
*relminmxid = InvalidMultiXactId;
|
|
|
|
|
2021-12-03 13:38:26 +01:00
|
|
|
/*
|
|
|
|
* Force reltablespace to zero if the relation kind does not support
|
|
|
|
* tablespaces. This is mainly just for cleanliness' sake.
|
|
|
|
*/
|
|
|
|
if (!RELKIND_HAS_TABLESPACE(relkind))
|
|
|
|
reltablespace = InvalidOid;
|
2004-08-31 19:10:36 +02:00
|
|
|
|
pg_upgrade: Preserve relfilenodes and tablespace OIDs.
Currently, database OIDs, relfilenodes, and tablespace OIDs can all
change when a cluster is upgraded using pg_upgrade. It seems better
to preserve them, because (1) it makes troubleshooting pg_upgrade
easier, since you don't have to do a lot of work to match up files
in the old and new clusters, (2) it allows 'rsync' to save bandwidth
when used to re-sync a cluster after an upgrade, and (3) if we ever
encrypt or sign blocks, we would likely want to use a nonce that
depends on these values.
This patch only arranges to preserve relfilenodes and tablespace
OIDs. The task of preserving database OIDs is left for another patch,
since it involves some complexities that don't exist in these cases.
Database OIDs have a similar issue, but there are some tricky points
in that case that do not apply to these cases, so that problem is left
for another patch.
Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.
Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
2022-01-17 19:32:44 +01:00
|
|
|
/* Don't create storage for relkinds without physical storage. */
|
|
|
|
if (!RELKIND_HAS_STORAGE(relkind))
|
2011-07-18 17:02:48 +02:00
|
|
|
create_storage = false;
|
|
|
|
else
|
2019-01-04 18:51:17 +01:00
|
|
|
{
|
pg_upgrade: Preserve relfilenodes and tablespace OIDs.
Currently, database OIDs, relfilenodes, and tablespace OIDs can all
change when a cluster is upgraded using pg_upgrade. It seems better
to preserve them, because (1) it makes troubleshooting pg_upgrade
easier, since you don't have to do a lot of work to match up files
in the old and new clusters, (2) it allows 'rsync' to save bandwidth
when used to re-sync a cluster after an upgrade, and (3) if we ever
encrypt or sign blocks, we would likely want to use a nonce that
depends on these values.
This patch only arranges to preserve relfilenodes and tablespace
OIDs. The task of preserving database OIDs is left for another patch,
since it involves some complexities that don't exist in these cases.
Database OIDs have a similar issue, but there are some tricky points
in that case that do not apply to these cases, so that problem is left
for another patch.
Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.
Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
2022-01-17 19:32:44 +01:00
|
|
|
/*
|
2022-09-28 15:45:27 +02:00
|
|
|
* If relfilenumber is unspecified by the caller then create storage
|
|
|
|
* with oid same as relid.
|
pg_upgrade: Preserve relfilenodes and tablespace OIDs.
Currently, database OIDs, relfilenodes, and tablespace OIDs can all
change when a cluster is upgraded using pg_upgrade. It seems better
to preserve them, because (1) it makes troubleshooting pg_upgrade
easier, since you don't have to do a lot of work to match up files
in the old and new clusters, (2) it allows 'rsync' to save bandwidth
when used to re-sync a cluster after an upgrade, and (3) if we ever
encrypt or sign blocks, we would likely want to use a nonce that
depends on these values.
This patch only arranges to preserve relfilenodes and tablespace
OIDs. The task of preserving database OIDs is left for another patch,
since it involves some complexities that don't exist in these cases.
Database OIDs have a similar issue, but there are some tricky points
in that case that do not apply to these cases, so that problem is left
for another patch.
Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.
Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
2022-01-17 19:32:44 +01:00
|
|
|
*/
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
if (!RelFileNumberIsValid(relfilenumber))
|
2022-09-28 15:45:27 +02:00
|
|
|
relfilenumber = relid;
|
2019-01-04 18:51:17 +01:00
|
|
|
}
|
2011-07-18 17:02:48 +02:00
|
|
|
|
2004-07-11 21:52:52 +02:00
|
|
|
/*
|
|
|
|
* Never allow a pg_class entry to explicitly specify the database's
|
|
|
|
* default tablespace in reltablespace; force it to zero instead. This
|
|
|
|
* ensures that if the database is cloned with a different default
|
|
|
|
* tablespace, the pg_class entry will still match where CREATE DATABASE
|
|
|
|
* will put the physically copied relation.
|
|
|
|
*
|
|
|
|
* Yes, this is a bit of a hack.
|
|
|
|
*/
|
|
|
|
if (reltablespace == MyDatabaseTableSpace)
|
|
|
|
reltablespace = InvalidOid;
|
|
|
|
|
2000-06-28 05:33:33 +02:00
|
|
|
/*
|
2001-06-29 23:08:25 +02:00
|
|
|
* build the relcache entry.
|
2000-06-28 05:33:33 +02:00
|
|
|
*/
|
2002-03-26 20:17:02 +01:00
|
|
|
rel = RelationBuildLocalRelation(relname,
|
|
|
|
relnamespace,
|
|
|
|
tupDesc,
|
2004-06-18 08:14:31 +02:00
|
|
|
relid,
|
tableam: introduce table AM infrastructure.
This introduces the concept of table access methods, i.e. CREATE
ACCESS METHOD ... TYPE TABLE and
CREATE TABLE ... USING (storage-engine).
No table access functionality is delegated to table AMs as of this
commit, that'll be done in following commits.
Subsequent commits will incrementally abstract table access
functionality to be routed through table access methods. That change
is too large to be reviewed & committed at once, so it'll be done
incrementally.
Docs will be updated at the end, as adding them incrementally would
likely make them less coherent, and definitely is a lot more work,
without a lot of benefit.
Table access methods are specified similar to index access methods,
i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with
callbacks. In contrast to index AMs that struct needs to live as long
as a backend, typically that's achieved by just returning a pointer to
a constant struct.
Psql's \d+ now displays a table's access method. That can be disabled
with HIDE_TABLEAM=true, which is mainly useful so regression tests can
be run against different AMs. It's quite possible that this behaviour
still needs to be fine tuned.
For now it's not allowed to set a table AM for a partitioned table, as
we've not resolved how partitions would inherit that. Disallowing
allows us to introduce, if we decide that's the way forward, such a
behaviour without a compatibility break.
Catversion bumped, to add the heap table AM and references to it.
Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de
https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 18:54:38 +01:00
|
|
|
accessmtd,
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
relfilenumber,
|
2004-06-18 08:14:31 +02:00
|
|
|
reltablespace,
|
2010-02-07 21:48:13 +01:00
|
|
|
shared_relation,
|
2010-12-13 18:34:26 +01:00
|
|
|
mapped_relation,
|
2012-06-14 15:47:30 +02:00
|
|
|
relpersistence,
|
|
|
|
relkind);
|
2000-07-02 06:46:09 +02:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
2008-08-11 13:05:11 +02:00
|
|
|
* Have the storage manager create the relation's disk file, if needed.
|
|
|
|
*
|
2021-12-03 13:38:26 +01:00
|
|
|
* For tables, the AM callback creates both the main and the init fork.
|
|
|
|
* For others, only the main fork is created; the other forks will be
|
|
|
|
* created on demand.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2004-07-11 21:52:52 +02:00
|
|
|
if (create_storage)
|
|
|
|
{
|
2021-12-03 13:38:26 +01:00
|
|
|
if (RELKIND_HAS_TABLE_AM(rel->rd_rel->relkind))
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
table_relation_set_new_filelocator(rel, &rel->rd_locator,
|
|
|
|
relpersistence,
|
|
|
|
relfrozenxid, relminmxid);
|
2021-12-03 13:38:26 +01:00
|
|
|
else if (RELKIND_HAS_STORAGE(rel->rd_rel->relkind))
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
RelationCreateStorage(rel->rd_locator, relpersistence, true);
|
2021-12-03 13:38:26 +01:00
|
|
|
else
|
|
|
|
Assert(false);
|
2004-07-11 21:52:52 +02:00
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
|
Prevent drop of tablespaces used by partitioned relations
When a tablespace is used in a partitioned relation (per commits
ca4103025dfe in pg12 for tables and 33e6c34c3267 in pg11 for indexes),
it is possible to drop the tablespace, potentially causing various
problems. One such was reported in bug #16577, where a rewriting ALTER
TABLE causes a server crash.
Protect against this by using pg_shdepend to keep track of tablespaces
when used for relations that don't keep physical files; we now abort a
tablespace if we see that the tablespace is referenced from any
partitioned relations.
Backpatch this to 11, where this problem has been latent all along. We
don't try to create pg_shdepend entries for existing partitioned
indexes/tables, but any ones that are modified going forward will be
protected.
Note slight behavior change: when trying to drop a tablespace that
contains both regular tables as well as partitioned ones, you'd
previously get ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE and now you'll
get ERRCODE_DEPENDENT_OBJECTS_STILL_EXIST. Arguably, the latter is more
correct.
It is possible to add protecting pg_shdepend entries for existing
tables/indexes, by doing
ALTER TABLE ONLY some_partitioned_table SET TABLESPACE pg_default;
ALTER TABLE ONLY some_partitioned_table SET TABLESPACE original_tablespace;
for each partitioned table/index that is not in the database default
tablespace. Because these partitioned objects do not have storage, no
file needs to be actually moved, so it shouldn't take more time than
what's required to acquire locks.
This query can be used to search for such relations:
SELECT ... FROM pg_class WHERE relkind IN ('p', 'I') AND reltablespace <> 0
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/16577-881633a9f9894fd5@postgresql.org
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
2021-01-14 19:32:14 +01:00
|
|
|
/*
|
|
|
|
* If a tablespace is specified, removal of that tablespace is normally
|
|
|
|
* protected by the existence of a physical file; but for relations with
|
|
|
|
* no files, add a pg_shdepend entry to account for that.
|
|
|
|
*/
|
|
|
|
if (!create_storage && reltablespace != InvalidOid)
|
|
|
|
recordDependencyOnTablespace(RelationRelationId, relid,
|
|
|
|
reltablespace);
|
|
|
|
|
2022-09-23 22:00:55 +02:00
|
|
|
/* ensure that stats are dropped if transaction aborts */
|
|
|
|
pgstat_create_relation(rel);
|
|
|
|
|
1998-08-19 04:04:17 +02:00
|
|
|
return rel;
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
1997-11-28 18:28:02 +01:00
|
|
|
* heap_create_with_catalog - Create a cataloged relation
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
2005-07-14 00:46:09 +02:00
|
|
|
* this is done in multiple steps:
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
2002-12-16 19:39:22 +01:00
|
|
|
* 1) CheckAttributeNamesTypes() is used to make certain the tuple
|
|
|
|
* descriptor contains a valid set of attribute names and types
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
2002-03-26 20:17:02 +01:00
|
|
|
* 2) pg_class is opened and get_relname_relid()
|
2000-01-16 20:57:00 +01:00
|
|
|
* performs a scan to ensure that no relation with the
|
1996-07-09 08:22:35 +02:00
|
|
|
* same name already exists.
|
|
|
|
*
|
2001-02-12 21:07:21 +01:00
|
|
|
* 3) heap_create() is called to create the new relation on disk.
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
2005-08-12 03:36:05 +02:00
|
|
|
* 4) TypeCreate() is called to define a new type corresponding
|
1996-07-09 08:22:35 +02:00
|
|
|
* to the new relation.
|
|
|
|
*
|
2005-08-12 03:36:05 +02:00
|
|
|
* 5) AddNewRelationTuple() is called to register the
|
|
|
|
* relation in pg_class.
|
|
|
|
*
|
2001-02-12 21:07:21 +01:00
|
|
|
* 6) AddNewAttributeTuples() is called to register the
|
1996-07-09 08:22:35 +02:00
|
|
|
* new relation's schema in pg_attribute.
|
|
|
|
*
|
1997-08-22 16:10:26 +02:00
|
|
|
* 7) StoreConstraints is called () - vadim 08/22/97
|
1997-09-07 07:04:48 +02:00
|
|
|
*
|
1997-08-22 16:10:26 +02:00
|
|
|
* 8) the relations are closed and the new relation's oid
|
1996-07-09 08:22:35 +02:00
|
|
|
* is returned.
|
|
|
|
*
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
|
|
|
|
/* --------------------------------
|
2002-12-16 19:39:22 +01:00
|
|
|
* CheckAttributeNamesTypes
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
* this is used to make certain the tuple descriptor contains a
|
2002-12-16 19:39:22 +01:00
|
|
|
* valid set of attribute names and datatypes. a problem simply
|
2003-07-21 03:59:11 +02:00
|
|
|
* generates ereport(ERROR) which aborts the current transaction.
|
2019-01-31 01:25:33 +01:00
|
|
|
*
|
|
|
|
* relkind is the relkind of the relation to be created.
|
|
|
|
* flags controls which datatypes are allowed, cf CheckAttributeType.
|
1996-07-09 08:22:35 +02:00
|
|
|
* --------------------------------
|
|
|
|
*/
|
2002-12-16 19:39:22 +01:00
|
|
|
void
|
2010-02-07 21:48:13 +01:00
|
|
|
CheckAttributeNamesTypes(TupleDesc tupdesc, char relkind,
|
2019-01-31 01:25:33 +01:00
|
|
|
int flags)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2000-03-17 03:36:41 +01:00
|
|
|
int i;
|
|
|
|
int j;
|
1996-07-09 08:22:35 +02:00
|
|
|
int natts = tupdesc->natts;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2002-12-16 19:39:22 +01:00
|
|
|
/* Sanity check on column count */
|
|
|
|
if (natts < 0 || natts > MaxHeapAttributeNumber)
|
2003-07-20 23:56:35 +02:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_TOO_MANY_COLUMNS),
|
|
|
|
errmsg("tables can have at most %d columns",
|
|
|
|
MaxHeapAttributeNumber)));
|
2002-12-16 19:39:22 +01:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
|
|
|
* first check for collision with system attribute names
|
|
|
|
*
|
2002-08-29 02:17:06 +02:00
|
|
|
* Skip this for a view or type relation, since those don't have system
|
2002-08-15 18:36:08 +02:00
|
|
|
* attributes.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2002-08-15 18:36:08 +02:00
|
|
|
if (relkind != RELKIND_VIEW && relkind != RELKIND_COMPOSITE_TYPE)
|
2002-05-22 17:35:43 +02:00
|
|
|
{
|
2002-05-22 09:46:58 +02:00
|
|
|
for (i = 0; i < natts; i++)
|
|
|
|
{
|
2017-08-20 20:19:07 +02:00
|
|
|
Form_pg_attribute attr = TupleDescAttr(tupdesc, i);
|
|
|
|
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
if (SystemAttributeByName(NameStr(attr->attname)) != NULL)
|
2003-07-21 03:59:11 +02:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_DUPLICATE_COLUMN),
|
|
|
|
errmsg("column name \"%s\" conflicts with a system column name",
|
2017-08-20 20:19:07 +02:00
|
|
|
NameStr(attr->attname))));
|
2002-05-22 17:35:43 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
|
|
|
* next check for repeated attribute names
|
|
|
|
*/
|
2000-03-17 03:36:41 +01:00
|
|
|
for (i = 1; i < natts; i++)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2000-03-17 03:36:41 +01:00
|
|
|
for (j = 0; j < i; j++)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2017-08-20 20:19:07 +02:00
|
|
|
if (strcmp(NameStr(TupleDescAttr(tupdesc, j)->attname),
|
|
|
|
NameStr(TupleDescAttr(tupdesc, i)->attname)) == 0)
|
2003-07-21 03:59:11 +02:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_DUPLICATE_COLUMN),
|
2007-06-04 00:16:03 +02:00
|
|
|
errmsg("column name \"%s\" specified more than once",
|
2017-08-20 20:19:07 +02:00
|
|
|
NameStr(TupleDescAttr(tupdesc, j)->attname))));
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
}
|
2002-08-05 04:30:50 +02:00
|
|
|
|
|
|
|
/*
|
2002-12-16 19:39:22 +01:00
|
|
|
* next check the attribute types
|
|
|
|
*/
|
|
|
|
for (i = 0; i < natts; i++)
|
|
|
|
{
|
2017-08-20 20:19:07 +02:00
|
|
|
CheckAttributeType(NameStr(TupleDescAttr(tupdesc, i)->attname),
|
|
|
|
TupleDescAttr(tupdesc, i)->atttypid,
|
|
|
|
TupleDescAttr(tupdesc, i)->attcollation,
|
2011-03-28 21:44:54 +02:00
|
|
|
NIL, /* assume we're creating a new rowtype */
|
2019-01-31 01:25:33 +01:00
|
|
|
flags);
|
2002-12-16 19:39:22 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* --------------------------------
|
|
|
|
* CheckAttributeType
|
|
|
|
*
|
|
|
|
* Verify that the proposed datatype of an attribute is legal.
|
2011-03-28 21:44:54 +02:00
|
|
|
* This is needed mainly because there are types (and pseudo-types)
|
2002-12-16 19:39:22 +01:00
|
|
|
* in the catalogs that we do not support as elements of real tuples.
|
2011-03-28 21:44:54 +02:00
|
|
|
* We also check some other properties required of a table column.
|
|
|
|
*
|
|
|
|
* If the attribute is being proposed for addition to an existing table or
|
|
|
|
* composite type, pass a one-element list of the rowtype OID as
|
|
|
|
* containing_rowtypes. When checking a to-be-created rowtype, it's
|
|
|
|
* sufficient to pass NIL, because there could not be any recursive reference
|
|
|
|
* to a not-yet-existing rowtype.
|
2019-01-31 01:25:33 +01:00
|
|
|
*
|
|
|
|
* flags is a bitmask controlling which datatypes we allow. For the most
|
|
|
|
* part, pseudo-types are disallowed as attribute types, but there are some
|
|
|
|
* exceptions: ANYARRAYOID, RECORDOID, and RECORDARRAYOID can be allowed
|
|
|
|
* in some cases. (This works because values of those type classes are
|
|
|
|
* self-identifying to some extent. However, RECORDOID and RECORDARRAYOID
|
|
|
|
* are reliably identifiable only within a session, since the identity info
|
|
|
|
* may use a typmod that is only locally assigned. The caller is expected
|
|
|
|
* to know whether these cases are safe.)
|
2019-12-23 18:53:12 +01:00
|
|
|
*
|
|
|
|
* flags can also control the phrasing of the error messages. If
|
|
|
|
* CHKATYPE_IS_PARTKEY is specified, "attname" should be a partition key
|
|
|
|
* column number as text, not a real column name.
|
2002-12-16 19:39:22 +01:00
|
|
|
* --------------------------------
|
|
|
|
*/
|
|
|
|
void
|
2011-03-28 21:44:54 +02:00
|
|
|
CheckAttributeType(const char *attname,
|
|
|
|
Oid atttypid, Oid attcollation,
|
|
|
|
List *containing_rowtypes,
|
2019-01-31 01:25:33 +01:00
|
|
|
int flags)
|
2002-12-16 19:39:22 +01:00
|
|
|
{
|
|
|
|
char att_typtype = get_typtype(atttypid);
|
2011-03-28 21:44:54 +02:00
|
|
|
Oid att_typelem;
|
2002-12-16 19:39:22 +01:00
|
|
|
|
2024-02-16 15:02:00 +01:00
|
|
|
/* since this function recurses, it could be driven to stack overflow */
|
|
|
|
check_stack_depth();
|
|
|
|
|
2017-01-25 15:27:09 +01:00
|
|
|
if (att_typtype == TYPTYPE_PSEUDO)
|
2003-03-23 06:14:37 +01:00
|
|
|
{
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
/*
|
2019-01-31 01:25:33 +01:00
|
|
|
* We disallow pseudo-type columns, with the exception of ANYARRAY,
|
|
|
|
* RECORD, and RECORD[] when the caller says that those are OK.
|
|
|
|
*
|
|
|
|
* We don't need to worry about recursive containment for RECORD and
|
|
|
|
* RECORD[] because (a) no named composite type should be allowed to
|
|
|
|
* contain those, and (b) two "anonymous" record types couldn't be
|
|
|
|
* considered to be the same type, so infinite recursion isn't
|
|
|
|
* possible.
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
*/
|
2019-01-31 01:25:33 +01:00
|
|
|
if (!((atttypid == ANYARRAYOID && (flags & CHKATYPE_ANYARRAY)) ||
|
|
|
|
(atttypid == RECORDOID && (flags & CHKATYPE_ANYRECORD)) ||
|
|
|
|
(atttypid == RECORDARRAYOID && (flags & CHKATYPE_ANYRECORD))))
|
2019-12-23 18:53:12 +01:00
|
|
|
{
|
|
|
|
if (flags & CHKATYPE_IS_PARTKEY)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
|
|
|
|
/* translator: first %s is an integer not a name */
|
|
|
|
errmsg("partition key column %s has pseudo-type %s",
|
|
|
|
attname, format_type_be(atttypid))));
|
|
|
|
else
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
|
|
|
|
errmsg("column \"%s\" has pseudo-type %s",
|
|
|
|
attname, format_type_be(atttypid))));
|
|
|
|
}
|
2003-03-23 06:14:37 +01:00
|
|
|
}
|
2011-06-03 00:37:57 +02:00
|
|
|
else if (att_typtype == TYPTYPE_DOMAIN)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* If it's a domain, recurse to check its base type.
|
|
|
|
*/
|
|
|
|
CheckAttributeType(attname, getBaseType(atttypid), attcollation,
|
|
|
|
containing_rowtypes,
|
2019-01-31 01:25:33 +01:00
|
|
|
flags);
|
2011-06-03 00:37:57 +02:00
|
|
|
}
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
else if (att_typtype == TYPTYPE_COMPOSITE)
|
|
|
|
{
|
|
|
|
/*
|
2011-06-03 00:37:57 +02:00
|
|
|
* For a composite type, recurse into its attributes.
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
*/
|
|
|
|
Relation relation;
|
|
|
|
TupleDesc tupdesc;
|
|
|
|
int i;
|
|
|
|
|
2011-03-28 21:44:54 +02:00
|
|
|
/*
|
|
|
|
* Check for self-containment. Eventually we might be able to allow
|
|
|
|
* this (just return without complaint, if so) but it's not clear how
|
|
|
|
* many other places would require anti-recursion defenses before it
|
|
|
|
* would be safe to allow tables to contain their own rowtype.
|
|
|
|
*/
|
|
|
|
if (list_member_oid(containing_rowtypes, atttypid))
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
|
|
|
|
errmsg("composite type %s cannot be made a member of itself",
|
|
|
|
format_type_be(atttypid))));
|
|
|
|
|
Avoid using lcons and list_delete_first where it's easy to do so.
Formerly, lcons was about the same speed as lappend, but with the new
List implementation, that's not so; with a long List, data movement
imposes an O(N) cost on lcons and list_delete_first, but not lappend.
Hence, invent list_delete_last with semantics parallel to
list_delete_first (but O(1) cost), and change various places to use
lappend and list_delete_last where this can be done without much
violence to the code logic.
There are quite a few places that construct result lists using lcons not
lappend. Some have semantic rationales for that; I added comments about
it to a couple that didn't have them already. In many such places though,
I think the coding is that way only because back in the dark ages lcons
was faster than lappend. Hence, switch to lappend where this can be done
without causing semantic changes.
In ExecInitExprRec(), this results in aggregates and window functions that
are in the same plan node being executed in a different order than before.
Generally, the executions of such functions ought to be independent of
each other, so this shouldn't result in visibly different query results.
But if you push it, as one regression test case does, you can show that
the order is different. The new order seems saner; it's closer to
the order of the functions in the query text. And we never documented
or promised anything about this, anyway.
Also, in gistfinishsplit(), don't bother building a reverse-order list;
it's easy now to iterate backwards through the original list.
It'd be possible to go further towards removing uses of lcons and
list_delete_first, but it'd require more extensive logic changes,
and I'm not convinced it's worth it. Most of the remaining uses
deal with queues that probably never get long enough to be worth
sweating over. (Actually, I doubt that any of the changes in this
patch will have measurable performance effects either. But better
to have good examples than bad ones in the code base.)
Patch by me, thanks to David Rowley and Daniel Gustafsson for review.
Discussion: https://postgr.es/m/21272.1563318411@sss.pgh.pa.us
2019-07-17 17:15:28 +02:00
|
|
|
containing_rowtypes = lappend_oid(containing_rowtypes, atttypid);
|
2011-03-28 21:44:54 +02:00
|
|
|
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
relation = relation_open(get_typ_typrelid(atttypid), AccessShareLock);
|
|
|
|
|
|
|
|
tupdesc = RelationGetDescr(relation);
|
|
|
|
|
|
|
|
for (i = 0; i < tupdesc->natts; i++)
|
|
|
|
{
|
2017-08-20 20:19:07 +02:00
|
|
|
Form_pg_attribute attr = TupleDescAttr(tupdesc, i);
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
|
|
|
|
if (attr->attisdropped)
|
|
|
|
continue;
|
2011-03-28 21:44:54 +02:00
|
|
|
CheckAttributeType(NameStr(attr->attname),
|
|
|
|
attr->atttypid, attr->attcollation,
|
|
|
|
containing_rowtypes,
|
2019-12-23 18:53:12 +01:00
|
|
|
flags & ~CHKATYPE_IS_PARTKEY);
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
relation_close(relation, AccessShareLock);
|
2011-03-28 21:44:54 +02:00
|
|
|
|
Avoid using lcons and list_delete_first where it's easy to do so.
Formerly, lcons was about the same speed as lappend, but with the new
List implementation, that's not so; with a long List, data movement
imposes an O(N) cost on lcons and list_delete_first, but not lappend.
Hence, invent list_delete_last with semantics parallel to
list_delete_first (but O(1) cost), and change various places to use
lappend and list_delete_last where this can be done without much
violence to the code logic.
There are quite a few places that construct result lists using lcons not
lappend. Some have semantic rationales for that; I added comments about
it to a couple that didn't have them already. In many such places though,
I think the coding is that way only because back in the dark ages lcons
was faster than lappend. Hence, switch to lappend where this can be done
without causing semantic changes.
In ExecInitExprRec(), this results in aggregates and window functions that
are in the same plan node being executed in a different order than before.
Generally, the executions of such functions ought to be independent of
each other, so this shouldn't result in visibly different query results.
But if you push it, as one regression test case does, you can show that
the order is different. The new order seems saner; it's closer to
the order of the functions in the query text. And we never documented
or promised anything about this, anyway.
Also, in gistfinishsplit(), don't bother building a reverse-order list;
it's easy now to iterate backwards through the original list.
It'd be possible to go further towards removing uses of lcons and
list_delete_first, but it'd require more extensive logic changes,
and I'm not convinced it's worth it. Most of the remaining uses
deal with queues that probably never get long enough to be worth
sweating over. (Actually, I doubt that any of the changes in this
patch will have measurable performance effects either. But better
to have good examples than bad ones in the code base.)
Patch by me, thanks to David Rowley and Daniel Gustafsson for review.
Discussion: https://postgr.es/m/21272.1563318411@sss.pgh.pa.us
2019-07-17 17:15:28 +02:00
|
|
|
containing_rowtypes = list_delete_last(containing_rowtypes);
|
2011-03-28 21:44:54 +02:00
|
|
|
}
|
2019-12-23 18:08:23 +01:00
|
|
|
else if (att_typtype == TYPTYPE_RANGE)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* If it's a range, recurse to check its subtype.
|
|
|
|
*/
|
2020-01-31 23:03:55 +01:00
|
|
|
CheckAttributeType(attname, get_range_subtype(atttypid),
|
|
|
|
get_range_collation(atttypid),
|
2019-12-23 18:08:23 +01:00
|
|
|
containing_rowtypes,
|
|
|
|
flags);
|
|
|
|
}
|
2011-03-28 21:44:54 +02:00
|
|
|
else if (OidIsValid((att_typelem = get_element_type(atttypid))))
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Must recurse into array types, too, in case they are composite.
|
|
|
|
*/
|
|
|
|
CheckAttributeType(attname, att_typelem, attcollation,
|
|
|
|
containing_rowtypes,
|
2019-01-31 01:25:33 +01:00
|
|
|
flags);
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
}
|
2011-03-04 22:39:44 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* This might not be strictly invalid per SQL standard, but it is pretty
|
2011-03-28 21:44:54 +02:00
|
|
|
* useless, and it cannot be dumped, so we must disallow it.
|
2011-03-04 22:39:44 +01:00
|
|
|
*/
|
2011-03-28 21:44:54 +02:00
|
|
|
if (!OidIsValid(attcollation) && type_is_collatable(atttypid))
|
2019-12-23 18:53:12 +01:00
|
|
|
{
|
|
|
|
if (flags & CHKATYPE_IS_PARTKEY)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
|
|
|
|
/* translator: first %s is an integer not a name */
|
|
|
|
errmsg("no collation was derived for partition key column %s with collatable type %s",
|
|
|
|
attname, format_type_be(atttypid)),
|
|
|
|
errhint("Use the COLLATE clause to set the collation explicitly.")));
|
|
|
|
else
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
|
|
|
|
errmsg("no collation was derived for column \"%s\" with collatable type %s",
|
|
|
|
attname, format_type_be(atttypid)),
|
|
|
|
errhint("Use the COLLATE clause to set the collation explicitly.")));
|
|
|
|
}
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
2020-07-31 03:54:26 +02:00
|
|
|
/*
|
|
|
|
* InsertPgAttributeTuples
|
|
|
|
* Construct and insert a set of tuples in pg_attribute.
|
2008-11-14 02:57:42 +01:00
|
|
|
*
|
2020-07-31 03:54:26 +02:00
|
|
|
* Caller has already opened and locked pg_attribute. tupdesc contains the
|
2021-11-26 09:57:23 +01:00
|
|
|
* attributes to insert. attcacheoff is always initialized to -1. attoptions
|
|
|
|
* supplies the values for the attoptions fields and must contain the same
|
|
|
|
* number of elements as tupdesc or be NULL. The other variable-length fields
|
|
|
|
* of pg_attribute are always initialized to null values.
|
2008-11-14 02:57:42 +01:00
|
|
|
*
|
2017-02-01 23:18:36 +01:00
|
|
|
* indstate is the index state for CatalogTupleInsertWithInfo. It can be
|
|
|
|
* passed as NULL, in which case we'll fetch the necessary info. (Don't do
|
|
|
|
* this when inserting multiple attributes, because it's a tad more
|
|
|
|
* expensive.)
|
2020-07-31 03:54:26 +02:00
|
|
|
*
|
|
|
|
* new_rel_oid is the relation OID assigned to the attributes inserted.
|
|
|
|
* If set to InvalidOid, the relation OID from tupdesc is used instead.
|
2008-11-14 02:57:42 +01:00
|
|
|
*/
|
|
|
|
void
|
2020-07-31 03:54:26 +02:00
|
|
|
InsertPgAttributeTuples(Relation pg_attribute_rel,
|
|
|
|
TupleDesc tupdesc,
|
|
|
|
Oid new_rel_oid,
|
2023-08-23 06:14:11 +02:00
|
|
|
const Datum *attoptions,
|
2020-07-31 03:54:26 +02:00
|
|
|
CatalogIndexState indstate)
|
2008-11-14 02:57:42 +01:00
|
|
|
{
|
2020-07-31 03:54:26 +02:00
|
|
|
TupleTableSlot **slot;
|
|
|
|
TupleDesc td;
|
|
|
|
int nslots;
|
|
|
|
int natts = 0;
|
|
|
|
int slotCount = 0;
|
|
|
|
bool close_index = false;
|
|
|
|
|
|
|
|
td = RelationGetDescr(pg_attribute_rel);
|
|
|
|
|
|
|
|
/* Initialize the number of slots to use */
|
|
|
|
nslots = Min(tupdesc->natts,
|
2020-09-05 06:52:47 +02:00
|
|
|
(MAX_CATALOG_MULTI_INSERT_BYTES / sizeof(FormData_pg_attribute)));
|
2020-07-31 03:54:26 +02:00
|
|
|
slot = palloc(sizeof(TupleTableSlot *) * nslots);
|
|
|
|
for (int i = 0; i < nslots; i++)
|
|
|
|
slot[i] = MakeSingleTupleTableSlot(td, &TTSOpsHeapTuple);
|
|
|
|
|
|
|
|
while (natts < tupdesc->natts)
|
|
|
|
{
|
|
|
|
Form_pg_attribute attrs = TupleDescAttr(tupdesc, natts);
|
2008-11-14 02:57:42 +01:00
|
|
|
|
2020-07-31 03:54:26 +02:00
|
|
|
ExecClearTuple(slot[slotCount]);
|
2008-11-14 02:57:42 +01:00
|
|
|
|
2021-10-26 10:40:08 +02:00
|
|
|
memset(slot[slotCount]->tts_isnull, false,
|
|
|
|
slot[slotCount]->tts_tupleDescriptor->natts * sizeof(bool));
|
|
|
|
|
2020-07-31 03:54:26 +02:00
|
|
|
if (new_rel_oid != InvalidOid)
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attrelid - 1] = ObjectIdGetDatum(new_rel_oid);
|
|
|
|
else
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attrelid - 1] = ObjectIdGetDatum(attrs->attrelid);
|
|
|
|
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attname - 1] = NameGetDatum(&attrs->attname);
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_atttypid - 1] = ObjectIdGetDatum(attrs->atttypid);
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attlen - 1] = Int16GetDatum(attrs->attlen);
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attnum - 1] = Int16GetDatum(attrs->attnum);
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attcacheoff - 1] = Int32GetDatum(-1);
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_atttypmod - 1] = Int32GetDatum(attrs->atttypmod);
|
2023-03-28 09:58:14 +02:00
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attndims - 1] = Int16GetDatum(attrs->attndims);
|
2020-07-31 03:54:26 +02:00
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attbyval - 1] = BoolGetDatum(attrs->attbyval);
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attalign - 1] = CharGetDatum(attrs->attalign);
|
2021-05-23 18:12:09 +02:00
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attstorage - 1] = CharGetDatum(attrs->attstorage);
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attcompression - 1] = CharGetDatum(attrs->attcompression);
|
2020-07-31 03:54:26 +02:00
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attnotnull - 1] = BoolGetDatum(attrs->attnotnull);
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_atthasdef - 1] = BoolGetDatum(attrs->atthasdef);
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_atthasmissing - 1] = BoolGetDatum(attrs->atthasmissing);
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attidentity - 1] = CharGetDatum(attrs->attidentity);
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attgenerated - 1] = CharGetDatum(attrs->attgenerated);
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attisdropped - 1] = BoolGetDatum(attrs->attisdropped);
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attislocal - 1] = BoolGetDatum(attrs->attislocal);
|
2023-03-28 09:58:14 +02:00
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attinhcount - 1] = Int16GetDatum(attrs->attinhcount);
|
2020-07-31 03:54:26 +02:00
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attcollation - 1] = ObjectIdGetDatum(attrs->attcollation);
|
|
|
|
if (attoptions && attoptions[natts] != (Datum) 0)
|
|
|
|
slot[slotCount]->tts_values[Anum_pg_attribute_attoptions - 1] = attoptions[natts];
|
|
|
|
else
|
|
|
|
slot[slotCount]->tts_isnull[Anum_pg_attribute_attoptions - 1] = true;
|
2008-11-14 02:57:42 +01:00
|
|
|
|
2024-01-13 18:14:53 +01:00
|
|
|
/*
|
|
|
|
* The remaining fields are not set for new columns.
|
|
|
|
*/
|
|
|
|
slot[slotCount]->tts_isnull[Anum_pg_attribute_attstattarget - 1] = true;
|
2020-07-31 03:54:26 +02:00
|
|
|
slot[slotCount]->tts_isnull[Anum_pg_attribute_attacl - 1] = true;
|
|
|
|
slot[slotCount]->tts_isnull[Anum_pg_attribute_attfdwoptions - 1] = true;
|
|
|
|
slot[slotCount]->tts_isnull[Anum_pg_attribute_attmissingval - 1] = true;
|
|
|
|
|
|
|
|
ExecStoreVirtualTuple(slot[slotCount]);
|
|
|
|
slotCount++;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If slots are full or the end of processing has been reached, insert
|
|
|
|
* a batch of tuples.
|
|
|
|
*/
|
|
|
|
if (slotCount == nslots || natts == tupdesc->natts - 1)
|
|
|
|
{
|
|
|
|
/* fetch index info only when we know we need it */
|
|
|
|
if (!indstate)
|
|
|
|
{
|
|
|
|
indstate = CatalogOpenIndexes(pg_attribute_rel);
|
|
|
|
close_index = true;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* insert the new tuples and update the indexes */
|
|
|
|
CatalogTuplesMultiInsertWithInfo(pg_attribute_rel, slot, slotCount,
|
|
|
|
indstate);
|
|
|
|
slotCount = 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
natts++;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (close_index)
|
|
|
|
CatalogCloseIndexes(indstate);
|
|
|
|
for (int i = 0; i < nslots; i++)
|
|
|
|
ExecDropSingleTupleTableSlot(slot[i]);
|
|
|
|
pfree(slot);
|
2008-11-14 02:57:42 +01:00
|
|
|
}
|
2009-06-11 16:49:15 +02:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/* --------------------------------
|
|
|
|
* AddNewAttributeTuples
|
|
|
|
*
|
|
|
|
* this registers the new relation's schema by adding
|
|
|
|
* tuples to pg_attribute.
|
|
|
|
* --------------------------------
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
AddNewAttributeTuples(Oid new_rel_oid,
|
2001-08-10 20:57:42 +02:00
|
|
|
TupleDesc tupdesc,
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
char relkind)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
1998-08-19 04:04:17 +02:00
|
|
|
Relation rel;
|
2002-08-05 05:29:17 +02:00
|
|
|
CatalogIndexState indstate;
|
1996-07-09 08:22:35 +02:00
|
|
|
int natts = tupdesc->natts;
|
2002-07-17 00:12:20 +02:00
|
|
|
ObjectAddress myself,
|
|
|
|
referenced;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
2002-08-05 05:29:17 +02:00
|
|
|
* open pg_attribute and its indexes.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2019-01-21 19:32:19 +01:00
|
|
|
rel = table_open(AttributeRelationId, RowExclusiveLock);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2002-08-05 05:29:17 +02:00
|
|
|
indstate = CatalogOpenIndexes(rel);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2020-07-31 03:54:26 +02:00
|
|
|
InsertPgAttributeTuples(rel, tupdesc, new_rel_oid, NULL, indstate);
|
2002-07-17 00:12:20 +02:00
|
|
|
|
2020-07-31 03:54:26 +02:00
|
|
|
/* add dependencies on their datatypes and collations */
|
|
|
|
for (int i = 0; i < natts; i++)
|
|
|
|
{
|
2008-11-14 02:57:42 +01:00
|
|
|
/* Add dependency info */
|
2020-07-01 10:03:50 +02:00
|
|
|
ObjectAddressSubSet(myself, RelationRelationId, new_rel_oid, i + 1);
|
2020-07-31 03:54:26 +02:00
|
|
|
ObjectAddressSet(referenced, TypeRelationId,
|
|
|
|
tupdesc->attrs[i].atttypid);
|
2002-07-17 00:12:20 +02:00
|
|
|
recordDependencyOn(&myself, &referenced, DEPENDENCY_NORMAL);
|
2011-02-12 14:54:13 +01:00
|
|
|
|
2011-04-22 23:43:18 +02:00
|
|
|
/* The default collation is pinned, so don't bother recording it */
|
2020-07-31 03:54:26 +02:00
|
|
|
if (OidIsValid(tupdesc->attrs[i].attcollation) &&
|
|
|
|
tupdesc->attrs[i].attcollation != DEFAULT_COLLATION_OID)
|
2011-02-12 14:54:13 +01:00
|
|
|
{
|
2020-07-31 03:54:26 +02:00
|
|
|
ObjectAddressSet(referenced, CollationRelationId,
|
|
|
|
tupdesc->attrs[i].attcollation);
|
2011-02-12 14:54:13 +01:00
|
|
|
recordDependencyOn(&myself, &referenced, DEPENDENCY_NORMAL);
|
|
|
|
}
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
2023-07-12 16:12:34 +02:00
|
|
|
* Next we add the system attributes. Skip all for a view or type
|
|
|
|
* relation. We don't bother with making datatype dependencies here,
|
|
|
|
* since presumably all these types are pinned.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2002-08-15 18:36:08 +02:00
|
|
|
if (relkind != RELKIND_VIEW && relkind != RELKIND_COMPOSITE_TYPE)
|
2002-05-22 17:35:43 +02:00
|
|
|
{
|
2020-07-31 03:54:26 +02:00
|
|
|
TupleDesc td;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2020-07-31 03:54:26 +02:00
|
|
|
td = CreateTupleDesc(lengthof(SysAtt), (FormData_pg_attribute **) &SysAtt);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2020-07-31 03:54:26 +02:00
|
|
|
InsertPgAttributeTuples(rel, td, new_rel_oid, NULL, indstate);
|
|
|
|
FreeTupleDesc(td);
|
2002-05-22 17:35:43 +02:00
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
|
|
|
|
/*
|
2002-07-17 00:12:20 +02:00
|
|
|
* clean up
|
1997-09-07 07:04:48 +02:00
|
|
|
*/
|
2002-08-05 05:29:17 +02:00
|
|
|
CatalogCloseIndexes(indstate);
|
2001-06-12 07:55:50 +02:00
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(rel, RowExclusiveLock);
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
2006-07-04 00:45:41 +02:00
|
|
|
/* --------------------------------
|
|
|
|
* InsertPgClassTuple
|
|
|
|
*
|
|
|
|
* Construct and insert a new tuple in pg_class.
|
|
|
|
*
|
|
|
|
* Caller has already opened and locked pg_class.
|
|
|
|
* Tuple data is taken from new_rel_desc->rd_rel, except for the
|
|
|
|
* variable-width fields which are not present in a cached reldesc.
|
2009-10-05 21:24:49 +02:00
|
|
|
* relacl and reloptions are passed in Datum form (to avoid having
|
|
|
|
* to reference the data types in heap.h). Pass (Datum) 0 to set them
|
|
|
|
* to NULL.
|
2006-07-04 00:45:41 +02:00
|
|
|
* --------------------------------
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
InsertPgClassTuple(Relation pg_class_desc,
|
|
|
|
Relation new_rel_desc,
|
|
|
|
Oid new_rel_oid,
|
2009-10-05 21:24:49 +02:00
|
|
|
Datum relacl,
|
2006-07-04 00:45:41 +02:00
|
|
|
Datum reloptions)
|
|
|
|
{
|
|
|
|
Form_pg_class rd_rel = new_rel_desc->rd_rel;
|
|
|
|
Datum values[Natts_pg_class];
|
2008-11-02 02:45:28 +01:00
|
|
|
bool nulls[Natts_pg_class];
|
2006-07-04 00:45:41 +02:00
|
|
|
HeapTuple tup;
|
|
|
|
|
|
|
|
/* This is a tad tedious, but way cleaner than what we used to do... */
|
|
|
|
memset(values, 0, sizeof(values));
|
2008-11-02 02:45:28 +01:00
|
|
|
memset(nulls, false, sizeof(nulls));
|
2006-07-04 00:45:41 +02:00
|
|
|
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
values[Anum_pg_class_oid - 1] = ObjectIdGetDatum(new_rel_oid);
|
2006-07-04 00:45:41 +02:00
|
|
|
values[Anum_pg_class_relname - 1] = NameGetDatum(&rd_rel->relname);
|
|
|
|
values[Anum_pg_class_relnamespace - 1] = ObjectIdGetDatum(rd_rel->relnamespace);
|
|
|
|
values[Anum_pg_class_reltype - 1] = ObjectIdGetDatum(rd_rel->reltype);
|
2010-01-29 00:21:13 +01:00
|
|
|
values[Anum_pg_class_reloftype - 1] = ObjectIdGetDatum(rd_rel->reloftype);
|
2006-07-04 00:45:41 +02:00
|
|
|
values[Anum_pg_class_relowner - 1] = ObjectIdGetDatum(rd_rel->relowner);
|
|
|
|
values[Anum_pg_class_relam - 1] = ObjectIdGetDatum(rd_rel->relam);
|
2022-09-28 15:45:27 +02:00
|
|
|
values[Anum_pg_class_relfilenode - 1] = ObjectIdGetDatum(rd_rel->relfilenode);
|
2006-07-04 00:45:41 +02:00
|
|
|
values[Anum_pg_class_reltablespace - 1] = ObjectIdGetDatum(rd_rel->reltablespace);
|
|
|
|
values[Anum_pg_class_relpages - 1] = Int32GetDatum(rd_rel->relpages);
|
|
|
|
values[Anum_pg_class_reltuples - 1] = Float4GetDatum(rd_rel->reltuples);
|
2011-10-14 23:23:01 +02:00
|
|
|
values[Anum_pg_class_relallvisible - 1] = Int32GetDatum(rd_rel->relallvisible);
|
2006-07-04 00:45:41 +02:00
|
|
|
values[Anum_pg_class_reltoastrelid - 1] = ObjectIdGetDatum(rd_rel->reltoastrelid);
|
|
|
|
values[Anum_pg_class_relhasindex - 1] = BoolGetDatum(rd_rel->relhasindex);
|
|
|
|
values[Anum_pg_class_relisshared - 1] = BoolGetDatum(rd_rel->relisshared);
|
2010-12-13 18:34:26 +01:00
|
|
|
values[Anum_pg_class_relpersistence - 1] = CharGetDatum(rd_rel->relpersistence);
|
2006-07-04 00:45:41 +02:00
|
|
|
values[Anum_pg_class_relkind - 1] = CharGetDatum(rd_rel->relkind);
|
|
|
|
values[Anum_pg_class_relnatts - 1] = Int16GetDatum(rd_rel->relnatts);
|
|
|
|
values[Anum_pg_class_relchecks - 1] = Int16GetDatum(rd_rel->relchecks);
|
|
|
|
values[Anum_pg_class_relhasrules - 1] = BoolGetDatum(rd_rel->relhasrules);
|
2008-11-09 22:24:33 +01:00
|
|
|
values[Anum_pg_class_relhastriggers - 1] = BoolGetDatum(rd_rel->relhastriggers);
|
Code review for row security.
Buildfarm member tick identified an issue where the policies in the
relcache for a relation were were being replaced underneath a running
query, leading to segfaults while processing the policies to be added
to a query. Similar to how TupleDesc RuleLocks are handled, add in a
equalRSDesc() function to check if the policies have actually changed
and, if not, swap back the rsdesc field (using the original instead of
the temporairly built one; the whole structure is swapped and then
specific fields swapped back). This now passes a CLOBBER_CACHE_ALWAYS
for me and should resolve the buildfarm error.
In addition to addressing this, add a new chapter in Data Definition
under Privileges which explains row security and provides examples of
its usage, change \d to always list policies (even if row security is
disabled- but note that it is disabled, or enabled with no policies),
rework check_role_for_policy (it really didn't need the entire policy,
but it did need to be using has_privs_of_role()), and change the field
in pg_class to relrowsecurity from relhasrowsecurity, based on
Heikki's suggestion. Also from Heikki, only issue SET ROW_SECURITY in
pg_restore when talking to a 9.5+ server, list Bypass RLS in \du, and
document --enable-row-security options for pg_dump and pg_restore.
Lastly, fix a number of minor whitespace and typo issues from Heikki,
Dimitri, add a missing #include, per Peter E, fix a few minor
variable-assigned-but-not-used and resource leak issues from Coverity
and add tab completion for role attribute bypassrls as well.
2014-09-24 22:32:22 +02:00
|
|
|
values[Anum_pg_class_relrowsecurity - 1] = BoolGetDatum(rd_rel->relrowsecurity);
|
2015-10-05 03:05:08 +02:00
|
|
|
values[Anum_pg_class_relforcerowsecurity - 1] = BoolGetDatum(rd_rel->relforcerowsecurity);
|
2006-07-04 00:45:41 +02:00
|
|
|
values[Anum_pg_class_relhassubclass - 1] = BoolGetDatum(rd_rel->relhassubclass);
|
2013-05-06 19:26:51 +02:00
|
|
|
values[Anum_pg_class_relispopulated - 1] = BoolGetDatum(rd_rel->relispopulated);
|
2013-11-08 18:30:43 +01:00
|
|
|
values[Anum_pg_class_relreplident - 1] = CharGetDatum(rd_rel->relreplident);
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
values[Anum_pg_class_relispartition - 1] = BoolGetDatum(rd_rel->relispartition);
|
2018-03-21 14:13:24 +01:00
|
|
|
values[Anum_pg_class_relrewrite - 1] = ObjectIdGetDatum(rd_rel->relrewrite);
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
values[Anum_pg_class_relfrozenxid - 1] = TransactionIdGetDatum(rd_rel->relfrozenxid);
|
Improve concurrency of foreign key locking
This patch introduces two additional lock modes for tuples: "SELECT FOR
KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each
other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
FOR UPDATE". UPDATE commands that do not modify the values stored in
the columns that are part of the key of the tuple now grab a SELECT FOR
NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
with tuple locks of the FOR KEY SHARE variety.
Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.
The added tuple lock semantics require some rejiggering of the multixact
module, so that the locking level that each transaction is holding can
be stored alongside its Xid. Also, multixacts now need to persist
across server restarts and crashes, because they can now represent not
only tuple locks, but also tuple updates. This means we need more
careful tracking of lifetime of pg_multixact SLRU files; since they now
persist longer, we require more infrastructure to figure out when they
can be removed. pg_upgrade also needs to be careful to copy
pg_multixact files over from the old server to the new, or at least part
of multixact.c state, depending on the versions of the old and new
servers.
Tuple time qualification rules (HeapTupleSatisfies routines) need to be
careful not to consider tuples with the "is multi" infomask bit set as
being only locked; they might need to look up MultiXact values (i.e.
possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
whereas they previously were assured to only use information readily
available from the tuple header. This is considered acceptable, because
the extra I/O would involve cases that would previously cause some
commands to block waiting for concurrent transactions to finish.
Another important change is the fact that locking tuples that have
previously been updated causes the future versions to be marked as
locked, too; this is essential for correctness of foreign key checks.
This causes additional WAL-logging, also (there was previously a single
WAL record for a locked tuple; now there are as many as updated copies
of the tuple there exist.)
With all this in place, contention related to tuples being checked by
foreign key rules should be much reduced.
As a bonus, the old behavior that a subtransaction grabbing a stronger
tuple lock than the parent (sub)transaction held on a given tuple and
later aborting caused the weaker lock to be lost, has been fixed.
Many new spec files were added for isolation tester framework, to ensure
overall behavior is sane. There's probably room for several more tests.
There were several reviewers of this patch; in particular, Noah Misch
and Andres Freund spent considerable time in it. Original idea for the
patch came from Simon Riggs, after a problem report by Joel Jacobson.
Most code is from me, with contributions from Marti Raudsepp, Alexander
Shulgin, Noah Misch and Andres Freund.
This patch was discussed in several pgsql-hackers threads; the most
important start at the following message-ids:
AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
1290721684-sup-3951@alvh.no-ip.org
1294953201-sup-2099@alvh.no-ip.org
1320343602-sup-2290@alvh.no-ip.org
1339690386-sup-8927@alvh.no-ip.org
4FE5FF020200002500048A3D@gw.wicourts.gov
4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 16:04:59 +01:00
|
|
|
values[Anum_pg_class_relminmxid - 1] = MultiXactIdGetDatum(rd_rel->relminmxid);
|
2009-10-05 21:24:49 +02:00
|
|
|
if (relacl != (Datum) 0)
|
|
|
|
values[Anum_pg_class_relacl - 1] = relacl;
|
|
|
|
else
|
|
|
|
nulls[Anum_pg_class_relacl - 1] = true;
|
2006-07-04 00:45:41 +02:00
|
|
|
if (reloptions != (Datum) 0)
|
|
|
|
values[Anum_pg_class_reloptions - 1] = reloptions;
|
|
|
|
else
|
2008-11-02 02:45:28 +01:00
|
|
|
nulls[Anum_pg_class_reloptions - 1] = true;
|
2006-07-04 00:45:41 +02:00
|
|
|
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
/* relpartbound is set by updating this tuple, if necessary */
|
|
|
|
nulls[Anum_pg_class_relpartbound - 1] = true;
|
|
|
|
|
2008-11-02 02:45:28 +01:00
|
|
|
tup = heap_form_tuple(RelationGetDescr(pg_class_desc), values, nulls);
|
2006-07-04 00:45:41 +02:00
|
|
|
|
|
|
|
/* finally insert the new tuple, update the indexes, and clean up */
|
2017-01-31 22:42:24 +01:00
|
|
|
CatalogTupleInsert(pg_class_desc, tup);
|
2006-07-04 00:45:41 +02:00
|
|
|
|
|
|
|
heap_freetuple(tup);
|
|
|
|
}
|
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/* --------------------------------
|
1999-02-02 04:45:56 +01:00
|
|
|
* AddNewRelationTuple
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
* this registers the new relation in the catalogs by
|
|
|
|
* adding a tuple to pg_class.
|
|
|
|
* --------------------------------
|
|
|
|
*/
|
1997-08-19 23:40:56 +02:00
|
|
|
static void
|
1999-02-02 04:45:56 +01:00
|
|
|
AddNewRelationTuple(Relation pg_class_desc,
|
1996-07-09 08:22:35 +02:00
|
|
|
Relation new_rel_desc,
|
|
|
|
Oid new_rel_oid,
|
2001-02-12 21:07:21 +01:00
|
|
|
Oid new_type_oid,
|
2010-01-29 00:21:13 +01:00
|
|
|
Oid reloftype,
|
2005-08-26 05:08:15 +02:00
|
|
|
Oid relowner,
|
2006-07-02 04:23:23 +02:00
|
|
|
char relkind,
|
2019-03-29 04:01:14 +01:00
|
|
|
TransactionId relfrozenxid,
|
|
|
|
TransactionId relminmxid,
|
2009-10-05 21:24:49 +02:00
|
|
|
Datum relacl,
|
2006-07-04 00:45:41 +02:00
|
|
|
Datum reloptions)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
|
|
|
Form_pg_class new_rel_reltup;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
2001-02-12 21:07:21 +01:00
|
|
|
* first we update some of the information in our uncataloged relation's
|
1996-07-09 08:22:35 +02:00
|
|
|
* relation descriptor.
|
|
|
|
*/
|
|
|
|
new_rel_reltup = new_rel_desc->rd_rel;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2021-12-03 13:38:26 +01:00
|
|
|
/* The relation is empty */
|
|
|
|
new_rel_reltup->relpages = 0;
|
|
|
|
new_rel_reltup->reltuples = -1;
|
|
|
|
new_rel_reltup->relallvisible = 0;
|
|
|
|
|
|
|
|
/* Sequences always have a known size */
|
|
|
|
if (relkind == RELKIND_SEQUENCE)
|
2001-05-07 02:43:27 +02:00
|
|
|
{
|
2021-12-03 13:38:26 +01:00
|
|
|
new_rel_reltup->relpages = 1;
|
|
|
|
new_rel_reltup->reltuples = 1;
|
2001-05-07 02:43:27 +02:00
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2019-03-29 04:01:14 +01:00
|
|
|
new_rel_reltup->relfrozenxid = relfrozenxid;
|
|
|
|
new_rel_reltup->relminmxid = relminmxid;
|
2005-08-26 05:08:15 +02:00
|
|
|
new_rel_reltup->relowner = relowner;
|
2001-02-12 21:07:21 +01:00
|
|
|
new_rel_reltup->reltype = new_type_oid;
|
2010-01-29 00:21:13 +01:00
|
|
|
new_rel_reltup->reloftype = reloftype;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
/* relispartition is always set by updating this tuple later */
|
|
|
|
new_rel_reltup->relispartition = false;
|
|
|
|
|
Don't create pg_type entries for sequences or toast tables.
Commit f7f70d5e2 left one inconsistency behind: we're still creating
pg_type entries for the composite types of sequences and toast tables,
but not arrays over those composites. But there seems precious little
reason to have named composite types for toast tables, and not much more
to have them for sequences (especially given the thought that sequences
may someday not be standalone relations at all).
So, let's close that inconsistency by removing these composite types,
rather than adding arrays for them. This buys back a little bit of
the initial pg_type bloat added by the previous patch, and could be
a significant savings in a large database with many toast tables.
Aside from a small logic rearrangement in heap_create_with_catalog,
this patch mostly needs to clean up some places that were assuming that
pg_class.reltype always has a valid value. Those are really pre-existing
bugs, given that it's documented otherwise; notably, the plpgsql changes
fix code that gives "cache lookup failed for type 0" on indexes today.
But none of these seem interesting enough to back-patch.
Also, remove the pg_dump/pg_upgrade infrastructure for propagating
a toast table's pg_type OID into the new database, since we no longer
need that.
Discussion: https://postgr.es/m/761F1389-C6A8-4C15-80CE-950C961F5341@gmail.com
2020-07-07 21:43:22 +02:00
|
|
|
/* fill rd_att's type ID with something sane even if reltype is zero */
|
|
|
|
new_rel_desc->rd_att->tdtypeid = new_type_oid ? new_type_oid : RECORDOID;
|
|
|
|
new_rel_desc->rd_att->tdtypmod = -1;
|
2004-04-01 23:28:47 +02:00
|
|
|
|
2006-07-04 00:45:41 +02:00
|
|
|
/* Now build and insert the tuple */
|
2009-10-05 21:24:49 +02:00
|
|
|
InsertPgClassTuple(pg_class_desc, new_rel_desc, new_rel_oid,
|
|
|
|
relacl, reloptions);
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/* --------------------------------
|
1999-02-02 04:45:56 +01:00
|
|
|
* AddNewRelationType -
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
2004-04-01 23:28:47 +02:00
|
|
|
* define a composite type corresponding to the new relation
|
1996-07-09 08:22:35 +02:00
|
|
|
* --------------------------------
|
|
|
|
*/
|
Change many routines to return ObjectAddress rather than OID
The changed routines are mostly those that can be directly called by
ProcessUtilitySlow; the intention is to make the affected object
information more precise, in support for future event trigger changes.
Originally it was envisioned that the OID of the affected object would
be enough, and in most cases that is correct, but upon actually
implementing the event trigger changes it turned out that ObjectAddress
is more widely useful.
Additionally, some command execution routines grew an output argument
that's an object address which provides further info about the executed
command. To wit:
* for ALTER DOMAIN / ADD CONSTRAINT, it corresponds to the address of
the new constraint
* for ALTER OBJECT / SET SCHEMA, it corresponds to the address of the
schema that originally contained the object.
* for ALTER EXTENSION {ADD, DROP} OBJECT, it corresponds to the address
of the object added to or dropped from the extension.
There's no user-visible change in this commit, and no functional change
either.
Discussion: 20150218213255.GC6717@tamriel.snowman.net
Reviewed-By: Stephen Frost, Andres Freund
2015-03-03 18:10:50 +01:00
|
|
|
static ObjectAddress
|
2002-03-29 20:06:29 +01:00
|
|
|
AddNewRelationType(const char *typeName,
|
|
|
|
Oid typeNamespace,
|
|
|
|
Oid new_rel_oid,
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
char new_rel_kind,
|
Repair a longstanding bug in CLUSTER and the rewriting variants of ALTER
TABLE: if the command is executed by someone other than the table owner (eg,
a superuser) and the table has a toast table, the toast table's pg_type row
ends up with the wrong typowner, ie, the command issuer not the table owner.
This is quite harmless for most purposes, since no interesting permissions
checks consult the pg_type row. However, it could lead to unexpected failures
if one later tries to drop the role that issued the command (in 8.1 or 8.2),
or strange warnings from pg_dump afterwards (in 8.3 and up, which will allow
the DROP ROLE because we don't create a "redundant" owner dependency for table
rowtypes). Problem identified by Cott Lang.
Back-patch to 8.1. The problem is actually far older --- the CLUSTER variant
can be demonstrated in 7.0 --- but it's mostly cosmetic before 8.1 because we
didn't track ownership dependencies before 8.1. Also, fixing it before 8.1
would require changing the call signature of heap_create_with_catalog(), which
seems to carry a nontrivial risk of breaking add-on modules.
2009-02-24 02:38:10 +01:00
|
|
|
Oid ownerid,
|
2009-09-27 00:42:03 +02:00
|
|
|
Oid new_row_type,
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
Oid new_array_type)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2005-08-12 03:36:05 +02:00
|
|
|
return
|
2009-09-27 00:42:03 +02:00
|
|
|
TypeCreate(new_row_type, /* optional predetermined OID */
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
typeName, /* type name */
|
2005-08-12 03:36:05 +02:00
|
|
|
typeNamespace, /* type namespace */
|
|
|
|
new_rel_oid, /* relation oid */
|
|
|
|
new_rel_kind, /* relation kind */
|
Repair a longstanding bug in CLUSTER and the rewriting variants of ALTER
TABLE: if the command is executed by someone other than the table owner (eg,
a superuser) and the table has a toast table, the toast table's pg_type row
ends up with the wrong typowner, ie, the command issuer not the table owner.
This is quite harmless for most purposes, since no interesting permissions
checks consult the pg_type row. However, it could lead to unexpected failures
if one later tries to drop the role that issued the command (in 8.1 or 8.2),
or strange warnings from pg_dump afterwards (in 8.3 and up, which will allow
the DROP ROLE because we don't create a "redundant" owner dependency for table
rowtypes). Problem identified by Cott Lang.
Back-patch to 8.1. The problem is actually far older --- the CLUSTER variant
can be demonstrated in 7.0 --- but it's mostly cosmetic before 8.1 because we
didn't track ownership dependencies before 8.1. Also, fixing it before 8.1
would require changing the call signature of heap_create_with_catalog(), which
seems to carry a nontrivial risk of breaking add-on modules.
2009-02-24 02:38:10 +01:00
|
|
|
ownerid, /* owner's ID */
|
2005-08-12 03:36:05 +02:00
|
|
|
-1, /* internal size (varlena) */
|
2007-04-02 05:49:42 +02:00
|
|
|
TYPTYPE_COMPOSITE, /* type-type (composite) */
|
Replace the hard-wired type knowledge in TypeCategory() and IsPreferredType()
with system catalog lookups, as was foreseen to be necessary almost since
their creation. Instead put the information into two new pg_type columns,
typcategory and typispreferred. Add support for setting these when
creating a user-defined base type.
The category column is just a "char" (i.e. a poor man's enum), allowing
a crude form of user extensibility of the category list: just use an
otherwise-unused character. This seems sufficient for foreseen uses,
but we could upgrade to having an actual category catalog someday, if
there proves to be a huge demand for custom type categories.
In this patch I have attempted to hew exactly to the behavior of the
previous hardwired logic, except for introducing new type categories for
arrays, composites, and enums. In particular the default preferred state
for user-defined types remains TRUE. That seems worth revisiting, but it
should be done as a separate patch from introducing the infrastructure.
Likewise, any adjustment of the standard set of categories should be done
separately.
2008-07-30 19:05:05 +02:00
|
|
|
TYPCATEGORY_COMPOSITE, /* type-category (ditto) */
|
2008-07-30 21:35:13 +02:00
|
|
|
false, /* composite types are never preferred */
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
DEFAULT_TYPDELIM, /* default array delimiter */
|
2005-08-12 03:36:05 +02:00
|
|
|
F_RECORD_IN, /* input procedure */
|
|
|
|
F_RECORD_OUT, /* output procedure */
|
|
|
|
F_RECORD_RECV, /* receive procedure */
|
|
|
|
F_RECORD_SEND, /* send procedure */
|
2006-12-30 22:21:56 +01:00
|
|
|
InvalidOid, /* typmodin procedure - none */
|
|
|
|
InvalidOid, /* typmodout procedure - none */
|
2005-08-12 03:36:05 +02:00
|
|
|
InvalidOid, /* analyze procedure - default */
|
Support subscripting of arbitrary types, not only arrays.
This patch generalizes the subscripting infrastructure so that any
data type can be subscripted, if it provides a handler function to
define what that means. Traditional variable-length (varlena) arrays
all use array_subscript_handler(), while the existing fixed-length
types that support subscripting use raw_array_subscript_handler().
It's expected that other types that want to use subscripting notation
will define their own handlers. (This patch provides no such new
features, though; it only lays the foundation for them.)
To do this, move the parser's semantic processing of subscripts
(including coercion to whatever data type is required) into a
method callback supplied by the handler. On the execution side,
replace the ExecEvalSubscriptingRef* layer of functions with direct
calls to callback-supplied execution routines. (Thus, essentially
no new run-time overhead should be caused by this patch. Indeed,
there is room to remove some overhead by supplying specialized
execution routines. This patch does a little bit in that line,
but more could be done.)
Additional work is required here and there to remove formerly
hard-wired assumptions about the result type, collation, etc
of a SubscriptingRef expression node; and to remove assumptions
that the subscript values must be integers.
One useful side-effect of this is that we now have a less squishy
mechanism for identifying whether a data type is a "true" array:
instead of wiring in weird rules about typlen, we can look to see
if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this
to be bulletproof, we have to forbid user-defined types from using
that handler directly; but there seems no good reason for them to
do so.
This patch also removes assumptions that the number of subscripts
is limited to MAXDIM (6), or indeed has any hard-wired limit.
That limit still applies to types handled by array_subscript_handler
or raw_array_subscript_handler, but to discourage other dependencies
on this constant, I've moved it from c.h to utils/array.h.
Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov,
Peter Eisentraut, Pavel Stehule
Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com
Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
2020-12-09 18:40:37 +01:00
|
|
|
InvalidOid, /* subscript procedure - none */
|
2005-08-12 03:36:05 +02:00
|
|
|
InvalidOid, /* array element type - irrelevant */
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
false, /* this is not an array type */
|
|
|
|
new_array_type, /* array type if any */
|
2005-08-12 03:36:05 +02:00
|
|
|
InvalidOid, /* domain base type - irrelevant */
|
|
|
|
NULL, /* default value - none */
|
|
|
|
NULL, /* default binary representation */
|
|
|
|
false, /* passed by reference */
|
2020-03-04 16:34:25 +01:00
|
|
|
TYPALIGN_DOUBLE, /* alignment - must be the largest! */
|
|
|
|
TYPSTORAGE_EXTENDED, /* fully TOASTable */
|
2005-08-12 03:36:05 +02:00
|
|
|
-1, /* typmod */
|
|
|
|
0, /* array dimensions for typBaseType */
|
2011-02-08 22:04:18 +01:00
|
|
|
false, /* Type NOT NULL */
|
2011-04-22 23:43:18 +02:00
|
|
|
InvalidOid); /* rowtypes never have a collation */
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/* --------------------------------
|
1997-11-28 18:28:02 +01:00
|
|
|
* heap_create_with_catalog
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
* creates a new cataloged relation. see comments above.
|
2009-10-05 21:24:49 +02:00
|
|
|
*
|
|
|
|
* Arguments:
|
|
|
|
* relname: name to give to new rel
|
|
|
|
* relnamespace: OID of namespace it goes in
|
|
|
|
* reltablespace: OID of tablespace it goes in
|
|
|
|
* relid: OID to assign to new rel, or InvalidOid to select a new OID
|
|
|
|
* reltypeid: OID to assign to rel's rowtype, or InvalidOid to select one
|
2012-03-08 21:52:26 +01:00
|
|
|
* reloftypeid: if a typed table, OID of underlying type; else InvalidOid
|
2009-10-05 21:24:49 +02:00
|
|
|
* ownerid: OID of new rel's owner
|
2021-04-09 06:53:07 +02:00
|
|
|
* accessmtd: OID of new rel's access method
|
2009-10-05 21:24:49 +02:00
|
|
|
* tupdesc: tuple descriptor (source of column definitions)
|
|
|
|
* cooked_constraints: list of precooked check constraints and defaults
|
|
|
|
* relkind: relkind for new rel
|
2012-03-08 21:52:26 +01:00
|
|
|
* relpersistence: rel's persistence status (permanent, temp, or unlogged)
|
2017-08-16 06:22:32 +02:00
|
|
|
* shared_relation: true if it's to be a shared relation
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
* mapped_relation: true if the relation will use the relfilenumber map
|
2009-10-05 21:24:49 +02:00
|
|
|
* oncommit: ON COMMIT marking (only relevant if it's a temp table)
|
|
|
|
* reloptions: reloptions in Datum form, or (Datum) 0 if none
|
2017-08-16 06:22:32 +02:00
|
|
|
* use_user_acl: true if should look for user-defined default permissions;
|
|
|
|
* if false, relacl is always set NULL
|
|
|
|
* allow_system_table_mods: true to allow creation in system namespaces
|
2015-03-03 18:03:33 +01:00
|
|
|
* is_internal: is this a system-generated catalog?
|
2009-10-05 21:24:49 +02:00
|
|
|
*
|
Change many routines to return ObjectAddress rather than OID
The changed routines are mostly those that can be directly called by
ProcessUtilitySlow; the intention is to make the affected object
information more precise, in support for future event trigger changes.
Originally it was envisioned that the OID of the affected object would
be enough, and in most cases that is correct, but upon actually
implementing the event trigger changes it turned out that ObjectAddress
is more widely useful.
Additionally, some command execution routines grew an output argument
that's an object address which provides further info about the executed
command. To wit:
* for ALTER DOMAIN / ADD CONSTRAINT, it corresponds to the address of
the new constraint
* for ALTER OBJECT / SET SCHEMA, it corresponds to the address of the
schema that originally contained the object.
* for ALTER EXTENSION {ADD, DROP} OBJECT, it corresponds to the address
of the object added to or dropped from the extension.
There's no user-visible change in this commit, and no functional change
either.
Discussion: 20150218213255.GC6717@tamriel.snowman.net
Reviewed-By: Stephen Frost, Andres Freund
2015-03-03 18:10:50 +01:00
|
|
|
* Output parameters:
|
|
|
|
* typaddress: if not null, gets the object address of the new pg_type entry
|
Don't create pg_type entries for sequences or toast tables.
Commit f7f70d5e2 left one inconsistency behind: we're still creating
pg_type entries for the composite types of sequences and toast tables,
but not arrays over those composites. But there seems precious little
reason to have named composite types for toast tables, and not much more
to have them for sequences (especially given the thought that sequences
may someday not be standalone relations at all).
So, let's close that inconsistency by removing these composite types,
rather than adding arrays for them. This buys back a little bit of
the initial pg_type bloat added by the previous patch, and could be
a significant savings in a large database with many toast tables.
Aside from a small logic rearrangement in heap_create_with_catalog,
this patch mostly needs to clean up some places that were assuming that
pg_class.reltype always has a valid value. Those are really pre-existing
bugs, given that it's documented otherwise; notably, the plpgsql changes
fix code that gives "cache lookup failed for type 0" on indexes today.
But none of these seem interesting enough to back-patch.
Also, remove the pg_dump/pg_upgrade infrastructure for propagating
a toast table's pg_type OID into the new database, since we no longer
need that.
Discussion: https://postgr.es/m/761F1389-C6A8-4C15-80CE-950C961F5341@gmail.com
2020-07-07 21:43:22 +02:00
|
|
|
* (this must be null if the relkind is one that doesn't get a pg_type entry)
|
Change many routines to return ObjectAddress rather than OID
The changed routines are mostly those that can be directly called by
ProcessUtilitySlow; the intention is to make the affected object
information more precise, in support for future event trigger changes.
Originally it was envisioned that the OID of the affected object would
be enough, and in most cases that is correct, but upon actually
implementing the event trigger changes it turned out that ObjectAddress
is more widely useful.
Additionally, some command execution routines grew an output argument
that's an object address which provides further info about the executed
command. To wit:
* for ALTER DOMAIN / ADD CONSTRAINT, it corresponds to the address of
the new constraint
* for ALTER OBJECT / SET SCHEMA, it corresponds to the address of the
schema that originally contained the object.
* for ALTER EXTENSION {ADD, DROP} OBJECT, it corresponds to the address
of the object added to or dropped from the extension.
There's no user-visible change in this commit, and no functional change
either.
Discussion: 20150218213255.GC6717@tamriel.snowman.net
Reviewed-By: Stephen Frost, Andres Freund
2015-03-03 18:10:50 +01:00
|
|
|
*
|
2009-10-05 21:24:49 +02:00
|
|
|
* Returns the OID of the new relation
|
1996-07-09 08:22:35 +02:00
|
|
|
* --------------------------------
|
|
|
|
*/
|
|
|
|
Oid
|
2002-03-31 08:26:32 +02:00
|
|
|
heap_create_with_catalog(const char *relname,
|
2002-03-26 20:17:02 +01:00
|
|
|
Oid relnamespace,
|
2004-06-18 08:14:31 +02:00
|
|
|
Oid reltablespace,
|
2005-04-14 03:38:22 +02:00
|
|
|
Oid relid,
|
2009-09-27 00:42:03 +02:00
|
|
|
Oid reltypeid,
|
2010-01-29 00:21:13 +01:00
|
|
|
Oid reloftypeid,
|
2005-08-26 05:08:15 +02:00
|
|
|
Oid ownerid,
|
tableam: introduce table AM infrastructure.
This introduces the concept of table access methods, i.e. CREATE
ACCESS METHOD ... TYPE TABLE and
CREATE TABLE ... USING (storage-engine).
No table access functionality is delegated to table AMs as of this
commit, that'll be done in following commits.
Subsequent commits will incrementally abstract table access
functionality to be routed through table access methods. That change
is too large to be reviewed & committed at once, so it'll be done
incrementally.
Docs will be updated at the end, as adding them incrementally would
likely make them less coherent, and definitely is a lot more work,
without a lot of benefit.
Table access methods are specified similar to index access methods,
i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with
callbacks. In contrast to index AMs that struct needs to live as long
as a backend, typically that's achieved by just returning a pointer to
a constant struct.
Psql's \d+ now displays a table's access method. That can be disabled
with HIDE_TABLEAM=true, which is mainly useful so regression tests can
be run against different AMs. It's quite possible that this behaviour
still needs to be fine tuned.
For now it's not allowed to set a table AM for a partitioned table, as
we've not resolved how partitions would inherit that. Disallowing
allows us to introduce, if we decide that's the way forward, such a
behaviour without a compatibility break.
Catversion bumped, to add the heap table AM and references to it.
Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de
https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 18:54:38 +01:00
|
|
|
Oid accessmtd,
|
1998-08-06 07:13:14 +02:00
|
|
|
TupleDesc tupdesc,
|
2008-05-10 01:32:05 +02:00
|
|
|
List *cooked_constraints,
|
1999-02-02 04:45:56 +01:00
|
|
|
char relkind,
|
2010-12-13 18:34:26 +01:00
|
|
|
char relpersistence,
|
2002-04-27 23:24:34 +02:00
|
|
|
bool shared_relation,
|
2010-02-07 21:48:13 +01:00
|
|
|
bool mapped_relation,
|
2002-11-11 23:19:25 +01:00
|
|
|
OnCommitAction oncommit,
|
2006-07-04 00:45:41 +02:00
|
|
|
Datum reloptions,
|
2009-10-05 21:24:49 +02:00
|
|
|
bool use_user_acl,
|
2012-10-23 23:07:26 +02:00
|
|
|
bool allow_system_table_mods,
|
Change many routines to return ObjectAddress rather than OID
The changed routines are mostly those that can be directly called by
ProcessUtilitySlow; the intention is to make the affected object
information more precise, in support for future event trigger changes.
Originally it was envisioned that the OID of the affected object would
be enough, and in most cases that is correct, but upon actually
implementing the event trigger changes it turned out that ObjectAddress
is more widely useful.
Additionally, some command execution routines grew an output argument
that's an object address which provides further info about the executed
command. To wit:
* for ALTER DOMAIN / ADD CONSTRAINT, it corresponds to the address of
the new constraint
* for ALTER OBJECT / SET SCHEMA, it corresponds to the address of the
schema that originally contained the object.
* for ALTER EXTENSION {ADD, DROP} OBJECT, it corresponds to the address
of the object added to or dropped from the extension.
There's no user-visible change in this commit, and no functional change
either.
Discussion: 20150218213255.GC6717@tamriel.snowman.net
Reviewed-By: Stephen Frost, Andres Freund
2015-03-03 18:10:50 +01:00
|
|
|
bool is_internal,
|
2018-03-21 14:13:24 +01:00
|
|
|
Oid relrewrite,
|
Change many routines to return ObjectAddress rather than OID
The changed routines are mostly those that can be directly called by
ProcessUtilitySlow; the intention is to make the affected object
information more precise, in support for future event trigger changes.
Originally it was envisioned that the OID of the affected object would
be enough, and in most cases that is correct, but upon actually
implementing the event trigger changes it turned out that ObjectAddress
is more widely useful.
Additionally, some command execution routines grew an output argument
that's an object address which provides further info about the executed
command. To wit:
* for ALTER DOMAIN / ADD CONSTRAINT, it corresponds to the address of
the new constraint
* for ALTER OBJECT / SET SCHEMA, it corresponds to the address of the
schema that originally contained the object.
* for ALTER EXTENSION {ADD, DROP} OBJECT, it corresponds to the address
of the object added to or dropped from the extension.
There's no user-visible change in this commit, and no functional change
either.
Discussion: 20150218213255.GC6717@tamriel.snowman.net
Reviewed-By: Stephen Frost, Andres Freund
2015-03-03 18:10:50 +01:00
|
|
|
ObjectAddress *typaddress)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
|
|
|
Relation pg_class_desc;
|
|
|
|
Relation new_rel_desc;
|
2009-10-05 21:24:49 +02:00
|
|
|
Acl *relacl;
|
2010-07-26 01:21:22 +02:00
|
|
|
Oid existing_relid;
|
2007-05-12 02:55:00 +02:00
|
|
|
Oid old_type_oid;
|
2001-02-12 21:07:21 +01:00
|
|
|
Oid new_type_oid;
|
pg_upgrade: Preserve relfilenodes and tablespace OIDs.
Currently, database OIDs, relfilenodes, and tablespace OIDs can all
change when a cluster is upgraded using pg_upgrade. It seems better
to preserve them, because (1) it makes troubleshooting pg_upgrade
easier, since you don't have to do a lot of work to match up files
in the old and new clusters, (2) it allows 'rsync' to save bandwidth
when used to re-sync a cluster after an upgrade, and (3) if we ever
encrypt or sign blocks, we would likely want to use a nonce that
depends on these values.
This patch only arranges to preserve relfilenodes and tablespace
OIDs. The task of preserving database OIDs is left for another patch,
since it involves some complexities that don't exist in these cases.
Database OIDs have a similar issue, but there are some tricky points
in that case that do not apply to these cases, so that problem is left
for another patch.
Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.
Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
2022-01-17 19:32:44 +01:00
|
|
|
|
|
|
|
/* By default set to InvalidOid unless overridden by binary-upgrade */
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
RelFileNumber relfilenumber = InvalidRelFileNumber;
|
2019-03-29 04:01:14 +01:00
|
|
|
TransactionId relfrozenxid;
|
|
|
|
MultiXactId relminmxid;
|
1999-05-25 18:15:34 +02:00
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
pg_class_desc = table_open(RelationRelationId, RowExclusiveLock);
|
2005-08-12 03:36:05 +02:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
|
|
|
* sanity checks
|
|
|
|
*/
|
1999-02-02 04:45:56 +01:00
|
|
|
Assert(IsNormalProcessingMode() || IsBootstrapProcessingMode());
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2019-01-31 01:25:33 +01:00
|
|
|
/*
|
|
|
|
* Validate proposed tupdesc for the desired relkind. If
|
|
|
|
* allow_system_table_mods is on, allow ANYARRAY to be used; this is a
|
|
|
|
* hack to allow creating pg_statistic and cloning it during VACUUM FULL.
|
|
|
|
*/
|
|
|
|
CheckAttributeNamesTypes(tupdesc, relkind,
|
|
|
|
allow_system_table_mods ? CHKATYPE_ANYARRAY : 0);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2010-07-26 01:21:22 +02:00
|
|
|
/*
|
2011-04-25 22:55:11 +02:00
|
|
|
* This would fail later on anyway, if the relation already exists. But
|
|
|
|
* by catching it here we can emit a nicer error message.
|
2010-07-26 01:21:22 +02:00
|
|
|
*/
|
|
|
|
existing_relid = get_relname_relid(relname, relnamespace);
|
|
|
|
if (existing_relid != InvalidOid)
|
2003-07-21 03:59:11 +02:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_DUPLICATE_TABLE),
|
|
|
|
errmsg("relation \"%s\" already exists", relname)));
|
2002-09-04 22:31:48 +02:00
|
|
|
|
2007-05-12 02:55:00 +02:00
|
|
|
/*
|
|
|
|
* Since we are going to create a rowtype as well, also check for
|
|
|
|
* collision with an existing type name. If there is one and it's an
|
|
|
|
* autogenerated array, we can rename it out of the way; otherwise we can
|
|
|
|
* at least give a good error message.
|
|
|
|
*/
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
old_type_oid = GetSysCacheOid2(TYPENAMENSP, Anum_pg_type_oid,
|
2010-02-14 19:42:19 +01:00
|
|
|
CStringGetDatum(relname),
|
|
|
|
ObjectIdGetDatum(relnamespace));
|
2007-05-12 02:55:00 +02:00
|
|
|
if (OidIsValid(old_type_oid))
|
|
|
|
{
|
|
|
|
if (!moveArrayTypeName(old_type_oid, relname, relnamespace))
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_DUPLICATE_OBJECT),
|
|
|
|
errmsg("type \"%s\" already exists", relname),
|
|
|
|
errhint("A relation has an associated type of the same name, "
|
|
|
|
"so you must use a name that doesn't conflict "
|
|
|
|
"with any existing type.")));
|
|
|
|
}
|
|
|
|
|
2007-10-12 20:55:12 +02:00
|
|
|
/*
|
2010-02-07 21:48:13 +01:00
|
|
|
* Shared relations must be in pg_global (last-ditch check)
|
2007-10-12 20:55:12 +02:00
|
|
|
*/
|
2010-02-07 21:48:13 +01:00
|
|
|
if (shared_relation && reltablespace != GLOBALTABLESPACE_OID)
|
|
|
|
elog(ERROR, "shared relations must be placed in pg_global tablespace");
|
2007-10-12 20:55:12 +02:00
|
|
|
|
2022-09-28 15:45:27 +02:00
|
|
|
/*
|
|
|
|
* Allocate an OID for the relation, unless we were told what to use.
|
|
|
|
*
|
|
|
|
* The OID will be the relfilenumber as well, so make sure it doesn't
|
|
|
|
* collide with either pg_class OIDs or existing physical files.
|
|
|
|
*/
|
2010-02-03 02:14:17 +01:00
|
|
|
if (!OidIsValid(relid))
|
2010-01-06 04:04:03 +01:00
|
|
|
{
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
/* Use binary-upgrade override for pg_class.oid and relfilenumber */
|
2021-12-03 13:38:26 +01:00
|
|
|
if (IsBinaryUpgrade)
|
2010-02-03 02:14:17 +01:00
|
|
|
{
|
2021-12-03 13:38:26 +01:00
|
|
|
/*
|
|
|
|
* Indexes are not supported here; they use
|
|
|
|
* binary_upgrade_next_index_pg_class_oid.
|
|
|
|
*/
|
|
|
|
Assert(relkind != RELKIND_INDEX);
|
|
|
|
Assert(relkind != RELKIND_PARTITIONED_INDEX);
|
2014-08-26 04:19:05 +02:00
|
|
|
|
2021-12-03 13:38:26 +01:00
|
|
|
if (relkind == RELKIND_TOASTVALUE)
|
|
|
|
{
|
|
|
|
/* There might be no TOAST table, so we have to test for it. */
|
|
|
|
if (OidIsValid(binary_upgrade_next_toast_pg_class_oid))
|
|
|
|
{
|
|
|
|
relid = binary_upgrade_next_toast_pg_class_oid;
|
|
|
|
binary_upgrade_next_toast_pg_class_oid = InvalidOid;
|
pg_upgrade: Preserve relfilenodes and tablespace OIDs.
Currently, database OIDs, relfilenodes, and tablespace OIDs can all
change when a cluster is upgraded using pg_upgrade. It seems better
to preserve them, because (1) it makes troubleshooting pg_upgrade
easier, since you don't have to do a lot of work to match up files
in the old and new clusters, (2) it allows 'rsync' to save bandwidth
when used to re-sync a cluster after an upgrade, and (3) if we ever
encrypt or sign blocks, we would likely want to use a nonce that
depends on these values.
This patch only arranges to preserve relfilenodes and tablespace
OIDs. The task of preserving database OIDs is left for another patch,
since it involves some complexities that don't exist in these cases.
Database OIDs have a similar issue, but there are some tricky points
in that case that do not apply to these cases, so that problem is left
for another patch.
Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.
Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
2022-01-17 19:32:44 +01:00
|
|
|
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
if (!RelFileNumberIsValid(binary_upgrade_next_toast_pg_class_relfilenumber))
|
pg_upgrade: Preserve relfilenodes and tablespace OIDs.
Currently, database OIDs, relfilenodes, and tablespace OIDs can all
change when a cluster is upgraded using pg_upgrade. It seems better
to preserve them, because (1) it makes troubleshooting pg_upgrade
easier, since you don't have to do a lot of work to match up files
in the old and new clusters, (2) it allows 'rsync' to save bandwidth
when used to re-sync a cluster after an upgrade, and (3) if we ever
encrypt or sign blocks, we would likely want to use a nonce that
depends on these values.
This patch only arranges to preserve relfilenodes and tablespace
OIDs. The task of preserving database OIDs is left for another patch,
since it involves some complexities that don't exist in these cases.
Database OIDs have a similar issue, but there are some tricky points
in that case that do not apply to these cases, so that problem is left
for another patch.
Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.
Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
2022-01-17 19:32:44 +01:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
errmsg("toast relfilenumber value not set when in binary upgrade mode")));
|
pg_upgrade: Preserve relfilenodes and tablespace OIDs.
Currently, database OIDs, relfilenodes, and tablespace OIDs can all
change when a cluster is upgraded using pg_upgrade. It seems better
to preserve them, because (1) it makes troubleshooting pg_upgrade
easier, since you don't have to do a lot of work to match up files
in the old and new clusters, (2) it allows 'rsync' to save bandwidth
when used to re-sync a cluster after an upgrade, and (3) if we ever
encrypt or sign blocks, we would likely want to use a nonce that
depends on these values.
This patch only arranges to preserve relfilenodes and tablespace
OIDs. The task of preserving database OIDs is left for another patch,
since it involves some complexities that don't exist in these cases.
Database OIDs have a similar issue, but there are some tricky points
in that case that do not apply to these cases, so that problem is left
for another patch.
Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.
Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
2022-01-17 19:32:44 +01:00
|
|
|
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
relfilenumber = binary_upgrade_next_toast_pg_class_relfilenumber;
|
|
|
|
binary_upgrade_next_toast_pg_class_relfilenumber = InvalidRelFileNumber;
|
2021-12-03 13:38:26 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
if (!OidIsValid(binary_upgrade_next_heap_pg_class_oid))
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
|
|
|
|
errmsg("pg_class heap OID value not set when in binary upgrade mode")));
|
|
|
|
|
|
|
|
relid = binary_upgrade_next_heap_pg_class_oid;
|
|
|
|
binary_upgrade_next_heap_pg_class_oid = InvalidOid;
|
pg_upgrade: Preserve relfilenodes and tablespace OIDs.
Currently, database OIDs, relfilenodes, and tablespace OIDs can all
change when a cluster is upgraded using pg_upgrade. It seems better
to preserve them, because (1) it makes troubleshooting pg_upgrade
easier, since you don't have to do a lot of work to match up files
in the old and new clusters, (2) it allows 'rsync' to save bandwidth
when used to re-sync a cluster after an upgrade, and (3) if we ever
encrypt or sign blocks, we would likely want to use a nonce that
depends on these values.
This patch only arranges to preserve relfilenodes and tablespace
OIDs. The task of preserving database OIDs is left for another patch,
since it involves some complexities that don't exist in these cases.
Database OIDs have a similar issue, but there are some tricky points
in that case that do not apply to these cases, so that problem is left
for another patch.
Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.
Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
2022-01-17 19:32:44 +01:00
|
|
|
|
|
|
|
if (RELKIND_HAS_STORAGE(relkind))
|
|
|
|
{
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
if (!RelFileNumberIsValid(binary_upgrade_next_heap_pg_class_relfilenumber))
|
pg_upgrade: Preserve relfilenodes and tablespace OIDs.
Currently, database OIDs, relfilenodes, and tablespace OIDs can all
change when a cluster is upgraded using pg_upgrade. It seems better
to preserve them, because (1) it makes troubleshooting pg_upgrade
easier, since you don't have to do a lot of work to match up files
in the old and new clusters, (2) it allows 'rsync' to save bandwidth
when used to re-sync a cluster after an upgrade, and (3) if we ever
encrypt or sign blocks, we would likely want to use a nonce that
depends on these values.
This patch only arranges to preserve relfilenodes and tablespace
OIDs. The task of preserving database OIDs is left for another patch,
since it involves some complexities that don't exist in these cases.
Database OIDs have a similar issue, but there are some tricky points
in that case that do not apply to these cases, so that problem is left
for another patch.
Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.
Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
2022-01-17 19:32:44 +01:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
errmsg("relfilenumber value not set when in binary upgrade mode")));
|
pg_upgrade: Preserve relfilenodes and tablespace OIDs.
Currently, database OIDs, relfilenodes, and tablespace OIDs can all
change when a cluster is upgraded using pg_upgrade. It seems better
to preserve them, because (1) it makes troubleshooting pg_upgrade
easier, since you don't have to do a lot of work to match up files
in the old and new clusters, (2) it allows 'rsync' to save bandwidth
when used to re-sync a cluster after an upgrade, and (3) if we ever
encrypt or sign blocks, we would likely want to use a nonce that
depends on these values.
This patch only arranges to preserve relfilenodes and tablespace
OIDs. The task of preserving database OIDs is left for another patch,
since it involves some complexities that don't exist in these cases.
Database OIDs have a similar issue, but there are some tricky points
in that case that do not apply to these cases, so that problem is left
for another patch.
Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.
Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
2022-01-17 19:32:44 +01:00
|
|
|
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
relfilenumber = binary_upgrade_next_heap_pg_class_relfilenumber;
|
|
|
|
binary_upgrade_next_heap_pg_class_relfilenumber = InvalidRelFileNumber;
|
pg_upgrade: Preserve relfilenodes and tablespace OIDs.
Currently, database OIDs, relfilenodes, and tablespace OIDs can all
change when a cluster is upgraded using pg_upgrade. It seems better
to preserve them, because (1) it makes troubleshooting pg_upgrade
easier, since you don't have to do a lot of work to match up files
in the old and new clusters, (2) it allows 'rsync' to save bandwidth
when used to re-sync a cluster after an upgrade, and (3) if we ever
encrypt or sign blocks, we would likely want to use a nonce that
depends on these values.
This patch only arranges to preserve relfilenodes and tablespace
OIDs. The task of preserving database OIDs is left for another patch,
since it involves some complexities that don't exist in these cases.
Database OIDs have a similar issue, but there are some tricky points
in that case that do not apply to these cases, so that problem is left
for another patch.
Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.
Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
2022-01-17 19:32:44 +01:00
|
|
|
}
|
2021-12-03 13:38:26 +01:00
|
|
|
}
|
2010-02-03 02:14:17 +01:00
|
|
|
}
|
2021-12-03 13:38:26 +01:00
|
|
|
|
|
|
|
if (!OidIsValid(relid))
|
2022-09-28 15:45:27 +02:00
|
|
|
relid = GetNewRelFileNumber(reltablespace, pg_class_desc,
|
|
|
|
relpersistence);
|
2010-01-06 04:04:03 +01:00
|
|
|
}
|
2005-08-12 03:36:05 +02:00
|
|
|
|
2009-10-05 21:24:49 +02:00
|
|
|
/*
|
|
|
|
* Determine the relation's initial permissions.
|
|
|
|
*/
|
|
|
|
if (use_user_acl)
|
|
|
|
{
|
|
|
|
switch (relkind)
|
|
|
|
{
|
|
|
|
case RELKIND_RELATION:
|
|
|
|
case RELKIND_VIEW:
|
2013-03-04 01:23:31 +01:00
|
|
|
case RELKIND_MATVIEW:
|
2011-01-02 05:48:11 +01:00
|
|
|
case RELKIND_FOREIGN_TABLE:
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
case RELKIND_PARTITIONED_TABLE:
|
2017-10-12 00:35:19 +02:00
|
|
|
relacl = get_user_default_acl(OBJECT_TABLE, ownerid,
|
2009-10-05 21:24:49 +02:00
|
|
|
relnamespace);
|
|
|
|
break;
|
|
|
|
case RELKIND_SEQUENCE:
|
2017-10-12 00:35:19 +02:00
|
|
|
relacl = get_user_default_acl(OBJECT_SEQUENCE, ownerid,
|
2009-10-05 21:24:49 +02:00
|
|
|
relnamespace);
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
relacl = NULL;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
relacl = NULL;
|
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
2002-08-11 23:17:35 +02:00
|
|
|
* Create the relcache entry (mostly dummy at this point) and the physical
|
|
|
|
* disk file. (If we fail further down, it's the smgr's responsibility to
|
|
|
|
* remove the disk file again.)
|
pg_upgrade: Preserve relfilenodes and tablespace OIDs.
Currently, database OIDs, relfilenodes, and tablespace OIDs can all
change when a cluster is upgraded using pg_upgrade. It seems better
to preserve them, because (1) it makes troubleshooting pg_upgrade
easier, since you don't have to do a lot of work to match up files
in the old and new clusters, (2) it allows 'rsync' to save bandwidth
when used to re-sync a cluster after an upgrade, and (3) if we ever
encrypt or sign blocks, we would likely want to use a nonce that
depends on these values.
This patch only arranges to preserve relfilenodes and tablespace
OIDs. The task of preserving database OIDs is left for another patch,
since it involves some complexities that don't exist in these cases.
Database OIDs have a similar issue, but there are some tricky points
in that case that do not apply to these cases, so that problem is left
for another patch.
Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.
Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
2022-01-17 19:32:44 +01:00
|
|
|
*
|
|
|
|
* NB: Note that passing create_storage = true is correct even for binary
|
|
|
|
* upgrade. The storage we create here will be replaced later, but we
|
|
|
|
* need to have something on disk in the meanwhile.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2002-04-27 23:24:34 +02:00
|
|
|
new_rel_desc = heap_create(relname,
|
|
|
|
relnamespace,
|
2004-06-18 08:14:31 +02:00
|
|
|
reltablespace,
|
2005-04-14 03:38:22 +02:00
|
|
|
relid,
|
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the
integer that is used to name the sequence of files for a certain relation
within the directory set aside for that tablespace/database combination;
or (2) that value plus the OIDs of the tablespace and database; or
occasionally (3) the whole series of files created for a relation
based on those values. Using the same name for more than one thing is
confusing.
Replace RelFileNode with RelFileNumber when we're talking about just the
single number, i.e. (1) from above, and with RelFileLocator when we're
talking about all the things that are needed to locate a relation's files
on disk, i.e. (2) from above. In the places where we refer to (3) as
a relfilenode, instead refer to "relation storage".
Since there is a ton of SQL code in the world that knows about
pg_class.relfilenode, don't change the name of that column, or of other
SQL-facing things that derive their name from it.
On the other hand, do adjust closely-related internal terminology. For
example, the structure member names dbNode and spcNode appear to be
derived from the fact that the structure itself was called RelFileNode,
so change those to dbOid and spcOid. Likewise, various variables with
names like rnode and relnode get renamed appropriately, according to
how they're being used in context.
Hopefully, this is clearer than before. It is also preparation for
future patches that intend to widen the relfilenumber fields from its
current width of 32 bits. Variables that store a relfilenumber are now
declared as type RelFileNumber rather than type Oid; right now, these
are the same, but that can now more easily be changed.
Dilip Kumar, per an idea from me. Reviewed also by Andres Freund.
I fixed some whitespace issues, changed a couple of words in a
comment, and made one other minor correction.
Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com
Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
2022-07-06 17:39:09 +02:00
|
|
|
relfilenumber,
|
tableam: introduce table AM infrastructure.
This introduces the concept of table access methods, i.e. CREATE
ACCESS METHOD ... TYPE TABLE and
CREATE TABLE ... USING (storage-engine).
No table access functionality is delegated to table AMs as of this
commit, that'll be done in following commits.
Subsequent commits will incrementally abstract table access
functionality to be routed through table access methods. That change
is too large to be reviewed & committed at once, so it'll be done
incrementally.
Docs will be updated at the end, as adding them incrementally would
likely make them less coherent, and definitely is a lot more work,
without a lot of benefit.
Table access methods are specified similar to index access methods,
i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with
callbacks. In contrast to index AMs that struct needs to live as long
as a backend, typically that's achieved by just returning a pointer to
a constant struct.
Psql's \d+ now displays a table's access method. That can be disabled
with HIDE_TABLEAM=true, which is mainly useful so regression tests can
be run against different AMs. It's quite possible that this behaviour
still needs to be fine tuned.
For now it's not allowed to set a table AM for a partitioned table, as
we've not resolved how partitions would inherit that. Disallowing
allows us to introduce, if we decide that's the way forward, such a
behaviour without a compatibility break.
Catversion bumped, to add the heap table AM and references to it.
Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de
https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 18:54:38 +01:00
|
|
|
accessmtd,
|
2002-04-27 23:24:34 +02:00
|
|
|
tupdesc,
|
2004-08-31 19:10:36 +02:00
|
|
|
relkind,
|
2010-12-13 18:34:26 +01:00
|
|
|
relpersistence,
|
2002-04-27 23:24:34 +02:00
|
|
|
shared_relation,
|
2013-06-03 16:22:31 +02:00
|
|
|
mapped_relation,
|
2019-03-29 04:01:14 +01:00
|
|
|
allow_system_table_mods,
|
|
|
|
&relfrozenxid,
|
pg_upgrade: Preserve relfilenodes and tablespace OIDs.
Currently, database OIDs, relfilenodes, and tablespace OIDs can all
change when a cluster is upgraded using pg_upgrade. It seems better
to preserve them, because (1) it makes troubleshooting pg_upgrade
easier, since you don't have to do a lot of work to match up files
in the old and new clusters, (2) it allows 'rsync' to save bandwidth
when used to re-sync a cluster after an upgrade, and (3) if we ever
encrypt or sign blocks, we would likely want to use a nonce that
depends on these values.
This patch only arranges to preserve relfilenodes and tablespace
OIDs. The task of preserving database OIDs is left for another patch,
since it involves some complexities that don't exist in these cases.
Database OIDs have a similar issue, but there are some tricky points
in that case that do not apply to these cases, so that problem is left
for another patch.
Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.
Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
2022-01-17 19:32:44 +01:00
|
|
|
&relminmxid,
|
|
|
|
true);
|
1999-02-02 04:45:56 +01:00
|
|
|
|
2005-08-12 03:36:05 +02:00
|
|
|
Assert(relid == RelationGetRelid(new_rel_desc));
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2018-03-21 14:13:24 +01:00
|
|
|
new_rel_desc->rd_rel->relrewrite = relrewrite;
|
|
|
|
|
2005-08-12 03:36:05 +02:00
|
|
|
/*
|
Don't create pg_type entries for sequences or toast tables.
Commit f7f70d5e2 left one inconsistency behind: we're still creating
pg_type entries for the composite types of sequences and toast tables,
but not arrays over those composites. But there seems precious little
reason to have named composite types for toast tables, and not much more
to have them for sequences (especially given the thought that sequences
may someday not be standalone relations at all).
So, let's close that inconsistency by removing these composite types,
rather than adding arrays for them. This buys back a little bit of
the initial pg_type bloat added by the previous patch, and could be
a significant savings in a large database with many toast tables.
Aside from a small logic rearrangement in heap_create_with_catalog,
this patch mostly needs to clean up some places that were assuming that
pg_class.reltype always has a valid value. Those are really pre-existing
bugs, given that it's documented otherwise; notably, the plpgsql changes
fix code that gives "cache lookup failed for type 0" on indexes today.
But none of these seem interesting enough to back-patch.
Also, remove the pg_dump/pg_upgrade infrastructure for propagating
a toast table's pg_type OID into the new database, since we no longer
need that.
Discussion: https://postgr.es/m/761F1389-C6A8-4C15-80CE-950C961F5341@gmail.com
2020-07-07 21:43:22 +02:00
|
|
|
* Decide whether to create a pg_type entry for the relation's rowtype.
|
|
|
|
* These types are made except where the use of a relation as such is an
|
Create composite array types for initdb-created relations.
When we invented arrays of composite types (commit bc8036fc6),
we excluded system catalogs, basically just on the grounds of not
wanting to bloat pg_type. However, it's definitely inconsistent that
catalogs' composite types can't be put into arrays when others can.
Another problem is that the exclusion is done by checking
IsUnderPostmaster in heap_create_with_catalog, which means that
(1) If a user tries to create a table in single-user mode, it doesn't
get an array type. That's bad in itself, plus it breaks pg_upgrade.
(2) If someone drops and recreates a system view or information_schema
view (as we occasionally recommend doing), it will now have an array
type where it did not before, making for still more inconsistency.
So this is all pretty messy. Let's just get rid of the inconsistency
and decree that system-created relations should have array types if
similar user-created ones would, i.e. it only depends on the relkind.
As of HEAD, that means that the initial contents of pg_type grow from
411 rows to 605, which is a lot of growth percentage-wise, but it's
still quite a small catalog compared to others.
Wenjing Zeng, reviewed by Shawn Wang, further hacking by me
Discussion: https://postgr.es/m/761F1389-C6A8-4C15-80CE-950C961F5341@gmail.com
2020-07-06 20:21:16 +02:00
|
|
|
* implementation detail: toast tables, sequences and indexes.
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
*/
|
Create composite array types for initdb-created relations.
When we invented arrays of composite types (commit bc8036fc6),
we excluded system catalogs, basically just on the grounds of not
wanting to bloat pg_type. However, it's definitely inconsistent that
catalogs' composite types can't be put into arrays when others can.
Another problem is that the exclusion is done by checking
IsUnderPostmaster in heap_create_with_catalog, which means that
(1) If a user tries to create a table in single-user mode, it doesn't
get an array type. That's bad in itself, plus it breaks pg_upgrade.
(2) If someone drops and recreates a system view or information_schema
view (as we occasionally recommend doing), it will now have an array
type where it did not before, making for still more inconsistency.
So this is all pretty messy. Let's just get rid of the inconsistency
and decree that system-created relations should have array types if
similar user-created ones would, i.e. it only depends on the relkind.
As of HEAD, that means that the initial contents of pg_type grow from
411 rows to 605, which is a lot of growth percentage-wise, but it's
still quite a small catalog compared to others.
Wenjing Zeng, reviewed by Shawn Wang, further hacking by me
Discussion: https://postgr.es/m/761F1389-C6A8-4C15-80CE-950C961F5341@gmail.com
2020-07-06 20:21:16 +02:00
|
|
|
if (!(relkind == RELKIND_SEQUENCE ||
|
|
|
|
relkind == RELKIND_TOASTVALUE ||
|
|
|
|
relkind == RELKIND_INDEX ||
|
|
|
|
relkind == RELKIND_PARTITIONED_INDEX))
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
{
|
Don't create pg_type entries for sequences or toast tables.
Commit f7f70d5e2 left one inconsistency behind: we're still creating
pg_type entries for the composite types of sequences and toast tables,
but not arrays over those composites. But there seems precious little
reason to have named composite types for toast tables, and not much more
to have them for sequences (especially given the thought that sequences
may someday not be standalone relations at all).
So, let's close that inconsistency by removing these composite types,
rather than adding arrays for them. This buys back a little bit of
the initial pg_type bloat added by the previous patch, and could be
a significant savings in a large database with many toast tables.
Aside from a small logic rearrangement in heap_create_with_catalog,
this patch mostly needs to clean up some places that were assuming that
pg_class.reltype always has a valid value. Those are really pre-existing
bugs, given that it's documented otherwise; notably, the plpgsql changes
fix code that gives "cache lookup failed for type 0" on indexes today.
But none of these seem interesting enough to back-patch.
Also, remove the pg_dump/pg_upgrade infrastructure for propagating
a toast table's pg_type OID into the new database, since we no longer
need that.
Discussion: https://postgr.es/m/761F1389-C6A8-4C15-80CE-950C961F5341@gmail.com
2020-07-07 21:43:22 +02:00
|
|
|
Oid new_array_oid;
|
|
|
|
ObjectAddress new_type_addr;
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
char *relarrayname;
|
|
|
|
|
Don't create pg_type entries for sequences or toast tables.
Commit f7f70d5e2 left one inconsistency behind: we're still creating
pg_type entries for the composite types of sequences and toast tables,
but not arrays over those composites. But there seems precious little
reason to have named composite types for toast tables, and not much more
to have them for sequences (especially given the thought that sequences
may someday not be standalone relations at all).
So, let's close that inconsistency by removing these composite types,
rather than adding arrays for them. This buys back a little bit of
the initial pg_type bloat added by the previous patch, and could be
a significant savings in a large database with many toast tables.
Aside from a small logic rearrangement in heap_create_with_catalog,
this patch mostly needs to clean up some places that were assuming that
pg_class.reltype always has a valid value. Those are really pre-existing
bugs, given that it's documented otherwise; notably, the plpgsql changes
fix code that gives "cache lookup failed for type 0" on indexes today.
But none of these seem interesting enough to back-patch.
Also, remove the pg_dump/pg_upgrade infrastructure for propagating
a toast table's pg_type OID into the new database, since we no longer
need that.
Discussion: https://postgr.es/m/761F1389-C6A8-4C15-80CE-950C961F5341@gmail.com
2020-07-07 21:43:22 +02:00
|
|
|
/*
|
|
|
|
* We'll make an array over the composite type, too. For largely
|
|
|
|
* historical reasons, the array type's OID is assigned first.
|
|
|
|
*/
|
|
|
|
new_array_oid = AssignTypeArrayOid();
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Make the pg_type entry for the composite type. The OID of the
|
|
|
|
* composite type can be preselected by the caller, but if reltypeid
|
|
|
|
* is InvalidOid, we'll generate a new OID for it.
|
|
|
|
*
|
|
|
|
* NOTE: we could get a unique-index failure here, in case someone
|
|
|
|
* else is creating the same type name in parallel but hadn't
|
|
|
|
* committed yet when we checked for a duplicate name above.
|
|
|
|
*/
|
|
|
|
new_type_addr = AddNewRelationType(relname,
|
|
|
|
relnamespace,
|
|
|
|
relid,
|
|
|
|
relkind,
|
|
|
|
ownerid,
|
|
|
|
reltypeid,
|
|
|
|
new_array_oid);
|
|
|
|
new_type_oid = new_type_addr.objectId;
|
|
|
|
if (typaddress)
|
|
|
|
*typaddress = new_type_addr;
|
|
|
|
|
|
|
|
/* Now create the array type. */
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
relarrayname = makeArrayTypeName(relname, relnamespace);
|
|
|
|
|
|
|
|
TypeCreate(new_array_oid, /* force the type's OID to this */
|
|
|
|
relarrayname, /* Array type name */
|
|
|
|
relnamespace, /* Same namespace as parent */
|
|
|
|
InvalidOid, /* Not composite, no relationOid */
|
|
|
|
0, /* relkind, also N/A here */
|
Repair a longstanding bug in CLUSTER and the rewriting variants of ALTER
TABLE: if the command is executed by someone other than the table owner (eg,
a superuser) and the table has a toast table, the toast table's pg_type row
ends up with the wrong typowner, ie, the command issuer not the table owner.
This is quite harmless for most purposes, since no interesting permissions
checks consult the pg_type row. However, it could lead to unexpected failures
if one later tries to drop the role that issued the command (in 8.1 or 8.2),
or strange warnings from pg_dump afterwards (in 8.3 and up, which will allow
the DROP ROLE because we don't create a "redundant" owner dependency for table
rowtypes). Problem identified by Cott Lang.
Back-patch to 8.1. The problem is actually far older --- the CLUSTER variant
can be demonstrated in 7.0 --- but it's mostly cosmetic before 8.1 because we
didn't track ownership dependencies before 8.1. Also, fixing it before 8.1
would require changing the call signature of heap_create_with_catalog(), which
seems to carry a nontrivial risk of breaking add-on modules.
2009-02-24 02:38:10 +01:00
|
|
|
ownerid, /* owner's ID */
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
-1, /* Internal size (varlena) */
|
|
|
|
TYPTYPE_BASE, /* Not composite - typelem is */
|
Replace the hard-wired type knowledge in TypeCategory() and IsPreferredType()
with system catalog lookups, as was foreseen to be necessary almost since
their creation. Instead put the information into two new pg_type columns,
typcategory and typispreferred. Add support for setting these when
creating a user-defined base type.
The category column is just a "char" (i.e. a poor man's enum), allowing
a crude form of user extensibility of the category list: just use an
otherwise-unused character. This seems sufficient for foreseen uses,
but we could upgrade to having an actual category catalog someday, if
there proves to be a huge demand for custom type categories.
In this patch I have attempted to hew exactly to the behavior of the
previous hardwired logic, except for introducing new type categories for
arrays, composites, and enums. In particular the default preferred state
for user-defined types remains TRUE. That seems worth revisiting, but it
should be done as a separate patch from introducing the infrastructure.
Likewise, any adjustment of the standard set of categories should be done
separately.
2008-07-30 19:05:05 +02:00
|
|
|
TYPCATEGORY_ARRAY, /* type-category (array) */
|
2008-07-30 21:35:13 +02:00
|
|
|
false, /* array types are never preferred */
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
DEFAULT_TYPDELIM, /* default array delimiter */
|
|
|
|
F_ARRAY_IN, /* array input proc */
|
|
|
|
F_ARRAY_OUT, /* array output proc */
|
|
|
|
F_ARRAY_RECV, /* array recv (bin) proc */
|
|
|
|
F_ARRAY_SEND, /* array send (bin) proc */
|
|
|
|
InvalidOid, /* typmodin procedure - none */
|
|
|
|
InvalidOid, /* typmodout procedure - none */
|
2012-03-04 02:20:19 +01:00
|
|
|
F_ARRAY_TYPANALYZE, /* array analyze procedure */
|
Support subscripting of arbitrary types, not only arrays.
This patch generalizes the subscripting infrastructure so that any
data type can be subscripted, if it provides a handler function to
define what that means. Traditional variable-length (varlena) arrays
all use array_subscript_handler(), while the existing fixed-length
types that support subscripting use raw_array_subscript_handler().
It's expected that other types that want to use subscripting notation
will define their own handlers. (This patch provides no such new
features, though; it only lays the foundation for them.)
To do this, move the parser's semantic processing of subscripts
(including coercion to whatever data type is required) into a
method callback supplied by the handler. On the execution side,
replace the ExecEvalSubscriptingRef* layer of functions with direct
calls to callback-supplied execution routines. (Thus, essentially
no new run-time overhead should be caused by this patch. Indeed,
there is room to remove some overhead by supplying specialized
execution routines. This patch does a little bit in that line,
but more could be done.)
Additional work is required here and there to remove formerly
hard-wired assumptions about the result type, collation, etc
of a SubscriptingRef expression node; and to remove assumptions
that the subscript values must be integers.
One useful side-effect of this is that we now have a less squishy
mechanism for identifying whether a data type is a "true" array:
instead of wiring in weird rules about typlen, we can look to see
if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this
to be bulletproof, we have to forbid user-defined types from using
that handler directly; but there seems no good reason for them to
do so.
This patch also removes assumptions that the number of subscripts
is limited to MAXDIM (6), or indeed has any hard-wired limit.
That limit still applies to types handled by array_subscript_handler
or raw_array_subscript_handler, but to discourage other dependencies
on this constant, I've moved it from c.h to utils/array.h.
Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov,
Peter Eisentraut, Pavel Stehule
Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com
Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
2020-12-09 18:40:37 +01:00
|
|
|
F_ARRAY_SUBSCRIPT_HANDLER, /* array subscript procedure */
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
new_type_oid, /* array element type - the rowtype */
|
|
|
|
true, /* yes, this is an array type */
|
|
|
|
InvalidOid, /* this has no array type */
|
|
|
|
InvalidOid, /* domain base type - irrelevant */
|
|
|
|
NULL, /* default value - none */
|
|
|
|
NULL, /* default binary representation */
|
|
|
|
false, /* passed by reference */
|
2020-03-04 16:34:25 +01:00
|
|
|
TYPALIGN_DOUBLE, /* alignment - must be the largest! */
|
|
|
|
TYPSTORAGE_EXTENDED, /* fully TOASTable */
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
-1, /* typmod */
|
|
|
|
0, /* array dimensions for typBaseType */
|
2011-02-08 22:04:18 +01:00
|
|
|
false, /* Type NOT NULL */
|
2011-04-22 23:43:18 +02:00
|
|
|
InvalidOid); /* rowtypes never have a collation */
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
|
|
|
|
pfree(relarrayname);
|
|
|
|
}
|
Don't create pg_type entries for sequences or toast tables.
Commit f7f70d5e2 left one inconsistency behind: we're still creating
pg_type entries for the composite types of sequences and toast tables,
but not arrays over those composites. But there seems precious little
reason to have named composite types for toast tables, and not much more
to have them for sequences (especially given the thought that sequences
may someday not be standalone relations at all).
So, let's close that inconsistency by removing these composite types,
rather than adding arrays for them. This buys back a little bit of
the initial pg_type bloat added by the previous patch, and could be
a significant savings in a large database with many toast tables.
Aside from a small logic rearrangement in heap_create_with_catalog,
this patch mostly needs to clean up some places that were assuming that
pg_class.reltype always has a valid value. Those are really pre-existing
bugs, given that it's documented otherwise; notably, the plpgsql changes
fix code that gives "cache lookup failed for type 0" on indexes today.
But none of these seem interesting enough to back-patch.
Also, remove the pg_dump/pg_upgrade infrastructure for propagating
a toast table's pg_type OID into the new database, since we no longer
need that.
Discussion: https://postgr.es/m/761F1389-C6A8-4C15-80CE-950C961F5341@gmail.com
2020-07-07 21:43:22 +02:00
|
|
|
else
|
|
|
|
{
|
|
|
|
/* Caller should not be expecting a type to be created. */
|
|
|
|
Assert(reltypeid == InvalidOid);
|
|
|
|
Assert(typaddress == NULL);
|
|
|
|
|
|
|
|
new_type_oid = InvalidOid;
|
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
2001-02-12 21:07:21 +01:00
|
|
|
* now create an entry in pg_class for the relation.
|
|
|
|
*
|
|
|
|
* NOTE: we could get a unique-index failure here, in case someone else is
|
|
|
|
* creating the same relation name in parallel but hadn't committed yet
|
|
|
|
* when we checked for a duplicate name above.
|
1997-09-07 07:04:48 +02:00
|
|
|
*/
|
1999-02-02 04:45:56 +01:00
|
|
|
AddNewRelationTuple(pg_class_desc,
|
1996-07-09 08:22:35 +02:00
|
|
|
new_rel_desc,
|
2005-08-12 03:36:05 +02:00
|
|
|
relid,
|
2001-02-12 21:07:21 +01:00
|
|
|
new_type_oid,
|
2010-01-29 00:21:13 +01:00
|
|
|
reloftypeid,
|
2005-08-26 05:08:15 +02:00
|
|
|
ownerid,
|
2006-07-02 04:23:23 +02:00
|
|
|
relkind,
|
2019-03-29 04:01:14 +01:00
|
|
|
relfrozenxid,
|
|
|
|
relminmxid,
|
2009-10-05 21:24:49 +02:00
|
|
|
PointerGetDatum(relacl),
|
2006-07-04 00:45:41 +02:00
|
|
|
reloptions);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2001-02-12 21:07:21 +01:00
|
|
|
/*
|
|
|
|
* now add tuples to pg_attribute for the attributes in our new relation.
|
|
|
|
*/
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
AddNewAttributeTuples(relid, new_rel_desc->rd_att, relkind);
|
2001-02-12 21:07:21 +01:00
|
|
|
|
2002-07-18 18:47:26 +02:00
|
|
|
/*
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
* Make a dependency link to force the relation to be deleted if its
|
2009-10-05 21:24:49 +02:00
|
|
|
* namespace is. Also make a dependency link to its owner, as well as
|
|
|
|
* dependencies for any roles mentioned in the default ACL.
|
2005-07-07 22:40:02 +02:00
|
|
|
*
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
* For composite types, these dependencies are tracked for the pg_type
|
2007-05-14 22:24:41 +02:00
|
|
|
* entry, so we needn't record them here. Likewise, TOAST tables don't
|
|
|
|
* need a namespace dependency (they live in a pinned namespace) nor an
|
2009-10-05 21:24:49 +02:00
|
|
|
* owner dependency (they depend indirectly through the parent table), nor
|
2011-02-08 22:08:41 +01:00
|
|
|
* should they have any ACL entries. The same applies for extension
|
|
|
|
* dependencies.
|
2009-10-05 21:24:49 +02:00
|
|
|
*
|
2007-05-14 22:24:41 +02:00
|
|
|
* Also, skip this in bootstrap mode, since we don't make dependencies
|
|
|
|
* while bootstrapping.
|
2002-07-18 18:47:26 +02:00
|
|
|
*/
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
if (relkind != RELKIND_COMPOSITE_TYPE &&
|
2007-05-14 22:24:41 +02:00
|
|
|
relkind != RELKIND_TOASTVALUE &&
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
!IsBootstrapProcessingMode())
|
2002-07-18 18:47:26 +02:00
|
|
|
{
|
|
|
|
ObjectAddress myself,
|
|
|
|
referenced;
|
2020-09-05 14:33:53 +02:00
|
|
|
ObjectAddresses *addrs;
|
2002-07-18 18:47:26 +02:00
|
|
|
|
2020-09-05 14:33:53 +02:00
|
|
|
ObjectAddressSet(myself, RelationRelationId, relid);
|
2005-07-07 22:40:02 +02:00
|
|
|
|
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
David Fetter, Andrew Dunstan, Tom Lane
2007-05-11 19:57:14 +02:00
|
|
|
recordDependencyOnOwner(RelationRelationId, relid, ownerid);
|
2009-10-05 21:24:49 +02:00
|
|
|
|
Fix missing role dependencies for some schema and type ACLs.
This patch fixes several related cases in which pg_shdepend entries were
never made, or were lost, for references to roles appearing in the ACLs of
schemas and/or types. While that did no immediate harm, if a referenced
role were later dropped, the drop would be allowed and would leave a
dangling reference in the object's ACL. That still wasn't a big problem
for normal database usage, but it would cause obscure failures in
subsequent dump/reload or pg_upgrade attempts, taking the form of
attempts to grant privileges to all-numeric role names. (I think I've
seen field reports matching that symptom, but can't find any right now.)
Several cases are fixed here:
1. ALTER DOMAIN SET/DROP DEFAULT would lose the dependencies for any
existing ACL entries for the domain. This case is ancient, dating
back as far as we've had pg_shdepend tracking at all.
2. If a default type privilege applies, CREATE TYPE recorded the
ACL properly but forgot to install dependency entries for it.
This dates to the addition of default privileges for types in 9.2.
3. If a default schema privilege applies, CREATE SCHEMA recorded the
ACL properly but forgot to install dependency entries for it.
This dates to the addition of default privileges for schemas in v10
(commit ab89e465c).
Another somewhat-related problem is that when creating a relation
rowtype or implicit array type, TypeCreate would apply any available
default type privileges to that type, which we don't really want
since such an object isn't supposed to have privileges of its own.
(You can't, for example, drop such privileges once they've been added
to an array type.)
ab89e465c is also to blame for a race condition in the regression tests:
privileges.sql transiently installed globally-applicable default
privileges on schemas, which sometimes got absorbed into the ACLs of
schemas created by concurrent test scripts. This should have resulted
in failures when privileges.sql tried to drop the role holding such
privileges; but thanks to the bug fixed here, it instead led to dangling
ACLs in the final state of the regression database. We'd managed not to
notice that, but it became obvious in the wake of commit da906766c, which
allowed the race condition to occur in pg_upgrade tests.
To fix, add a function recordDependencyOnNewAcl to encapsulate what
callers of get_user_default_acl need to do; while the original call
sites got that right via ad-hoc code, none of the later-added ones
have. Also change GenerateTypeDependencies to generate these
dependencies, which requires adding the typacl to its parameter list.
(That might be annoying if there are any extensions calling that
function directly; but if there are, they're most likely buggy in the
same way as the core callers were, so they need work anyway.) While
I was at it, I changed GenerateTypeDependencies to accept most of its
parameters in the form of a Form_pg_type pointer, making its parameter
list a bit less unwieldy and mistake-prone.
The test race condition is fixed just by wrapping the addition and
removal of default privileges into a single transaction, so that that
state is never visible externally. We might eventually prefer to
separate out tests of default privileges into a script that runs by
itself, but that would be a bigger change and would make the tests
run slower overall.
Back-patch relevant parts to all supported branches.
Discussion: https://postgr.es/m/15719.1541725287@sss.pgh.pa.us
2018-11-10 02:42:03 +01:00
|
|
|
recordDependencyOnNewAcl(RelationRelationId, relid, 0, ownerid, relacl);
|
|
|
|
|
Delete deleteWhatDependsOn() in favor of more performDeletion() flag bits.
deleteWhatDependsOn() had grown an uncomfortably large number of
assumptions about what it's used for. There are actually only two minor
differences between what it does and what a regular performDeletion() call
can do, so let's invent additional bits in performDeletion's existing flags
argument that specify those behaviors, and get rid of deleteWhatDependsOn()
as such. (We'd probably have done it this way from the start, except that
performDeletion didn't originally have a flags argument, IIRC.)
Also, add a SKIP_EXTENSIONS flag bit that prevents ever recursing to an
extension, and use that when dropping temporary objects at session end.
This provides a more general solution to the problem addressed in a hacky
way in commit 08dd23cec: if an extension script creates temp objects and
forgets to remove them again, the whole extension went away when its
contained temp objects were deleted. The previous solution only covered
temp relations, but this solves it for all object types.
These changes require minor additions in dependency.c to pass the flags
to subroutines that previously didn't get them, but it's still a net
savings of code, and it seems cleaner than before.
Having done this, revert the special-case code added in 08dd23cec that
prevented addition of pg_depend records for temp table extension
membership, because that caused its own oddities: dropping an extension
that had created such a table didn't automatically remove the table,
leading to a failure if the table had another dependency on the extension
(such as use of an extension data type), or to a duplicate-name failure if
you then tried to recreate the extension. But we keep the part that
prevents the pg_temp_nnn schema from becoming an extension member; we never
want that to happen. Add a regression test case covering these behaviors.
Although this fixes some arguable bugs, we've heard few field complaints,
and any such problems are easily worked around by explicitly dropping temp
objects at the end of extension scripts (which seems like good practice
anyway). So I won't risk a back-patch.
Discussion: https://postgr.es/m/e51f4311-f483-4dd0-1ccc-abec3c405110@BlueTreble.com
2016-12-02 20:57:35 +01:00
|
|
|
recordDependencyOnCurrentExtension(&myself, false);
|
2011-02-08 22:08:41 +01:00
|
|
|
|
2020-09-05 14:33:53 +02:00
|
|
|
addrs = new_object_addresses();
|
|
|
|
|
|
|
|
ObjectAddressSet(referenced, NamespaceRelationId, relnamespace);
|
|
|
|
add_exact_object_address(&referenced, addrs);
|
|
|
|
|
2010-01-29 00:21:13 +01:00
|
|
|
if (reloftypeid)
|
|
|
|
{
|
2020-09-05 14:33:53 +02:00
|
|
|
ObjectAddressSet(referenced, TypeRelationId, reloftypeid);
|
|
|
|
add_exact_object_address(&referenced, addrs);
|
2010-01-29 00:21:13 +01:00
|
|
|
}
|
tableam: introduce table AM infrastructure.
This introduces the concept of table access methods, i.e. CREATE
ACCESS METHOD ... TYPE TABLE and
CREATE TABLE ... USING (storage-engine).
No table access functionality is delegated to table AMs as of this
commit, that'll be done in following commits.
Subsequent commits will incrementally abstract table access
functionality to be routed through table access methods. That change
is too large to be reviewed & committed at once, so it'll be done
incrementally.
Docs will be updated at the end, as adding them incrementally would
likely make them less coherent, and definitely is a lot more work,
without a lot of benefit.
Table access methods are specified similar to index access methods,
i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with
callbacks. In contrast to index AMs that struct needs to live as long
as a backend, typically that's achieved by just returning a pointer to
a constant struct.
Psql's \d+ now displays a table's access method. That can be disabled
with HIDE_TABLEAM=true, which is mainly useful so regression tests can
be run against different AMs. It's quite possible that this behaviour
still needs to be fine tuned.
For now it's not allowed to set a table AM for a partitioned table, as
we've not resolved how partitions would inherit that. Disallowing
allows us to introduce, if we decide that's the way forward, such a
behaviour without a compatibility break.
Catversion bumped, to add the heap table AM and references to it.
Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de
https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 18:54:38 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Make a dependency link to force the relation to be deleted if its
|
2021-12-03 13:38:26 +01:00
|
|
|
* access method is.
|
tableam: introduce table AM infrastructure.
This introduces the concept of table access methods, i.e. CREATE
ACCESS METHOD ... TYPE TABLE and
CREATE TABLE ... USING (storage-engine).
No table access functionality is delegated to table AMs as of this
commit, that'll be done in following commits.
Subsequent commits will incrementally abstract table access
functionality to be routed through table access methods. That change
is too large to be reviewed & committed at once, so it'll be done
incrementally.
Docs will be updated at the end, as adding them incrementally would
likely make them less coherent, and definitely is a lot more work,
without a lot of benefit.
Table access methods are specified similar to index access methods,
i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with
callbacks. In contrast to index AMs that struct needs to live as long
as a backend, typically that's achieved by just returning a pointer to
a constant struct.
Psql's \d+ now displays a table's access method. That can be disabled
with HIDE_TABLEAM=true, which is mainly useful so regression tests can
be run against different AMs. It's quite possible that this behaviour
still needs to be fine tuned.
For now it's not allowed to set a table AM for a partitioned table, as
we've not resolved how partitions would inherit that. Disallowing
allows us to introduce, if we decide that's the way forward, such a
behaviour without a compatibility break.
Catversion bumped, to add the heap table AM and references to it.
Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de
https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 18:54:38 +01:00
|
|
|
*
|
|
|
|
* No need to add an explicit dependency for the toast table, as the
|
|
|
|
* main table depends on it.
|
|
|
|
*/
|
2021-12-03 13:38:26 +01:00
|
|
|
if (RELKIND_HAS_TABLE_AM(relkind) && relkind != RELKIND_TOASTVALUE)
|
tableam: introduce table AM infrastructure.
This introduces the concept of table access methods, i.e. CREATE
ACCESS METHOD ... TYPE TABLE and
CREATE TABLE ... USING (storage-engine).
No table access functionality is delegated to table AMs as of this
commit, that'll be done in following commits.
Subsequent commits will incrementally abstract table access
functionality to be routed through table access methods. That change
is too large to be reviewed & committed at once, so it'll be done
incrementally.
Docs will be updated at the end, as adding them incrementally would
likely make them less coherent, and definitely is a lot more work,
without a lot of benefit.
Table access methods are specified similar to index access methods,
i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with
callbacks. In contrast to index AMs that struct needs to live as long
as a backend, typically that's achieved by just returning a pointer to
a constant struct.
Psql's \d+ now displays a table's access method. That can be disabled
with HIDE_TABLEAM=true, which is mainly useful so regression tests can
be run against different AMs. It's quite possible that this behaviour
still needs to be fine tuned.
For now it's not allowed to set a table AM for a partitioned table, as
we've not resolved how partitions would inherit that. Disallowing
allows us to introduce, if we decide that's the way forward, such a
behaviour without a compatibility break.
Catversion bumped, to add the heap table AM and references to it.
Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de
https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 18:54:38 +01:00
|
|
|
{
|
2020-09-05 14:33:53 +02:00
|
|
|
ObjectAddressSet(referenced, AccessMethodRelationId, accessmtd);
|
|
|
|
add_exact_object_address(&referenced, addrs);
|
tableam: introduce table AM infrastructure.
This introduces the concept of table access methods, i.e. CREATE
ACCESS METHOD ... TYPE TABLE and
CREATE TABLE ... USING (storage-engine).
No table access functionality is delegated to table AMs as of this
commit, that'll be done in following commits.
Subsequent commits will incrementally abstract table access
functionality to be routed through table access methods. That change
is too large to be reviewed & committed at once, so it'll be done
incrementally.
Docs will be updated at the end, as adding them incrementally would
likely make them less coherent, and definitely is a lot more work,
without a lot of benefit.
Table access methods are specified similar to index access methods,
i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with
callbacks. In contrast to index AMs that struct needs to live as long
as a backend, typically that's achieved by just returning a pointer to
a constant struct.
Psql's \d+ now displays a table's access method. That can be disabled
with HIDE_TABLEAM=true, which is mainly useful so regression tests can
be run against different AMs. It's quite possible that this behaviour
still needs to be fine tuned.
For now it's not allowed to set a table AM for a partitioned table, as
we've not resolved how partitions would inherit that. Disallowing
allows us to introduce, if we decide that's the way forward, such a
behaviour without a compatibility break.
Catversion bumped, to add the heap table AM and references to it.
Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de
https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 18:54:38 +01:00
|
|
|
}
|
2020-09-05 14:33:53 +02:00
|
|
|
|
|
|
|
record_object_address_dependencies(&myself, addrs, DEPENDENCY_NORMAL);
|
|
|
|
free_object_addresses(addrs);
|
2002-07-18 18:47:26 +02:00
|
|
|
}
|
|
|
|
|
2010-11-25 17:48:49 +01:00
|
|
|
/* Post creation hook for new relation */
|
2013-03-07 02:52:06 +01:00
|
|
|
InvokeObjectPostCreateHookArg(RelationRelationId, relid, 0, is_internal);
|
2010-11-25 17:48:49 +01:00
|
|
|
|
2002-03-03 18:47:56 +01:00
|
|
|
/*
|
2008-05-10 01:32:05 +02:00
|
|
|
* Store any supplied constraints and defaults.
|
2002-03-03 18:47:56 +01:00
|
|
|
*
|
|
|
|
* NB: this may do a CommandCounterIncrement and rebuild the relcache
|
|
|
|
* entry, so the relation must be valid and self-consistent at this point.
|
|
|
|
* In particular, there are not yet constraints and defaults anywhere.
|
|
|
|
*/
|
2013-03-18 03:55:14 +01:00
|
|
|
StoreConstraints(new_rel_desc, cooked_constraints, is_internal);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2002-11-11 23:19:25 +01:00
|
|
|
/*
|
|
|
|
* If there's a special on-commit action, remember it
|
|
|
|
*/
|
|
|
|
if (oncommit != ONCOMMIT_NOOP)
|
2005-08-12 03:36:05 +02:00
|
|
|
register_on_commit_action(relid, oncommit);
|
2002-11-11 23:19:25 +01:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
|
|
|
* ok, the relation has been cataloged, so close our relations and return
|
2005-08-12 03:36:05 +02:00
|
|
|
* the OID of the newly created relation.
|
1997-09-07 07:04:48 +02:00
|
|
|
*/
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(new_rel_desc, NoLock); /* do not unlock till end of xact */
|
|
|
|
table_close(pg_class_desc, RowExclusiveLock);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2005-08-12 03:36:05 +02:00
|
|
|
return relid;
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
2002-07-12 20:43:19 +02:00
|
|
|
/*
|
1996-07-09 08:22:35 +02:00
|
|
|
* RelationRemoveInheritance
|
|
|
|
*
|
2002-07-12 20:43:19 +02:00
|
|
|
* Formerly, this routine checked for child relations and aborted the
|
|
|
|
* deletion if any were found. Now we rely on the dependency mechanism
|
|
|
|
* to check for or delete child relations. By the time we get here,
|
2002-07-20 00:21:17 +02:00
|
|
|
* there are no children and we need only remove any pg_inherits rows
|
|
|
|
* linking this relation to its parent(s).
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
1997-08-19 23:40:56 +02:00
|
|
|
static void
|
2004-08-28 23:05:26 +02:00
|
|
|
RelationRemoveInheritance(Oid relid)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
|
|
|
Relation catalogRelation;
|
2002-07-12 20:43:19 +02:00
|
|
|
SysScanDesc scan;
|
2002-07-15 18:33:32 +02:00
|
|
|
ScanKeyData key;
|
|
|
|
HeapTuple tuple;
|
1998-09-01 06:40:42 +02:00
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
catalogRelation = table_open(InheritsRelationId, RowExclusiveLock);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2003-11-12 22:15:59 +01:00
|
|
|
ScanKeyInit(&key,
|
|
|
|
Anum_pg_inherits_inhrelid,
|
|
|
|
BTEqualStrategyNumber, F_OIDEQ,
|
2004-08-28 23:05:26 +02:00
|
|
|
ObjectIdGetDatum(relid));
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2005-04-14 22:03:27 +02:00
|
|
|
scan = systable_beginscan(catalogRelation, InheritsRelidSeqnoIndexId, true,
|
Use an MVCC snapshot, rather than SnapshotNow, for catalog scans.
SnapshotNow scans have the undesirable property that, in the face of
concurrent updates, the scan can fail to see either the old or the new
versions of the row. In many cases, we work around this by requiring
DDL operations to hold AccessExclusiveLock on the object being
modified; in some cases, the existing locking is inadequate and random
failures occur as a result. This commit doesn't change anything
related to locking, but will hopefully pave the way to allowing lock
strength reductions in the future.
The major issue has held us back from making this change in the past
is that taking an MVCC snapshot is significantly more expensive than
using a static special snapshot such as SnapshotNow. However, testing
of various worst-case scenarios reveals that this problem is not
severe except under fairly extreme workloads. To mitigate those
problems, we avoid retaking the MVCC snapshot for each new scan;
instead, we take a new snapshot only when invalidation messages have
been processed. The catcache machinery already requires that
invalidation messages be sent before releasing the related heavyweight
lock; else other backends might rely on locally-cached data rather
than scanning the catalog at all. Thus, making snapshot reuse
dependent on the same guarantees shouldn't break anything that wasn't
already subtly broken.
Patch by me. Review by Michael Paquier and Andres Freund.
2013-07-02 15:47:01 +02:00
|
|
|
NULL, 1, &key);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2002-07-12 20:43:19 +02:00
|
|
|
while (HeapTupleIsValid(tuple = systable_getnext(scan)))
|
2017-02-01 22:13:30 +01:00
|
|
|
CatalogTupleDelete(catalogRelation, &tuple->t_self);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2002-07-12 20:43:19 +02:00
|
|
|
systable_endscan(scan);
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(catalogRelation, RowExclusiveLock);
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
2002-03-26 20:17:02 +01:00
|
|
|
/*
|
1999-02-02 04:45:56 +01:00
|
|
|
* DeleteRelationTuple
|
2002-07-14 23:08:08 +02:00
|
|
|
*
|
|
|
|
* Remove pg_class row for the given relid.
|
|
|
|
*
|
|
|
|
* Note: this is shared by relation deletion and index deletion. It's
|
|
|
|
* not intended for use anyplace else.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2002-07-14 23:08:08 +02:00
|
|
|
void
|
|
|
|
DeleteRelationTuple(Oid relid)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
|
|
|
Relation pg_class_desc;
|
|
|
|
HeapTuple tup;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2002-07-14 23:08:08 +02:00
|
|
|
/* Grab an appropriate lock on the pg_class relation */
|
2019-01-21 19:32:19 +01:00
|
|
|
pg_class_desc = table_open(RelationRelationId, RowExclusiveLock);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2010-02-14 19:42:19 +01:00
|
|
|
tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
|
1998-08-19 04:04:17 +02:00
|
|
|
if (!HeapTupleIsValid(tup))
|
2003-07-20 23:56:35 +02:00
|
|
|
elog(ERROR, "cache lookup failed for relation %u", relid);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2002-07-14 23:08:08 +02:00
|
|
|
/* delete the relation tuple from pg_class, and finish up */
|
2017-02-01 22:13:30 +01:00
|
|
|
CatalogTupleDelete(pg_class_desc, &tup->t_self);
|
1998-09-01 06:40:42 +02:00
|
|
|
|
2002-07-14 23:08:08 +02:00
|
|
|
ReleaseSysCache(tup);
|
1996-07-09 08:22:35 +02:00
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(pg_class_desc, RowExclusiveLock);
|
1999-09-23 19:03:39 +02:00
|
|
|
}
|
|
|
|
|
2002-07-14 23:08:08 +02:00
|
|
|
/*
|
|
|
|
* DeleteAttributeTuples
|
1999-09-23 19:03:39 +02:00
|
|
|
*
|
2002-07-14 23:08:08 +02:00
|
|
|
* Remove pg_attribute rows for the given relid.
|
|
|
|
*
|
|
|
|
* Note: this is shared by relation deletion and index deletion. It's
|
|
|
|
* not intended for use anyplace else.
|
1999-09-23 19:03:39 +02:00
|
|
|
*/
|
1999-09-24 02:25:33 +02:00
|
|
|
void
|
2002-07-14 23:08:08 +02:00
|
|
|
DeleteAttributeTuples(Oid relid)
|
1999-09-24 02:25:33 +02:00
|
|
|
{
|
2002-07-14 23:08:08 +02:00
|
|
|
Relation attrel;
|
|
|
|
SysScanDesc scan;
|
|
|
|
ScanKeyData key[1];
|
|
|
|
HeapTuple atttup;
|
1999-09-23 19:03:39 +02:00
|
|
|
|
2002-07-14 23:08:08 +02:00
|
|
|
/* Grab an appropriate lock on the pg_attribute relation */
|
2019-01-21 19:32:19 +01:00
|
|
|
attrel = table_open(AttributeRelationId, RowExclusiveLock);
|
1999-09-23 19:03:39 +02:00
|
|
|
|
2002-07-14 23:08:08 +02:00
|
|
|
/* Use the index to scan only attributes of the target relation */
|
2003-11-12 22:15:59 +01:00
|
|
|
ScanKeyInit(&key[0],
|
|
|
|
Anum_pg_attribute_attrelid,
|
|
|
|
BTEqualStrategyNumber, F_OIDEQ,
|
|
|
|
ObjectIdGetDatum(relid));
|
1998-09-01 06:40:42 +02:00
|
|
|
|
2005-04-14 22:03:27 +02:00
|
|
|
scan = systable_beginscan(attrel, AttributeRelidNumIndexId, true,
|
Use an MVCC snapshot, rather than SnapshotNow, for catalog scans.
SnapshotNow scans have the undesirable property that, in the face of
concurrent updates, the scan can fail to see either the old or the new
versions of the row. In many cases, we work around this by requiring
DDL operations to hold AccessExclusiveLock on the object being
modified; in some cases, the existing locking is inadequate and random
failures occur as a result. This commit doesn't change anything
related to locking, but will hopefully pave the way to allowing lock
strength reductions in the future.
The major issue has held us back from making this change in the past
is that taking an MVCC snapshot is significantly more expensive than
using a static special snapshot such as SnapshotNow. However, testing
of various worst-case scenarios reveals that this problem is not
severe except under fairly extreme workloads. To mitigate those
problems, we avoid retaking the MVCC snapshot for each new scan;
instead, we take a new snapshot only when invalidation messages have
been processed. The catcache machinery already requires that
invalidation messages be sent before releasing the related heavyweight
lock; else other backends might rely on locally-cached data rather
than scanning the catalog at all. Thus, making snapshot reuse
dependent on the same guarantees shouldn't break anything that wasn't
already subtly broken.
Patch by me. Review by Michael Paquier and Andres Freund.
2013-07-02 15:47:01 +02:00
|
|
|
NULL, 1, key);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2002-07-14 23:08:08 +02:00
|
|
|
/* Delete all the matching tuples */
|
|
|
|
while ((atttup = systable_getnext(scan)) != NULL)
|
2017-02-01 22:13:30 +01:00
|
|
|
CatalogTupleDelete(attrel, &atttup->t_self);
|
This patch implements ORACLE's COMMENT SQL command.
>From the ORACLE 7 SQL Language Reference Manual:
-----------------------------------------------------
COMMENT
Purpose:
To add a comment about a table, view, snapshot, or
column into the data dictionary.
Prerequisites:
The table, view, or snapshot must be in your own
schema
or you must have COMMENT ANY TABLE system privilege.
Syntax:
COMMENT ON [ TABLE table ] |
[ COLUMN table.column] IS 'text'
You can effectively drop a comment from the database
by setting it to the empty string ''.
-----------------------------------------------------
Example:
COMMENT ON TABLE workorders IS
'Maintains base records for workorder information';
COMMENT ON COLUMN workorders.hours IS
'Number of hours the engineer worked on the task';
to drop a comment:
COMMENT ON COLUMN workorders.hours IS '';
The current patch will simply perform the insert into
pg_description, as per the TODO. And, of course, when
the table is dropped, any comments relating to it
or any of its attributes are also dropped. I haven't
looked at the ODBC source yet, but I do know from
an ODBC client standpoint that the standard does
support the notion of table and column comments.
Hopefully the ODBC driver is already fetching these
values from pg_description, but if not, it should be
trivial.
Hope this makes the grade,
Mike Mascari
(mascarim@yahoo.com)
1999-10-15 03:49:49 +02:00
|
|
|
|
2002-07-14 23:08:08 +02:00
|
|
|
/* Clean up after the scan */
|
|
|
|
systable_endscan(scan);
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(attrel, RowExclusiveLock);
|
This patch implements ORACLE's COMMENT SQL command.
>From the ORACLE 7 SQL Language Reference Manual:
-----------------------------------------------------
COMMENT
Purpose:
To add a comment about a table, view, snapshot, or
column into the data dictionary.
Prerequisites:
The table, view, or snapshot must be in your own
schema
or you must have COMMENT ANY TABLE system privilege.
Syntax:
COMMENT ON [ TABLE table ] |
[ COLUMN table.column] IS 'text'
You can effectively drop a comment from the database
by setting it to the empty string ''.
-----------------------------------------------------
Example:
COMMENT ON TABLE workorders IS
'Maintains base records for workorder information';
COMMENT ON COLUMN workorders.hours IS
'Number of hours the engineer worked on the task';
to drop a comment:
COMMENT ON COLUMN workorders.hours IS '';
The current patch will simply perform the insert into
pg_description, as per the TODO. And, of course, when
the table is dropped, any comments relating to it
or any of its attributes are also dropped. I haven't
looked at the ODBC source yet, but I do know from
an ODBC client standpoint that the standard does
support the notion of table and column comments.
Hopefully the ODBC driver is already fetching these
values from pg_description, but if not, it should be
trivial.
Hope this makes the grade,
Mike Mascari
(mascarim@yahoo.com)
1999-10-15 03:49:49 +02:00
|
|
|
}
|
|
|
|
|
2012-10-24 19:39:37 +02:00
|
|
|
/*
|
|
|
|
* DeleteSystemAttributeTuples
|
|
|
|
*
|
|
|
|
* Remove pg_attribute rows for system columns of the given relid.
|
|
|
|
*
|
|
|
|
* Note: this is only used when converting a table to a view. Views don't
|
|
|
|
* have system columns, so we should remove them from pg_attribute.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
DeleteSystemAttributeTuples(Oid relid)
|
|
|
|
{
|
|
|
|
Relation attrel;
|
|
|
|
SysScanDesc scan;
|
|
|
|
ScanKeyData key[2];
|
|
|
|
HeapTuple atttup;
|
|
|
|
|
|
|
|
/* Grab an appropriate lock on the pg_attribute relation */
|
2019-01-21 19:32:19 +01:00
|
|
|
attrel = table_open(AttributeRelationId, RowExclusiveLock);
|
2012-10-24 19:39:37 +02:00
|
|
|
|
|
|
|
/* Use the index to scan only system attributes of the target relation */
|
|
|
|
ScanKeyInit(&key[0],
|
|
|
|
Anum_pg_attribute_attrelid,
|
|
|
|
BTEqualStrategyNumber, F_OIDEQ,
|
|
|
|
ObjectIdGetDatum(relid));
|
|
|
|
ScanKeyInit(&key[1],
|
|
|
|
Anum_pg_attribute_attnum,
|
|
|
|
BTLessEqualStrategyNumber, F_INT2LE,
|
|
|
|
Int16GetDatum(0));
|
|
|
|
|
|
|
|
scan = systable_beginscan(attrel, AttributeRelidNumIndexId, true,
|
Use an MVCC snapshot, rather than SnapshotNow, for catalog scans.
SnapshotNow scans have the undesirable property that, in the face of
concurrent updates, the scan can fail to see either the old or the new
versions of the row. In many cases, we work around this by requiring
DDL operations to hold AccessExclusiveLock on the object being
modified; in some cases, the existing locking is inadequate and random
failures occur as a result. This commit doesn't change anything
related to locking, but will hopefully pave the way to allowing lock
strength reductions in the future.
The major issue has held us back from making this change in the past
is that taking an MVCC snapshot is significantly more expensive than
using a static special snapshot such as SnapshotNow. However, testing
of various worst-case scenarios reveals that this problem is not
severe except under fairly extreme workloads. To mitigate those
problems, we avoid retaking the MVCC snapshot for each new scan;
instead, we take a new snapshot only when invalidation messages have
been processed. The catcache machinery already requires that
invalidation messages be sent before releasing the related heavyweight
lock; else other backends might rely on locally-cached data rather
than scanning the catalog at all. Thus, making snapshot reuse
dependent on the same guarantees shouldn't break anything that wasn't
already subtly broken.
Patch by me. Review by Michael Paquier and Andres Freund.
2013-07-02 15:47:01 +02:00
|
|
|
NULL, 2, key);
|
2012-10-24 19:39:37 +02:00
|
|
|
|
|
|
|
/* Delete all the matching tuples */
|
|
|
|
while ((atttup = systable_getnext(scan)) != NULL)
|
2017-02-01 22:13:30 +01:00
|
|
|
CatalogTupleDelete(attrel, &atttup->t_self);
|
2012-10-24 19:39:37 +02:00
|
|
|
|
|
|
|
/* Clean up after the scan */
|
|
|
|
systable_endscan(scan);
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(attrel, RowExclusiveLock);
|
2012-10-24 19:39:37 +02:00
|
|
|
}
|
|
|
|
|
2002-08-02 20:15:10 +02:00
|
|
|
/*
|
|
|
|
* RemoveAttributeById
|
|
|
|
*
|
|
|
|
* This is the guts of ALTER TABLE DROP COLUMN: actually mark the attribute
|
2003-05-12 02:17:03 +02:00
|
|
|
* deleted in pg_attribute. We also remove pg_statistic entries for it.
|
|
|
|
* (Everything else needed, such as getting rid of any pg_attrdef entry,
|
|
|
|
* is handled by dependency.c.)
|
2002-08-02 20:15:10 +02:00
|
|
|
*/
|
|
|
|
void
|
|
|
|
RemoveAttributeById(Oid relid, AttrNumber attnum)
|
|
|
|
{
|
|
|
|
Relation rel;
|
|
|
|
Relation attr_rel;
|
|
|
|
HeapTuple tuple;
|
|
|
|
Form_pg_attribute attStruct;
|
|
|
|
char newattname[NAMEDATALEN];
|
2023-12-22 21:44:55 +01:00
|
|
|
Datum valuesAtt[Natts_pg_attribute] = {0};
|
|
|
|
bool nullsAtt[Natts_pg_attribute] = {0};
|
|
|
|
bool replacesAtt[Natts_pg_attribute] = {0};
|
2002-08-02 20:15:10 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Grab an exclusive lock on the target table, which we will NOT release
|
|
|
|
* until end of transaction. (In the simple case where we are directly
|
2019-07-01 03:00:23 +02:00
|
|
|
* dropping this column, ATExecDropColumn already did this ... but when
|
|
|
|
* cascading from a drop of some other object, we may not have any lock.)
|
2002-08-02 20:15:10 +02:00
|
|
|
*/
|
2002-08-29 02:17:06 +02:00
|
|
|
rel = relation_open(relid, AccessExclusiveLock);
|
2002-08-02 20:15:10 +02:00
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
attr_rel = table_open(AttributeRelationId, RowExclusiveLock);
|
2002-08-02 20:15:10 +02:00
|
|
|
|
2010-02-14 19:42:19 +01:00
|
|
|
tuple = SearchSysCacheCopy2(ATTNUM,
|
|
|
|
ObjectIdGetDatum(relid),
|
|
|
|
Int16GetDatum(attnum));
|
2002-08-02 20:15:10 +02:00
|
|
|
if (!HeapTupleIsValid(tuple)) /* shouldn't happen */
|
2003-07-21 03:59:11 +02:00
|
|
|
elog(ERROR, "cache lookup failed for attribute %d of relation %u",
|
2002-08-02 20:15:10 +02:00
|
|
|
attnum, relid);
|
|
|
|
attStruct = (Form_pg_attribute) GETSTRUCT(tuple);
|
|
|
|
|
2023-07-12 16:12:34 +02:00
|
|
|
/* Mark the attribute as dropped */
|
|
|
|
attStruct->attisdropped = true;
|
2003-05-12 02:17:03 +02:00
|
|
|
|
2023-07-12 16:12:34 +02:00
|
|
|
/*
|
|
|
|
* Set the type OID to invalid. A dropped attribute's type link cannot be
|
|
|
|
* relied on (once the attribute is dropped, the type might be too).
|
|
|
|
* Fortunately we do not need the type row --- the only really essential
|
|
|
|
* information is the type's typlen and typalign, which are preserved in
|
|
|
|
* the attribute's attlen and attalign. We set atttypid to zero here as a
|
|
|
|
* means of catching code that incorrectly expects it to be valid.
|
|
|
|
*/
|
|
|
|
attStruct->atttypid = InvalidOid;
|
2002-08-02 20:15:10 +02:00
|
|
|
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
/* Remove any not-null constraint the column may have */
|
2023-07-12 16:12:34 +02:00
|
|
|
attStruct->attnotnull = false;
|
2002-08-02 20:15:10 +02:00
|
|
|
|
2023-07-12 16:12:34 +02:00
|
|
|
/* Unset this so no one tries to look up the generation expression */
|
|
|
|
attStruct->attgenerated = '\0';
|
2004-03-23 20:35:17 +01:00
|
|
|
|
2023-07-12 16:12:34 +02:00
|
|
|
/*
|
|
|
|
* Change the column name to something that isn't likely to conflict
|
|
|
|
*/
|
|
|
|
snprintf(newattname, sizeof(newattname),
|
|
|
|
"........pg.dropped.%d........", attnum);
|
|
|
|
namestrcpy(&(attStruct->attname), newattname);
|
2018-06-22 14:42:36 +02:00
|
|
|
|
2023-12-22 21:44:55 +01:00
|
|
|
/* Clear the missing value */
|
|
|
|
attStruct->atthasmissing = false;
|
|
|
|
nullsAtt[Anum_pg_attribute_attmissingval - 1] = true;
|
|
|
|
replacesAtt[Anum_pg_attribute_attmissingval - 1] = true;
|
|
|
|
|
|
|
|
/*
|
2024-01-13 18:14:53 +01:00
|
|
|
* Clear the other nullable fields. This saves some space in pg_attribute
|
|
|
|
* and removes no longer useful information.
|
2023-12-22 21:44:55 +01:00
|
|
|
*/
|
2024-01-13 18:14:53 +01:00
|
|
|
nullsAtt[Anum_pg_attribute_attstattarget - 1] = true;
|
|
|
|
replacesAtt[Anum_pg_attribute_attstattarget - 1] = true;
|
2023-12-22 21:44:55 +01:00
|
|
|
nullsAtt[Anum_pg_attribute_attacl - 1] = true;
|
|
|
|
replacesAtt[Anum_pg_attribute_attacl - 1] = true;
|
|
|
|
nullsAtt[Anum_pg_attribute_attoptions - 1] = true;
|
|
|
|
replacesAtt[Anum_pg_attribute_attoptions - 1] = true;
|
|
|
|
nullsAtt[Anum_pg_attribute_attfdwoptions - 1] = true;
|
|
|
|
replacesAtt[Anum_pg_attribute_attfdwoptions - 1] = true;
|
|
|
|
|
|
|
|
tuple = heap_modify_tuple(tuple, RelationGetDescr(attr_rel),
|
|
|
|
valuesAtt, nullsAtt, replacesAtt);
|
2002-08-02 20:15:10 +02:00
|
|
|
|
2023-07-12 16:12:34 +02:00
|
|
|
CatalogTupleUpdate(attr_rel, &tuple->t_self, tuple);
|
|
|
|
|
2002-08-02 20:15:10 +02:00
|
|
|
/*
|
|
|
|
* Because updating the pg_attribute row will trigger a relcache flush for
|
|
|
|
* the target relation, we need not do anything else to notify other
|
|
|
|
* backends of the change.
|
|
|
|
*/
|
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(attr_rel, RowExclusiveLock);
|
2002-08-02 20:15:10 +02:00
|
|
|
|
2023-07-12 16:12:34 +02:00
|
|
|
RemoveStatistics(relid, attnum);
|
2003-05-12 02:17:03 +02:00
|
|
|
|
2002-08-29 02:17:06 +02:00
|
|
|
relation_close(rel, NoLock);
|
2002-08-02 20:15:10 +02:00
|
|
|
}
|
|
|
|
|
2004-08-28 23:05:26 +02:00
|
|
|
/*
|
|
|
|
* heap_drop_with_catalog - removes specified relation from catalogs
|
2001-06-18 18:13:21 +02:00
|
|
|
*
|
2002-07-12 20:43:19 +02:00
|
|
|
* Note that this routine is not responsible for dropping objects that are
|
|
|
|
* linked to the pg_class entry via dependencies (for example, indexes and
|
|
|
|
* constraints). Those are deleted by the dependency-tracing logic in
|
|
|
|
* dependency.c before control gets here. In general, therefore, this routine
|
|
|
|
* should never be called directly; go through performDeletion() instead.
|
1996-11-06 08:31:26 +01:00
|
|
|
*/
|
|
|
|
void
|
2004-08-28 23:05:26 +02:00
|
|
|
heap_drop_with_catalog(Oid relid)
|
1996-11-06 08:31:26 +01:00
|
|
|
{
|
1998-08-19 04:04:17 +02:00
|
|
|
Relation rel;
|
2017-04-11 15:08:36 +02:00
|
|
|
HeapTuple tuple;
|
Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.
Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.
Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 23:28:04 +02:00
|
|
|
Oid parentOid = InvalidOid,
|
|
|
|
defaultPartOid = InvalidOid;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1996-11-06 08:31:26 +01:00
|
|
|
/*
|
2017-04-11 15:08:36 +02:00
|
|
|
* To drop a partition safely, we must grab exclusive lock on its parent,
|
|
|
|
* because another backend might be about to execute a query on the parent
|
|
|
|
* table. If it relies on previously cached partition descriptor, then it
|
|
|
|
* could attempt to access the just-dropped relation as its partition. We
|
|
|
|
* must therefore take a table lock strong enough to prevent all queries
|
|
|
|
* on the table from proceeding until we commit and send out a
|
2018-03-21 16:03:35 +01:00
|
|
|
* shared-cache-inval notice that will make them update their partition
|
|
|
|
* descriptors.
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
*/
|
2017-04-11 15:08:36 +02:00
|
|
|
tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
|
2017-11-28 01:22:08 +01:00
|
|
|
if (!HeapTupleIsValid(tuple))
|
|
|
|
elog(ERROR, "cache lookup failed for relation %u", relid);
|
2017-04-11 15:08:36 +02:00
|
|
|
if (((Form_pg_class) GETSTRUCT(tuple))->relispartition)
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
{
|
ALTER TABLE ... DETACH PARTITION ... CONCURRENTLY
Allow a partition be detached from its partitioned table without
blocking concurrent queries, by running in two transactions and only
requiring ShareUpdateExclusive in the partitioned table.
Because it runs in two transactions, it cannot be used in a transaction
block. This is the main reason to use dedicated syntax: so that users
can choose to use the original mode if they need it. But also, it
doesn't work when a default partition exists (because an exclusive lock
would still need to be obtained on it, in order to change its partition
constraint.)
In case the second transaction is cancelled or a crash occurs, there's
ALTER TABLE .. DETACH PARTITION .. FINALIZE, which executes the final
steps.
The main trick to make this work is the addition of column
pg_inherits.inhdetachpending, initially false; can only be set true in
the first part of this command. Once that is committed, concurrent
transactions that use a PartitionDirectory will include or ignore
partitions so marked: in optimizer they are ignored if the row is marked
committed for the snapshot; in executor they are always included. As a
result, and because of the way PartitionDirectory caches partition
descriptors, queries that were planned before the detach will see the
rows in the detached partition and queries that are planned after the
detach, won't.
A CHECK constraint is created that duplicates the partition constraint.
This is probably not strictly necessary, and some users will prefer to
remove it afterwards, but if the partition is re-attached to a
partitioned table, the constraint needn't be rechecked.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200803234854.GA24158@alvherre.pgsql
2021-03-25 22:00:28 +01:00
|
|
|
/*
|
|
|
|
* We have to lock the parent if the partition is being detached,
|
|
|
|
* because it's possible that some query still has a partition
|
|
|
|
* descriptor that includes this partition.
|
|
|
|
*/
|
|
|
|
parentOid = get_partition_parent(relid, true);
|
2017-04-28 20:00:58 +02:00
|
|
|
LockRelationOid(parentOid, AccessExclusiveLock);
|
Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.
Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.
Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 23:28:04 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If this is not the default partition, dropping it will change the
|
|
|
|
* default partition's partition constraint, so we must lock it.
|
|
|
|
*/
|
|
|
|
defaultPartOid = get_default_partition_oid(parentOid);
|
|
|
|
if (OidIsValid(defaultPartOid) && relid != defaultPartOid)
|
|
|
|
LockRelationOid(defaultPartOid, AccessExclusiveLock);
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
}
|
|
|
|
|
2017-04-11 15:08:36 +02:00
|
|
|
ReleaseSysCache(tuple);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Open and lock the relation.
|
|
|
|
*/
|
|
|
|
rel = relation_open(relid, AccessExclusiveLock);
|
|
|
|
|
2008-11-29 01:13:21 +01:00
|
|
|
/*
|
|
|
|
* There can no longer be anyone *else* touching the relation, but we
|
2011-02-15 21:49:54 +01:00
|
|
|
* might still have open queries or cursors, or pending trigger events, in
|
|
|
|
* our own session.
|
2008-11-29 01:13:21 +01:00
|
|
|
*/
|
2011-02-15 21:49:54 +01:00
|
|
|
CheckTableNotInUse(rel, "DROP TABLE");
|
2008-11-29 01:13:21 +01:00
|
|
|
|
2011-06-08 12:47:21 +02:00
|
|
|
/*
|
|
|
|
* This effectively deletes all rows in the table, and may be done in a
|
|
|
|
* serializable transaction. In that case we must record a rw-conflict in
|
|
|
|
* to this transaction from each transaction holding a predicate lock on
|
|
|
|
* the table.
|
|
|
|
*/
|
|
|
|
CheckTableForSerializableConflictIn(rel);
|
|
|
|
|
2011-01-02 05:48:11 +01:00
|
|
|
/*
|
|
|
|
* Delete pg_foreign_table tuple first.
|
|
|
|
*/
|
|
|
|
if (rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
|
|
|
|
{
|
2022-10-05 10:01:41 +02:00
|
|
|
Relation ftrel;
|
|
|
|
HeapTuple fttuple;
|
2011-01-02 05:48:11 +01:00
|
|
|
|
2022-10-05 10:01:41 +02:00
|
|
|
ftrel = table_open(ForeignTableRelationId, RowExclusiveLock);
|
2011-01-02 05:48:11 +01:00
|
|
|
|
2022-10-05 10:01:41 +02:00
|
|
|
fttuple = SearchSysCache1(FOREIGNTABLEREL, ObjectIdGetDatum(relid));
|
|
|
|
if (!HeapTupleIsValid(fttuple))
|
2011-01-02 05:48:11 +01:00
|
|
|
elog(ERROR, "cache lookup failed for foreign table %u", relid);
|
|
|
|
|
2022-10-05 10:01:41 +02:00
|
|
|
CatalogTupleDelete(ftrel, &fttuple->t_self);
|
2011-01-02 05:48:11 +01:00
|
|
|
|
2022-10-05 10:01:41 +02:00
|
|
|
ReleaseSysCache(fttuple);
|
|
|
|
table_close(ftrel, RowExclusiveLock);
|
2011-01-02 05:48:11 +01:00
|
|
|
}
|
|
|
|
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
/*
|
|
|
|
* If a partitioned table, delete the pg_partitioned_table tuple.
|
|
|
|
*/
|
|
|
|
if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
|
|
|
|
RemovePartitionKeyByRelId(relid);
|
|
|
|
|
Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.
Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.
Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 23:28:04 +02:00
|
|
|
/*
|
|
|
|
* If the relation being dropped is the default partition itself,
|
|
|
|
* invalidate its entry in pg_partitioned_table.
|
|
|
|
*/
|
|
|
|
if (relid == defaultPartOid)
|
|
|
|
update_default_partition_oid(parentOid, InvalidOid);
|
|
|
|
|
1996-11-06 08:31:26 +01:00
|
|
|
/*
|
2008-08-11 13:05:11 +02:00
|
|
|
* Schedule unlinking of the relation's physical files at commit.
|
1996-11-06 08:31:26 +01:00
|
|
|
*/
|
2020-06-12 08:51:16 +02:00
|
|
|
if (RELKIND_HAS_STORAGE(rel->rd_rel->relkind))
|
2008-11-19 11:34:52 +01:00
|
|
|
RelationDropStorage(rel);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
pgstat: scaffolding for transactional stats creation / drop.
One problematic part of the current statistics collector design is that there
is no reliable way of getting rid of statistics entries. Because of that
pgstat_vacuum_stat() (called by [auto-]vacuum) matches all stats for the
current database with the catalog contents and tries to drop now-superfluous
entries. That's quite expensive. What's worse, it doesn't work on physical
replicas, despite physical replicas collection statistics entries.
This commit introduces infrastructure to create / drop statistics entries
transactionally, together with the underlying catalog objects (functions,
relations, subscriptions). pgstat_xact.c maintains a list of stats entries
created / dropped transactionally in the current transaction. To ensure the
removal of statistics entries is durable dropped statistics entries are
included in commit / abort (and prepare) records, which also ensures that
stats entries are dropped on standbys.
Statistics entries created separately from creating the underlying catalog
object (e.g. when stats were previously lost due to an immediate restart)
are *not* WAL logged. However that can only happen outside of the transaction
creating the catalog object, so it does not lead to "leaked" statistics
entries.
For this to work, functions creating / dropping functions / relations /
subscriptions need to call into pgstat. For subscriptions this was already
done when dropping subscriptions, via pgstat_report_subscription_drop() (now
renamed to pgstat_drop_subscription()).
This commit does not actually drop stats yet, it just provides the
infrastructure. It is however a largely independent piece of infrastructure,
so committing it separately makes sense.
Bumps XLOG_PAGE_MAGIC.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-07 03:22:22 +02:00
|
|
|
/* ensure that stats are dropped if transaction commits */
|
|
|
|
pgstat_drop_relation(rel);
|
|
|
|
|
1996-11-06 08:31:26 +01:00
|
|
|
/*
|
2004-08-28 23:05:26 +02:00
|
|
|
* Close relcache entry, but *keep* AccessExclusiveLock on the relation
|
|
|
|
* until transaction commit. This ensures no one else will try to do
|
|
|
|
* something with the doomed relation.
|
1996-11-06 08:31:26 +01:00
|
|
|
*/
|
2004-08-28 23:05:26 +02:00
|
|
|
relation_close(rel, NoLock);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2017-03-23 13:36:36 +01:00
|
|
|
/*
|
|
|
|
* Remove any associated relation synchronization states.
|
|
|
|
*/
|
|
|
|
RemoveSubscriptionRel(InvalidOid, relid);
|
|
|
|
|
1996-11-06 08:31:26 +01:00
|
|
|
/*
|
2004-08-28 23:05:26 +02:00
|
|
|
* Forget any ON COMMIT action for the rel
|
1996-11-06 08:31:26 +01:00
|
|
|
*/
|
2004-08-28 23:05:26 +02:00
|
|
|
remove_on_commit_action(relid);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1996-11-06 08:31:26 +01:00
|
|
|
/*
|
2004-08-28 23:05:26 +02:00
|
|
|
* Flush the relation from the relcache. We want to do this before
|
|
|
|
* starting to remove catalog entries, just to be certain that no relcache
|
|
|
|
* entry rebuild will happen partway through. (That should not really
|
|
|
|
* matter, since we don't do CommandCounterIncrement here, but let's be
|
|
|
|
* safe.)
|
1996-11-06 08:31:26 +01:00
|
|
|
*/
|
2004-08-28 23:05:26 +02:00
|
|
|
RelationForgetRelation(relid);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2002-11-11 23:19:25 +01:00
|
|
|
/*
|
2004-08-28 23:05:26 +02:00
|
|
|
* remove inheritance information
|
2002-11-11 23:19:25 +01:00
|
|
|
*/
|
2004-08-28 23:05:26 +02:00
|
|
|
RelationRemoveInheritance(relid);
|
2002-11-11 23:19:25 +01:00
|
|
|
|
1996-11-06 08:31:26 +01:00
|
|
|
/*
|
2004-08-28 23:05:26 +02:00
|
|
|
* delete statistics
|
1996-11-06 08:31:26 +01:00
|
|
|
*/
|
2004-08-28 23:05:26 +02:00
|
|
|
RemoveStatistics(relid, 0);
|
2000-07-04 01:10:14 +02:00
|
|
|
|
1999-09-18 21:08:25 +02:00
|
|
|
/*
|
2004-08-28 23:05:26 +02:00
|
|
|
* delete attribute tuples
|
1999-09-18 21:08:25 +02:00
|
|
|
*/
|
2004-08-28 23:05:26 +02:00
|
|
|
DeleteAttributeTuples(relid);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1999-09-05 19:43:47 +02:00
|
|
|
/*
|
2004-08-28 23:05:26 +02:00
|
|
|
* delete relation tuple
|
1999-09-05 19:43:47 +02:00
|
|
|
*/
|
2004-08-28 23:05:26 +02:00
|
|
|
DeleteRelationTuple(relid);
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
|
2017-04-28 20:00:58 +02:00
|
|
|
if (OidIsValid(parentOid))
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
{
|
Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.
Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.
Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 23:28:04 +02:00
|
|
|
/*
|
|
|
|
* If this is not the default partition, the partition constraint of
|
|
|
|
* the default partition has changed to include the portion of the key
|
|
|
|
* space previously covered by the dropped partition.
|
|
|
|
*/
|
|
|
|
if (OidIsValid(defaultPartOid) && relid != defaultPartOid)
|
|
|
|
CacheInvalidateRelcacheByRelid(defaultPartOid);
|
|
|
|
|
2016-12-13 16:54:52 +01:00
|
|
|
/*
|
|
|
|
* Invalidate the parent's relcache so that the partition is no longer
|
|
|
|
* included in its partition descriptor.
|
|
|
|
*/
|
2017-04-28 20:00:58 +02:00
|
|
|
CacheInvalidateRelcacheByRelid(parentOid);
|
|
|
|
/* keep the lock */
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
}
|
1996-11-06 08:31:26 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
|
2018-03-28 02:13:52 +02:00
|
|
|
/*
|
|
|
|
* RelationClearMissing
|
|
|
|
*
|
|
|
|
* Set atthasmissing and attmissingval to false/null for all attributes
|
|
|
|
* where they are currently set. This can be safely and usefully done if
|
|
|
|
* the table is rewritten (e.g. by VACUUM FULL or CLUSTER) where we know there
|
|
|
|
* are no rows left with less than a full complement of attributes.
|
|
|
|
*
|
|
|
|
* The caller must have an AccessExclusive lock on the relation.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
RelationClearMissing(Relation rel)
|
|
|
|
{
|
|
|
|
Relation attr_rel;
|
|
|
|
Oid relid = RelationGetRelid(rel);
|
|
|
|
int natts = RelationGetNumberOfAttributes(rel);
|
|
|
|
int attnum;
|
|
|
|
Datum repl_val[Natts_pg_attribute];
|
|
|
|
bool repl_null[Natts_pg_attribute];
|
|
|
|
bool repl_repl[Natts_pg_attribute];
|
|
|
|
Form_pg_attribute attrtuple;
|
|
|
|
HeapTuple tuple,
|
|
|
|
newtuple;
|
|
|
|
|
|
|
|
memset(repl_val, 0, sizeof(repl_val));
|
|
|
|
memset(repl_null, false, sizeof(repl_null));
|
|
|
|
memset(repl_repl, false, sizeof(repl_repl));
|
|
|
|
|
|
|
|
repl_val[Anum_pg_attribute_atthasmissing - 1] = BoolGetDatum(false);
|
|
|
|
repl_null[Anum_pg_attribute_attmissingval - 1] = true;
|
|
|
|
|
|
|
|
repl_repl[Anum_pg_attribute_atthasmissing - 1] = true;
|
|
|
|
repl_repl[Anum_pg_attribute_attmissingval - 1] = true;
|
|
|
|
|
|
|
|
|
|
|
|
/* Get a lock on pg_attribute */
|
2019-01-21 19:32:19 +01:00
|
|
|
attr_rel = table_open(AttributeRelationId, RowExclusiveLock);
|
2018-03-28 02:13:52 +02:00
|
|
|
|
|
|
|
/* process each non-system attribute, including any dropped columns */
|
|
|
|
for (attnum = 1; attnum <= natts; attnum++)
|
|
|
|
{
|
|
|
|
tuple = SearchSysCache2(ATTNUM,
|
|
|
|
ObjectIdGetDatum(relid),
|
|
|
|
Int16GetDatum(attnum));
|
|
|
|
if (!HeapTupleIsValid(tuple)) /* shouldn't happen */
|
|
|
|
elog(ERROR, "cache lookup failed for attribute %d of relation %u",
|
|
|
|
attnum, relid);
|
|
|
|
|
|
|
|
attrtuple = (Form_pg_attribute) GETSTRUCT(tuple);
|
|
|
|
|
|
|
|
/* ignore any where atthasmissing is not true */
|
|
|
|
if (attrtuple->atthasmissing)
|
|
|
|
{
|
|
|
|
newtuple = heap_modify_tuple(tuple, RelationGetDescr(attr_rel),
|
|
|
|
repl_val, repl_null, repl_repl);
|
|
|
|
|
|
|
|
CatalogTupleUpdate(attr_rel, &newtuple->t_self, newtuple);
|
|
|
|
|
|
|
|
heap_freetuple(newtuple);
|
|
|
|
}
|
|
|
|
|
|
|
|
ReleaseSysCache(tuple);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Our update of the pg_attribute rows will force a relcache rebuild, so
|
|
|
|
* there's nothing else to do here.
|
|
|
|
*/
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(attr_rel, RowExclusiveLock);
|
2018-03-28 02:13:52 +02:00
|
|
|
}
|
|
|
|
|
2018-06-22 14:42:36 +02:00
|
|
|
/*
|
|
|
|
* SetAttrMissing
|
|
|
|
*
|
|
|
|
* Set the missing value of a single attribute. This should only be used by
|
|
|
|
* binary upgrade. Takes an AccessExclusive lock on the relation owning the
|
|
|
|
* attribute.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
SetAttrMissing(Oid relid, char *attname, char *value)
|
|
|
|
{
|
2022-07-16 08:42:15 +02:00
|
|
|
Datum valuesAtt[Natts_pg_attribute] = {0};
|
|
|
|
bool nullsAtt[Natts_pg_attribute] = {0};
|
|
|
|
bool replacesAtt[Natts_pg_attribute] = {0};
|
2018-06-22 14:42:36 +02:00
|
|
|
Datum missingval;
|
|
|
|
Form_pg_attribute attStruct;
|
|
|
|
Relation attrrel,
|
|
|
|
tablerel;
|
|
|
|
HeapTuple atttup,
|
|
|
|
newtup;
|
|
|
|
|
|
|
|
/* lock the table the attribute belongs to */
|
2019-01-21 19:32:19 +01:00
|
|
|
tablerel = table_open(relid, AccessExclusiveLock);
|
2018-06-22 14:42:36 +02:00
|
|
|
|
2021-06-18 12:51:12 +02:00
|
|
|
/* Don't do anything unless it's a plain table */
|
|
|
|
if (tablerel->rd_rel->relkind != RELKIND_RELATION)
|
|
|
|
{
|
|
|
|
table_close(tablerel, AccessExclusiveLock);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2018-06-22 14:42:36 +02:00
|
|
|
/* Lock the attribute row and get the data */
|
2019-01-21 19:32:19 +01:00
|
|
|
attrrel = table_open(AttributeRelationId, RowExclusiveLock);
|
2018-06-22 14:42:36 +02:00
|
|
|
atttup = SearchSysCacheAttName(relid, attname);
|
|
|
|
if (!HeapTupleIsValid(atttup))
|
|
|
|
elog(ERROR, "cache lookup failed for attribute %s of relation %u",
|
|
|
|
attname, relid);
|
|
|
|
attStruct = (Form_pg_attribute) GETSTRUCT(atttup);
|
|
|
|
|
|
|
|
/* get an array value from the value string */
|
|
|
|
missingval = OidFunctionCall3(F_ARRAY_IN,
|
|
|
|
CStringGetDatum(value),
|
|
|
|
ObjectIdGetDatum(attStruct->atttypid),
|
|
|
|
Int32GetDatum(attStruct->atttypmod));
|
|
|
|
|
|
|
|
/* update the tuple - set atthasmissing and attmissingval */
|
|
|
|
valuesAtt[Anum_pg_attribute_atthasmissing - 1] = BoolGetDatum(true);
|
|
|
|
replacesAtt[Anum_pg_attribute_atthasmissing - 1] = true;
|
|
|
|
valuesAtt[Anum_pg_attribute_attmissingval - 1] = missingval;
|
|
|
|
replacesAtt[Anum_pg_attribute_attmissingval - 1] = true;
|
|
|
|
|
|
|
|
newtup = heap_modify_tuple(atttup, RelationGetDescr(attrrel),
|
|
|
|
valuesAtt, nullsAtt, replacesAtt);
|
|
|
|
CatalogTupleUpdate(attrrel, &newtup->t_self, newtup);
|
|
|
|
|
|
|
|
/* clean up */
|
|
|
|
ReleaseSysCache(atttup);
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(attrrel, RowExclusiveLock);
|
|
|
|
table_close(tablerel, AccessExclusiveLock);
|
2018-06-22 14:42:36 +02:00
|
|
|
}
|
|
|
|
|
1999-10-04 01:55:40 +02:00
|
|
|
/*
|
2002-07-16 07:53:34 +02:00
|
|
|
* Store a check-constraint expression for the given relation.
|
1999-10-04 01:55:40 +02:00
|
|
|
*
|
|
|
|
* Caller is responsible for updating the count of constraints
|
|
|
|
* in the pg_class entry for the relation.
|
2015-03-25 21:17:56 +01:00
|
|
|
*
|
|
|
|
* The OID of the new constraint is returned.
|
1999-10-04 01:55:40 +02:00
|
|
|
*/
|
2015-03-25 21:17:56 +01:00
|
|
|
static Oid
|
2017-10-31 15:34:31 +01:00
|
|
|
StoreRelCheck(Relation rel, const char *ccname, Node *expr,
|
2012-04-21 04:46:20 +02:00
|
|
|
bool is_validated, bool is_local, int inhcount,
|
2013-03-18 03:55:14 +01:00
|
|
|
bool is_no_inherit, bool is_internal)
|
1997-08-22 04:58:51 +02:00
|
|
|
{
|
2008-05-10 01:32:05 +02:00
|
|
|
char *ccbin;
|
2002-07-12 20:43:19 +02:00
|
|
|
List *varList;
|
|
|
|
int keycount;
|
|
|
|
int16 *attNos;
|
2015-03-25 21:17:56 +01:00
|
|
|
Oid constrOid;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1998-11-12 16:39:06 +01:00
|
|
|
/*
|
2008-05-10 01:32:05 +02:00
|
|
|
* Flatten expression to string form for storage.
|
1998-11-12 16:39:06 +01:00
|
|
|
*/
|
2008-05-10 01:32:05 +02:00
|
|
|
ccbin = nodeToString(expr);
|
2000-04-12 19:17:23 +02:00
|
|
|
|
2002-07-12 20:43:19 +02:00
|
|
|
/*
|
2008-05-10 01:32:05 +02:00
|
|
|
* Find columns of rel that are used in expr
|
2002-07-16 07:53:34 +02:00
|
|
|
*
|
|
|
|
* NB: pull_var_clause is okay here only because we don't allow subselects
|
|
|
|
* in check constraints; it would fail to examine the contents of
|
|
|
|
* subselects.
|
2002-07-12 20:43:19 +02:00
|
|
|
*/
|
2016-03-10 21:52:58 +01:00
|
|
|
varList = pull_var_clause(expr, 0);
|
2004-05-26 06:41:50 +02:00
|
|
|
keycount = list_length(varList);
|
2002-07-12 20:43:19 +02:00
|
|
|
|
|
|
|
if (keycount > 0)
|
|
|
|
{
|
2004-05-26 06:41:50 +02:00
|
|
|
ListCell *vl;
|
2002-07-12 20:43:19 +02:00
|
|
|
int i = 0;
|
|
|
|
|
|
|
|
attNos = (int16 *) palloc(keycount * sizeof(int16));
|
|
|
|
foreach(vl, varList)
|
|
|
|
{
|
|
|
|
Var *var = (Var *) lfirst(vl);
|
|
|
|
int j;
|
|
|
|
|
|
|
|
for (j = 0; j < i; j++)
|
|
|
|
if (attNos[j] == var->varattno)
|
|
|
|
break;
|
|
|
|
if (j == i)
|
|
|
|
attNos[i++] = var->varattno;
|
|
|
|
}
|
|
|
|
keycount = i;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
attNos = NULL;
|
|
|
|
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
/*
|
|
|
|
* Partitioned tables do not contain any rows themselves, so a NO INHERIT
|
|
|
|
* constraint makes no sense.
|
|
|
|
*/
|
|
|
|
if (is_no_inherit &&
|
|
|
|
rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
|
2017-01-24 16:20:02 +01:00
|
|
|
errmsg("cannot add NO INHERIT constraint to partitioned table \"%s\"",
|
|
|
|
RelationGetRelationName(rel))));
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
|
2002-07-12 20:43:19 +02:00
|
|
|
/*
|
|
|
|
* Create the Check Constraint
|
|
|
|
*/
|
2015-03-25 21:17:56 +01:00
|
|
|
constrOid =
|
|
|
|
CreateConstraintEntry(ccname, /* Constraint Name */
|
|
|
|
RelationGetNamespace(rel), /* namespace */
|
|
|
|
CONSTRAINT_CHECK, /* Constraint Type */
|
|
|
|
false, /* Is Deferrable */
|
|
|
|
false, /* Is Deferred */
|
|
|
|
is_validated,
|
2018-03-23 14:48:22 +01:00
|
|
|
InvalidOid, /* no parent constraint */
|
2015-03-25 21:17:56 +01:00
|
|
|
RelationGetRelid(rel), /* relation */
|
|
|
|
attNos, /* attrs in the constraint */
|
2018-04-07 22:00:39 +02:00
|
|
|
keycount, /* # key attrs in the constraint */
|
|
|
|
keycount, /* # total attrs in the constraint */
|
2015-03-25 21:17:56 +01:00
|
|
|
InvalidOid, /* not a domain constraint */
|
|
|
|
InvalidOid, /* no associated index */
|
|
|
|
InvalidOid, /* Foreign key fields */
|
|
|
|
NULL,
|
|
|
|
NULL,
|
|
|
|
NULL,
|
|
|
|
NULL,
|
|
|
|
0,
|
|
|
|
' ',
|
|
|
|
' ',
|
2021-12-08 11:09:44 +01:00
|
|
|
NULL,
|
|
|
|
0,
|
2015-03-25 21:17:56 +01:00
|
|
|
' ',
|
|
|
|
NULL, /* not an exclusion constraint */
|
|
|
|
expr, /* Tree form of check constraint */
|
|
|
|
ccbin, /* Binary form of check constraint */
|
|
|
|
is_local, /* conislocal */
|
|
|
|
inhcount, /* coninhcount */
|
|
|
|
is_no_inherit, /* connoinherit */
|
2024-01-24 15:43:41 +01:00
|
|
|
false, /* conwithoutoverlaps */
|
2015-03-25 21:17:56 +01:00
|
|
|
is_internal); /* internally constructed? */
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2008-05-10 01:32:05 +02:00
|
|
|
pfree(ccbin);
|
2015-03-25 21:17:56 +01:00
|
|
|
|
|
|
|
return constrOid;
|
1997-08-22 16:10:26 +02:00
|
|
|
}
|
|
|
|
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
/*
|
|
|
|
* Store a not-null constraint for the given relation
|
|
|
|
*
|
|
|
|
* The OID of the new constraint is returned.
|
|
|
|
*/
|
|
|
|
static Oid
|
|
|
|
StoreRelNotNull(Relation rel, const char *nnname, AttrNumber attnum,
|
|
|
|
bool is_validated, bool is_local, int inhcount,
|
|
|
|
bool is_no_inherit)
|
|
|
|
{
|
|
|
|
Oid constrOid;
|
|
|
|
|
|
|
|
constrOid =
|
|
|
|
CreateConstraintEntry(nnname,
|
|
|
|
RelationGetNamespace(rel),
|
|
|
|
CONSTRAINT_NOTNULL,
|
|
|
|
false,
|
|
|
|
false,
|
|
|
|
is_validated,
|
|
|
|
InvalidOid,
|
|
|
|
RelationGetRelid(rel),
|
|
|
|
&attnum,
|
|
|
|
1,
|
|
|
|
1,
|
|
|
|
InvalidOid, /* not a domain constraint */
|
|
|
|
InvalidOid, /* no associated index */
|
|
|
|
InvalidOid, /* Foreign key fields */
|
|
|
|
NULL,
|
|
|
|
NULL,
|
|
|
|
NULL,
|
|
|
|
NULL,
|
|
|
|
0,
|
|
|
|
' ',
|
|
|
|
' ',
|
|
|
|
NULL,
|
|
|
|
0,
|
|
|
|
' ',
|
|
|
|
NULL, /* not an exclusion constraint */
|
|
|
|
NULL,
|
|
|
|
NULL,
|
|
|
|
is_local,
|
|
|
|
inhcount,
|
|
|
|
is_no_inherit,
|
2024-01-24 15:43:41 +01:00
|
|
|
false, /* conwithoutoverlaps */
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
false);
|
|
|
|
return constrOid;
|
|
|
|
}
|
|
|
|
|
1999-10-04 01:55:40 +02:00
|
|
|
/*
|
2008-05-10 01:32:05 +02:00
|
|
|
* Store defaults and constraints (passed as a list of CookedConstraint).
|
1999-10-04 01:55:40 +02:00
|
|
|
*
|
2015-03-25 21:17:56 +01:00
|
|
|
* Each CookedConstraint struct is modified to store the new catalog tuple OID.
|
|
|
|
*
|
1999-10-04 01:55:40 +02:00
|
|
|
* NOTE: only pre-cooked expressions will be passed this way, which is to
|
|
|
|
* say expressions inherited from an existing relation. Newly parsed
|
|
|
|
* expressions can be added later, by direct calls to StoreAttrDefault
|
2008-05-10 01:32:05 +02:00
|
|
|
* and StoreRelCheck (see AddRelationNewConstraints()).
|
1999-10-04 01:55:40 +02:00
|
|
|
*/
|
1997-08-22 16:10:26 +02:00
|
|
|
static void
|
2013-03-18 03:55:14 +01:00
|
|
|
StoreConstraints(Relation rel, List *cooked_constraints, bool is_internal)
|
1997-08-22 16:10:26 +02:00
|
|
|
{
|
2008-05-10 01:32:05 +02:00
|
|
|
int numchecks = 0;
|
|
|
|
ListCell *lc;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2015-03-25 21:17:56 +01:00
|
|
|
if (cooked_constraints == NIL)
|
2002-03-03 18:47:56 +01:00
|
|
|
return; /* nothing to do */
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2000-01-16 20:57:00 +01:00
|
|
|
/*
|
2002-03-03 18:47:56 +01:00
|
|
|
* Deparsing of constraint expressions will fail unless the just-created
|
2000-01-16 20:57:00 +01:00
|
|
|
* pg_attribute tuples for this relation are made visible. So, bump the
|
2002-03-03 18:47:56 +01:00
|
|
|
* command counter. CAUTION: this will cause a relcache entry rebuild.
|
2000-01-16 20:57:00 +01:00
|
|
|
*/
|
|
|
|
CommandCounterIncrement();
|
|
|
|
|
2008-05-10 01:32:05 +02:00
|
|
|
foreach(lc, cooked_constraints)
|
|
|
|
{
|
|
|
|
CookedConstraint *con = (CookedConstraint *) lfirst(lc);
|
1999-10-04 01:55:40 +02:00
|
|
|
|
2008-05-10 01:32:05 +02:00
|
|
|
switch (con->contype)
|
|
|
|
{
|
|
|
|
case CONSTR_DEFAULT:
|
2015-03-25 21:17:56 +01:00
|
|
|
con->conoid = StoreAttrDefault(rel, con->attnum, con->expr,
|
2018-03-28 02:13:52 +02:00
|
|
|
is_internal, false);
|
2008-05-10 01:32:05 +02:00
|
|
|
break;
|
|
|
|
case CONSTR_CHECK:
|
2015-03-25 21:17:56 +01:00
|
|
|
con->conoid =
|
|
|
|
StoreRelCheck(rel, con->name, con->expr,
|
|
|
|
!con->skip_validation, con->is_local,
|
|
|
|
con->inhcount, con->is_no_inherit,
|
|
|
|
is_internal);
|
2008-05-10 01:32:05 +02:00
|
|
|
numchecks++;
|
|
|
|
break;
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
|
|
|
|
case CONSTR_NOTNULL:
|
|
|
|
con->conoid =
|
|
|
|
StoreRelNotNull(rel, con->name, con->attnum,
|
|
|
|
!con->skip_validation, con->is_local,
|
|
|
|
con->inhcount, con->is_no_inherit);
|
|
|
|
break;
|
|
|
|
|
2008-05-10 01:32:05 +02:00
|
|
|
default:
|
|
|
|
elog(ERROR, "unrecognized constraint type: %d",
|
|
|
|
(int) con->contype);
|
|
|
|
}
|
|
|
|
}
|
2002-03-03 18:47:56 +01:00
|
|
|
|
2008-05-10 01:32:05 +02:00
|
|
|
if (numchecks > 0)
|
|
|
|
SetRelationNumChecks(rel, numchecks);
|
1999-10-04 01:55:40 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2008-05-10 01:32:05 +02:00
|
|
|
* AddRelationNewConstraints
|
1999-10-04 01:55:40 +02:00
|
|
|
*
|
2008-05-10 01:32:05 +02:00
|
|
|
* Add new column default expressions and/or constraint check expressions
|
|
|
|
* to an existing relation. This is defined to do both for efficiency in
|
|
|
|
* DefineRelation, but of course you can do just one or the other by passing
|
|
|
|
* empty lists.
|
1999-10-04 01:55:40 +02:00
|
|
|
*
|
|
|
|
* rel: relation to be modified
|
2008-05-10 01:32:05 +02:00
|
|
|
* newColDefaults: list of RawColumnDefault structures
|
|
|
|
* newConstraints: list of Constraint nodes
|
2017-08-16 06:22:32 +02:00
|
|
|
* allow_merge: true if check constraints may be merged with existing ones
|
|
|
|
* is_local: true if definition is local, false if it's inherited
|
|
|
|
* is_internal: true if result of some internal process, not a user request
|
2023-08-22 12:22:03 +02:00
|
|
|
* queryString: used during expression transformation of default values and
|
|
|
|
* cooked CHECK constraints
|
1999-10-04 01:55:40 +02:00
|
|
|
*
|
2008-05-10 01:32:05 +02:00
|
|
|
* All entries in newColDefaults will be processed. Entries in newConstraints
|
|
|
|
* will be processed only if they are CONSTR_CHECK type.
|
1999-10-04 01:55:40 +02:00
|
|
|
*
|
2004-05-05 06:48:48 +02:00
|
|
|
* Returns a list of CookedConstraint nodes that shows the cooked form of
|
|
|
|
* the default and constraint expressions added to the relation.
|
|
|
|
*
|
ALTER TABLE ... DETACH PARTITION ... CONCURRENTLY
Allow a partition be detached from its partitioned table without
blocking concurrent queries, by running in two transactions and only
requiring ShareUpdateExclusive in the partitioned table.
Because it runs in two transactions, it cannot be used in a transaction
block. This is the main reason to use dedicated syntax: so that users
can choose to use the original mode if they need it. But also, it
doesn't work when a default partition exists (because an exclusive lock
would still need to be obtained on it, in order to change its partition
constraint.)
In case the second transaction is cancelled or a crash occurs, there's
ALTER TABLE .. DETACH PARTITION .. FINALIZE, which executes the final
steps.
The main trick to make this work is the addition of column
pg_inherits.inhdetachpending, initially false; can only be set true in
the first part of this command. Once that is committed, concurrent
transactions that use a PartitionDirectory will include or ignore
partitions so marked: in optimizer they are ignored if the row is marked
committed for the snapshot; in executor they are always included. As a
result, and because of the way PartitionDirectory caches partition
descriptors, queries that were planned before the detach will see the
rows in the detached partition and queries that are planned after the
detach, won't.
A CHECK constraint is created that duplicates the partition constraint.
This is probably not strictly necessary, and some users will prefer to
remove it afterwards, but if the partition is re-attached to a
partitioned table, the constraint needn't be rechecked.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200803234854.GA24158@alvherre.pgsql
2021-03-25 22:00:28 +01:00
|
|
|
* NB: caller should have opened rel with some self-conflicting lock mode,
|
|
|
|
* and should hold that lock till end of transaction; for normal cases that'll
|
|
|
|
* be AccessExclusiveLock, but if caller knows that the constraint is already
|
|
|
|
* enforced by some other means, it can be ShareUpdateExclusiveLock. Also, we
|
|
|
|
* assume the caller has done a CommandCounterIncrement if necessary to make
|
|
|
|
* the relation's catalog tuples visible.
|
1999-10-04 01:55:40 +02:00
|
|
|
*/
|
2004-05-05 06:48:48 +02:00
|
|
|
List *
|
2008-05-10 01:32:05 +02:00
|
|
|
AddRelationNewConstraints(Relation rel,
|
|
|
|
List *newColDefaults,
|
|
|
|
List *newConstraints,
|
|
|
|
bool allow_merge,
|
2013-03-18 03:55:14 +01:00
|
|
|
bool is_local,
|
2018-08-22 08:42:49 +02:00
|
|
|
bool is_internal,
|
|
|
|
const char *queryString)
|
1999-10-04 01:55:40 +02:00
|
|
|
{
|
2004-05-05 06:48:48 +02:00
|
|
|
List *cookedConstraints = NIL;
|
1999-10-04 01:55:40 +02:00
|
|
|
TupleDesc tupleDesc;
|
|
|
|
TupleConstr *oldconstr;
|
|
|
|
int numoldchecks;
|
|
|
|
ParseState *pstate;
|
Make parser rely more heavily on the ParseNamespaceItem data structure.
When I added the ParseNamespaceItem data structure (in commit 5ebaaa494),
it wasn't very tightly integrated into the parser's APIs. In the wake of
adding p_rtindex to that struct (commit b541e9acc), there is a good reason
to make more use of it: by passing around ParseNamespaceItem pointers
instead of bare RTE pointers, we can get rid of various messy methods for
passing back or deducing the rangetable index of an RTE during parsing.
Hence, refactor the addRangeTableEntryXXX functions to build and return
a ParseNamespaceItem struct, not just the RTE proper; and replace
addRTEtoQuery with addNSItemToQuery, which is passed a ParseNamespaceItem
rather than building one internally.
Also, add per-column data (a ParseNamespaceColumn array) to each
ParseNamespaceItem. These arrays are built during addRangeTableEntryXXX,
where we have column type data at hand so that it's nearly free to fill
the data structure. Later, when we need to build Vars referencing RTEs,
we can use the ParseNamespaceColumn info to avoid the rather expensive
operations done in get_rte_attribute_type() or expandRTE().
get_rte_attribute_type() is indeed dead code now, so I've removed it.
This makes for a useful improvement in parse analysis speed, around 20%
in one moderately-complex test query.
The ParseNamespaceColumn structs also include Var identity information
(varno/varattno). That info isn't actually being used in this patch,
except that p_varno == 0 is a handy test for a dropped column.
A follow-on patch will make more use of it.
Discussion: https://postgr.es/m/2461.1577764221@sss.pgh.pa.us
2020-01-02 17:29:01 +01:00
|
|
|
ParseNamespaceItem *nsitem;
|
1999-10-04 01:55:40 +02:00
|
|
|
int numchecks;
|
2004-06-10 19:56:03 +02:00
|
|
|
List *checknames;
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
List *nnnames;
|
2004-05-26 06:41:50 +02:00
|
|
|
ListCell *cell;
|
2002-03-20 20:45:13 +01:00
|
|
|
Node *expr;
|
2004-05-05 06:48:48 +02:00
|
|
|
CookedConstraint *cooked;
|
2002-03-19 03:18:25 +01:00
|
|
|
|
1999-10-04 01:55:40 +02:00
|
|
|
/*
|
|
|
|
* Get info about existing constraints.
|
|
|
|
*/
|
|
|
|
tupleDesc = RelationGetDescr(rel);
|
|
|
|
oldconstr = tupleDesc->constr;
|
|
|
|
if (oldconstr)
|
|
|
|
numoldchecks = oldconstr->num_check;
|
|
|
|
else
|
|
|
|
numoldchecks = 0;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Create a dummy ParseState and insert the target relation as its sole
|
|
|
|
* rangetable entry. We need a ParseState for transformExpr.
|
|
|
|
*/
|
|
|
|
pstate = make_parsestate(NULL);
|
2018-08-22 08:42:49 +02:00
|
|
|
pstate->p_sourcetext = queryString;
|
Make parser rely more heavily on the ParseNamespaceItem data structure.
When I added the ParseNamespaceItem data structure (in commit 5ebaaa494),
it wasn't very tightly integrated into the parser's APIs. In the wake of
adding p_rtindex to that struct (commit b541e9acc), there is a good reason
to make more use of it: by passing around ParseNamespaceItem pointers
instead of bare RTE pointers, we can get rid of various messy methods for
passing back or deducing the rangetable index of an RTE during parsing.
Hence, refactor the addRangeTableEntryXXX functions to build and return
a ParseNamespaceItem struct, not just the RTE proper; and replace
addRTEtoQuery with addNSItemToQuery, which is passed a ParseNamespaceItem
rather than building one internally.
Also, add per-column data (a ParseNamespaceColumn array) to each
ParseNamespaceItem. These arrays are built during addRangeTableEntryXXX,
where we have column type data at hand so that it's nearly free to fill
the data structure. Later, when we need to build Vars referencing RTEs,
we can use the ParseNamespaceColumn info to avoid the rather expensive
operations done in get_rte_attribute_type() or expandRTE().
get_rte_attribute_type() is indeed dead code now, so I've removed it.
This makes for a useful improvement in parse analysis speed, around 20%
in one moderately-complex test query.
The ParseNamespaceColumn structs also include Var identity information
(varno/varattno). That info isn't actually being used in this patch,
except that p_varno == 0 is a handy test for a dropped column.
A follow-on patch will make more use of it.
Discussion: https://postgr.es/m/2461.1577764221@sss.pgh.pa.us
2020-01-02 17:29:01 +01:00
|
|
|
nsitem = addRangeTableEntryForRelation(pstate,
|
|
|
|
rel,
|
|
|
|
AccessShareLock,
|
|
|
|
NULL,
|
|
|
|
false,
|
|
|
|
true);
|
|
|
|
addNSItemToQuery(pstate, nsitem, true, true, true);
|
1999-10-04 01:55:40 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Process column default expressions.
|
|
|
|
*/
|
2008-05-10 01:32:05 +02:00
|
|
|
foreach(cell, newColDefaults)
|
1997-08-22 16:10:26 +02:00
|
|
|
{
|
2004-05-26 06:41:50 +02:00
|
|
|
RawColumnDefault *colDef = (RawColumnDefault *) lfirst(cell);
|
2017-08-20 20:19:07 +02:00
|
|
|
Form_pg_attribute atp = TupleDescAttr(rel->rd_att, colDef->attnum - 1);
|
2015-03-25 21:17:56 +01:00
|
|
|
Oid defOid;
|
2000-04-12 19:17:23 +02:00
|
|
|
|
2002-03-20 20:45:13 +01:00
|
|
|
expr = cookDefault(pstate, colDef->raw_default,
|
|
|
|
atp->atttypid, atp->atttypmod,
|
2019-03-30 08:13:09 +01:00
|
|
|
NameStr(atp->attname),
|
|
|
|
atp->attgenerated);
|
2004-05-05 06:48:48 +02:00
|
|
|
|
2007-10-29 20:40:40 +01:00
|
|
|
/*
|
|
|
|
* If the expression is just a NULL constant, we do not bother to make
|
|
|
|
* an explicit pg_attrdef entry, since the default behavior is
|
2019-03-30 08:13:09 +01:00
|
|
|
* equivalent. This applies to column defaults, but not for
|
|
|
|
* generation expressions.
|
2007-10-29 20:40:40 +01:00
|
|
|
*
|
|
|
|
* Note a nonobvious property of this test: if the column is of a
|
|
|
|
* domain type, what we'll get is not a bare null Const but a
|
|
|
|
* CoerceToDomain expr, so we will not discard the default. This is
|
|
|
|
* critical because the column default needs to be retained to
|
|
|
|
* override any default that the domain might have.
|
|
|
|
*/
|
|
|
|
if (expr == NULL ||
|
2019-03-30 08:13:09 +01:00
|
|
|
(!colDef->generated &&
|
|
|
|
IsA(expr, Const) &&
|
|
|
|
castNode(Const, expr)->constisnull))
|
2007-10-29 20:40:40 +01:00
|
|
|
continue;
|
|
|
|
|
2018-03-28 02:13:52 +02:00
|
|
|
/* If the DEFAULT is volatile we cannot use a missing value */
|
Ensure we preprocess expressions before checking their volatility.
contain_mutable_functions and contain_volatile_functions give
reliable answers only after expression preprocessing (specifically
eval_const_expressions). Some places understand this, but some did
not get the memo --- which is not entirely their fault, because the
problem is documented only in places far away from those functions.
Introduce wrapper functions that allow doing the right thing easily,
and add commentary in hopes of preventing future mistakes from
copy-and-paste of code that's only conditionally safe.
Two actual bugs of this ilk are fixed here. We failed to preprocess
column GENERATED expressions before checking mutability, so that the
code could fail to detect the use of a volatile function
default-argument expression, or it could reject a polymorphic function
that is actually immutable on the datatype of interest. Likewise,
column DEFAULT expressions weren't preprocessed before determining if
it's safe to apply the attmissingval mechanism. A false negative
would just result in an unnecessary table rewrite, but a false
positive could allow the attmissingval mechanism to be used in a case
where it should not be, resulting in unexpected initial values in a
new column.
In passing, re-order the steps in ComputePartitionAttrs so that its
checks for invalid column references are done before applying
expression_planner, rather than after. The previous coding would
not complain if a partition expression contains a disallowed column
reference that gets optimized away by constant folding, which seems
to me to be a behavior we do not want.
Per bug #18097 from Jim Keener. Back-patch to all supported versions.
Discussion: https://postgr.es/m/18097-ebb179674f22932f@postgresql.org
2023-11-16 16:05:14 +01:00
|
|
|
if (colDef->missingMode &&
|
|
|
|
contain_volatile_functions_after_planning((Expr *) expr))
|
2018-03-28 02:13:52 +02:00
|
|
|
colDef->missingMode = false;
|
|
|
|
|
|
|
|
defOid = StoreAttrDefault(rel, colDef->attnum, expr, is_internal,
|
|
|
|
colDef->missingMode);
|
2004-05-05 06:48:48 +02:00
|
|
|
|
|
|
|
cooked = (CookedConstraint *) palloc(sizeof(CookedConstraint));
|
|
|
|
cooked->contype = CONSTR_DEFAULT;
|
2015-03-25 21:17:56 +01:00
|
|
|
cooked->conoid = defOid;
|
2004-05-05 06:48:48 +02:00
|
|
|
cooked->name = NULL;
|
|
|
|
cooked->attnum = colDef->attnum;
|
|
|
|
cooked->expr = expr;
|
2011-06-02 00:43:50 +02:00
|
|
|
cooked->skip_validation = false;
|
2008-05-10 01:32:05 +02:00
|
|
|
cooked->is_local = is_local;
|
|
|
|
cooked->inhcount = is_local ? 0 : 1;
|
2012-04-21 04:46:20 +02:00
|
|
|
cooked->is_no_inherit = false;
|
2004-05-05 06:48:48 +02:00
|
|
|
cookedConstraints = lappend(cookedConstraints, cooked);
|
1997-08-22 16:10:26 +02:00
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1999-10-04 01:55:40 +02:00
|
|
|
/*
|
|
|
|
* Process constraint expressions.
|
|
|
|
*/
|
|
|
|
numchecks = numoldchecks;
|
2004-06-10 19:56:03 +02:00
|
|
|
checknames = NIL;
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
nnnames = NIL;
|
2008-05-10 01:32:05 +02:00
|
|
|
foreach(cell, newConstraints)
|
1997-08-22 16:10:26 +02:00
|
|
|
{
|
2004-05-26 06:41:50 +02:00
|
|
|
Constraint *cdef = (Constraint *) lfirst(cell);
|
2015-03-25 21:17:56 +01:00
|
|
|
Oid constrOid;
|
1999-10-04 01:55:40 +02:00
|
|
|
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
if (cdef->contype == CONSTR_CHECK)
|
2008-05-10 01:32:05 +02:00
|
|
|
{
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
char *ccname;
|
2000-04-12 19:17:23 +02:00
|
|
|
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
if (cdef->raw_expr != NULL)
|
|
|
|
{
|
|
|
|
Assert(cdef->cooked_expr == NULL);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Transform raw parsetree to executable expression, and
|
|
|
|
* verify it's valid as a CHECK constraint.
|
|
|
|
*/
|
|
|
|
expr = cookConstraint(pstate, cdef->raw_expr,
|
|
|
|
RelationGetRelationName(rel));
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
Assert(cdef->cooked_expr != NULL);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Here, we assume the parser will only pass us valid CHECK
|
|
|
|
* expressions, so we do no particular checking.
|
|
|
|
*/
|
|
|
|
expr = stringToNode(cdef->cooked_expr);
|
|
|
|
}
|
2000-04-12 19:17:23 +02:00
|
|
|
|
2008-05-10 01:32:05 +02:00
|
|
|
/*
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
* Check name uniqueness, or generate a name if none was given.
|
2008-05-10 01:32:05 +02:00
|
|
|
*/
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
if (cdef->conname != NULL)
|
|
|
|
{
|
|
|
|
ListCell *cell2;
|
2000-09-29 20:21:41 +02:00
|
|
|
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
ccname = cdef->conname;
|
|
|
|
/* Check against other new constraints */
|
|
|
|
/* Needed because we don't do CommandCounterIncrement in loop */
|
|
|
|
foreach(cell2, checknames)
|
|
|
|
{
|
|
|
|
if (strcmp((char *) lfirst(cell2), ccname) == 0)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_DUPLICATE_OBJECT),
|
|
|
|
errmsg("check constraint \"%s\" already exists",
|
|
|
|
ccname)));
|
|
|
|
}
|
2004-06-10 19:56:03 +02:00
|
|
|
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
/* save name for future checks */
|
|
|
|
checknames = lappend(checknames, ccname);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Check against pre-existing constraints. If we are allowed
|
|
|
|
* to merge with an existing constraint, there's no more to do
|
|
|
|
* here. (We omit the duplicate constraint from the result,
|
|
|
|
* which is what ATAddCheckConstraint wants.)
|
|
|
|
*/
|
|
|
|
if (MergeWithExistingConstraint(rel, ccname, expr,
|
|
|
|
allow_merge, is_local,
|
|
|
|
cdef->initially_valid,
|
|
|
|
cdef->is_no_inherit))
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
else
|
2004-06-10 19:56:03 +02:00
|
|
|
{
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
/*
|
|
|
|
* When generating a name, we want to create "tab_col_check"
|
|
|
|
* for a column constraint and "tab_check" for a table
|
|
|
|
* constraint. We no longer have any info about the syntactic
|
|
|
|
* positioning of the constraint phrase, so we approximate
|
|
|
|
* this by seeing whether the expression references more than
|
|
|
|
* one column. (If the user played by the rules, the result
|
|
|
|
* is the same...)
|
|
|
|
*
|
|
|
|
* Note: pull_var_clause() doesn't descend into sublinks, but
|
|
|
|
* we eliminated those above; and anyway this only needs to be
|
|
|
|
* an approximate answer.
|
|
|
|
*/
|
|
|
|
List *vars;
|
|
|
|
char *colname;
|
|
|
|
|
|
|
|
vars = pull_var_clause(expr, 0);
|
|
|
|
|
|
|
|
/* eliminate duplicates */
|
|
|
|
vars = list_union(NIL, vars);
|
|
|
|
|
|
|
|
if (list_length(vars) == 1)
|
|
|
|
colname = get_attname(RelationGetRelid(rel),
|
|
|
|
((Var *) linitial(vars))->varattno,
|
|
|
|
true);
|
|
|
|
else
|
|
|
|
colname = NULL;
|
|
|
|
|
|
|
|
ccname = ChooseConstraintName(RelationGetRelationName(rel),
|
|
|
|
colname,
|
|
|
|
"check",
|
|
|
|
RelationGetNamespace(rel),
|
|
|
|
checknames);
|
|
|
|
|
|
|
|
/* save name for future checks */
|
|
|
|
checknames = lappend(checknames, ccname);
|
2004-06-10 19:56:03 +02:00
|
|
|
}
|
2008-05-10 01:32:05 +02:00
|
|
|
|
|
|
|
/*
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
* OK, store it.
|
2008-05-10 01:32:05 +02:00
|
|
|
*/
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
constrOid =
|
|
|
|
StoreRelCheck(rel, ccname, expr, cdef->initially_valid, is_local,
|
|
|
|
is_local ? 0 : 1, cdef->is_no_inherit, is_internal);
|
|
|
|
|
|
|
|
numchecks++;
|
|
|
|
|
|
|
|
cooked = (CookedConstraint *) palloc(sizeof(CookedConstraint));
|
|
|
|
cooked->contype = CONSTR_CHECK;
|
|
|
|
cooked->conoid = constrOid;
|
|
|
|
cooked->name = ccname;
|
|
|
|
cooked->attnum = 0;
|
|
|
|
cooked->expr = expr;
|
|
|
|
cooked->skip_validation = cdef->skip_validation;
|
|
|
|
cooked->is_local = is_local;
|
|
|
|
cooked->inhcount = is_local ? 0 : 1;
|
|
|
|
cooked->is_no_inherit = cdef->is_no_inherit;
|
|
|
|
cookedConstraints = lappend(cookedConstraints, cooked);
|
2004-06-10 19:56:03 +02:00
|
|
|
}
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
else if (cdef->contype == CONSTR_NOTNULL)
|
2004-06-10 19:56:03 +02:00
|
|
|
{
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
CookedConstraint *nncooked;
|
|
|
|
AttrNumber colnum;
|
|
|
|
char *nnname;
|
2023-04-12 19:29:21 +02:00
|
|
|
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
/* Determine which column to modify */
|
|
|
|
colnum = get_attnum(RelationGetRelid(rel), strVal(linitial(cdef->keys)));
|
|
|
|
if (colnum == InvalidAttrNumber) /* shouldn't happen */
|
|
|
|
elog(ERROR, "cache lookup failed for attribute \"%s\" of relation %u",
|
|
|
|
strVal(linitial(cdef->keys)), RelationGetRelid(rel));
|
2004-06-10 19:56:03 +02:00
|
|
|
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
/*
|
|
|
|
* If the column already has a not-null constraint, we need only
|
|
|
|
* update its catalog status and we're done.
|
|
|
|
*/
|
|
|
|
if (AdjustNotNullInheritance1(RelationGetRelid(rel), colnum,
|
2023-08-29 19:19:24 +02:00
|
|
|
cdef->inhcount, cdef->is_no_inherit))
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
continue;
|
2004-06-10 19:56:03 +02:00
|
|
|
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
/*
|
|
|
|
* If a constraint name is specified, check that it isn't already
|
|
|
|
* used. Otherwise, choose a non-conflicting one ourselves.
|
|
|
|
*/
|
|
|
|
if (cdef->conname)
|
|
|
|
{
|
|
|
|
if (ConstraintNameIsUsed(CONSTRAINT_RELATION,
|
|
|
|
RelationGetRelid(rel),
|
|
|
|
cdef->conname))
|
|
|
|
ereport(ERROR,
|
|
|
|
errcode(ERRCODE_DUPLICATE_OBJECT),
|
|
|
|
errmsg("constraint \"%s\" for relation \"%s\" already exists",
|
|
|
|
cdef->conname, RelationGetRelationName(rel)));
|
|
|
|
nnname = cdef->conname;
|
|
|
|
}
|
2004-06-10 19:56:03 +02:00
|
|
|
else
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
nnname = ChooseConstraintName(RelationGetRelationName(rel),
|
|
|
|
strVal(linitial(cdef->keys)),
|
|
|
|
"not_null",
|
|
|
|
RelationGetNamespace(rel),
|
|
|
|
nnnames);
|
|
|
|
nnnames = lappend(nnnames, nnname);
|
|
|
|
|
|
|
|
constrOid =
|
|
|
|
StoreRelNotNull(rel, nnname, colnum,
|
|
|
|
cdef->initially_valid,
|
|
|
|
cdef->inhcount == 0,
|
|
|
|
cdef->inhcount,
|
|
|
|
cdef->is_no_inherit);
|
|
|
|
|
|
|
|
nncooked = (CookedConstraint *) palloc(sizeof(CookedConstraint));
|
|
|
|
nncooked->contype = CONSTR_NOTNULL;
|
|
|
|
nncooked->conoid = constrOid;
|
|
|
|
nncooked->name = nnname;
|
|
|
|
nncooked->attnum = colnum;
|
|
|
|
nncooked->expr = NULL;
|
|
|
|
nncooked->skip_validation = cdef->skip_validation;
|
|
|
|
nncooked->is_local = is_local;
|
|
|
|
nncooked->inhcount = cdef->inhcount;
|
|
|
|
nncooked->is_no_inherit = cdef->is_no_inherit;
|
|
|
|
|
|
|
|
cookedConstraints = lappend(cookedConstraints, nncooked);
|
2008-05-10 01:32:05 +02:00
|
|
|
}
|
1997-08-22 16:10:26 +02:00
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1999-10-04 01:55:40 +02:00
|
|
|
/*
|
|
|
|
* Update the count of constraints in the relation's pg_class tuple. We do
|
|
|
|
* this even if there was no change, in order to ensure that an SI update
|
|
|
|
* message is sent out for the pg_class tuple, which will force other
|
|
|
|
* backends to rebuild their relcache entries for the rel. (This is
|
2002-03-03 18:47:56 +01:00
|
|
|
* critical if we added defaults but not constraints.)
|
1999-10-04 01:55:40 +02:00
|
|
|
*/
|
2002-03-03 18:47:56 +01:00
|
|
|
SetRelationNumChecks(rel, numchecks);
|
2004-05-05 06:48:48 +02:00
|
|
|
|
|
|
|
return cookedConstraints;
|
2002-03-03 18:47:56 +01:00
|
|
|
}
|
|
|
|
|
2008-05-10 01:32:05 +02:00
|
|
|
/*
|
|
|
|
* Check for a pre-existing check constraint that conflicts with a proposed
|
|
|
|
* new one, and either adjust its conislocal/coninhcount settings or throw
|
|
|
|
* error as needed.
|
|
|
|
*
|
2017-08-16 06:22:32 +02:00
|
|
|
* Returns true if merged (constraint is a duplicate), or false if it's
|
2008-05-10 01:32:05 +02:00
|
|
|
* got a so-far-unique name, or throws error if conflict.
|
2012-01-16 23:19:42 +01:00
|
|
|
*
|
|
|
|
* XXX See MergeConstraintsIntoExisting too if you change this code.
|
2008-05-10 01:32:05 +02:00
|
|
|
*/
|
|
|
|
static bool
|
2017-10-31 15:34:31 +01:00
|
|
|
MergeWithExistingConstraint(Relation rel, const char *ccname, Node *expr,
|
2011-12-05 19:10:18 +01:00
|
|
|
bool allow_merge, bool is_local,
|
Fix two bugs in merging of inherited CHECK constraints.
Historically, we've allowed users to add a CHECK constraint to a child
table and then add an identical CHECK constraint to the parent. This
results in "merging" the two constraints so that the pre-existing
child constraint ends up with both conislocal = true and coninhcount > 0.
However, if you tried to do it in the other order, you got a duplicate
constraint error. This is problematic for pg_dump, which needs to issue
separated ADD CONSTRAINT commands in some cases, but has no good way to
ensure that the constraints will be added in the required order.
And it's more than a bit arbitrary, too. The goal of complaining about
duplicated ADD CONSTRAINT commands can be served if we reject the case of
adding a constraint when the existing one already has conislocal = true;
but if it has conislocal = false, let's just make the ADD CONSTRAINT set
conislocal = true. In this way, either order of adding the constraints
has the same end result.
Another problem was that the code allowed creation of a parent constraint
marked convalidated that is merged with a child constraint that is
!convalidated. In this case, an inheritance scan of the parent table could
emit some rows violating the constraint condition, which would be an
unexpected result given the marking of the parent constraint as validated.
Hence, forbid merging of constraints in this case. (Note: valid child and
not-valid parent seems fine, so continue to allow that.)
Per report from Benedikt Grundmann. Back-patch to 9.2 where we introduced
possibly-not-valid check constraints. The second bug obviously doesn't
apply before that, and I think the first doesn't either, because pg_dump
only gets into this situation when dealing with not-valid constraints.
Report: <CADbMkNPT-Jz5PRSQ4RbUASYAjocV_KHUWapR%2Bg8fNvhUAyRpxA%40mail.gmail.com>
Discussion: <22108.1475874586@sss.pgh.pa.us>
2016-10-09 01:29:27 +02:00
|
|
|
bool is_initially_valid,
|
2012-04-21 04:46:20 +02:00
|
|
|
bool is_no_inherit)
|
2008-05-10 01:32:05 +02:00
|
|
|
{
|
|
|
|
bool found;
|
|
|
|
Relation conDesc;
|
|
|
|
SysScanDesc conscan;
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
ScanKeyData skey[3];
|
2008-05-10 01:32:05 +02:00
|
|
|
HeapTuple tup;
|
|
|
|
|
|
|
|
/* Search for a pg_constraint entry with same name and relation */
|
2019-01-21 19:32:19 +01:00
|
|
|
conDesc = table_open(ConstraintRelationId, RowExclusiveLock);
|
2008-05-10 01:32:05 +02:00
|
|
|
|
|
|
|
found = false;
|
|
|
|
|
|
|
|
ScanKeyInit(&skey[0],
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
Anum_pg_constraint_conrelid,
|
|
|
|
BTEqualStrategyNumber, F_OIDEQ,
|
|
|
|
ObjectIdGetDatum(RelationGetRelid(rel)));
|
|
|
|
ScanKeyInit(&skey[1],
|
|
|
|
Anum_pg_constraint_contypid,
|
|
|
|
BTEqualStrategyNumber, F_OIDEQ,
|
|
|
|
ObjectIdGetDatum(InvalidOid));
|
|
|
|
ScanKeyInit(&skey[2],
|
2008-05-10 01:32:05 +02:00
|
|
|
Anum_pg_constraint_conname,
|
|
|
|
BTEqualStrategyNumber, F_NAMEEQ,
|
|
|
|
CStringGetDatum(ccname));
|
|
|
|
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
conscan = systable_beginscan(conDesc, ConstraintRelidTypidNameIndexId, true,
|
|
|
|
NULL, 3, skey);
|
2008-05-10 01:32:05 +02:00
|
|
|
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
/* There can be at most one matching row */
|
|
|
|
if (HeapTupleIsValid(tup = systable_getnext(conscan)))
|
2008-05-10 01:32:05 +02:00
|
|
|
{
|
|
|
|
Form_pg_constraint con = (Form_pg_constraint) GETSTRUCT(tup);
|
|
|
|
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
/* Found it. Conflicts if not identical check constraint */
|
|
|
|
if (con->contype == CONSTRAINT_CHECK)
|
2008-05-10 01:32:05 +02:00
|
|
|
{
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
Datum val;
|
|
|
|
bool isnull;
|
|
|
|
|
|
|
|
val = fastgetattr(tup,
|
|
|
|
Anum_pg_constraint_conbin,
|
|
|
|
conDesc->rd_att, &isnull);
|
|
|
|
if (isnull)
|
|
|
|
elog(ERROR, "null conbin for rel %s",
|
|
|
|
RelationGetRelationName(rel));
|
|
|
|
if (equal(expr, stringToNode(TextDatumGetCString(val))))
|
|
|
|
found = true;
|
|
|
|
}
|
Fix two bugs in merging of inherited CHECK constraints.
Historically, we've allowed users to add a CHECK constraint to a child
table and then add an identical CHECK constraint to the parent. This
results in "merging" the two constraints so that the pre-existing
child constraint ends up with both conislocal = true and coninhcount > 0.
However, if you tried to do it in the other order, you got a duplicate
constraint error. This is problematic for pg_dump, which needs to issue
separated ADD CONSTRAINT commands in some cases, but has no good way to
ensure that the constraints will be added in the required order.
And it's more than a bit arbitrary, too. The goal of complaining about
duplicated ADD CONSTRAINT commands can be served if we reject the case of
adding a constraint when the existing one already has conislocal = true;
but if it has conislocal = false, let's just make the ADD CONSTRAINT set
conislocal = true. In this way, either order of adding the constraints
has the same end result.
Another problem was that the code allowed creation of a parent constraint
marked convalidated that is merged with a child constraint that is
!convalidated. In this case, an inheritance scan of the parent table could
emit some rows violating the constraint condition, which would be an
unexpected result given the marking of the parent constraint as validated.
Hence, forbid merging of constraints in this case. (Note: valid child and
not-valid parent seems fine, so continue to allow that.)
Per report from Benedikt Grundmann. Back-patch to 9.2 where we introduced
possibly-not-valid check constraints. The second bug obviously doesn't
apply before that, and I think the first doesn't either, because pg_dump
only gets into this situation when dealing with not-valid constraints.
Report: <CADbMkNPT-Jz5PRSQ4RbUASYAjocV_KHUWapR%2Bg8fNvhUAyRpxA%40mail.gmail.com>
Discussion: <22108.1475874586@sss.pgh.pa.us>
2016-10-09 01:29:27 +02:00
|
|
|
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
/*
|
|
|
|
* If the existing constraint is purely inherited (no local
|
|
|
|
* definition) then interpret addition of a local constraint as a
|
|
|
|
* legal merge. This allows ALTER ADD CONSTRAINT on parent and child
|
|
|
|
* tables to be given in either order with same end state. However if
|
|
|
|
* the relation is a partition, all inherited constraints are always
|
|
|
|
* non-local, including those that were merged.
|
|
|
|
*/
|
|
|
|
if (is_local && !con->conislocal && !rel->rd_rel->relispartition)
|
|
|
|
allow_merge = true;
|
Fix two bugs in merging of inherited CHECK constraints.
Historically, we've allowed users to add a CHECK constraint to a child
table and then add an identical CHECK constraint to the parent. This
results in "merging" the two constraints so that the pre-existing
child constraint ends up with both conislocal = true and coninhcount > 0.
However, if you tried to do it in the other order, you got a duplicate
constraint error. This is problematic for pg_dump, which needs to issue
separated ADD CONSTRAINT commands in some cases, but has no good way to
ensure that the constraints will be added in the required order.
And it's more than a bit arbitrary, too. The goal of complaining about
duplicated ADD CONSTRAINT commands can be served if we reject the case of
adding a constraint when the existing one already has conislocal = true;
but if it has conislocal = false, let's just make the ADD CONSTRAINT set
conislocal = true. In this way, either order of adding the constraints
has the same end result.
Another problem was that the code allowed creation of a parent constraint
marked convalidated that is merged with a child constraint that is
!convalidated. In this case, an inheritance scan of the parent table could
emit some rows violating the constraint condition, which would be an
unexpected result given the marking of the parent constraint as validated.
Hence, forbid merging of constraints in this case. (Note: valid child and
not-valid parent seems fine, so continue to allow that.)
Per report from Benedikt Grundmann. Back-patch to 9.2 where we introduced
possibly-not-valid check constraints. The second bug obviously doesn't
apply before that, and I think the first doesn't either, because pg_dump
only gets into this situation when dealing with not-valid constraints.
Report: <CADbMkNPT-Jz5PRSQ4RbUASYAjocV_KHUWapR%2Bg8fNvhUAyRpxA%40mail.gmail.com>
Discussion: <22108.1475874586@sss.pgh.pa.us>
2016-10-09 01:29:27 +02:00
|
|
|
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
if (!found || !allow_merge)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_DUPLICATE_OBJECT),
|
|
|
|
errmsg("constraint \"%s\" for relation \"%s\" already exists",
|
|
|
|
ccname, RelationGetRelationName(rel))));
|
2012-01-16 23:19:42 +01:00
|
|
|
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
/* If the child constraint is "no inherit" then cannot merge */
|
|
|
|
if (con->connoinherit)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
|
|
|
|
errmsg("constraint \"%s\" conflicts with non-inherited constraint on relation \"%s\"",
|
|
|
|
ccname, RelationGetRelationName(rel))));
|
2012-01-16 23:19:42 +01:00
|
|
|
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
/*
|
|
|
|
* Must not change an existing inherited constraint to "no inherit"
|
|
|
|
* status. That's because inherited constraints should be able to
|
|
|
|
* propagate to lower-level children.
|
|
|
|
*/
|
|
|
|
if (con->coninhcount > 0 && is_no_inherit)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
|
|
|
|
errmsg("constraint \"%s\" conflicts with inherited constraint on relation \"%s\"",
|
|
|
|
ccname, RelationGetRelationName(rel))));
|
2016-10-13 23:05:14 +02:00
|
|
|
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
/*
|
|
|
|
* If the child constraint is "not valid" then cannot merge with a
|
|
|
|
* valid parent constraint.
|
|
|
|
*/
|
|
|
|
if (is_initially_valid && !con->convalidated)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
|
|
|
|
errmsg("constraint \"%s\" conflicts with NOT VALID constraint on relation \"%s\"",
|
|
|
|
ccname, RelationGetRelationName(rel))));
|
Fix two bugs in merging of inherited CHECK constraints.
Historically, we've allowed users to add a CHECK constraint to a child
table and then add an identical CHECK constraint to the parent. This
results in "merging" the two constraints so that the pre-existing
child constraint ends up with both conislocal = true and coninhcount > 0.
However, if you tried to do it in the other order, you got a duplicate
constraint error. This is problematic for pg_dump, which needs to issue
separated ADD CONSTRAINT commands in some cases, but has no good way to
ensure that the constraints will be added in the required order.
And it's more than a bit arbitrary, too. The goal of complaining about
duplicated ADD CONSTRAINT commands can be served if we reject the case of
adding a constraint when the existing one already has conislocal = true;
but if it has conislocal = false, let's just make the ADD CONSTRAINT set
conislocal = true. In this way, either order of adding the constraints
has the same end result.
Another problem was that the code allowed creation of a parent constraint
marked convalidated that is merged with a child constraint that is
!convalidated. In this case, an inheritance scan of the parent table could
emit some rows violating the constraint condition, which would be an
unexpected result given the marking of the parent constraint as validated.
Hence, forbid merging of constraints in this case. (Note: valid child and
not-valid parent seems fine, so continue to allow that.)
Per report from Benedikt Grundmann. Back-patch to 9.2 where we introduced
possibly-not-valid check constraints. The second bug obviously doesn't
apply before that, and I think the first doesn't either, because pg_dump
only gets into this situation when dealing with not-valid constraints.
Report: <CADbMkNPT-Jz5PRSQ4RbUASYAjocV_KHUWapR%2Bg8fNvhUAyRpxA%40mail.gmail.com>
Discussion: <22108.1475874586@sss.pgh.pa.us>
2016-10-09 01:29:27 +02:00
|
|
|
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
/* OK to update the tuple */
|
|
|
|
ereport(NOTICE,
|
|
|
|
(errmsg("merging constraint \"%s\" with inherited definition",
|
|
|
|
ccname)));
|
Fix two bugs in merging of inherited CHECK constraints.
Historically, we've allowed users to add a CHECK constraint to a child
table and then add an identical CHECK constraint to the parent. This
results in "merging" the two constraints so that the pre-existing
child constraint ends up with both conislocal = true and coninhcount > 0.
However, if you tried to do it in the other order, you got a duplicate
constraint error. This is problematic for pg_dump, which needs to issue
separated ADD CONSTRAINT commands in some cases, but has no good way to
ensure that the constraints will be added in the required order.
And it's more than a bit arbitrary, too. The goal of complaining about
duplicated ADD CONSTRAINT commands can be served if we reject the case of
adding a constraint when the existing one already has conislocal = true;
but if it has conislocal = false, let's just make the ADD CONSTRAINT set
conislocal = true. In this way, either order of adding the constraints
has the same end result.
Another problem was that the code allowed creation of a parent constraint
marked convalidated that is merged with a child constraint that is
!convalidated. In this case, an inheritance scan of the parent table could
emit some rows violating the constraint condition, which would be an
unexpected result given the marking of the parent constraint as validated.
Hence, forbid merging of constraints in this case. (Note: valid child and
not-valid parent seems fine, so continue to allow that.)
Per report from Benedikt Grundmann. Back-patch to 9.2 where we introduced
possibly-not-valid check constraints. The second bug obviously doesn't
apply before that, and I think the first doesn't either, because pg_dump
only gets into this situation when dealing with not-valid constraints.
Report: <CADbMkNPT-Jz5PRSQ4RbUASYAjocV_KHUWapR%2Bg8fNvhUAyRpxA%40mail.gmail.com>
Discussion: <22108.1475874586@sss.pgh.pa.us>
2016-10-09 01:29:27 +02:00
|
|
|
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
tup = heap_copytuple(tup);
|
|
|
|
con = (Form_pg_constraint) GETSTRUCT(tup);
|
Fix two bugs in merging of inherited CHECK constraints.
Historically, we've allowed users to add a CHECK constraint to a child
table and then add an identical CHECK constraint to the parent. This
results in "merging" the two constraints so that the pre-existing
child constraint ends up with both conislocal = true and coninhcount > 0.
However, if you tried to do it in the other order, you got a duplicate
constraint error. This is problematic for pg_dump, which needs to issue
separated ADD CONSTRAINT commands in some cases, but has no good way to
ensure that the constraints will be added in the required order.
And it's more than a bit arbitrary, too. The goal of complaining about
duplicated ADD CONSTRAINT commands can be served if we reject the case of
adding a constraint when the existing one already has conislocal = true;
but if it has conislocal = false, let's just make the ADD CONSTRAINT set
conislocal = true. In this way, either order of adding the constraints
has the same end result.
Another problem was that the code allowed creation of a parent constraint
marked convalidated that is merged with a child constraint that is
!convalidated. In this case, an inheritance scan of the parent table could
emit some rows violating the constraint condition, which would be an
unexpected result given the marking of the parent constraint as validated.
Hence, forbid merging of constraints in this case. (Note: valid child and
not-valid parent seems fine, so continue to allow that.)
Per report from Benedikt Grundmann. Back-patch to 9.2 where we introduced
possibly-not-valid check constraints. The second bug obviously doesn't
apply before that, and I think the first doesn't either, because pg_dump
only gets into this situation when dealing with not-valid constraints.
Report: <CADbMkNPT-Jz5PRSQ4RbUASYAjocV_KHUWapR%2Bg8fNvhUAyRpxA%40mail.gmail.com>
Discussion: <22108.1475874586@sss.pgh.pa.us>
2016-10-09 01:29:27 +02:00
|
|
|
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
/*
|
|
|
|
* In case of partitions, an inherited constraint must be inherited
|
|
|
|
* only once since it cannot have multiple parents and it is never
|
|
|
|
* considered local.
|
|
|
|
*/
|
|
|
|
if (rel->rd_rel->relispartition)
|
|
|
|
{
|
|
|
|
con->coninhcount = 1;
|
|
|
|
con->conislocal = false;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
if (is_local)
|
|
|
|
con->conislocal = true;
|
2008-05-10 01:32:05 +02:00
|
|
|
else
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
con->coninhcount++;
|
2023-03-28 09:58:14 +02:00
|
|
|
|
|
|
|
if (con->coninhcount < 0)
|
|
|
|
ereport(ERROR,
|
|
|
|
errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
|
|
|
|
errmsg("too many inheritance parents"));
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
}
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
if (is_no_inherit)
|
|
|
|
{
|
|
|
|
Assert(is_local);
|
|
|
|
con->connoinherit = true;
|
2008-05-10 01:32:05 +02:00
|
|
|
}
|
Fully enforce uniqueness of constraint names.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
2018-09-04 19:45:35 +02:00
|
|
|
|
|
|
|
CatalogTupleUpdate(conDesc, &tup->t_self, tup);
|
2008-05-10 01:32:05 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
systable_endscan(conscan);
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(conDesc, RowExclusiveLock);
|
2008-05-10 01:32:05 +02:00
|
|
|
|
|
|
|
return found;
|
|
|
|
}
|
|
|
|
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
/* list_sort comparator to sort CookedConstraint by attnum */
|
|
|
|
static int
|
|
|
|
list_cookedconstr_attnum_cmp(const ListCell *p1, const ListCell *p2)
|
|
|
|
{
|
|
|
|
AttrNumber v1 = ((CookedConstraint *) lfirst(p1))->attnum;
|
|
|
|
AttrNumber v2 = ((CookedConstraint *) lfirst(p2))->attnum;
|
|
|
|
|
2024-02-16 21:05:36 +01:00
|
|
|
return pg_cmp_s16(v1, v2);
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Create the not-null constraints when creating a new relation
|
|
|
|
*
|
|
|
|
* These come from two sources: the 'constraints' list (of Constraint) is
|
|
|
|
* specified directly by the user; the 'old_notnulls' list (of
|
|
|
|
* CookedConstraint) comes from inheritance. We create one constraint
|
|
|
|
* for each column, giving priority to user-specified ones, and setting
|
|
|
|
* inhcount according to how many parents cause each column to get a
|
|
|
|
* not-null constraint. If a user-specified name clashes with another
|
|
|
|
* user-specified name, an error is raised.
|
|
|
|
*
|
|
|
|
* Note that inherited constraints have two shapes: those coming from another
|
|
|
|
* not-null constraint in the parent, which have a name already, and those
|
|
|
|
* coming from a primary key in the parent, which don't. Any name specified
|
|
|
|
* in a parent is disregarded in case of a conflict.
|
|
|
|
*
|
|
|
|
* Returns a list of AttrNumber for columns that need to have the attnotnull
|
|
|
|
* flag set.
|
|
|
|
*/
|
|
|
|
List *
|
|
|
|
AddRelationNotNullConstraints(Relation rel, List *constraints,
|
|
|
|
List *old_notnulls)
|
|
|
|
{
|
|
|
|
List *givennames;
|
|
|
|
List *nnnames;
|
|
|
|
List *nncols = NIL;
|
|
|
|
ListCell *lc;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We track two lists of names: nnnames keeps all the constraint names,
|
|
|
|
* givennames tracks user-generated names. The distinction is important,
|
|
|
|
* because we must raise error for user-generated name conflicts, but for
|
|
|
|
* system-generated name conflicts we just generate another.
|
|
|
|
*/
|
|
|
|
nnnames = NIL;
|
|
|
|
givennames = NIL;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* First, create all not-null constraints that are directly specified by
|
|
|
|
* the user. Note that inheritance might have given us another source for
|
|
|
|
* each, so we must scan the old_notnulls list and increment inhcount for
|
|
|
|
* each element with identical attnum. We delete from there any element
|
|
|
|
* that we process.
|
|
|
|
*/
|
|
|
|
foreach(lc, constraints)
|
|
|
|
{
|
|
|
|
Constraint *constr = lfirst_node(Constraint, lc);
|
|
|
|
AttrNumber attnum;
|
|
|
|
char *conname;
|
|
|
|
bool is_local = true;
|
|
|
|
int inhcount = 0;
|
|
|
|
ListCell *lc2;
|
|
|
|
|
|
|
|
Assert(constr->contype == CONSTR_NOTNULL);
|
|
|
|
|
|
|
|
attnum = get_attnum(RelationGetRelid(rel),
|
|
|
|
strVal(linitial(constr->keys)));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Search in the list of inherited constraints for any entries on the
|
|
|
|
* same column.
|
|
|
|
*/
|
|
|
|
foreach(lc2, old_notnulls)
|
|
|
|
{
|
|
|
|
CookedConstraint *old = (CookedConstraint *) lfirst(lc2);
|
|
|
|
|
|
|
|
if (old->attnum == attnum)
|
|
|
|
{
|
2023-08-29 19:19:24 +02:00
|
|
|
/*
|
|
|
|
* If we get a constraint from the parent, having a local NO
|
|
|
|
* INHERIT one doesn't work.
|
|
|
|
*/
|
|
|
|
if (constr->is_no_inherit)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_DATATYPE_MISMATCH),
|
|
|
|
errmsg("cannot define not-null constraint on column \"%s\" with NO INHERIT",
|
|
|
|
strVal(linitial(constr->keys))),
|
|
|
|
errdetail("The column has an inherited not-null constraint.")));
|
|
|
|
|
Catalog not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints.
We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables. We also spawn not-null constraints
for inheritance child tables when their parents have primary keys.
These related constraints mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations: for example, as opposed to CHECK constraints, we don't
match not-null ones by name when descending a hierarchy to alter it,
instead matching by column name that they apply to. This means we don't
require the constraint names to be identical across a hierarchy.
For now, we omit them for system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
psql shows these constraints in \d+.
pg_dump requires some ad-hoc hacks, particularly when dumping a primary
key. We now create one "throwaway" not-null constraint for each column
in the PK together with the CREATE TABLE command, and once the PK is
created, all those throwaway constraints are removed. This avoids
having to check each tuple for nullness when the dump restores the
primary key creation.
pg_upgrading from an older release requires a somewhat brittle procedure
to create a constraint state that matches what would be created if the
database were being created fresh in Postgres 17. I have tested all the
scenarios I could think of, and it works correctly as far as I can tell,
but I could have neglected weird cases.
This patch has been very long in the making. The first patch was
written by Bernd Helmle in 2010 to add a new pg_constraint.contype value
('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one
was killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again. During Postgres 16 this had
already been introduced by commit e056c557aef4, but there were some
problems mainly with the pg_upgrade procedure that couldn't be fixed in
reasonable time, so it was reverted.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring an additional pg_attribute column
to track the OID of the not-null constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
2023-08-25 13:31:24 +02:00
|
|
|
inhcount++;
|
|
|
|
old_notnulls = foreach_delete_current(old_notnulls, lc2);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Determine a constraint name, which may have been specified by the
|
|
|
|
* user, or raise an error if a conflict exists with another
|
|
|
|
* user-specified name.
|
|
|
|
*/
|
|
|
|
if (constr->conname)
|
|
|
|
{
|
|
|
|
foreach(lc2, givennames)
|
|
|
|
{
|
|
|
|
if (strcmp(lfirst(lc2), constr->conname) == 0)
|
|
|
|
ereport(ERROR,
|
|
|
|
errcode(ERRCODE_DUPLICATE_OBJECT),
|
|
|
|
errmsg("constraint \"%s\" for relation \"%s\" already exists",
|
|
|
|
constr->conname,
|
|
|
|
RelationGetRelationName(rel)));
|
|
|
|
}
|
|
|
|
|
|
|
|
conname = constr->conname;
|
|
|
|
givennames = lappend(givennames, conname);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
conname = ChooseConstraintName(RelationGetRelationName(rel),
|
|
|
|
get_attname(RelationGetRelid(rel),
|
|
|
|
attnum, false),
|
|
|
|
"not_null",
|
|
|
|
RelationGetNamespace(rel),
|
|
|
|
nnnames);
|
|
|
|
nnnames = lappend(nnnames, conname);
|
|
|
|
|
|
|
|
StoreRelNotNull(rel, conname,
|
|
|
|
attnum, true, is_local,
|
|
|
|
inhcount, constr->is_no_inherit);
|
|
|
|
|
|
|
|
nncols = lappend_int(nncols, attnum);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If any column remains in the old_notnulls list, we must create a not-
|
|
|
|
* null constraint marked not-local. Because multiple parents could
|
|
|
|
* specify a not-null constraint for the same column, we must count how
|
|
|
|
* many there are and add to the original inhcount accordingly, deleting
|
|
|
|
* elements we've already processed. We sort the list to make it easy.
|
|
|
|
*
|
|
|
|
* We don't use foreach() here because we have two nested loops over the
|
|
|
|
* constraint list, with possible element deletions in the inner one. If
|
|
|
|
* we used foreach_delete_current() it could only fix up the state of one
|
|
|
|
* of the loops, so it seems cleaner to use looping over list indexes for
|
|
|
|
* both loops. Note that any deletion will happen beyond where the outer
|
|
|
|
* loop is, so its index never needs adjustment.
|
|
|
|
*/
|
|
|
|
list_sort(old_notnulls, list_cookedconstr_attnum_cmp);
|
|
|
|
for (int outerpos = 0; outerpos < list_length(old_notnulls); outerpos++)
|
|
|
|
{
|
|
|
|
CookedConstraint *cooked;
|
|
|
|
char *conname = NULL;
|
|
|
|
int add_inhcount = 0;
|
|
|
|
ListCell *lc2;
|
|
|
|
|
|
|
|
cooked = (CookedConstraint *) list_nth(old_notnulls, outerpos);
|
|
|
|
Assert(cooked->contype == CONSTR_NOTNULL);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Preserve the first non-conflicting constraint name we come across,
|
|
|
|
* if any
|
|
|
|
*/
|
|
|
|
if (conname == NULL && cooked->name)
|
|
|
|
conname = cooked->name;
|
|
|
|
|
|
|
|
for (int restpos = outerpos + 1; restpos < list_length(old_notnulls);)
|
|
|
|
{
|
|
|
|
CookedConstraint *other;
|
|
|
|
|
|
|
|
other = (CookedConstraint *) list_nth(old_notnulls, restpos);
|
|
|
|
if (other->attnum == cooked->attnum)
|
|
|
|
{
|
|
|
|
if (conname == NULL && other->name)
|
|
|
|
conname = other->name;
|
|
|
|
|
|
|
|
add_inhcount++;
|
|
|
|
old_notnulls = list_delete_nth_cell(old_notnulls, restpos);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
restpos++;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* If we got a name, make sure it isn't one we've already used */
|
|
|
|
if (conname != NULL)
|
|
|
|
{
|
|
|
|
foreach(lc2, nnnames)
|
|
|
|
{
|
|
|
|
if (strcmp(lfirst(lc2), conname) == 0)
|
|
|
|
{
|
|
|
|
conname = NULL;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* and choose a name, if needed */
|
|
|
|
if (conname == NULL)
|
|
|
|
conname = ChooseConstraintName(RelationGetRelationName(rel),
|
|
|
|
get_attname(RelationGetRelid(rel),
|
|
|
|
cooked->attnum, false),
|
|
|
|
"not_null",
|
|
|
|
RelationGetNamespace(rel),
|
|
|
|
nnnames);
|
|
|
|
nnnames = lappend(nnnames, conname);
|
|
|
|
|
|
|
|
StoreRelNotNull(rel, conname, cooked->attnum, true,
|
|
|
|
cooked->is_local, cooked->inhcount + add_inhcount,
|
|
|
|
cooked->is_no_inherit);
|
|
|
|
|
|
|
|
nncols = lappend_int(nncols, cooked->attnum);
|
|
|
|
}
|
|
|
|
|
|
|
|
return nncols;
|
|
|
|
}
|
|
|
|
|
2002-03-03 18:47:56 +01:00
|
|
|
/*
|
|
|
|
* Update the count of constraints in the relation's pg_class tuple.
|
|
|
|
*
|
|
|
|
* Caller had better hold exclusive lock on the relation.
|
|
|
|
*
|
|
|
|
* An important side effect is that a SI update message will be sent out for
|
|
|
|
* the pg_class tuple, which will force other backends to rebuild their
|
|
|
|
* relcache entries for the rel. Also, this backend will rebuild its
|
|
|
|
* own relcache entry at the next CommandCounterIncrement.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
SetRelationNumChecks(Relation rel, int numchecks)
|
|
|
|
{
|
|
|
|
Relation relrel;
|
|
|
|
HeapTuple reltup;
|
|
|
|
Form_pg_class relStruct;
|
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
relrel = table_open(RelationRelationId, RowExclusiveLock);
|
2010-02-14 19:42:19 +01:00
|
|
|
reltup = SearchSysCacheCopy1(RELOID,
|
|
|
|
ObjectIdGetDatum(RelationGetRelid(rel)));
|
1999-10-04 01:55:40 +02:00
|
|
|
if (!HeapTupleIsValid(reltup))
|
2003-07-20 23:56:35 +02:00
|
|
|
elog(ERROR, "cache lookup failed for relation %u",
|
2000-11-16 23:30:52 +01:00
|
|
|
RelationGetRelid(rel));
|
1999-10-04 01:55:40 +02:00
|
|
|
relStruct = (Form_pg_class) GETSTRUCT(reltup);
|
|
|
|
|
2002-03-03 18:47:56 +01:00
|
|
|
if (relStruct->relchecks != numchecks)
|
|
|
|
{
|
|
|
|
relStruct->relchecks = numchecks;
|
1999-10-04 01:55:40 +02:00
|
|
|
|
2017-01-31 22:42:24 +01:00
|
|
|
CatalogTupleUpdate(relrel, &reltup->t_self, reltup);
|
2002-03-03 18:47:56 +01:00
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/* Skip the disk update, but force relcache inval anyway */
|
2004-02-10 02:55:27 +01:00
|
|
|
CacheInvalidateRelcache(rel);
|
2002-03-03 18:47:56 +01:00
|
|
|
}
|
1999-10-04 01:55:40 +02:00
|
|
|
|
1999-12-16 23:20:03 +01:00
|
|
|
heap_freetuple(reltup);
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(relrel, RowExclusiveLock);
|
1997-08-22 16:10:26 +02:00
|
|
|
}
|
|
|
|
|
2019-03-30 08:13:09 +01:00
|
|
|
/*
|
|
|
|
* Check for references to generated columns
|
|
|
|
*/
|
|
|
|
static bool
|
|
|
|
check_nested_generated_walker(Node *node, void *context)
|
|
|
|
{
|
|
|
|
ParseState *pstate = context;
|
|
|
|
|
|
|
|
if (node == NULL)
|
|
|
|
return false;
|
|
|
|
else if (IsA(node, Var))
|
|
|
|
{
|
|
|
|
Var *var = (Var *) node;
|
|
|
|
Oid relid;
|
|
|
|
AttrNumber attnum;
|
|
|
|
|
|
|
|
relid = rt_fetch(var->varno, pstate->p_rtable)->relid;
|
2021-05-21 21:12:08 +02:00
|
|
|
if (!OidIsValid(relid))
|
|
|
|
return false; /* XXX shouldn't we raise an error? */
|
|
|
|
|
2019-03-30 08:13:09 +01:00
|
|
|
attnum = var->varattno;
|
|
|
|
|
2021-05-21 21:12:08 +02:00
|
|
|
if (attnum > 0 && get_attgenerated(relid, attnum))
|
2019-03-30 08:13:09 +01:00
|
|
|
ereport(ERROR,
|
2021-05-21 21:12:08 +02:00
|
|
|
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
|
2019-03-30 08:13:09 +01:00
|
|
|
errmsg("cannot use generated column \"%s\" in column generation expression",
|
|
|
|
get_attname(relid, attnum, false)),
|
|
|
|
errdetail("A generated column cannot reference another generated column."),
|
|
|
|
parser_errposition(pstate, var->location)));
|
2021-05-21 21:12:08 +02:00
|
|
|
/* A whole-row Var is necessarily self-referential, so forbid it */
|
|
|
|
if (attnum == 0)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
|
|
|
|
errmsg("cannot use whole-row variable in column generation expression"),
|
|
|
|
errdetail("This would cause the generated column to depend on its own value."),
|
|
|
|
parser_errposition(pstate, var->location)));
|
|
|
|
/* System columns were already checked in the parser */
|
2019-03-30 08:13:09 +01:00
|
|
|
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
return expression_tree_walker(node, check_nested_generated_walker,
|
|
|
|
(void *) context);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
check_nested_generated(ParseState *pstate, Node *node)
|
|
|
|
{
|
|
|
|
check_nested_generated_walker(node, pstate);
|
|
|
|
}
|
|
|
|
|
2002-03-19 03:18:25 +01:00
|
|
|
/*
|
|
|
|
* Take a raw default and convert it to a cooked format ready for
|
|
|
|
* storage.
|
|
|
|
*
|
2002-03-20 20:45:13 +01:00
|
|
|
* Parse state should be set up to recognize any vars that might appear
|
|
|
|
* in the expression. (Even though we plan to reject vars, it's more
|
|
|
|
* user-friendly to give the correct error message than "unknown var".)
|
|
|
|
*
|
2003-07-29 19:21:27 +02:00
|
|
|
* If atttypid is not InvalidOid, coerce the expression to the specified
|
|
|
|
* type (and typmod atttypmod). attname is only needed in this case:
|
|
|
|
* it is used in the error message, if any.
|
2002-03-19 03:18:25 +01:00
|
|
|
*/
|
|
|
|
Node *
|
|
|
|
cookDefault(ParseState *pstate,
|
|
|
|
Node *raw_default,
|
|
|
|
Oid atttypid,
|
|
|
|
int32 atttypmod,
|
2019-03-30 08:13:09 +01:00
|
|
|
const char *attname,
|
|
|
|
char attgenerated)
|
2002-03-20 20:45:13 +01:00
|
|
|
{
|
2002-03-19 03:18:25 +01:00
|
|
|
Node *expr;
|
|
|
|
|
|
|
|
Assert(raw_default != NULL);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Transform raw parsetree to executable expression.
|
|
|
|
*/
|
2019-03-30 08:13:09 +01:00
|
|
|
expr = transformExpr(pstate, raw_default, attgenerated ? EXPR_KIND_GENERATED_COLUMN : EXPR_KIND_COLUMN_DEFAULT);
|
2002-03-19 03:18:25 +01:00
|
|
|
|
2019-03-30 08:13:09 +01:00
|
|
|
if (attgenerated)
|
|
|
|
{
|
Ensure we preprocess expressions before checking their volatility.
contain_mutable_functions and contain_volatile_functions give
reliable answers only after expression preprocessing (specifically
eval_const_expressions). Some places understand this, but some did
not get the memo --- which is not entirely their fault, because the
problem is documented only in places far away from those functions.
Introduce wrapper functions that allow doing the right thing easily,
and add commentary in hopes of preventing future mistakes from
copy-and-paste of code that's only conditionally safe.
Two actual bugs of this ilk are fixed here. We failed to preprocess
column GENERATED expressions before checking mutability, so that the
code could fail to detect the use of a volatile function
default-argument expression, or it could reject a polymorphic function
that is actually immutable on the datatype of interest. Likewise,
column DEFAULT expressions weren't preprocessed before determining if
it's safe to apply the attmissingval mechanism. A false negative
would just result in an unnecessary table rewrite, but a false
positive could allow the attmissingval mechanism to be used in a case
where it should not be, resulting in unexpected initial values in a
new column.
In passing, re-order the steps in ComputePartitionAttrs so that its
checks for invalid column references are done before applying
expression_planner, rather than after. The previous coding would
not complain if a partition expression contains a disallowed column
reference that gets optimized away by constant folding, which seems
to me to be a behavior we do not want.
Per bug #18097 from Jim Keener. Back-patch to all supported versions.
Discussion: https://postgr.es/m/18097-ebb179674f22932f@postgresql.org
2023-11-16 16:05:14 +01:00
|
|
|
/* Disallow refs to other generated columns */
|
2019-03-30 08:13:09 +01:00
|
|
|
check_nested_generated(pstate, expr);
|
|
|
|
|
Ensure we preprocess expressions before checking their volatility.
contain_mutable_functions and contain_volatile_functions give
reliable answers only after expression preprocessing (specifically
eval_const_expressions). Some places understand this, but some did
not get the memo --- which is not entirely their fault, because the
problem is documented only in places far away from those functions.
Introduce wrapper functions that allow doing the right thing easily,
and add commentary in hopes of preventing future mistakes from
copy-and-paste of code that's only conditionally safe.
Two actual bugs of this ilk are fixed here. We failed to preprocess
column GENERATED expressions before checking mutability, so that the
code could fail to detect the use of a volatile function
default-argument expression, or it could reject a polymorphic function
that is actually immutable on the datatype of interest. Likewise,
column DEFAULT expressions weren't preprocessed before determining if
it's safe to apply the attmissingval mechanism. A false negative
would just result in an unnecessary table rewrite, but a false
positive could allow the attmissingval mechanism to be used in a case
where it should not be, resulting in unexpected initial values in a
new column.
In passing, re-order the steps in ComputePartitionAttrs so that its
checks for invalid column references are done before applying
expression_planner, rather than after. The previous coding would
not complain if a partition expression contains a disallowed column
reference that gets optimized away by constant folding, which seems
to me to be a behavior we do not want.
Per bug #18097 from Jim Keener. Back-patch to all supported versions.
Discussion: https://postgr.es/m/18097-ebb179674f22932f@postgresql.org
2023-11-16 16:05:14 +01:00
|
|
|
/* Disallow mutable functions */
|
|
|
|
if (contain_mutable_functions_after_planning((Expr *) expr))
|
2019-03-30 08:13:09 +01:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
|
|
|
|
errmsg("generation expression is not immutable")));
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* For a default expression, transformExpr() should have rejected
|
|
|
|
* column references.
|
|
|
|
*/
|
|
|
|
Assert(!contain_var_clause(expr));
|
|
|
|
}
|
2002-05-13 01:43:04 +02:00
|
|
|
|
2002-03-19 03:18:25 +01:00
|
|
|
/*
|
2003-07-29 19:21:27 +02:00
|
|
|
* Coerce the expression to the correct type and typmod, if given. This
|
|
|
|
* should match the parser's processing of non-defaulted expressions ---
|
2006-08-02 03:59:48 +02:00
|
|
|
* see transformAssignedExpr().
|
2002-03-19 03:18:25 +01:00
|
|
|
*/
|
2002-03-20 20:45:13 +01:00
|
|
|
if (OidIsValid(atttypid))
|
|
|
|
{
|
|
|
|
Oid type_id = exprType(expr);
|
2002-03-19 03:18:25 +01:00
|
|
|
|
2003-07-29 19:21:27 +02:00
|
|
|
expr = coerce_to_target_type(pstate, expr, type_id,
|
|
|
|
atttypid, atttypmod,
|
|
|
|
COERCION_ASSIGNMENT,
|
2008-08-29 01:09:48 +02:00
|
|
|
COERCE_IMPLICIT_CAST,
|
|
|
|
-1);
|
2003-07-29 19:21:27 +02:00
|
|
|
if (expr == NULL)
|
2003-07-21 03:59:11 +02:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_DATATYPE_MISMATCH),
|
|
|
|
errmsg("column \"%s\" is of type %s"
|
|
|
|
" but default expression is of type %s",
|
|
|
|
attname,
|
|
|
|
format_type_be(atttypid),
|
|
|
|
format_type_be(type_id)),
|
|
|
|
errhint("You will need to rewrite or cast the expression.")));
|
2002-03-19 03:18:25 +01:00
|
|
|
}
|
|
|
|
|
2011-03-20 01:29:08 +01:00
|
|
|
/*
|
|
|
|
* Finally, take care of collations in the finished expression.
|
|
|
|
*/
|
|
|
|
assign_expr_collations(pstate, expr);
|
|
|
|
|
2003-07-29 19:21:27 +02:00
|
|
|
return expr;
|
2002-03-19 03:18:25 +01:00
|
|
|
}
|
|
|
|
|
2001-05-30 14:57:36 +02:00
|
|
|
/*
|
2008-05-10 01:32:05 +02:00
|
|
|
* Take a raw CHECK constraint expression and convert it to a cooked format
|
|
|
|
* ready for storage.
|
2002-07-12 20:43:19 +02:00
|
|
|
*
|
2008-05-10 01:32:05 +02:00
|
|
|
* Parse state must be set up to recognize any vars that might appear
|
|
|
|
* in the expression.
|
2001-05-30 14:57:36 +02:00
|
|
|
*/
|
2008-05-10 01:32:05 +02:00
|
|
|
static Node *
|
|
|
|
cookConstraint(ParseState *pstate,
|
|
|
|
Node *raw_constraint,
|
|
|
|
char *relname)
|
2001-05-30 14:57:36 +02:00
|
|
|
{
|
2008-05-10 01:32:05 +02:00
|
|
|
Node *expr;
|
2001-10-25 07:50:21 +02:00
|
|
|
|
2001-05-30 14:57:36 +02:00
|
|
|
/*
|
2008-05-10 01:32:05 +02:00
|
|
|
* Transform raw parsetree to executable expression.
|
2001-05-30 14:57:36 +02:00
|
|
|
*/
|
Centralize the logic for detecting misplaced aggregates, window funcs, etc.
Formerly we relied on checking after-the-fact to see if an expression
contained aggregates, window functions, or sub-selects when it shouldn't.
This is grotty, easily forgotten (indeed, we had forgotten to teach
DefineIndex about rejecting window functions), and none too efficient
since it requires extra traversals of the parse tree. To improve matters,
define an enum type that classifies all SQL sub-expressions, store it in
ParseState to show what kind of expression we are currently parsing, and
make transformAggregateCall, transformWindowFuncCall, and transformSubLink
check the expression type and throw error if the type indicates the
construct is disallowed. This allows removal of a large number of ad-hoc
checks scattered around the code base. The enum type is sufficiently
fine-grained that we can still produce error messages of at least the
same specificity as before.
Bringing these error checks together revealed that we'd been none too
consistent about phrasing of the error messages, so standardize the wording
a bit.
Also, rewrite checking of aggregate arguments so that it requires only one
traversal of the arguments, rather than up to three as before.
In passing, clean up some more comments left over from add_missing_from
support, and annotate some tests that I think are dead code now that that's
gone. (I didn't risk actually removing said dead code, though.)
2012-08-10 17:35:33 +02:00
|
|
|
expr = transformExpr(pstate, raw_constraint, EXPR_KIND_CHECK_CONSTRAINT);
|
2001-10-25 07:50:21 +02:00
|
|
|
|
2008-05-10 01:32:05 +02:00
|
|
|
/*
|
|
|
|
* Make sure it yields a boolean result.
|
|
|
|
*/
|
|
|
|
expr = coerce_to_boolean(pstate, expr, "CHECK");
|
2001-05-30 14:57:36 +02:00
|
|
|
|
2011-03-20 01:29:08 +01:00
|
|
|
/*
|
|
|
|
* Take care of collations.
|
|
|
|
*/
|
|
|
|
assign_expr_collations(pstate, expr);
|
|
|
|
|
2008-05-10 01:32:05 +02:00
|
|
|
/*
|
Centralize the logic for detecting misplaced aggregates, window funcs, etc.
Formerly we relied on checking after-the-fact to see if an expression
contained aggregates, window functions, or sub-selects when it shouldn't.
This is grotty, easily forgotten (indeed, we had forgotten to teach
DefineIndex about rejecting window functions), and none too efficient
since it requires extra traversals of the parse tree. To improve matters,
define an enum type that classifies all SQL sub-expressions, store it in
ParseState to show what kind of expression we are currently parsing, and
make transformAggregateCall, transformWindowFuncCall, and transformSubLink
check the expression type and throw error if the type indicates the
construct is disallowed. This allows removal of a large number of ad-hoc
checks scattered around the code base. The enum type is sufficiently
fine-grained that we can still produce error messages of at least the
same specificity as before.
Bringing these error checks together revealed that we'd been none too
consistent about phrasing of the error messages, so standardize the wording
a bit.
Also, rewrite checking of aggregate arguments so that it requires only one
traversal of the arguments, rather than up to three as before.
In passing, clean up some more comments left over from add_missing_from
support, and annotate some tests that I think are dead code now that that's
gone. (I didn't risk actually removing said dead code, though.)
2012-08-10 17:35:33 +02:00
|
|
|
* Make sure no outside relations are referred to (this is probably dead
|
|
|
|
* code now that add_missing_from is history).
|
2008-05-10 01:32:05 +02:00
|
|
|
*/
|
|
|
|
if (list_length(pstate->p_rtable) != 1)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
|
|
|
|
errmsg("only table \"%s\" can be referenced in check constraint",
|
|
|
|
relname)));
|
2001-05-30 14:57:36 +02:00
|
|
|
|
2008-05-10 01:32:05 +02:00
|
|
|
return expr;
|
1997-08-22 16:10:26 +02:00
|
|
|
}
|
1997-08-22 04:58:51 +02:00
|
|
|
|
2020-11-01 13:22:07 +01:00
|
|
|
/*
|
|
|
|
* CopyStatistics --- copy entries in pg_statistic from one rel to another
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
CopyStatistics(Oid fromrelid, Oid torelid)
|
|
|
|
{
|
|
|
|
HeapTuple tup;
|
|
|
|
SysScanDesc scan;
|
|
|
|
ScanKeyData key[1];
|
|
|
|
Relation statrel;
|
Avoid some overhead with open and close of catalog indexes
This commit improves two code paths to open and close indexes a
minimum amount of times when doing a series of catalog updates or
inserts. CatalogTupleInsert() is costly when using it for multiple
inserts or updates compared to CatalogTupleInsertWithInfo(), as it would
need to open and close the indexes of the catalog worked each time an
operation is done.
This commit updates the following places:
- REINDEX CONCURRENTLY when copying statistics from one index relation
to the other. Multi-INSERTs are avoided here, as this would begin to
show benefits only for indexes with multiple expressions, for example,
which may not be the most common pattern. This change is noticeable in
profiles with indexes having many expressions, for example, and it would
improve any callers of CopyStatistics().
- Update of statistics on ANALYZE, that mixes inserts and updates.
In each case, the catalog indexes are opened only if at least one
insertion and/or update is required, to minimize the cost of the
operation. Like the previous coding, no indexes are opened as long as
at least one insert or update of pg_statistic has happened.
Author: Ranier Vilela
Reviewed-by: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/CAEudQAqh0F9y6Di_Wc8xW4zkWm_5SDd-nRfVsCn=h0Nm1C_mrg@mail.gmail.com
2022-11-16 02:49:05 +01:00
|
|
|
CatalogIndexState indstate = NULL;
|
2020-11-01 13:22:07 +01:00
|
|
|
|
|
|
|
statrel = table_open(StatisticRelationId, RowExclusiveLock);
|
|
|
|
|
|
|
|
/* Now search for stat records */
|
|
|
|
ScanKeyInit(&key[0],
|
|
|
|
Anum_pg_statistic_starelid,
|
|
|
|
BTEqualStrategyNumber, F_OIDEQ,
|
|
|
|
ObjectIdGetDatum(fromrelid));
|
|
|
|
|
|
|
|
scan = systable_beginscan(statrel, StatisticRelidAttnumInhIndexId,
|
|
|
|
true, NULL, 1, key);
|
|
|
|
|
|
|
|
while (HeapTupleIsValid((tup = systable_getnext(scan))))
|
|
|
|
{
|
|
|
|
Form_pg_statistic statform;
|
|
|
|
|
|
|
|
/* make a modifiable copy */
|
|
|
|
tup = heap_copytuple(tup);
|
|
|
|
statform = (Form_pg_statistic) GETSTRUCT(tup);
|
|
|
|
|
|
|
|
/* update the copy of the tuple and insert it */
|
|
|
|
statform->starelid = torelid;
|
Avoid some overhead with open and close of catalog indexes
This commit improves two code paths to open and close indexes a
minimum amount of times when doing a series of catalog updates or
inserts. CatalogTupleInsert() is costly when using it for multiple
inserts or updates compared to CatalogTupleInsertWithInfo(), as it would
need to open and close the indexes of the catalog worked each time an
operation is done.
This commit updates the following places:
- REINDEX CONCURRENTLY when copying statistics from one index relation
to the other. Multi-INSERTs are avoided here, as this would begin to
show benefits only for indexes with multiple expressions, for example,
which may not be the most common pattern. This change is noticeable in
profiles with indexes having many expressions, for example, and it would
improve any callers of CopyStatistics().
- Update of statistics on ANALYZE, that mixes inserts and updates.
In each case, the catalog indexes are opened only if at least one
insertion and/or update is required, to minimize the cost of the
operation. Like the previous coding, no indexes are opened as long as
at least one insert or update of pg_statistic has happened.
Author: Ranier Vilela
Reviewed-by: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/CAEudQAqh0F9y6Di_Wc8xW4zkWm_5SDd-nRfVsCn=h0Nm1C_mrg@mail.gmail.com
2022-11-16 02:49:05 +01:00
|
|
|
|
|
|
|
/* fetch index information when we know we need it */
|
|
|
|
if (indstate == NULL)
|
|
|
|
indstate = CatalogOpenIndexes(statrel);
|
|
|
|
|
|
|
|
CatalogTupleInsertWithInfo(statrel, tup, indstate);
|
2020-11-01 13:22:07 +01:00
|
|
|
|
|
|
|
heap_freetuple(tup);
|
|
|
|
}
|
|
|
|
|
|
|
|
systable_endscan(scan);
|
|
|
|
|
Avoid some overhead with open and close of catalog indexes
This commit improves two code paths to open and close indexes a
minimum amount of times when doing a series of catalog updates or
inserts. CatalogTupleInsert() is costly when using it for multiple
inserts or updates compared to CatalogTupleInsertWithInfo(), as it would
need to open and close the indexes of the catalog worked each time an
operation is done.
This commit updates the following places:
- REINDEX CONCURRENTLY when copying statistics from one index relation
to the other. Multi-INSERTs are avoided here, as this would begin to
show benefits only for indexes with multiple expressions, for example,
which may not be the most common pattern. This change is noticeable in
profiles with indexes having many expressions, for example, and it would
improve any callers of CopyStatistics().
- Update of statistics on ANALYZE, that mixes inserts and updates.
In each case, the catalog indexes are opened only if at least one
insertion and/or update is required, to minimize the cost of the
operation. Like the previous coding, no indexes are opened as long as
at least one insert or update of pg_statistic has happened.
Author: Ranier Vilela
Reviewed-by: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/CAEudQAqh0F9y6Di_Wc8xW4zkWm_5SDd-nRfVsCn=h0Nm1C_mrg@mail.gmail.com
2022-11-16 02:49:05 +01:00
|
|
|
if (indstate != NULL)
|
|
|
|
CatalogCloseIndexes(indstate);
|
2020-11-01 13:22:07 +01:00
|
|
|
table_close(statrel, RowExclusiveLock);
|
|
|
|
}
|
1999-11-28 03:03:04 +01:00
|
|
|
|
2003-05-12 02:17:03 +02:00
|
|
|
/*
|
|
|
|
* RemoveStatistics --- remove entries in pg_statistic for a rel or column
|
|
|
|
*
|
2009-12-29 21:11:45 +01:00
|
|
|
* If attnum is zero, remove all entries for rel; else remove only the one(s)
|
2003-05-12 02:17:03 +02:00
|
|
|
* for that column.
|
|
|
|
*/
|
2004-02-15 22:01:39 +01:00
|
|
|
void
|
2004-08-28 23:05:26 +02:00
|
|
|
RemoveStatistics(Oid relid, AttrNumber attnum)
|
1999-11-28 03:03:04 +01:00
|
|
|
{
|
|
|
|
Relation pgstatistic;
|
2002-07-20 00:21:17 +02:00
|
|
|
SysScanDesc scan;
|
2003-05-12 02:17:03 +02:00
|
|
|
ScanKeyData key[2];
|
|
|
|
int nkeys;
|
1999-11-28 03:03:04 +01:00
|
|
|
HeapTuple tuple;
|
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
pgstatistic = table_open(StatisticRelationId, RowExclusiveLock);
|
1999-11-28 03:03:04 +01:00
|
|
|
|
2003-11-12 22:15:59 +01:00
|
|
|
ScanKeyInit(&key[0],
|
|
|
|
Anum_pg_statistic_starelid,
|
|
|
|
BTEqualStrategyNumber, F_OIDEQ,
|
2004-08-28 23:05:26 +02:00
|
|
|
ObjectIdGetDatum(relid));
|
1999-11-28 03:03:04 +01:00
|
|
|
|
2003-05-12 02:17:03 +02:00
|
|
|
if (attnum == 0)
|
|
|
|
nkeys = 1;
|
|
|
|
else
|
|
|
|
{
|
2003-11-12 22:15:59 +01:00
|
|
|
ScanKeyInit(&key[1],
|
|
|
|
Anum_pg_statistic_staattnum,
|
|
|
|
BTEqualStrategyNumber, F_INT2EQ,
|
|
|
|
Int16GetDatum(attnum));
|
2003-05-12 02:17:03 +02:00
|
|
|
nkeys = 2;
|
|
|
|
}
|
|
|
|
|
2009-12-29 21:11:45 +01:00
|
|
|
scan = systable_beginscan(pgstatistic, StatisticRelidAttnumInhIndexId, true,
|
Use an MVCC snapshot, rather than SnapshotNow, for catalog scans.
SnapshotNow scans have the undesirable property that, in the face of
concurrent updates, the scan can fail to see either the old or the new
versions of the row. In many cases, we work around this by requiring
DDL operations to hold AccessExclusiveLock on the object being
modified; in some cases, the existing locking is inadequate and random
failures occur as a result. This commit doesn't change anything
related to locking, but will hopefully pave the way to allowing lock
strength reductions in the future.
The major issue has held us back from making this change in the past
is that taking an MVCC snapshot is significantly more expensive than
using a static special snapshot such as SnapshotNow. However, testing
of various worst-case scenarios reveals that this problem is not
severe except under fairly extreme workloads. To mitigate those
problems, we avoid retaking the MVCC snapshot for each new scan;
instead, we take a new snapshot only when invalidation messages have
been processed. The catcache machinery already requires that
invalidation messages be sent before releasing the related heavyweight
lock; else other backends might rely on locally-cached data rather
than scanning the catalog at all. Thus, making snapshot reuse
dependent on the same guarantees shouldn't break anything that wasn't
already subtly broken.
Patch by me. Review by Michael Paquier and Andres Freund.
2013-07-02 15:47:01 +02:00
|
|
|
NULL, nkeys, key);
|
2002-07-20 00:21:17 +02:00
|
|
|
|
2009-12-29 21:11:45 +01:00
|
|
|
/* we must loop even when attnum != 0, in case of inherited stats */
|
2002-07-20 00:21:17 +02:00
|
|
|
while (HeapTupleIsValid(tuple = systable_getnext(scan)))
|
2017-02-01 22:13:30 +01:00
|
|
|
CatalogTupleDelete(pgstatistic, &tuple->t_self);
|
1999-11-28 03:03:04 +01:00
|
|
|
|
2002-07-20 00:21:17 +02:00
|
|
|
systable_endscan(scan);
|
2003-05-12 02:17:03 +02:00
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(pgstatistic, RowExclusiveLock);
|
1999-11-28 03:03:04 +01:00
|
|
|
}
|
2002-07-14 23:08:08 +02:00
|
|
|
|
|
|
|
|
|
|
|
/*
|
2006-07-31 22:09:10 +02:00
|
|
|
* RelationTruncateIndexes - truncate all indexes associated
|
|
|
|
* with the heap relation to zero tuples.
|
2002-07-14 23:08:08 +02:00
|
|
|
*
|
2002-08-05 05:29:17 +02:00
|
|
|
* The routine will truncate and then reconstruct the indexes on
|
2006-07-31 22:09:10 +02:00
|
|
|
* the specified relation. Caller must hold exclusive lock on rel.
|
2002-07-14 23:08:08 +02:00
|
|
|
*/
|
|
|
|
static void
|
2006-07-31 22:09:10 +02:00
|
|
|
RelationTruncateIndexes(Relation heapRelation)
|
2002-07-14 23:08:08 +02:00
|
|
|
{
|
2004-05-26 06:41:50 +02:00
|
|
|
ListCell *indlist;
|
2002-07-14 23:08:08 +02:00
|
|
|
|
2003-05-28 18:04:02 +02:00
|
|
|
/* Ask the relcache to produce a list of the indexes of the rel */
|
|
|
|
foreach(indlist, RelationGetIndexList(heapRelation))
|
|
|
|
{
|
2004-05-26 06:41:50 +02:00
|
|
|
Oid indexId = lfirst_oid(indlist);
|
2003-05-28 18:04:02 +02:00
|
|
|
Relation currentIndex;
|
|
|
|
IndexInfo *indexInfo;
|
2002-07-14 23:08:08 +02:00
|
|
|
|
2006-07-31 22:09:10 +02:00
|
|
|
/* Open the index relation; use exclusive lock, just to be sure */
|
|
|
|
currentIndex = index_open(indexId, AccessExclusiveLock);
|
2002-07-14 23:08:08 +02:00
|
|
|
|
Fix misbehavior with expression indexes on ON COMMIT DELETE ROWS tables.
We implement ON COMMIT DELETE ROWS by truncating tables marked that
way, which requires also truncating/rebuilding their indexes. But
RelationTruncateIndexes asks the relcache for up-to-date copies of any
index expressions, which may cause execution of eval_const_expressions
on them, which can result in actual execution of subexpressions.
This is a bad thing to have happening during ON COMMIT. Manuel Rigger
reported that use of a SQL function resulted in crashes due to
expectations that ActiveSnapshot would be set, which it isn't.
The most obvious fix perhaps would be to push a snapshot during
PreCommit_on_commit_actions, but I think that would just open the door
to more problems: CommitTransaction explicitly expects that no
user-defined code can be running at this point.
Fortunately, since we know that no tuples exist to be indexed, there
seems no need to use the real index expressions or predicates during
RelationTruncateIndexes. We can set up dummy index expressions
instead (we do need something that will expose the right data type,
as there are places that build index tupdescs based on this), and
just ignore predicates and exclusion constraints.
In a green field it'd likely be better to reimplement ON COMMIT DELETE
ROWS using the same "init fork" infrastructure used for unlogged
relations. That seems impractical without catalog changes though,
and even without that it'd be too big a change to back-patch.
So for now do it like this.
Per private report from Manuel Rigger. This has been broken forever,
so back-patch to all supported branches.
2019-12-01 19:09:26 +01:00
|
|
|
/*
|
|
|
|
* Fetch info needed for index_build. Since we know there are no
|
|
|
|
* tuples that actually need indexing, we can use a dummy IndexInfo.
|
|
|
|
* This is slightly cheaper to build, but the real point is to avoid
|
|
|
|
* possibly running user-defined code in index expressions or
|
|
|
|
* predicates. We might be getting invoked during ON COMMIT
|
|
|
|
* processing, and we don't want to run any such code then.
|
|
|
|
*/
|
|
|
|
indexInfo = BuildDummyIndexInfo(currentIndex);
|
2003-05-28 18:04:02 +02:00
|
|
|
|
2008-09-30 12:52:14 +02:00
|
|
|
/*
|
2008-11-27 16:59:28 +01:00
|
|
|
* Now truncate the actual file (and discard buffers).
|
2008-09-30 12:52:14 +02:00
|
|
|
*/
|
2004-05-08 21:09:25 +02:00
|
|
|
RelationTruncate(currentIndex, 0);
|
2002-07-14 23:08:08 +02:00
|
|
|
|
|
|
|
/* Initialize the index and rebuild */
|
2006-07-31 03:16:38 +02:00
|
|
|
/* Note: we do not need to re-establish pkey setting */
|
2019-01-23 23:57:09 +01:00
|
|
|
index_build(heapRelation, currentIndex, indexInfo, true, false);
|
2002-07-14 23:08:08 +02:00
|
|
|
|
2006-05-11 01:18:39 +02:00
|
|
|
/* We're done with this index */
|
2006-07-31 22:09:10 +02:00
|
|
|
index_close(currentIndex, NoLock);
|
2002-07-14 23:08:08 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* heap_truncate
|
|
|
|
*
|
2005-01-27 04:19:37 +01:00
|
|
|
* This routine deletes all data within all the specified relations.
|
2003-09-19 23:04:20 +02:00
|
|
|
*
|
|
|
|
* This is not transaction-safe! There is another, transaction-safe
|
2005-01-27 04:19:37 +01:00
|
|
|
* implementation in commands/tablecmds.c. We now use this only for
|
2003-09-19 23:04:20 +02:00
|
|
|
* ON COMMIT truncation of temporary tables, where it doesn't matter.
|
2002-07-14 23:08:08 +02:00
|
|
|
*/
|
|
|
|
void
|
2005-01-27 04:19:37 +01:00
|
|
|
heap_truncate(List *relids)
|
2002-07-14 23:08:08 +02:00
|
|
|
{
|
2005-01-27 04:19:37 +01:00
|
|
|
List *relations = NIL;
|
|
|
|
ListCell *cell;
|
|
|
|
|
|
|
|
/* Open relations for processing, and grab exclusive access on each */
|
|
|
|
foreach(cell, relids)
|
|
|
|
{
|
|
|
|
Oid rid = lfirst_oid(cell);
|
|
|
|
Relation rel;
|
2002-07-14 23:08:08 +02:00
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
rel = table_open(rid, AccessExclusiveLock);
|
2005-01-27 04:19:37 +01:00
|
|
|
relations = lappend(relations, rel);
|
|
|
|
}
|
2002-07-14 23:08:08 +02:00
|
|
|
|
2003-09-19 23:04:20 +02:00
|
|
|
/* Don't allow truncate on tables that are referenced by foreign keys */
|
2005-01-27 04:19:37 +01:00
|
|
|
heap_truncate_check_FKs(relations, true);
|
2003-09-19 23:04:20 +02:00
|
|
|
|
2005-01-27 04:19:37 +01:00
|
|
|
/* OK to do it */
|
|
|
|
foreach(cell, relations)
|
|
|
|
{
|
|
|
|
Relation rel = lfirst(cell);
|
2002-07-14 23:08:08 +02:00
|
|
|
|
2009-08-23 21:23:41 +02:00
|
|
|
/* Truncate the relation */
|
|
|
|
heap_truncate_one_rel(rel);
|
2002-07-14 23:08:08 +02:00
|
|
|
|
2009-08-23 21:23:41 +02:00
|
|
|
/* Close the relation, but keep exclusive lock on it until commit */
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(rel, NoLock);
|
2005-01-27 04:19:37 +01:00
|
|
|
}
|
2002-07-14 23:08:08 +02:00
|
|
|
}
|
2003-09-19 23:04:20 +02:00
|
|
|
|
2009-08-23 21:23:41 +02:00
|
|
|
/*
|
|
|
|
* heap_truncate_one_rel
|
|
|
|
*
|
|
|
|
* This routine deletes all data within the specified relation.
|
|
|
|
*
|
|
|
|
* This is not transaction-safe, because the truncation is done immediately
|
|
|
|
* and cannot be rolled back later. Caller is responsible for having
|
|
|
|
* checked permissions etc, and must have obtained AccessExclusiveLock.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
heap_truncate_one_rel(Relation rel)
|
|
|
|
{
|
|
|
|
Oid toastrelid;
|
|
|
|
|
2018-11-05 01:14:33 +01:00
|
|
|
/*
|
|
|
|
* Truncate the relation. Partitioned tables have no storage, so there is
|
|
|
|
* nothing to do for them here.
|
|
|
|
*/
|
|
|
|
if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
|
|
|
|
return;
|
|
|
|
|
2019-03-29 04:01:14 +01:00
|
|
|
/* Truncate the underlying relation */
|
|
|
|
table_relation_nontransactional_truncate(rel);
|
2009-08-23 21:23:41 +02:00
|
|
|
|
|
|
|
/* If the relation has indexes, truncate the indexes too */
|
|
|
|
RelationTruncateIndexes(rel);
|
|
|
|
|
|
|
|
/* If there is a toast table, truncate that too */
|
|
|
|
toastrelid = rel->rd_rel->reltoastrelid;
|
|
|
|
if (OidIsValid(toastrelid))
|
|
|
|
{
|
2019-01-21 19:32:19 +01:00
|
|
|
Relation toastrel = table_open(toastrelid, AccessExclusiveLock);
|
2009-08-23 21:23:41 +02:00
|
|
|
|
2019-03-29 04:01:14 +01:00
|
|
|
table_relation_nontransactional_truncate(toastrel);
|
2009-08-23 21:23:41 +02:00
|
|
|
RelationTruncateIndexes(toastrel);
|
|
|
|
/* keep the lock... */
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(toastrel, NoLock);
|
2009-08-23 21:23:41 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2003-09-19 23:04:20 +02:00
|
|
|
/*
|
|
|
|
* heap_truncate_check_FKs
|
2005-01-27 04:19:37 +01:00
|
|
|
* Check for foreign keys referencing a list of relations that
|
2006-06-29 18:07:29 +02:00
|
|
|
* are to be truncated, and raise error if there are any
|
2003-09-19 23:04:20 +02:00
|
|
|
*
|
|
|
|
* We disallow such FKs (except self-referential ones) since the whole point
|
|
|
|
* of TRUNCATE is to not scan the individual rows to be thrown away.
|
|
|
|
*
|
|
|
|
* This is split out so it can be shared by both implementations of truncate.
|
2005-01-27 04:19:37 +01:00
|
|
|
* Caller should already hold a suitable lock on the relations.
|
|
|
|
*
|
|
|
|
* tempTables is only used to select an appropriate error message.
|
2003-09-19 23:04:20 +02:00
|
|
|
*/
|
|
|
|
void
|
2005-01-27 04:19:37 +01:00
|
|
|
heap_truncate_check_FKs(List *relations, bool tempTables)
|
2003-09-19 23:04:20 +02:00
|
|
|
{
|
2005-01-27 04:19:37 +01:00
|
|
|
List *oids = NIL;
|
2006-06-29 18:07:29 +02:00
|
|
|
List *dependents;
|
2005-01-27 04:19:37 +01:00
|
|
|
ListCell *cell;
|
2003-09-19 23:04:20 +02:00
|
|
|
|
|
|
|
/*
|
2005-01-27 04:19:37 +01:00
|
|
|
* Build a list of OIDs of the interesting relations.
|
|
|
|
*
|
|
|
|
* If a relation has no triggers, then it can neither have FKs nor be
|
2018-07-12 18:09:08 +02:00
|
|
|
* referenced by a FK from another table, so we can ignore it. For
|
|
|
|
* partitioned tables, FKs have no triggers, so we must include them
|
|
|
|
* anyway.
|
2003-09-19 23:04:20 +02:00
|
|
|
*/
|
2005-01-27 04:19:37 +01:00
|
|
|
foreach(cell, relations)
|
|
|
|
{
|
|
|
|
Relation rel = lfirst(cell);
|
|
|
|
|
2018-07-12 18:09:08 +02:00
|
|
|
if (rel->rd_rel->relhastriggers ||
|
|
|
|
rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
|
2005-01-27 04:19:37 +01:00
|
|
|
oids = lappend_oid(oids, RelationGetRelid(rel));
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Fast path: if no relation has triggers, none has FKs either.
|
|
|
|
*/
|
|
|
|
if (oids == NIL)
|
2003-09-19 23:04:20 +02:00
|
|
|
return;
|
|
|
|
|
|
|
|
/*
|
2006-06-29 18:07:29 +02:00
|
|
|
* Otherwise, must scan pg_constraint. We make one pass with all the
|
|
|
|
* relations considered; if this finds nothing, then all is well.
|
2003-09-19 23:04:20 +02:00
|
|
|
*/
|
2006-06-29 18:07:29 +02:00
|
|
|
dependents = heap_truncate_find_FKs(oids);
|
|
|
|
if (dependents == NIL)
|
|
|
|
return;
|
2003-09-19 23:04:20 +02:00
|
|
|
|
2006-06-29 18:07:29 +02:00
|
|
|
/*
|
|
|
|
* Otherwise we repeat the scan once per relation to identify a particular
|
|
|
|
* pair of relations to complain about. This is pretty slow, but
|
|
|
|
* performance shouldn't matter much in a failure path. The reason for
|
|
|
|
* doing things this way is to ensure that the message produced is not
|
|
|
|
* dependent on chance row locations within pg_constraint.
|
|
|
|
*/
|
|
|
|
foreach(cell, oids)
|
2003-09-19 23:04:20 +02:00
|
|
|
{
|
2006-06-29 18:07:29 +02:00
|
|
|
Oid relid = lfirst_oid(cell);
|
|
|
|
ListCell *cell2;
|
2005-01-27 04:19:37 +01:00
|
|
|
|
2006-06-29 18:07:29 +02:00
|
|
|
dependents = heap_truncate_find_FKs(list_make1_oid(relid));
|
2005-01-27 04:19:37 +01:00
|
|
|
|
2006-06-29 18:07:29 +02:00
|
|
|
foreach(cell2, dependents)
|
2005-01-27 04:19:37 +01:00
|
|
|
{
|
2006-06-29 18:07:29 +02:00
|
|
|
Oid relid2 = lfirst_oid(cell2);
|
|
|
|
|
|
|
|
if (!list_member_oid(oids, relid2))
|
|
|
|
{
|
|
|
|
char *relname = get_rel_name(relid);
|
|
|
|
char *relname2 = get_rel_name(relid2);
|
|
|
|
|
|
|
|
if (tempTables)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
|
|
|
|
errmsg("unsupported ON COMMIT and foreign key combination"),
|
|
|
|
errdetail("Table \"%s\" references \"%s\", but they do not have the same ON COMMIT setting.",
|
|
|
|
relname2, relname)));
|
|
|
|
else
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
|
|
|
|
errmsg("cannot truncate a table referenced in a foreign key constraint"),
|
|
|
|
errdetail("Table \"%s\" references \"%s\".",
|
|
|
|
relname2, relname),
|
|
|
|
errhint("Truncate table \"%s\" at the same time, "
|
|
|
|
"or use TRUNCATE ... CASCADE.",
|
|
|
|
relname2)));
|
|
|
|
}
|
2005-01-27 04:19:37 +01:00
|
|
|
}
|
2003-09-19 23:04:20 +02:00
|
|
|
}
|
|
|
|
}
|
2006-03-03 04:30:54 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* heap_truncate_find_FKs
|
2006-06-29 18:07:29 +02:00
|
|
|
* Find relations having foreign keys referencing any of the given rels
|
2006-03-03 04:30:54 +01:00
|
|
|
*
|
2006-06-29 18:07:29 +02:00
|
|
|
* Input and result are both lists of relation OIDs. The result contains
|
|
|
|
* no duplicates, does *not* include any rels that were already in the input
|
|
|
|
* list, and is sorted in OID order. (The last property is enforced mainly
|
|
|
|
* to guarantee consistent behavior in the regression tests; we don't want
|
|
|
|
* behavior to change depending on chance locations of rows in pg_constraint.)
|
2006-03-03 04:30:54 +01:00
|
|
|
*
|
2006-06-29 18:07:29 +02:00
|
|
|
* Note: caller should already have appropriate lock on all rels mentioned
|
2006-03-03 04:30:54 +01:00
|
|
|
* in relationIds. Since adding or dropping an FK requires exclusive lock
|
|
|
|
* on both rels, this ensures that the answer will be stable.
|
|
|
|
*/
|
|
|
|
List *
|
|
|
|
heap_truncate_find_FKs(List *relationIds)
|
|
|
|
{
|
|
|
|
List *result = NIL;
|
2020-09-04 04:57:35 +02:00
|
|
|
List *oids;
|
2020-02-07 21:09:36 +01:00
|
|
|
List *parent_cons;
|
|
|
|
ListCell *cell;
|
|
|
|
ScanKeyData key;
|
2006-03-03 04:30:54 +01:00
|
|
|
Relation fkeyRel;
|
|
|
|
SysScanDesc fkeyScan;
|
|
|
|
HeapTuple tuple;
|
2020-02-07 21:09:36 +01:00
|
|
|
bool restart;
|
|
|
|
|
|
|
|
oids = list_copy(relationIds);
|
2006-03-03 04:30:54 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Must scan pg_constraint. Right now, it is a seqscan because there is
|
|
|
|
* no available index on confrelid.
|
|
|
|
*/
|
2019-01-21 19:32:19 +01:00
|
|
|
fkeyRel = table_open(ConstraintRelationId, AccessShareLock);
|
2006-03-03 04:30:54 +01:00
|
|
|
|
2020-02-07 21:09:36 +01:00
|
|
|
restart:
|
|
|
|
restart = false;
|
|
|
|
parent_cons = NIL;
|
|
|
|
|
2006-03-03 04:30:54 +01:00
|
|
|
fkeyScan = systable_beginscan(fkeyRel, InvalidOid, false,
|
Use an MVCC snapshot, rather than SnapshotNow, for catalog scans.
SnapshotNow scans have the undesirable property that, in the face of
concurrent updates, the scan can fail to see either the old or the new
versions of the row. In many cases, we work around this by requiring
DDL operations to hold AccessExclusiveLock on the object being
modified; in some cases, the existing locking is inadequate and random
failures occur as a result. This commit doesn't change anything
related to locking, but will hopefully pave the way to allowing lock
strength reductions in the future.
The major issue has held us back from making this change in the past
is that taking an MVCC snapshot is significantly more expensive than
using a static special snapshot such as SnapshotNow. However, testing
of various worst-case scenarios reveals that this problem is not
severe except under fairly extreme workloads. To mitigate those
problems, we avoid retaking the MVCC snapshot for each new scan;
instead, we take a new snapshot only when invalidation messages have
been processed. The catcache machinery already requires that
invalidation messages be sent before releasing the related heavyweight
lock; else other backends might rely on locally-cached data rather
than scanning the catalog at all. Thus, making snapshot reuse
dependent on the same guarantees shouldn't break anything that wasn't
already subtly broken.
Patch by me. Review by Michael Paquier and Andres Freund.
2013-07-02 15:47:01 +02:00
|
|
|
NULL, 0, NULL);
|
2006-03-03 04:30:54 +01:00
|
|
|
|
|
|
|
while (HeapTupleIsValid(tuple = systable_getnext(fkeyScan)))
|
|
|
|
{
|
|
|
|
Form_pg_constraint con = (Form_pg_constraint) GETSTRUCT(tuple);
|
|
|
|
|
|
|
|
/* Not a foreign key */
|
|
|
|
if (con->contype != CONSTRAINT_FOREIGN)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
/* Not referencing one of our list of tables */
|
2020-02-07 21:09:36 +01:00
|
|
|
if (!list_member_oid(oids, con->confrelid))
|
2006-03-03 04:30:54 +01:00
|
|
|
continue;
|
|
|
|
|
2020-02-07 21:09:36 +01:00
|
|
|
/*
|
|
|
|
* If this constraint has a parent constraint which we have not seen
|
|
|
|
* yet, keep track of it for the second loop, below. Tracking parent
|
2021-10-04 15:12:57 +02:00
|
|
|
* constraints allows us to climb up to the top-level constraint and
|
2020-02-07 21:09:36 +01:00
|
|
|
* look for all possible relations referencing the partitioned table.
|
|
|
|
*/
|
|
|
|
if (OidIsValid(con->conparentid) &&
|
|
|
|
!list_member_oid(parent_cons, con->conparentid))
|
|
|
|
parent_cons = lappend_oid(parent_cons, con->conparentid);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Add referencer to result, unless present in input list. (Don't
|
|
|
|
* worry about dupes: we'll fix that below).
|
|
|
|
*/
|
2006-03-03 04:30:54 +01:00
|
|
|
if (!list_member_oid(relationIds, con->conrelid))
|
Clean up some ad-hoc code for sorting and de-duplicating Lists.
heap.c and relcache.c contained nearly identical copies of logic
to insert OIDs into an OID list while preserving the list's OID
ordering (and rejecting duplicates, in one case but not the other).
The comments argue that this is faster than qsort for small numbers
of OIDs, which is at best unproven, and seems even less likely to be
true now that lappend_cell_oid has to move data around. In any case
it's ugly and hard-to-follow code, and if we do have a lot of OIDs
to consider, it's O(N^2).
Hence, replace with simply lappend'ing OIDs to a List, then list_sort
the completed List, then remove adjacent duplicates if necessary.
This is demonstrably O(N log N) and it's much simpler for the
callers. It's possible that this would be somewhat inefficient
if there were a very large number of duplicates, but that seems
unlikely in the existing usage.
This adds list_deduplicate_oid and list_oid_cmp infrastructure
to list.c. I didn't bother with equivalent functionality for
integer or pointer Lists, but such could always be added later
if we find a use for it.
Discussion: https://postgr.es/m/26193.1563228600@sss.pgh.pa.us
2019-07-16 18:04:06 +02:00
|
|
|
result = lappend_oid(result, con->conrelid);
|
2006-03-03 04:30:54 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
systable_endscan(fkeyScan);
|
2020-02-07 21:09:36 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Process each parent constraint we found to add the list of referenced
|
|
|
|
* relations by them to the oids list. If we do add any new such
|
|
|
|
* relations, redo the first loop above. Also, if we see that the parent
|
|
|
|
* constraint in turn has a parent, add that so that we process all
|
|
|
|
* relations in a single additional pass.
|
|
|
|
*/
|
|
|
|
foreach(cell, parent_cons)
|
|
|
|
{
|
|
|
|
Oid parent = lfirst_oid(cell);
|
|
|
|
|
|
|
|
ScanKeyInit(&key,
|
|
|
|
Anum_pg_constraint_oid,
|
|
|
|
BTEqualStrategyNumber, F_OIDEQ,
|
|
|
|
ObjectIdGetDatum(parent));
|
|
|
|
|
|
|
|
fkeyScan = systable_beginscan(fkeyRel, ConstraintOidIndexId,
|
|
|
|
true, NULL, 1, &key);
|
|
|
|
|
|
|
|
tuple = systable_getnext(fkeyScan);
|
|
|
|
if (HeapTupleIsValid(tuple))
|
|
|
|
{
|
|
|
|
Form_pg_constraint con = (Form_pg_constraint) GETSTRUCT(tuple);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* pg_constraint rows always appear for partitioned hierarchies
|
|
|
|
* this way: on the each side of the constraint, one row appears
|
|
|
|
* for each partition that points to the top-most table on the
|
|
|
|
* other side.
|
|
|
|
*
|
|
|
|
* Because of this arrangement, we can correctly catch all
|
|
|
|
* relevant relations by adding to 'parent_cons' all rows with
|
|
|
|
* valid conparentid, and to the 'oids' list all rows with a zero
|
|
|
|
* conparentid. If any oids are added to 'oids', redo the first
|
|
|
|
* loop above by setting 'restart'.
|
|
|
|
*/
|
|
|
|
if (OidIsValid(con->conparentid))
|
|
|
|
parent_cons = list_append_unique_oid(parent_cons,
|
|
|
|
con->conparentid);
|
|
|
|
else if (!list_member_oid(oids, con->confrelid))
|
|
|
|
{
|
|
|
|
oids = lappend_oid(oids, con->confrelid);
|
|
|
|
restart = true;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
systable_endscan(fkeyScan);
|
|
|
|
}
|
|
|
|
|
|
|
|
list_free(parent_cons);
|
|
|
|
if (restart)
|
|
|
|
goto restart;
|
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(fkeyRel, AccessShareLock);
|
2020-02-07 21:09:36 +01:00
|
|
|
list_free(oids);
|
2006-03-03 04:30:54 +01:00
|
|
|
|
Clean up some ad-hoc code for sorting and de-duplicating Lists.
heap.c and relcache.c contained nearly identical copies of logic
to insert OIDs into an OID list while preserving the list's OID
ordering (and rejecting duplicates, in one case but not the other).
The comments argue that this is faster than qsort for small numbers
of OIDs, which is at best unproven, and seems even less likely to be
true now that lappend_cell_oid has to move data around. In any case
it's ugly and hard-to-follow code, and if we do have a lot of OIDs
to consider, it's O(N^2).
Hence, replace with simply lappend'ing OIDs to a List, then list_sort
the completed List, then remove adjacent duplicates if necessary.
This is demonstrably O(N log N) and it's much simpler for the
callers. It's possible that this would be somewhat inefficient
if there were a very large number of duplicates, but that seems
unlikely in the existing usage.
This adds list_deduplicate_oid and list_oid_cmp infrastructure
to list.c. I didn't bother with equivalent functionality for
integer or pointer Lists, but such could always be added later
if we find a use for it.
Discussion: https://postgr.es/m/26193.1563228600@sss.pgh.pa.us
2019-07-16 18:04:06 +02:00
|
|
|
/* Now sort and de-duplicate the result list */
|
|
|
|
list_sort(result, list_oid_cmp);
|
|
|
|
list_deduplicate_oid(result);
|
2006-06-29 18:07:29 +02:00
|
|
|
|
Clean up some ad-hoc code for sorting and de-duplicating Lists.
heap.c and relcache.c contained nearly identical copies of logic
to insert OIDs into an OID list while preserving the list's OID
ordering (and rejecting duplicates, in one case but not the other).
The comments argue that this is faster than qsort for small numbers
of OIDs, which is at best unproven, and seems even less likely to be
true now that lappend_cell_oid has to move data around. In any case
it's ugly and hard-to-follow code, and if we do have a lot of OIDs
to consider, it's O(N^2).
Hence, replace with simply lappend'ing OIDs to a List, then list_sort
the completed List, then remove adjacent duplicates if necessary.
This is demonstrably O(N log N) and it's much simpler for the
callers. It's possible that this would be somewhat inefficient
if there were a very large number of duplicates, but that seems
unlikely in the existing usage.
This adds list_deduplicate_oid and list_oid_cmp infrastructure
to list.c. I didn't bother with equivalent functionality for
integer or pointer Lists, but such could always be added later
if we find a use for it.
Discussion: https://postgr.es/m/26193.1563228600@sss.pgh.pa.us
2019-07-16 18:04:06 +02:00
|
|
|
return result;
|
2006-06-29 18:07:29 +02:00
|
|
|
}
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* StorePartitionKey
|
|
|
|
* Store information about the partition key rel into the catalog
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
StorePartitionKey(Relation rel,
|
|
|
|
char strategy,
|
|
|
|
int16 partnatts,
|
|
|
|
AttrNumber *partattrs,
|
|
|
|
List *partexprs,
|
|
|
|
Oid *partopclass,
|
|
|
|
Oid *partcollation)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
int2vector *partattrs_vec;
|
|
|
|
oidvector *partopclass_vec;
|
|
|
|
oidvector *partcollation_vec;
|
|
|
|
Datum partexprDatum;
|
|
|
|
Relation pg_partitioned_table;
|
|
|
|
HeapTuple tuple;
|
|
|
|
Datum values[Natts_pg_partitioned_table];
|
2022-07-16 08:42:15 +02:00
|
|
|
bool nulls[Natts_pg_partitioned_table] = {0};
|
2017-01-24 16:20:02 +01:00
|
|
|
ObjectAddress myself;
|
|
|
|
ObjectAddress referenced;
|
2020-09-05 14:33:53 +02:00
|
|
|
ObjectAddresses *addrs;
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
|
|
|
|
Assert(rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE);
|
|
|
|
|
|
|
|
/* Copy the partition attribute numbers, opclass OIDs into arrays */
|
|
|
|
partattrs_vec = buildint2vector(partattrs, partnatts);
|
|
|
|
partopclass_vec = buildoidvector(partopclass, partnatts);
|
|
|
|
partcollation_vec = buildoidvector(partcollation, partnatts);
|
|
|
|
|
|
|
|
/* Convert the expressions (if any) to a text datum */
|
|
|
|
if (partexprs)
|
|
|
|
{
|
2017-01-24 16:20:02 +01:00
|
|
|
char *exprString;
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
|
|
|
|
exprString = nodeToString(partexprs);
|
|
|
|
partexprDatum = CStringGetTextDatum(exprString);
|
|
|
|
pfree(exprString);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
partexprDatum = (Datum) 0;
|
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
pg_partitioned_table = table_open(PartitionedRelationId, RowExclusiveLock);
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
|
|
|
|
/* Only this can ever be NULL */
|
|
|
|
if (!partexprDatum)
|
|
|
|
nulls[Anum_pg_partitioned_table_partexprs - 1] = true;
|
|
|
|
|
|
|
|
values[Anum_pg_partitioned_table_partrelid - 1] = ObjectIdGetDatum(RelationGetRelid(rel));
|
|
|
|
values[Anum_pg_partitioned_table_partstrat - 1] = CharGetDatum(strategy);
|
|
|
|
values[Anum_pg_partitioned_table_partnatts - 1] = Int16GetDatum(partnatts);
|
Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.
Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.
Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 23:28:04 +02:00
|
|
|
values[Anum_pg_partitioned_table_partdefid - 1] = ObjectIdGetDatum(InvalidOid);
|
2017-01-24 16:20:02 +01:00
|
|
|
values[Anum_pg_partitioned_table_partattrs - 1] = PointerGetDatum(partattrs_vec);
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
values[Anum_pg_partitioned_table_partclass - 1] = PointerGetDatum(partopclass_vec);
|
|
|
|
values[Anum_pg_partitioned_table_partcollation - 1] = PointerGetDatum(partcollation_vec);
|
|
|
|
values[Anum_pg_partitioned_table_partexprs - 1] = partexprDatum;
|
|
|
|
|
|
|
|
tuple = heap_form_tuple(RelationGetDescr(pg_partitioned_table), values, nulls);
|
|
|
|
|
2017-01-31 22:42:24 +01:00
|
|
|
CatalogTupleInsert(pg_partitioned_table, tuple);
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(pg_partitioned_table, RowExclusiveLock);
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
|
|
|
|
/* Mark this relation as dependent on a few things as follows */
|
2020-09-05 14:33:53 +02:00
|
|
|
addrs = new_object_addresses();
|
|
|
|
ObjectAddressSet(myself, RelationRelationId, RelationGetRelid(rel));
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
|
|
|
|
/* Operator class and collation per key column */
|
|
|
|
for (i = 0; i < partnatts; i++)
|
|
|
|
{
|
2020-09-05 14:33:53 +02:00
|
|
|
ObjectAddressSet(referenced, OperatorClassRelationId, partopclass[i]);
|
|
|
|
add_exact_object_address(&referenced, addrs);
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
|
2017-06-06 17:07:20 +02:00
|
|
|
/* The default collation is pinned, so don't bother recording it */
|
|
|
|
if (OidIsValid(partcollation[i]) &&
|
|
|
|
partcollation[i] != DEFAULT_COLLATION_OID)
|
|
|
|
{
|
2020-09-05 14:33:53 +02:00
|
|
|
ObjectAddressSet(referenced, CollationRelationId, partcollation[i]);
|
|
|
|
add_exact_object_address(&referenced, addrs);
|
2018-12-13 21:11:09 +01:00
|
|
|
}
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
}
|
|
|
|
|
2020-09-05 14:33:53 +02:00
|
|
|
record_object_address_dependencies(&myself, addrs, DEPENDENCY_NORMAL);
|
|
|
|
free_object_addresses(addrs);
|
|
|
|
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
/*
|
2019-07-22 20:55:22 +02:00
|
|
|
* The partitioning columns are made internally dependent on the table,
|
|
|
|
* because we cannot drop any of them without dropping the whole table.
|
|
|
|
* (ATExecDropColumn independently enforces that, but it's not bulletproof
|
|
|
|
* so we need the dependencies too.)
|
|
|
|
*/
|
|
|
|
for (i = 0; i < partnatts; i++)
|
|
|
|
{
|
|
|
|
if (partattrs[i] == 0)
|
|
|
|
continue; /* ignore expressions here */
|
|
|
|
|
2020-09-05 14:33:53 +02:00
|
|
|
ObjectAddressSubSet(referenced, RelationRelationId,
|
|
|
|
RelationGetRelid(rel), partattrs[i]);
|
2019-07-22 20:55:22 +02:00
|
|
|
recordDependencyOn(&referenced, &myself, DEPENDENCY_INTERNAL);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Also consider anything mentioned in partition expressions. External
|
|
|
|
* references (e.g. functions) get NORMAL dependencies. Table columns
|
|
|
|
* mentioned in the expressions are handled the same as plain partitioning
|
|
|
|
* columns, i.e. they become internally dependent on the whole table.
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
*/
|
|
|
|
if (partexprs)
|
|
|
|
recordDependencyOnSingleRelExpr(&myself,
|
|
|
|
(Node *) partexprs,
|
|
|
|
RelationGetRelid(rel),
|
|
|
|
DEPENDENCY_NORMAL,
|
2019-07-22 20:55:22 +02:00
|
|
|
DEPENDENCY_INTERNAL,
|
2021-05-07 10:17:42 +02:00
|
|
|
true /* reverse the self-deps */ );
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* We must invalidate the relcache so that the next
|
|
|
|
* CommandCounterIncrement() will cause the same to be rebuilt using the
|
|
|
|
* information in just created catalog entry.
|
|
|
|
*/
|
|
|
|
CacheInvalidateRelcache(rel);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2017-01-24 16:20:02 +01:00
|
|
|
* RemovePartitionKeyByRelId
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
* Remove pg_partitioned_table entry for a relation
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
RemovePartitionKeyByRelId(Oid relid)
|
|
|
|
{
|
|
|
|
Relation rel;
|
|
|
|
HeapTuple tuple;
|
|
|
|
|
2019-01-21 19:32:19 +01:00
|
|
|
rel = table_open(PartitionedRelationId, RowExclusiveLock);
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
|
|
|
|
tuple = SearchSysCache1(PARTRELID, ObjectIdGetDatum(relid));
|
|
|
|
if (!HeapTupleIsValid(tuple))
|
|
|
|
elog(ERROR, "cache lookup failed for partition key of relation %u",
|
|
|
|
relid);
|
|
|
|
|
2017-02-01 22:13:30 +01:00
|
|
|
CatalogTupleDelete(rel, &tuple->t_self);
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
|
|
|
|
ReleaseSysCache(tuple);
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(rel, RowExclusiveLock);
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* StorePartitionBound
|
|
|
|
* Update pg_class tuple of rel to store the partition bound and set
|
|
|
|
* relispartition to true
|
2016-12-20 04:53:30 +01:00
|
|
|
*
|
2018-03-21 16:03:35 +01:00
|
|
|
* If this is the default partition, also update the default partition OID in
|
|
|
|
* pg_partitioned_table.
|
|
|
|
*
|
2016-12-20 04:53:30 +01:00
|
|
|
* Also, invalidate the parent's relcache, so that the next rebuild will load
|
2018-06-04 22:17:34 +02:00
|
|
|
* the new partition's info into its partition descriptor. If there is a
|
Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.
Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.
Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 23:28:04 +02:00
|
|
|
* default partition, we must invalidate its relcache entry as well.
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
*/
|
|
|
|
void
|
Code review focused on new node types added by partitioning support.
Fix failure to check that we got a plain Const from const-simplification of
a coercion request. This is the cause of bug #14666 from Tian Bing: there
is an int4 to money cast, but it's only stable not immutable (because of
dependence on lc_monetary), resulting in a FuncExpr that the code was
miserably unequipped to deal with, or indeed even to notice that it was
failing to deal with. Add test cases around this coercion behavior.
In view of the above, sprinkle the code liberally with castNode() macros,
in hope of catching the next such bug a bit sooner. Also, change some
functions that were randomly declared to take Node* to take more specific
pointer types. And change some struct fields that were declared Node*
but could be given more specific types, allowing removal of assorted
explicit casts.
Place PARTITION_MAX_KEYS check a bit closer to the code it's protecting.
Likewise check only-one-key-for-list-partitioning restriction in a less
random place.
Avoid not-per-project-style usages like !strcmp(...).
Fix assorted failures to avoid scribbling on the input of parse
transformation. I'm not sure how necessary this is, but it's entirely
silly for these functions to be expending cycles to avoid that and not
getting it right.
Add guards against partitioning on system columns.
Put backend/nodes/ support code into an order that matches handling
of these node types elsewhere.
Annotate the fact that somebody added location fields to PartitionBoundSpec
and PartitionRangeDatum but forgot to handle them in
outfuncs.c/readfuncs.c. This is fairly harmless for production purposes
(since readfuncs.c would just substitute -1 anyway) but it's still bogus.
It's not worth forcing a post-beta1 initdb just to fix this, but if we
have another reason to force initdb before 10.0, we should go back and
clean this up.
Contrariwise, somebody added location fields to PartitionElem and
PartitionSpec but forgot to teach exprLocation() about them.
Consolidate duplicative code in transformPartitionBound().
Improve a couple of error messages.
Improve assorted commentary.
Re-pgindent the files touched by this patch; this affects a few comment
blocks that must have been added quite recently.
Report: https://postgr.es/m/20170524024550.29935.14396@wrigleys.postgresql.org
2017-05-29 05:20:28 +02:00
|
|
|
StorePartitionBound(Relation rel, Relation parent, PartitionBoundSpec *bound)
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
{
|
|
|
|
Relation classRel;
|
|
|
|
HeapTuple tuple,
|
|
|
|
newtuple;
|
2017-01-24 16:20:02 +01:00
|
|
|
Datum new_val[Natts_pg_class];
|
|
|
|
bool new_null[Natts_pg_class],
|
|
|
|
new_repl[Natts_pg_class];
|
Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.
Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.
Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 23:28:04 +02:00
|
|
|
Oid defaultPartOid;
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
|
|
|
|
/* Update pg_class tuple */
|
2019-01-21 19:32:19 +01:00
|
|
|
classRel = table_open(RelationRelationId, RowExclusiveLock);
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
tuple = SearchSysCacheCopy1(RELOID,
|
|
|
|
ObjectIdGetDatum(RelationGetRelid(rel)));
|
2017-01-04 21:59:00 +01:00
|
|
|
if (!HeapTupleIsValid(tuple))
|
|
|
|
elog(ERROR, "cache lookup failed for relation %u",
|
|
|
|
RelationGetRelid(rel));
|
|
|
|
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
#ifdef USE_ASSERT_CHECKING
|
|
|
|
{
|
2017-01-24 16:20:02 +01:00
|
|
|
Form_pg_class classForm;
|
|
|
|
bool isnull;
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
|
|
|
|
classForm = (Form_pg_class) GETSTRUCT(tuple);
|
|
|
|
Assert(!classForm->relispartition);
|
|
|
|
(void) SysCacheGetAttr(RELOID, tuple, Anum_pg_class_relpartbound,
|
|
|
|
&isnull);
|
|
|
|
Assert(isnull);
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
|
|
|
/* Fill in relpartbound value */
|
|
|
|
memset(new_val, 0, sizeof(new_val));
|
|
|
|
memset(new_null, false, sizeof(new_null));
|
|
|
|
memset(new_repl, false, sizeof(new_repl));
|
|
|
|
new_val[Anum_pg_class_relpartbound - 1] = CStringGetTextDatum(nodeToString(bound));
|
|
|
|
new_null[Anum_pg_class_relpartbound - 1] = false;
|
|
|
|
new_repl[Anum_pg_class_relpartbound - 1] = true;
|
|
|
|
newtuple = heap_modify_tuple(tuple, RelationGetDescr(classRel),
|
|
|
|
new_val, new_null, new_repl);
|
|
|
|
/* Also set the flag */
|
|
|
|
((Form_pg_class) GETSTRUCT(newtuple))->relispartition = true;
|
2017-01-31 22:42:24 +01:00
|
|
|
CatalogTupleUpdate(classRel, &newtuple->t_self, newtuple);
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
heap_freetuple(newtuple);
|
2019-01-21 19:32:19 +01:00
|
|
|
table_close(classRel, RowExclusiveLock);
|
2016-12-20 04:53:30 +01:00
|
|
|
|
2018-03-21 16:03:35 +01:00
|
|
|
/*
|
|
|
|
* If we're storing bounds for the default partition, update
|
|
|
|
* pg_partitioned_table too.
|
|
|
|
*/
|
|
|
|
if (bound->is_default)
|
|
|
|
update_default_partition_oid(RelationGetRelid(parent),
|
|
|
|
RelationGetRelid(rel));
|
|
|
|
|
|
|
|
/* Make these updates visible */
|
2018-03-20 15:19:41 +01:00
|
|
|
CommandCounterIncrement();
|
|
|
|
|
Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.
Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.
Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 23:28:04 +02:00
|
|
|
/*
|
|
|
|
* The partition constraint for the default partition depends on the
|
|
|
|
* partition bounds of every other partition, so we must invalidate the
|
|
|
|
* relcache entry for that partition every time a partition is added or
|
|
|
|
* removed.
|
|
|
|
*/
|
ALTER TABLE ... DETACH PARTITION ... CONCURRENTLY
Allow a partition be detached from its partitioned table without
blocking concurrent queries, by running in two transactions and only
requiring ShareUpdateExclusive in the partitioned table.
Because it runs in two transactions, it cannot be used in a transaction
block. This is the main reason to use dedicated syntax: so that users
can choose to use the original mode if they need it. But also, it
doesn't work when a default partition exists (because an exclusive lock
would still need to be obtained on it, in order to change its partition
constraint.)
In case the second transaction is cancelled or a crash occurs, there's
ALTER TABLE .. DETACH PARTITION .. FINALIZE, which executes the final
steps.
The main trick to make this work is the addition of column
pg_inherits.inhdetachpending, initially false; can only be set true in
the first part of this command. Once that is committed, concurrent
transactions that use a PartitionDirectory will include or ignore
partitions so marked: in optimizer they are ignored if the row is marked
committed for the snapshot; in executor they are always included. As a
result, and because of the way PartitionDirectory caches partition
descriptors, queries that were planned before the detach will see the
rows in the detached partition and queries that are planned after the
detach, won't.
A CHECK constraint is created that duplicates the partition constraint.
This is probably not strictly necessary, and some users will prefer to
remove it afterwards, but if the partition is re-attached to a
partitioned table, the constraint needn't be rechecked.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200803234854.GA24158@alvherre.pgsql
2021-03-25 22:00:28 +01:00
|
|
|
defaultPartOid =
|
Fix relcache inconsistency hazard in partition detach
During queries coming from ri_triggers.c, we need to omit partitions
that are marked pending detach -- otherwise, the RI query is tricked
into allowing a row into the referencing table whose corresponding row
is in the detached partition. Which is bogus: once the detach operation
completes, the row becomes an orphan.
However, the code was not doing that in repeatable-read transactions,
because relcache kept a copy of the partition descriptor that included
the partition, and used it in the RI query. This commit changes the
partdesc cache code to only keep descriptors that aren't dependent on
a snapshot (namely: those where no detached partition exist, and those
where detached partitions are included). When a partdesc-without-
detached-partitions is requested, we create one afresh each time; also,
those partdescs are stored in PortalContext instead of
CacheMemoryContext.
find_inheritance_children gets a new output *detached_exist boolean,
which indicates whether any partition marked pending-detach is found.
Its "include_detached" input flag is changed to "omit_detached", because
that name captures desired the semantics more naturally.
CreatePartitionDirectory() and RelationGetPartitionDesc() arguments are
identically renamed.
This was noticed because a buildfarm member that runs with relcache
clobbering, which would not keep the improperly cached partdesc, broke
one test, which led us to realize that the expected output of that test
was bogus. This commit also corrects that expected output.
Author: Amit Langote <amitlangote09@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/3269784.1617215412@sss.pgh.pa.us
2021-04-22 21:13:25 +02:00
|
|
|
get_default_oid_from_partdesc(RelationGetPartitionDesc(parent, true));
|
Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.
Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.
Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 23:28:04 +02:00
|
|
|
if (OidIsValid(defaultPartOid))
|
|
|
|
CacheInvalidateRelcacheByRelid(defaultPartOid);
|
|
|
|
|
2016-12-20 04:53:30 +01:00
|
|
|
CacheInvalidateRelcache(parent);
|
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
|
|
|
}
|