From 49b83f6474349ecc756348f422003f447a5058bf Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Wed, 11 Dec 2019 15:53:36 -0500 Subject: [PATCH] Doc: back-patch documentation about limitations of CHECK constraints. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Back-patch commits 36d442a25 and 1f66c657f into all supported branches. I'd considered doing this when putting in the latter commit, but failed to pull the trigger. Now that we've had an actual field complaint about the lack of such docs, let's do it. Per bug #16158 from Piotr Jander. Original patches by Lætitia Avrot, Patrick Francelle, and me. Discussion: https://postgr.es/m/16158-7ccf2f74b3d655db@postgresql.org --- doc/src/sgml/ddl.sgml | 53 +++++++++++++++++ doc/src/sgml/ref/alter_domain.sgml | 21 ++++++- doc/src/sgml/ref/alter_table.sgml | 91 ++++++++++++++++++----------- doc/src/sgml/ref/create_domain.sgml | 24 ++++++++ doc/src/sgml/ref/create_table.sgml | 3 +- 5 files changed, 155 insertions(+), 37 deletions(-) diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index 7dbdca6074..06d6e48a35 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -403,6 +403,59 @@ CREATE TABLE products ( ensure that a column does not contain null values, the not-null constraint described in the next section can be used. + + + + PostgreSQL does not support + CHECK constraints that reference table data other than + the new or updated row being checked. While a CHECK + constraint that violates this rule may appear to work in simple + tests, it cannot guarantee that the database will not reach a state + in which the constraint condition is false (due to subsequent changes + of the other row(s) involved). This would cause a database dump and + reload to fail. The reload could fail even when the complete + database state is consistent with the constraint, due to rows not + being loaded in an order that will satisfy the constraint. If + possible, use UNIQUE, EXCLUDE, + or FOREIGN KEY constraints to express + cross-row and cross-table restrictions. + + + + If what you desire is a one-time check against other rows at row + insertion, rather than a continuously-maintained consistency + guarantee, a custom trigger can be used + to implement that. (This approach avoids the dump/reload problem because + pg_dump does not reinstall triggers until after + reloading data, so that the check will not be enforced during a + dump/reload.) + + + + + + PostgreSQL assumes that + CHECK constraints' conditions are immutable, that + is, they will always give the same result for the same input row. + This assumption is what justifies examining CHECK + constraints only when rows are inserted or updated, and not at other + times. (The warning above about not referencing other table data is + really a special case of this restriction.) + + + + An example of a common way to break this assumption is to reference a + user-defined function in a CHECK expression, and + then change the behavior of that + function. PostgreSQL does not disallow + that, but it will not notice if there are rows in the table that now + violate the CHECK constraint. That would cause a + subsequent database dump and reload to fail. + The recommended way to handle such a change is to drop the constraint + (using ALTER TABLE), adjust the function definition, + and re-add the constraint, thereby rechecking it against all table rows. + + diff --git a/doc/src/sgml/ref/alter_domain.sgml b/doc/src/sgml/ref/alter_domain.sgml index 85253e209b..8201cbb65f 100644 --- a/doc/src/sgml/ref/alter_domain.sgml +++ b/doc/src/sgml/ref/alter_domain.sgml @@ -118,8 +118,8 @@ ALTER DOMAIN name This form validates a constraint previously added as - NOT VALID, that is, verify that all data in columns using the - domain satisfy the specified constraint. + NOT VALID, that is, it verifies that all values in + table columns of the domain type satisfy the specified constraint. @@ -202,7 +202,7 @@ ALTER DOMAIN name NOT VALID - Do not verify existing column data for constraint validity. + Do not verify existing stored data for constraint validity. @@ -272,6 +272,21 @@ ALTER DOMAIN name Notes + + Although ALTER DOMAIN ADD CONSTRAINT attempts to verify + that existing stored data satisfies the new constraint, this check is not + bulletproof, because the command cannot see table rows that + are newly inserted or updated and not yet committed. If there is a hazard + that concurrent operations might insert bad data, the way to proceed is to + add the constraint using the NOT VALID option, commit + that command, wait until all transactions started before that commit have + finished, and then issue ALTER DOMAIN VALIDATE + CONSTRAINT to search for data violating the constraint. This + method is reliable because once the constraint is committed, all new + transactions are guaranteed to enforce it against new values of the domain + type. + + Currently, ALTER DOMAIN ADD CONSTRAINT, ALTER DOMAIN VALIDATE CONSTRAINT, and ALTER DOMAIN SET NOT diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index c2f25cc7b6..d63490f310 100644 --- a/doc/src/sgml/ref/alter_table.sgml +++ b/doc/src/sgml/ref/alter_table.sgml @@ -359,27 +359,36 @@ WITH ( MODULUS numeric_literal, REM , plus the option NOT VALID, which is currently only allowed for foreign key and CHECK constraints. - If the constraint is marked NOT VALID, the - potentially-lengthy initial check to verify that all rows in the table - satisfy the constraint is skipped. The constraint will still be - enforced against subsequent inserts or updates (that is, they'll fail - unless there is a matching row in the referenced table, in the case - of foreign keys; and they'll fail unless the new row matches the - specified check constraints). But the - database will not assume that the constraint holds for all rows in - the table, until it is validated by using the VALIDATE - CONSTRAINT option. Foreign key constraints on partitioned - tables may not be declared NOT VALID at present. - The addition of a foreign key constraint requires a - SHARE ROW EXCLUSIVE lock on the referenced table. + Normally, this form will cause a scan of the table to verify that all + existing rows in the table satisfy the new constraint. But if + the NOT VALID option is used, this + potentially-lengthy scan is skipped. The constraint will still be + enforced against subsequent inserts or updates (that is, they'll fail + unless there is a matching row in the referenced table, in the case + of foreign keys, or they'll fail unless the new row matches the + specified check condition). But the + database will not assume that the constraint holds for all rows in + the table, until it is validated by using the VALIDATE + CONSTRAINT option. + See below for more information + about using the NOT VALID option. + + + + Addition of a foreign key constraint requires a + SHARE ROW EXCLUSIVE lock on the referenced table, + in addition to the lock on the table receiving the constraint. Additional restrictions apply when unique or primary key constraints - are added to partitioned tables; see . + are added to partitioned tables; see . + Also, foreign key constraints on partitioned + tables may not be declared NOT VALID at present. @@ -454,23 +463,13 @@ WITH ( MODULUS numeric_literal, REM VALIDATE CONSTRAINT - This form validates a foreign key or check constraint that was previously created - as NOT VALID, by scanning the table to ensure there - are no rows for which the constraint is not satisfied. - Nothing happens if the constraint is already marked valid. - - - Validation can be a long process on larger tables. The value of separating - validation from initial creation is that you can defer validation to less - busy times, or can be used to give additional time to correct pre-existing - errors while preventing new errors. Note also that validation on its own - does not prevent normal write commands against the table while it runs. - - - Validation acquires only a SHARE UPDATE EXCLUSIVE lock - on the table being altered. If the constraint is a foreign key then - a ROW SHARE lock is also required on - the table referenced by the constraint. + This form validates a foreign key or check constraint that was + previously created as NOT VALID, by scanning the + table to ensure there are no rows for which the constraint is not + satisfied. Nothing happens if the constraint is already marked valid. + (See below for an explanation of the + usefulness of this command.) @@ -1190,8 +1189,8 @@ WITH ( MODULUS numeric_literal, REM - - Notes + + Notes The key word COLUMN is noise and can be omitted. @@ -1237,6 +1236,32 @@ WITH ( MODULUS numeric_literal, REM rewrites can thereby be combined into a single pass over the table. + + Scanning a large table to verify a new foreign key or check constraint + can take a long time, and other updates to the table are locked out + until the ALTER TABLE ADD CONSTRAINT command is + committed. The main purpose of the NOT VALID + constraint option is to reduce the impact of adding a constraint on + concurrent updates. With NOT VALID, + the ADD CONSTRAINT command does not scan the table + and can be committed immediately. After that, a VALIDATE + CONSTRAINT command can be issued to verify that existing rows + satisfy the constraint. The validation step does not need to lock out + concurrent updates, since it knows that other transactions will be + enforcing the constraint for rows that they insert or update; only + pre-existing rows need to be checked. Hence, validation acquires only + a SHARE UPDATE EXCLUSIVE lock on the table being + altered. (If the constraint is a foreign key then a ROW + SHARE lock is also required on the table referenced by the + constraint.) In addition to improving concurrency, it can be useful to + use NOT VALID and VALIDATE + CONSTRAINT in cases where the table is known to contain + pre-existing violations. Once the constraint is in place, no new + violations can be inserted, and the existing problems can be corrected + at leisure until VALIDATE CONSTRAINT finally + succeeds. + + The DROP COLUMN form does not physically remove the column, but simply makes it invisible to SQL operations. Subsequent diff --git a/doc/src/sgml/ref/create_domain.sgml b/doc/src/sgml/ref/create_domain.sgml index 49d5304330..81a8924926 100644 --- a/doc/src/sgml/ref/create_domain.sgml +++ b/doc/src/sgml/ref/create_domain.sgml @@ -214,6 +214,30 @@ INSERT INTO tab (domcol) VALUES ((SELECT domcol FROM tab WHERE false)); and then to apply column NOT NULL constraints to columns of the domain type as needed, rather than directly to the domain type. + + + PostgreSQL assumes that + CHECK constraints' conditions are immutable, that is, + they will always give the same result for the same input value. This + assumption is what justifies examining CHECK + constraints only when a value is first converted to be of a domain type, + and not at other times. (This is essentially the same as the treatment + of table CHECK constraints, as described in + .) + + + + An example of a common way to break this assumption is to reference a + user-defined function in a CHECK expression, and then + change the behavior of that + function. PostgreSQL does not disallow that, + but it will not notice if there are stored values of the domain type that + now violate the CHECK constraint. That would cause a + subsequent database dump and reload to fail. The recommended way to + handle such a change is to drop the constraint (using ALTER + DOMAIN), adjust the function definition, and re-add the + constraint, thereby rechecking it against stored data. + diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml index 7cadd1a6ae..4f27daf1ae 100644 --- a/doc/src/sgml/ref/create_table.sgml +++ b/doc/src/sgml/ref/create_table.sgml @@ -698,7 +698,8 @@ WITH ( MODULUS numeric_literal, REM Currently, CHECK expressions cannot contain subqueries nor refer to variables other than columns of the - current row. The system column tableoid + current row (see ). + The system column tableoid may be referenced, but not any other system column.