Doc: document that we expect CHECK constraint conditions to be immutable.

This restriction is implicit in the check-only-once implementation we use
for table and domain constraints, but it wasn't spelled out anywhere, nor
was there any advice about how to alter a constraint's behavior safely.
Improve that.

I was also dissatisfied with the documentation of ALTER DOMAIN VALIDATE
CONSTRAINT, which entirely failed to explain the use of that feature; and
thence decided that ALTER TABLE VALIDATE CONSTRAINT could be documented
better as well.

Perhaps we should back-patch this, along with the related commit 36d442a25,
but for now I refrained.

Discussion: https://postgr.es/m/12539.1544107316@sss.pgh.pa.us
This commit is contained in:
Tom Lane 2018-12-07 16:40:58 -05:00
parent 1464755fc4
commit 1f66c657f2
4 changed files with 127 additions and 37 deletions

View File

@ -427,7 +427,33 @@ CREATE TABLE products (
guarantee, a custom <link linkend="triggers">trigger</link> can be used
to implement that. (This approach avoids the dump/reload problem because
<application>pg_dump</application> does not reinstall triggers until after
reloading data, so that the check will not be enforced during a dump/reload.)
reloading data, so that the check will not be enforced during a
dump/reload.)
</para>
</note>
<note>
<para>
<productname>PostgreSQL</productname> assumes that
<literal>CHECK</literal> constraints' conditions are immutable, that
is, they will always give the same result for the same input row.
This assumption is what justifies examining <literal>CHECK</literal>
constraints only when rows are inserted or updated, and not at other
times. (The warning above about not referencing other table data is
really a special case of this restriction.)
</para>
<para>
An example of a common way to break this assumption is to reference a
user-defined function in a <literal>CHECK</literal> expression, and
then change the behavior of that
function. <productname>PostgreSQL</productname> does not disallow
that, but it will not notice if there are rows in the table that now
violate the <literal>CHECK</literal> constraint. That would cause a
subsequent database dump and reload to fail.
The recommended way to handle such a change is to drop the constraint
(using <command>ALTER TABLE</command>), adjust the function definition,
and re-add the constraint, thereby rechecking it against all table rows.
</para>
</note>
</sect2>

View File

@ -118,8 +118,8 @@ ALTER DOMAIN <replaceable class="parameter">name</replaceable>
<listitem>
<para>
This form validates a constraint previously added as
<literal>NOT VALID</literal>, that is, verify that all data in columns using the
domain satisfy the specified constraint.
<literal>NOT VALID</literal>, that is, it verifies that all values in
table columns of the domain type satisfy the specified constraint.
</para>
</listitem>
</varlistentry>
@ -202,7 +202,7 @@ ALTER DOMAIN <replaceable class="parameter">name</replaceable>
<term><literal>NOT VALID</literal></term>
<listitem>
<para>
Do not verify existing column data for constraint validity.
Do not verify existing stored data for constraint validity.
</para>
</listitem>
</varlistentry>
@ -272,6 +272,21 @@ ALTER DOMAIN <replaceable class="parameter">name</replaceable>
<refsect1>
<title>Notes</title>
<para>
Although <command>ALTER DOMAIN ADD CONSTRAINT</command> attempts to verify
that existing stored data satisfies the new constraint, this check is not
bulletproof, because the command cannot <quote>see</quote> table rows that
are newly inserted or updated and not yet committed. If there is a hazard
that concurrent operations might insert bad data, the way to proceed is to
add the constraint using the <literal>NOT VALID</literal> option, commit
that command, wait until all transactions started before that commit have
finished, and then issue <command>ALTER DOMAIN VALIDATE
CONSTRAINT</command> to search for data violating the constraint. This
method is reliable because once the constraint is committed, all new
transactions are guaranteed to enforce it against new values of the domain
type.
</para>
<para>
Currently, <command>ALTER DOMAIN ADD CONSTRAINT</command>, <command>ALTER
DOMAIN VALIDATE CONSTRAINT</command>, and <command>ALTER DOMAIN SET NOT

View File

@ -358,27 +358,36 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
<xref linkend="sql-createtable"/>, plus the option <literal>NOT
VALID</literal>, which is currently only allowed for foreign key
and CHECK constraints.
If the constraint is marked <literal>NOT VALID</literal>, the
potentially-lengthy initial check to verify that all rows in the table
satisfy the constraint is skipped. The constraint will still be
enforced against subsequent inserts or updates (that is, they'll fail
unless there is a matching row in the referenced table, in the case
of foreign keys; and they'll fail unless the new row matches the
specified check constraints). But the
database will not assume that the constraint holds for all rows in
the table, until it is validated by using the <literal>VALIDATE
CONSTRAINT</literal> option. Foreign key constraints on partitioned
tables may not be declared <literal>NOT VALID</literal> at present.
</para>
<para>
The addition of a foreign key constraint requires a
<literal>SHARE ROW EXCLUSIVE</literal> lock on the referenced table.
Normally, this form will cause a scan of the table to verify that all
existing rows in the table satisfy the new constraint. But if
the <literal>NOT VALID</literal> option is used, this
potentially-lengthy scan is skipped. The constraint will still be
enforced against subsequent inserts or updates (that is, they'll fail
unless there is a matching row in the referenced table, in the case
of foreign keys, or they'll fail unless the new row matches the
specified check condition). But the
database will not assume that the constraint holds for all rows in
the table, until it is validated by using the <literal>VALIDATE
CONSTRAINT</literal> option.
See <xref linkend="sql-altertable-notes"
endterm="sql-altertable-notes-title"/> below for more information
about using the <literal>NOT VALID</literal> option.
</para>
<para>
Addition of a foreign key constraint requires a
<literal>SHARE ROW EXCLUSIVE</literal> lock on the referenced table,
in addition to the lock on the table receiving the constraint.
</para>
<para>
Additional restrictions apply when unique or primary key constraints
are added to partitioned tables; see <xref linkend="sql-createtable" />.
are added to partitioned tables; see <xref linkend="sql-createtable"/>.
Also, foreign key constraints on partitioned
tables may not be declared <literal>NOT VALID</literal> at present.
</para>
</listitem>
@ -453,23 +462,13 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
<term><literal>VALIDATE CONSTRAINT</literal></term>
<listitem>
<para>
This form validates a foreign key or check constraint that was previously created
as <literal>NOT VALID</literal>, by scanning the table to ensure there
are no rows for which the constraint is not satisfied.
Nothing happens if the constraint is already marked valid.
</para>
<para>
Validation can be a long process on larger tables. The value of separating
validation from initial creation is that you can defer validation to less
busy times, or can be used to give additional time to correct pre-existing
errors while preventing new errors. Note also that validation on its own
does not prevent normal write commands against the table while it runs.
</para>
<para>
Validation acquires only a <literal>SHARE UPDATE EXCLUSIVE</literal> lock
on the table being altered. If the constraint is a foreign key then
a <literal>ROW SHARE</literal> lock is also required on
the table referenced by the constraint.
This form validates a foreign key or check constraint that was
previously created as <literal>NOT VALID</literal>, by scanning the
table to ensure there are no rows for which the constraint is not
satisfied. Nothing happens if the constraint is already marked valid.
(See <xref linkend="sql-altertable-notes"
endterm="sql-altertable-notes-title"/> below for an explanation of the
usefulness of this command.)
</para>
</listitem>
</varlistentry>
@ -1153,8 +1152,8 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</variablelist>
</refsect1>
<refsect1>
<title>Notes</title>
<refsect1 id="sql-altertable-notes">
<title id="sql-altertable-notes-title">Notes</title>
<para>
The key word <literal>COLUMN</literal> is noise and can be omitted.
@ -1199,6 +1198,32 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
rewrites can thereby be combined into a single pass over the table.
</para>
<para>
Scanning a large table to verify a new foreign key or check constraint
can take a long time, and other updates to the table are locked out
until the <command>ALTER TABLE ADD CONSTRAINT</command> command is
committed. The main purpose of the <literal>NOT VALID</literal>
constraint option is to reduce the impact of adding a constraint on
concurrent updates. With <literal>NOT VALID</literal>,
the <command>ADD CONSTRAINT</command> command does not scan the table
and can be committed immediately. After that, a <literal>VALIDATE
CONSTRAINT</literal> command can be issued to verify that existing rows
satisfy the constraint. The validation step does not need to lock out
concurrent updates, since it knows that other transactions will be
enforcing the constraint for rows that they insert or update; only
pre-existing rows need to be checked. Hence, validation acquires only
a <literal>SHARE UPDATE EXCLUSIVE</literal> lock on the table being
altered. (If the constraint is a foreign key then a <literal>ROW
SHARE</literal> lock is also required on the table referenced by the
constraint.) In addition to improving concurrency, it can be useful to
use <literal>NOT VALID</literal> and <literal>VALIDATE
CONSTRAINT</literal> in cases where the table is known to contain
pre-existing violations. Once the constraint is in place, no new
violations can be inserted, and the existing problems can be corrected
at leisure until <literal>VALIDATE CONSTRAINT</literal> finally
succeeds.
</para>
<para>
The <literal>DROP COLUMN</literal> form does not physically remove
the column, but simply makes it invisible to SQL operations. Subsequent

View File

@ -214,6 +214,30 @@ INSERT INTO tab (domcol) VALUES ((SELECT domcol FROM tab WHERE false));
and then to apply column <literal>NOT NULL</literal> constraints to columns of
the domain type as needed, rather than directly to the domain type.
</para>
<para>
<productname>PostgreSQL</productname> assumes that
<literal>CHECK</literal> constraints' conditions are immutable, that is,
they will always give the same result for the same input value. This
assumption is what justifies examining <literal>CHECK</literal>
constraints only when a value is first converted to be of a domain type,
and not at other times. (This is essentially the same as the treatment
of table <literal>CHECK</literal> constraints, as described in
<xref linkend="ddl-constraints-check-constraints"/>.)
</para>
<para>
An example of a common way to break this assumption is to reference a
user-defined function in a <literal>CHECK</literal> expression, and then
change the behavior of that
function. <productname>PostgreSQL</productname> does not disallow that,
but it will not notice if there are stored values of the domain type that
now violate the <literal>CHECK</literal> constraint. That would cause a
subsequent database dump and reload to fail. The recommended way to
handle such a change is to drop the constraint (using <command>ALTER
DOMAIN</command>), adjust the function definition, and re-add the
constraint, thereby rechecking it against stored data.
</para>
</refsect1>
<refsect1>