Fix constant-folding of ROW(...) IS [NOT] NULL with composite fields.

The SQL standard appears to specify that IS [NOT] NULL's tests of field
nullness are non-recursive, ie, we shouldn't consider that a composite
field with value ROW(NULL,NULL) is null for this purpose.
ExecEvalNullTest got this right, but eval_const_expressions did not,
leading to weird inconsistencies depending on whether the expression
was such that the planner could apply constant folding.

Also, adjust the docs to mention that IS [NOT] DISTINCT FROM NULL can be
used as a substitute test if a simple null check is wanted for a rowtype
argument.  That motivated reordering things so that IS [NOT] DISTINCT FROM
is described before IS [NOT] NULL.  In HEAD, I went a bit further and added
a table showing all the comparison-related predicates.

Per bug #14235.  Back-patch to all supported branches, since it's certainly
undesirable that constant-folding should change the semantics.

Report and patch by Andrew Gierth; assorted wordsmithing and revised
regression test cases by me.

Report: <20160708024746.1410.57282@wrigleys.postgresql.org>
This commit is contained in:
Tom Lane 2016-07-26 15:25:02 -04:00
parent c1a9542578
commit 4452000f31
5 changed files with 250 additions and 55 deletions

View File

@ -256,12 +256,111 @@
<literal>3</literal>).
</para>
<para>
There are also some comparison predicates, as shown in <xref
linkend="functions-comparison-pred-table">. These behave much like
operators, but have special syntax mandated by the SQL standard.
</para>
<table id="functions-comparison-pred-table">
<title>Comparison Predicates</title>
<tgroup cols="2">
<thead>
<row>
<entry>Predicate</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry> <replaceable>a</> <literal>BETWEEN</> <replaceable>x</> <literal>AND</> <replaceable>y</> </entry>
<entry>between</entry>
</row>
<row>
<entry> <replaceable>a</> <literal>NOT BETWEEN</> <replaceable>x</> <literal>AND</> <replaceable>y</> </entry>
<entry>not between</entry>
</row>
<row>
<entry> <replaceable>a</> <literal>BETWEEN SYMMETRIC</> <replaceable>x</> <literal>AND</> <replaceable>y</> </entry>
<entry>between, after sorting the comparison values</entry>
</row>
<row>
<entry> <replaceable>a</> <literal>NOT BETWEEN SYMMETRIC</> <replaceable>x</> <literal>AND</> <replaceable>y</> </entry>
<entry>not between, after sorting the comparison values</entry>
</row>
<row>
<entry> <replaceable>a</> <literal>IS DISTINCT FROM</> <replaceable>b</> </entry>
<entry>not equal, treating null like an ordinary value</entry>
</row>
<row>
<entry><replaceable>a</> <literal>IS NOT DISTINCT FROM</> <replaceable>b</></entry>
<entry>equal, treating null like an ordinary value</entry>
</row>
<row>
<entry> <replaceable>expression</> <literal>IS NULL</> </entry>
<entry>is null</entry>
</row>
<row>
<entry> <replaceable>expression</> <literal>IS NOT NULL</> </entry>
<entry>is not null</entry>
</row>
<row>
<entry> <replaceable>expression</> <literal>ISNULL</> </entry>
<entry>is null (nonstandard syntax)</entry>
</row>
<row>
<entry> <replaceable>expression</> <literal>NOTNULL</> </entry>
<entry>is not null (nonstandard syntax)</entry>
</row>
<row>
<entry> <replaceable>boolean_expression</> <literal>IS TRUE</> </entry>
<entry>is true</entry>
</row>
<row>
<entry> <replaceable>boolean_expression</> <literal>IS NOT TRUE</> </entry>
<entry>is false or unknown</entry>
</row>
<row>
<entry> <replaceable>boolean_expression</> <literal>IS FALSE</> </entry>
<entry>is false</entry>
</row>
<row>
<entry> <replaceable>boolean_expression</> <literal>IS NOT FALSE</> </entry>
<entry>is true or unknown</entry>
</row>
<row>
<entry> <replaceable>boolean_expression</> <literal>IS UNKNOWN</> </entry>
<entry>is unknown</entry>
</row>
<row>
<entry> <replaceable>boolean_expression</> <literal>IS NOT UNKNOWN</> </entry>
<entry>is true or false</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
<indexterm>
<primary>BETWEEN</primary>
</indexterm>
In addition to the comparison operators, the special
<token>BETWEEN</token> construct is available:
The <token>BETWEEN</token> predicate simplifies range tests:
<synopsis>
<replaceable>a</replaceable> BETWEEN <replaceable>x</replaceable> AND <replaceable>y</replaceable>
</synopsis>
@ -282,13 +381,39 @@
<indexterm>
<primary>BETWEEN SYMMETRIC</primary>
</indexterm>
<literal>BETWEEN SYMMETRIC</> is the same as <literal>BETWEEN</>
<literal>BETWEEN SYMMETRIC</> is like <literal>BETWEEN</>
except there is no requirement that the argument to the left of
<literal>AND</> be less than or equal to the argument on the right.
If it is not, those two arguments are automatically swapped, so that
a nonempty range is always implied.
</para>
<para>
<indexterm>
<primary>IS DISTINCT FROM</primary>
</indexterm>
<indexterm>
<primary>IS NOT DISTINCT FROM</primary>
</indexterm>
Ordinary comparison operators yield null (signifying <quote>unknown</>),
not true or false, when either input is null. For example,
<literal>7 = NULL</> yields null, as does <literal>7 &lt;&gt; NULL</>. When
this behavior is not suitable, use the
<literal>IS <optional> NOT </> DISTINCT FROM</literal> predicates:
<synopsis>
<replaceable>a</replaceable> IS DISTINCT FROM <replaceable>b</replaceable>
<replaceable>a</replaceable> IS NOT DISTINCT FROM <replaceable>b</replaceable>
</synopsis>
For non-null inputs, <literal>IS DISTINCT FROM</literal> is
the same as the <literal>&lt;&gt;</> operator. However, if both
inputs are null it returns false, and if only one input is
null it returns true. Similarly, <literal>IS NOT DISTINCT
FROM</literal> is identical to <literal>=</literal> for non-null
inputs, but it returns true when both inputs are null, and false when only
one input is null. Thus, these predicates effectively act as though null
were a normal data value, rather than <quote>unknown</>.
</para>
<para>
<indexterm>
<primary>IS NULL</primary>
@ -302,12 +427,12 @@
<indexterm>
<primary>NOTNULL</primary>
</indexterm>
To check whether a value is or is not null, use the constructs:
To check whether a value is or is not null, use the predicates:
<synopsis>
<replaceable>expression</replaceable> IS NULL
<replaceable>expression</replaceable> IS NOT NULL
</synopsis>
or the equivalent, but nonstandard, constructs:
or the equivalent, but nonstandard, predicates:
<synopsis>
<replaceable>expression</replaceable> ISNULL
<replaceable>expression</replaceable> NOTNULL
@ -320,8 +445,7 @@
<literal><replaceable>expression</replaceable> = NULL</literal>
because <literal>NULL</> is not <quote>equal to</quote>
<literal>NULL</>. (The null value represents an unknown value,
and it is not known whether two unknown values are equal.) This
behavior conforms to the SQL standard.
and it is not known whether two unknown values are equal.)
</para>
<tip>
@ -338,7 +462,6 @@
</para>
</tip>
<note>
<para>
If the <replaceable>expression</replaceable> is row-valued, then
<literal>IS NULL</> is true when the row expression itself is null
@ -346,39 +469,13 @@
<literal>IS NOT NULL</> is true when the row expression itself is non-null
and all the row's fields are non-null. Because of this behavior,
<literal>IS NULL</> and <literal>IS NOT NULL</> do not always return
inverse results for row-valued expressions, i.e., a row-valued
expression that contains both NULL and non-null values will return false
for both tests.
This definition conforms to the SQL standard, and is a change from the
inconsistent behavior exhibited by <productname>PostgreSQL</productname>
versions prior to 8.2.
</para>
</note>
<para>
<indexterm>
<primary>IS DISTINCT FROM</primary>
</indexterm>
<indexterm>
<primary>IS NOT DISTINCT FROM</primary>
</indexterm>
Ordinary comparison operators yield null (signifying <quote>unknown</>),
not true or false, when either input is null. For example,
<literal>7 = NULL</> yields null, as does <literal>7 &lt;&gt; NULL</>. When
this behavior is not suitable, use the
<literal>IS <optional> NOT </> DISTINCT FROM</literal> constructs:
<synopsis>
<replaceable>expression</replaceable> IS DISTINCT FROM <replaceable>expression</replaceable>
<replaceable>expression</replaceable> IS NOT DISTINCT FROM <replaceable>expression</replaceable>
</synopsis>
For non-null inputs, <literal>IS DISTINCT FROM</literal> is
the same as the <literal>&lt;&gt;</> operator. However, if both
inputs are null it returns false, and if only one input is
null it returns true. Similarly, <literal>IS NOT DISTINCT
FROM</literal> is identical to <literal>=</literal> for non-null
inputs, but it returns true when both inputs are null, and false when only
one input is null. Thus, these constructs effectively act as though null
were a normal data value, rather than <quote>unknown</>.
inverse results for row-valued expressions; in particular, a row-valued
expression that contains both null and non-null fields will return false
for both tests. In some cases, it may be preferable to
write <replaceable>row</replaceable> <literal>IS DISTINCT FROM NULL</>
or <replaceable>row</replaceable> <literal>IS NOT DISTINCT FROM NULL</>,
which will simply check whether the overall row value is null without any
additional tests on the row fields.
</para>
<para>
@ -400,14 +497,14 @@
<indexterm>
<primary>IS NOT UNKNOWN</primary>
</indexterm>
Boolean values can also be tested using the constructs
Boolean values can also be tested using the predicates
<synopsis>
<replaceable>expression</replaceable> IS TRUE
<replaceable>expression</replaceable> IS NOT TRUE
<replaceable>expression</replaceable> IS FALSE
<replaceable>expression</replaceable> IS NOT FALSE
<replaceable>expression</replaceable> IS UNKNOWN
<replaceable>expression</replaceable> IS NOT UNKNOWN
<replaceable>boolean_expression</replaceable> IS TRUE
<replaceable>boolean_expression</replaceable> IS NOT TRUE
<replaceable>boolean_expression</replaceable> IS FALSE
<replaceable>boolean_expression</replaceable> IS NOT FALSE
<replaceable>boolean_expression</replaceable> IS UNKNOWN
<replaceable>boolean_expression</replaceable> IS NOT UNKNOWN
</synopsis>
These will always return true or false, never a null value, even when the
operand is null.
@ -427,7 +524,7 @@
<primary>IS NOT OF</primary>
</indexterm>
It is possible to check the data type of an expression using the
constructs
predicates
<synopsis>
<replaceable>expression</replaceable> IS OF (typename, ...)
<replaceable>expression</replaceable> IS NOT OF (typename, ...)
@ -461,7 +558,7 @@
</indexterm>
<literal>num_nonnulls(VARIADIC "any")</literal>
</entry>
<entry>returns the number of non-NULL arguments</entry>
<entry>returns the number of non-null arguments</entry>
<entry><literal>num_nonnulls(1, NULL, 2)</literal></entry>
<entry><literal>2</literal></entry>
</row>
@ -472,7 +569,7 @@
</indexterm>
<literal>num_nulls(VARIADIC "any")</literal>
</entry>
<entry>returns the number of NULL arguments</entry>
<entry>returns the number of null arguments</entry>
<entry><literal>num_nulls(1, NULL, 2)</literal></entry>
<entry><literal>1</literal></entry>
</row>

View File

@ -3815,6 +3815,21 @@ ExecEvalNullTest(NullTestState *nstate,
if (ntest->argisrow && !(*isNull))
{
/*
* The SQL standard defines IS [NOT] NULL for a non-null rowtype
* argument as:
*
* "R IS NULL" is true if every field is the null value.
*
* "R IS NOT NULL" is true if no field is the null value.
*
* This definition is (apparently intentionally) not recursive; so our
* tests on the fields are primitive attisnull tests, not recursive
* checks to see if they are all-nulls or no-nulls rowtypes.
*
* The standard does not consider the possibility of zero-field rows,
* but here we consider them to vacuously satisfy both predicates.
*/
HeapTupleHeader tuple;
Oid tupType;
int32 tupTypmod;

View File

@ -3273,7 +3273,7 @@ eval_const_expressions_mutator(Node *node,
arg = eval_const_expressions_mutator((Node *) ntest->arg,
context);
if (arg && IsA(arg, RowExpr))
if (ntest->argisrow && arg && IsA(arg, RowExpr))
{
/*
* We break ROW(...) IS [NOT] NULL into separate tests on
@ -3285,8 +3285,6 @@ eval_const_expressions_mutator(Node *node,
List *newargs = NIL;
ListCell *l;
Assert(ntest->argisrow);
foreach(l, rarg->args)
{
Node *relem = (Node *) lfirst(l);
@ -3305,10 +3303,17 @@ eval_const_expressions_mutator(Node *node,
return makeBoolConst(false, false);
continue;
}
/*
* Else, make a scalar (argisrow == false) NullTest
* for this field. Scalar semantics are required
* because IS [NOT] NULL doesn't recurse; see comments
* in ExecEvalNullTest().
*/
newntest = makeNode(NullTest);
newntest->arg = (Expr *) relem;
newntest->nulltesttype = ntest->nulltesttype;
newntest->argisrow = type_is_rowtype(exprType(relem));
newntest->argisrow = false;
newntest->location = ntest->location;
newargs = lappend(newargs, newntest);
}

View File

@ -657,3 +657,57 @@ select row_to_json(r) from (select q2,q1 from tt1 offset 0) r;
{"q2":0,"q1":0}
(3 rows)
--
-- IS [NOT] NULL should not recurse into nested composites (bug #14235)
--
explain (verbose, costs off)
select r, r is null as isnull, r is not null as isnotnull
from (values (1,row(1,2)), (1,row(null,null)), (1,null),
(null,row(1,2)), (null,row(null,null)), (null,null) ) r(a,b);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Values Scan on "*VALUES*"
Output: ROW("*VALUES*".column1, "*VALUES*".column2), (("*VALUES*".column1 IS NULL) AND ("*VALUES*".column2 IS NULL)), (("*VALUES*".column1 IS NOT NULL) AND ("*VALUES*".column2 IS NOT NULL))
(2 rows)
select r, r is null as isnull, r is not null as isnotnull
from (values (1,row(1,2)), (1,row(null,null)), (1,null),
(null,row(1,2)), (null,row(null,null)), (null,null) ) r(a,b);
r | isnull | isnotnull
-------------+--------+-----------
(1,"(1,2)") | f | t
(1,"(,)") | f | t
(1,) | f | f
(,"(1,2)") | f | f
(,"(,)") | f | f
(,) | t | f
(6 rows)
explain (verbose, costs off)
with r(a,b) as
(values (1,row(1,2)), (1,row(null,null)), (1,null),
(null,row(1,2)), (null,row(null,null)), (null,null) )
select r, r is null as isnull, r is not null as isnotnull from r;
QUERY PLAN
----------------------------------------------------------
CTE Scan on r
Output: r.*, (r.* IS NULL), (r.* IS NOT NULL)
CTE r
-> Values Scan on "*VALUES*"
Output: "*VALUES*".column1, "*VALUES*".column2
(5 rows)
with r(a,b) as
(values (1,row(1,2)), (1,row(null,null)), (1,null),
(null,row(1,2)), (null,row(null,null)), (null,null) )
select r, r is null as isnull, r is not null as isnotnull from r;
r | isnull | isnotnull
-------------+--------+-----------
(1,"(1,2)") | f | t
(1,"(,)") | f | t
(1,) | f | f
(,"(1,2)") | f | f
(,"(,)") | f | f
(,) | t | f
(6 rows)

View File

@ -286,3 +286,27 @@ create temp table tt1 as select * from int8_tbl limit 2;
create temp table tt2 () inherits(tt1);
insert into tt2 values(0,0);
select row_to_json(r) from (select q2,q1 from tt1 offset 0) r;
--
-- IS [NOT] NULL should not recurse into nested composites (bug #14235)
--
explain (verbose, costs off)
select r, r is null as isnull, r is not null as isnotnull
from (values (1,row(1,2)), (1,row(null,null)), (1,null),
(null,row(1,2)), (null,row(null,null)), (null,null) ) r(a,b);
select r, r is null as isnull, r is not null as isnotnull
from (values (1,row(1,2)), (1,row(null,null)), (1,null),
(null,row(1,2)), (null,row(null,null)), (null,null) ) r(a,b);
explain (verbose, costs off)
with r(a,b) as
(values (1,row(1,2)), (1,row(null,null)), (1,null),
(null,row(1,2)), (null,row(null,null)), (null,null) )
select r, r is null as isnull, r is not null as isnotnull from r;
with r(a,b) as
(values (1,row(1,2)), (1,row(null,null)), (1,null),
(null,row(1,2)), (null,row(null,null)), (null,null) )
select r, r is null as isnull, r is not null as isnotnull from r;