2016-08-04 03:29:42 +02:00
|
|
|
--
|
|
|
|
-- tsrf - targetlist set returning function tests
|
|
|
|
--
|
|
|
|
-- simple srf
|
|
|
|
SELECT generate_series(1, 3);
|
|
|
|
generate_series
|
|
|
|
-----------------
|
|
|
|
1
|
|
|
|
2
|
|
|
|
3
|
|
|
|
(3 rows)
|
|
|
|
|
|
|
|
-- parallel iteration
|
|
|
|
SELECT generate_series(1, 3), generate_series(3,5);
|
|
|
|
generate_series | generate_series
|
|
|
|
-----------------+-----------------
|
|
|
|
1 | 3
|
|
|
|
2 | 4
|
|
|
|
3 | 5
|
|
|
|
(3 rows)
|
|
|
|
|
|
|
|
-- parallel iteration, different number of rows
|
|
|
|
SELECT generate_series(1, 2), generate_series(1,4);
|
|
|
|
generate_series | generate_series
|
|
|
|
-----------------+-----------------
|
|
|
|
1 | 1
|
|
|
|
2 | 2
|
Move targetlist SRF handling from expression evaluation to new executor node.
Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT
generate_series(1,5)) so far was done in the expression evaluation (i.e.
ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code.
This meant that most executor nodes performing projection, and most
expression evaluation functions, had to deal with the possibility that an
evaluated expression could return a set of return values.
That's bad because it leads to repeated code in a lot of places. It also,
and that's my (Andres's) motivation, made it a lot harder to implement a
more efficient way of doing expression evaluation.
To fix this, introduce a new executor node (ProjectSet) that can evaluate
targetlists containing one or more SRFs. To avoid the complexity of the old
way of handling nested expressions returning sets (e.g. having to pass up
ExprDoneCond, and dealing with arguments to functions returning sets etc.),
those SRFs can only be at the top level of the node's targetlist. The
planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is
only necessary in ProjectSet nodes and that SRFs are only present at the
top level of the node's targetlist. If there are nested SRFs the planner
creates multiple stacked ProjectSet nodes. The ProjectSet nodes always get
input from an underlying node.
We also discussed and prototyped evaluating targetlist SRFs using ROWS
FROM(), but that turned out to be more complicated than we'd hoped.
While moving SRF evaluation to ProjectSet would allow to retain the old
"least common multiple" behavior when multiple SRFs are present in one
targetlist (i.e. continue returning rows until all SRFs are at the end of
their input at the same time), we decided to instead only return rows till
all SRFs are exhausted, returning NULL for already exhausted ones. We
deemed the previous behavior to be too confusing, unexpected and actually
not particularly useful.
As a side effect, the previously prohibited case of multiple set returning
arguments to a function, is now allowed. Not because it's particularly
desirable, but because it ends up working and there seems to be no argument
for adding code to prohibit it.
Currently the behavior for COALESCE and CASE containing SRFs has changed,
returning multiple rows from the expression, even when the SRF containing
"arm" of the expression is not evaluated. That's because the SRFs are
evaluated in a separate ProjectSet node. As that's quite confusing, we're
likely to instead prohibit SRFs in those places. But that's still being
discussed, and the code would reside in places not touched here, so that's
a task for later.
There's a lot of, now superfluous, code dealing with set return expressions
around. But as the changes to get rid of those are verbose largely boring,
it seems better for readability to keep the cleanup as a separate commit.
Author: Tom Lane and Andres Freund
Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
2017-01-18 21:46:50 +01:00
|
|
|
| 3
|
|
|
|
| 4
|
2016-08-04 03:29:42 +02:00
|
|
|
(4 rows)
|
|
|
|
|
|
|
|
-- srf, with SRF argument
|
|
|
|
SELECT generate_series(1, generate_series(1, 3));
|
|
|
|
generate_series
|
|
|
|
-----------------
|
|
|
|
1
|
|
|
|
1
|
|
|
|
2
|
|
|
|
1
|
|
|
|
2
|
|
|
|
3
|
|
|
|
(6 rows)
|
|
|
|
|
Disallow set-returning functions inside CASE or COALESCE.
When we reimplemented SRFs in commit 69f4b9c85, our initial choice was
to allow the behavior to vary from historical practice in cases where a
SRF call appeared within a conditional-execution construct (currently,
only CASE or COALESCE). But that was controversial to begin with, and
subsequent discussion has resulted in a consensus that it's better to
throw an error instead of executing the query differently from before,
so long as we can provide a reasonably clear error message and a way to
rewrite the query.
Hence, add a parser mechanism to allow detection of such cases during
parse analysis. The mechanism just requires storing, in the ParseState,
a pointer to the set-returning FuncExpr or OpExpr most recently emitted
by parse analysis. Then the parsing functions for CASE and COALESCE can
detect the presence of a SRF in their arguments by noting whether this
pointer changes while analyzing their arguments. Furthermore, if it does,
it provides a suitable error cursor location for the complaint. (This
means that if there's more than one SRF in the arguments, the error will
point at the last one to be analyzed not the first. While connoisseurs of
parsing behavior might find that odd, it's unlikely the average user would
ever notice.)
While at it, we can also provide more specific error messages than before
about some pre-existing restrictions, such as no-SRFs-within-aggregates.
Also, reject at parse time cases where a NULLIF or IS DISTINCT FROM
construct would need to return a set. We've never supported that, but the
restriction is depended on in more subtle ways now, so it seems wise to
detect it at the start.
Also, provide some documentation about how to rewrite a SRF-within-CASE
query using a custom wrapper SRF.
It turns out that the information_schema.user_mapping_options view
contained an instance of exactly the behavior we're now forbidding; but
rewriting it makes it more clear and safer too.
initdb forced because of user_mapping_options change.
Patch by me, with error message suggestions from Alvaro Herrera and
Andres Freund, pursuant to a complaint from Regina Obe.
Discussion: https://postgr.es/m/000001d2d5de$d8d66170$8a832450$@pcorp.us
2017-06-14 05:46:39 +02:00
|
|
|
-- but we've traditionally rejected the same in FROM
|
|
|
|
SELECT * FROM generate_series(1, generate_series(1, 3));
|
|
|
|
ERROR: set-returning functions must appear at top level of FROM
|
|
|
|
LINE 1: SELECT * FROM generate_series(1, generate_series(1, 3));
|
|
|
|
^
|
2016-08-04 03:29:42 +02:00
|
|
|
-- srf, with two SRF arguments
|
|
|
|
SELECT generate_series(generate_series(1,3), generate_series(2, 4));
|
Move targetlist SRF handling from expression evaluation to new executor node.
Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT
generate_series(1,5)) so far was done in the expression evaluation (i.e.
ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code.
This meant that most executor nodes performing projection, and most
expression evaluation functions, had to deal with the possibility that an
evaluated expression could return a set of return values.
That's bad because it leads to repeated code in a lot of places. It also,
and that's my (Andres's) motivation, made it a lot harder to implement a
more efficient way of doing expression evaluation.
To fix this, introduce a new executor node (ProjectSet) that can evaluate
targetlists containing one or more SRFs. To avoid the complexity of the old
way of handling nested expressions returning sets (e.g. having to pass up
ExprDoneCond, and dealing with arguments to functions returning sets etc.),
those SRFs can only be at the top level of the node's targetlist. The
planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is
only necessary in ProjectSet nodes and that SRFs are only present at the
top level of the node's targetlist. If there are nested SRFs the planner
creates multiple stacked ProjectSet nodes. The ProjectSet nodes always get
input from an underlying node.
We also discussed and prototyped evaluating targetlist SRFs using ROWS
FROM(), but that turned out to be more complicated than we'd hoped.
While moving SRF evaluation to ProjectSet would allow to retain the old
"least common multiple" behavior when multiple SRFs are present in one
targetlist (i.e. continue returning rows until all SRFs are at the end of
their input at the same time), we decided to instead only return rows till
all SRFs are exhausted, returning NULL for already exhausted ones. We
deemed the previous behavior to be too confusing, unexpected and actually
not particularly useful.
As a side effect, the previously prohibited case of multiple set returning
arguments to a function, is now allowed. Not because it's particularly
desirable, but because it ends up working and there seems to be no argument
for adding code to prohibit it.
Currently the behavior for COALESCE and CASE containing SRFs has changed,
returning multiple rows from the expression, even when the SRF containing
"arm" of the expression is not evaluated. That's because the SRFs are
evaluated in a separate ProjectSet node. As that's quite confusing, we're
likely to instead prohibit SRFs in those places. But that's still being
discussed, and the code would reside in places not touched here, so that's
a task for later.
There's a lot of, now superfluous, code dealing with set return expressions
around. But as the changes to get rid of those are verbose largely boring,
it seems better for readability to keep the cleanup as a separate commit.
Author: Tom Lane and Andres Freund
Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
2017-01-18 21:46:50 +01:00
|
|
|
generate_series
|
|
|
|
-----------------
|
|
|
|
1
|
|
|
|
2
|
|
|
|
2
|
|
|
|
3
|
|
|
|
3
|
|
|
|
4
|
|
|
|
(6 rows)
|
|
|
|
|
Fix mishandling of tSRFs at different nesting levels.
Given a targetlist like "srf(x), f(srf(x))", split_pathtarget_at_srfs()
decided that it needed two levels of ProjectSet nodes, failing to notice
that the two SRF calls are textually equal(). Because of that, setrefs.c
would convert the upper ProjectSet's tlist to "Var1, f(Var1)" (where Var1
represents a reference to the srf(x) output of the lower ProjectSet).
This triggered an assertion in nodeProjectSet.c complaining that it found
no SRFs to evaluate, as reported by Erik Rijkers.
What we want in such a case is to evaluate srf(x) only once and use a plain
Result node to compute "Var1, f(Var1)"; that gives results similar to what
previous versions produced, whereas allowing srf(x) to be evaluated again
in an upper ProjectSet would square the number of rows emitted.
Furthermore, even if the SRF calls aren't textually identical, we want them
to be evaluated in lockstep, because that's what happened in the old
implementation. But split_pathtarget_at_srfs() got this completely wrong,
using two levels of ProjectSet for a case like "srf(x), f(srf(y))".
Hence, rewrite split_pathtarget_at_srfs() from the ground up so that it
groups SRFs according to the depth of nesting of SRFs in their arguments.
This is pretty much how we envisioned that working originally, but I blew
it when it came to implementation.
In passing, optimize the case of target == input_target, which I noticed
is not only possible but quite common.
Discussion: https://postgr.es/m/dcbd2853c05d22088766553d60dc78c6@xs4all.nl
2017-02-02 22:38:13 +01:00
|
|
|
-- check proper nesting of SRFs in different expressions
|
|
|
|
explain (verbose, costs off)
|
|
|
|
SELECT generate_series(1, generate_series(1, 3)), generate_series(2, 4);
|
|
|
|
QUERY PLAN
|
|
|
|
--------------------------------------------------------------------------------
|
|
|
|
ProjectSet
|
|
|
|
Output: generate_series(1, (generate_series(1, 3))), (generate_series(2, 4))
|
|
|
|
-> ProjectSet
|
|
|
|
Output: generate_series(1, 3), generate_series(2, 4)
|
|
|
|
-> Result
|
|
|
|
(5 rows)
|
|
|
|
|
|
|
|
SELECT generate_series(1, generate_series(1, 3)), generate_series(2, 4);
|
|
|
|
generate_series | generate_series
|
|
|
|
-----------------+-----------------
|
|
|
|
1 | 2
|
|
|
|
1 | 3
|
|
|
|
2 | 3
|
|
|
|
1 | 4
|
|
|
|
2 | 4
|
|
|
|
3 | 4
|
|
|
|
(6 rows)
|
|
|
|
|
2016-08-04 03:29:42 +02:00
|
|
|
CREATE TABLE few(id int, dataa text, datab text);
|
|
|
|
INSERT INTO few VALUES(1, 'a', 'foo'),(2, 'a', 'bar'),(3, 'b', 'bar');
|
Fix handling of targetlist SRFs when scan/join relation is known empty.
When we introduced separate ProjectSetPath nodes for application of
set-returning functions in v10, we inadvertently broke some cases where
we're supposed to recognize that the result of a subquery is known to be
empty (contain zero rows). That's because IS_DUMMY_REL was just looking
for a childless AppendPath without allowing for a ProjectSetPath being
possibly stuck on top. In itself, this didn't do anything much worse
than produce slightly worse plans for some corner cases.
Then in v11, commit 11cf92f6e rearranged things to allow the scan/join
targetlist to be applied directly to partial paths before they get
gathered. But it inserted a short-circuit path for dummy relations
that was a little too short: it failed to insert a ProjectSetPath node
at all for a targetlist containing set-returning functions, resulting in
bogus "set-valued function called in context that cannot accept a set"
errors, as reported in bug #15669 from Madelaine Thibaut.
The best way to fix this mess seems to be to reimplement IS_DUMMY_REL
so that it drills down through any ProjectSetPath nodes that might be
there (and it seems like we'd better allow for ProjectionPath as well).
While we're at it, make it look at rel->pathlist not cheapest_total_path,
so that it gives the right answer independently of whether set_cheapest
has been done lately. That dependency looks pretty shaky in the context
of code like apply_scanjoin_target_to_paths, and even if it's not broken
today it'd certainly bite us at some point. (Nastily, unsafe use of the
old coding would almost always work; the hazard comes down to possibly
looking through a dangling pointer, and only once in a blue moon would
you find something there that resulted in the wrong answer.)
It now looks like it was a mistake for IS_DUMMY_REL to be a macro: if
there are any extensions using it, they'll continue to use the old
inadequate logic until they're recompiled, after which they'll fail
to load into server versions predating this fix. Hopefully there are
few such extensions.
Having fixed IS_DUMMY_REL, the special path for dummy rels in
apply_scanjoin_target_to_paths is unnecessary as well as being wrong,
so we can just drop it.
Also change a few places that were testing for partitioned-ness of a
planner relation but not using IS_PARTITIONED_REL for the purpose; that
seems unsafe as well as inconsistent, plus it required an ugly hack in
apply_scanjoin_target_to_paths.
In passing, save a few cycles in apply_scanjoin_target_to_paths by
skipping processing of pre-existing paths for partitioned rels,
and do some cosmetic cleanup and comment adjustment in that function.
I renamed IS_DUMMY_PATH to IS_DUMMY_APPEND with the intention of breaking
any code that might be using it, since in almost every case that would
be wrong; IS_DUMMY_REL is what to be using instead.
In HEAD, also make set_dummy_rel_pathlist static (since it's no longer
used from outside allpaths.c), and delete is_dummy_plan, since it's no
longer used anywhere.
Back-patch as appropriate into v11 and v10.
Tom Lane and Julien Rouhaud
Discussion: https://postgr.es/m/15669-02fb3296cca26203@postgresql.org
2019-03-07 20:21:52 +01:00
|
|
|
-- SRF with a provably-dummy relation
|
|
|
|
explain (verbose, costs off)
|
|
|
|
SELECT unnest(ARRAY[1, 2]) FROM few WHERE false;
|
|
|
|
QUERY PLAN
|
|
|
|
--------------------------------------
|
|
|
|
ProjectSet
|
|
|
|
Output: unnest('{1,2}'::integer[])
|
|
|
|
-> Result
|
|
|
|
One-Time Filter: false
|
|
|
|
(4 rows)
|
|
|
|
|
|
|
|
SELECT unnest(ARRAY[1, 2]) FROM few WHERE false;
|
|
|
|
unnest
|
|
|
|
--------
|
|
|
|
(0 rows)
|
|
|
|
|
|
|
|
-- SRF shouldn't prevent upper query from recognizing lower as dummy
|
|
|
|
explain (verbose, costs off)
|
|
|
|
SELECT * FROM few f1,
|
|
|
|
(SELECT unnest(ARRAY[1,2]) FROM few f2 WHERE false OFFSET 0) ss;
|
|
|
|
QUERY PLAN
|
|
|
|
------------------------------------------------
|
|
|
|
Result
|
|
|
|
Output: f1.id, f1.dataa, f1.datab, ss.unnest
|
|
|
|
One-Time Filter: false
|
|
|
|
(3 rows)
|
|
|
|
|
|
|
|
SELECT * FROM few f1,
|
|
|
|
(SELECT unnest(ARRAY[1,2]) FROM few f2 WHERE false OFFSET 0) ss;
|
|
|
|
id | dataa | datab | unnest
|
|
|
|
----+-------+-------+--------
|
|
|
|
(0 rows)
|
|
|
|
|
2016-08-04 03:29:42 +02:00
|
|
|
-- SRF output order of sorting is maintained, if SRF is not referenced
|
|
|
|
SELECT few.id, generate_series(1,3) g FROM few ORDER BY id DESC;
|
|
|
|
id | g
|
|
|
|
----+---
|
|
|
|
3 | 1
|
|
|
|
3 | 2
|
|
|
|
3 | 3
|
|
|
|
2 | 1
|
|
|
|
2 | 2
|
|
|
|
2 | 3
|
|
|
|
1 | 1
|
|
|
|
1 | 2
|
|
|
|
1 | 3
|
|
|
|
(9 rows)
|
|
|
|
|
|
|
|
-- but SRFs can be referenced in sort
|
|
|
|
SELECT few.id, generate_series(1,3) g FROM few ORDER BY id, g DESC;
|
|
|
|
id | g
|
|
|
|
----+---
|
|
|
|
1 | 3
|
|
|
|
1 | 2
|
|
|
|
1 | 1
|
|
|
|
2 | 3
|
|
|
|
2 | 2
|
|
|
|
2 | 1
|
|
|
|
3 | 3
|
|
|
|
3 | 2
|
|
|
|
3 | 1
|
|
|
|
(9 rows)
|
|
|
|
|
|
|
|
SELECT few.id, generate_series(1,3) g FROM few ORDER BY id, generate_series(1,3) DESC;
|
|
|
|
id | g
|
|
|
|
----+---
|
|
|
|
1 | 3
|
|
|
|
1 | 2
|
|
|
|
1 | 1
|
|
|
|
2 | 3
|
|
|
|
2 | 2
|
|
|
|
2 | 1
|
|
|
|
3 | 3
|
|
|
|
3 | 2
|
|
|
|
3 | 1
|
|
|
|
(9 rows)
|
|
|
|
|
|
|
|
-- it's weird to have ORDER BYs that increase the number of results
|
|
|
|
SELECT few.id FROM few ORDER BY id, generate_series(1,3) DESC;
|
|
|
|
id
|
|
|
|
----
|
|
|
|
1
|
|
|
|
1
|
|
|
|
1
|
|
|
|
2
|
|
|
|
2
|
|
|
|
2
|
|
|
|
3
|
|
|
|
3
|
|
|
|
3
|
|
|
|
(9 rows)
|
|
|
|
|
|
|
|
-- SRFs are computed after aggregation
|
2017-01-19 23:21:26 +01:00
|
|
|
SET enable_hashagg TO 0; -- stable output order
|
2016-08-04 03:29:42 +02:00
|
|
|
SELECT few.dataa, count(*), min(id), max(id), unnest('{1,1,3}'::int[]) FROM few WHERE few.id = 1 GROUP BY few.dataa;
|
|
|
|
dataa | count | min | max | unnest
|
|
|
|
-------+-------+-----+-----+--------
|
|
|
|
a | 1 | 1 | 1 | 1
|
|
|
|
a | 1 | 1 | 1 | 1
|
|
|
|
a | 1 | 1 | 1 | 3
|
|
|
|
(3 rows)
|
|
|
|
|
|
|
|
-- unless referenced in GROUP BY clause
|
|
|
|
SELECT few.dataa, count(*), min(id), max(id), unnest('{1,1,3}'::int[]) FROM few WHERE few.id = 1 GROUP BY few.dataa, unnest('{1,1,3}'::int[]);
|
|
|
|
dataa | count | min | max | unnest
|
|
|
|
-------+-------+-----+-----+--------
|
Move targetlist SRF handling from expression evaluation to new executor node.
Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT
generate_series(1,5)) so far was done in the expression evaluation (i.e.
ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code.
This meant that most executor nodes performing projection, and most
expression evaluation functions, had to deal with the possibility that an
evaluated expression could return a set of return values.
That's bad because it leads to repeated code in a lot of places. It also,
and that's my (Andres's) motivation, made it a lot harder to implement a
more efficient way of doing expression evaluation.
To fix this, introduce a new executor node (ProjectSet) that can evaluate
targetlists containing one or more SRFs. To avoid the complexity of the old
way of handling nested expressions returning sets (e.g. having to pass up
ExprDoneCond, and dealing with arguments to functions returning sets etc.),
those SRFs can only be at the top level of the node's targetlist. The
planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is
only necessary in ProjectSet nodes and that SRFs are only present at the
top level of the node's targetlist. If there are nested SRFs the planner
creates multiple stacked ProjectSet nodes. The ProjectSet nodes always get
input from an underlying node.
We also discussed and prototyped evaluating targetlist SRFs using ROWS
FROM(), but that turned out to be more complicated than we'd hoped.
While moving SRF evaluation to ProjectSet would allow to retain the old
"least common multiple" behavior when multiple SRFs are present in one
targetlist (i.e. continue returning rows until all SRFs are at the end of
their input at the same time), we decided to instead only return rows till
all SRFs are exhausted, returning NULL for already exhausted ones. We
deemed the previous behavior to be too confusing, unexpected and actually
not particularly useful.
As a side effect, the previously prohibited case of multiple set returning
arguments to a function, is now allowed. Not because it's particularly
desirable, but because it ends up working and there seems to be no argument
for adding code to prohibit it.
Currently the behavior for COALESCE and CASE containing SRFs has changed,
returning multiple rows from the expression, even when the SRF containing
"arm" of the expression is not evaluated. That's because the SRFs are
evaluated in a separate ProjectSet node. As that's quite confusing, we're
likely to instead prohibit SRFs in those places. But that's still being
discussed, and the code would reside in places not touched here, so that's
a task for later.
There's a lot of, now superfluous, code dealing with set return expressions
around. But as the changes to get rid of those are verbose largely boring,
it seems better for readability to keep the cleanup as a separate commit.
Author: Tom Lane and Andres Freund
Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
2017-01-18 21:46:50 +01:00
|
|
|
a | 2 | 1 | 1 | 1
|
2017-01-19 23:21:26 +01:00
|
|
|
a | 1 | 1 | 1 | 3
|
2016-08-04 03:29:42 +02:00
|
|
|
(2 rows)
|
|
|
|
|
|
|
|
SELECT few.dataa, count(*), min(id), max(id), unnest('{1,1,3}'::int[]) FROM few WHERE few.id = 1 GROUP BY few.dataa, 5;
|
|
|
|
dataa | count | min | max | unnest
|
|
|
|
-------+-------+-----+-----+--------
|
Move targetlist SRF handling from expression evaluation to new executor node.
Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT
generate_series(1,5)) so far was done in the expression evaluation (i.e.
ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code.
This meant that most executor nodes performing projection, and most
expression evaluation functions, had to deal with the possibility that an
evaluated expression could return a set of return values.
That's bad because it leads to repeated code in a lot of places. It also,
and that's my (Andres's) motivation, made it a lot harder to implement a
more efficient way of doing expression evaluation.
To fix this, introduce a new executor node (ProjectSet) that can evaluate
targetlists containing one or more SRFs. To avoid the complexity of the old
way of handling nested expressions returning sets (e.g. having to pass up
ExprDoneCond, and dealing with arguments to functions returning sets etc.),
those SRFs can only be at the top level of the node's targetlist. The
planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is
only necessary in ProjectSet nodes and that SRFs are only present at the
top level of the node's targetlist. If there are nested SRFs the planner
creates multiple stacked ProjectSet nodes. The ProjectSet nodes always get
input from an underlying node.
We also discussed and prototyped evaluating targetlist SRFs using ROWS
FROM(), but that turned out to be more complicated than we'd hoped.
While moving SRF evaluation to ProjectSet would allow to retain the old
"least common multiple" behavior when multiple SRFs are present in one
targetlist (i.e. continue returning rows until all SRFs are at the end of
their input at the same time), we decided to instead only return rows till
all SRFs are exhausted, returning NULL for already exhausted ones. We
deemed the previous behavior to be too confusing, unexpected and actually
not particularly useful.
As a side effect, the previously prohibited case of multiple set returning
arguments to a function, is now allowed. Not because it's particularly
desirable, but because it ends up working and there seems to be no argument
for adding code to prohibit it.
Currently the behavior for COALESCE and CASE containing SRFs has changed,
returning multiple rows from the expression, even when the SRF containing
"arm" of the expression is not evaluated. That's because the SRFs are
evaluated in a separate ProjectSet node. As that's quite confusing, we're
likely to instead prohibit SRFs in those places. But that's still being
discussed, and the code would reside in places not touched here, so that's
a task for later.
There's a lot of, now superfluous, code dealing with set return expressions
around. But as the changes to get rid of those are verbose largely boring,
it seems better for readability to keep the cleanup as a separate commit.
Author: Tom Lane and Andres Freund
Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
2017-01-18 21:46:50 +01:00
|
|
|
a | 2 | 1 | 1 | 1
|
2017-01-19 23:21:26 +01:00
|
|
|
a | 1 | 1 | 1 | 3
|
2016-08-04 03:29:42 +02:00
|
|
|
(2 rows)
|
|
|
|
|
2017-01-19 23:21:26 +01:00
|
|
|
RESET enable_hashagg;
|
2016-08-04 03:29:42 +02:00
|
|
|
-- check HAVING works when GROUP BY does [not] reference SRF output
|
2016-09-13 03:15:10 +02:00
|
|
|
SELECT dataa, generate_series(1,1), count(*) FROM few GROUP BY 1 HAVING count(*) > 1;
|
2016-08-04 03:29:42 +02:00
|
|
|
dataa | generate_series | count
|
|
|
|
-------+-----------------+-------
|
|
|
|
a | 1 | 2
|
2016-09-13 03:15:10 +02:00
|
|
|
(1 row)
|
2016-08-04 03:29:42 +02:00
|
|
|
|
2016-09-13 03:15:10 +02:00
|
|
|
SELECT dataa, generate_series(1,1), count(*) FROM few GROUP BY 1, 2 HAVING count(*) > 1;
|
2016-08-04 03:29:42 +02:00
|
|
|
dataa | generate_series | count
|
|
|
|
-------+-----------------+-------
|
|
|
|
a | 1 | 2
|
2016-09-13 03:15:10 +02:00
|
|
|
(1 row)
|
2016-08-04 03:29:42 +02:00
|
|
|
|
|
|
|
-- it's weird to have GROUP BYs that increase the number of results
|
2016-09-13 03:15:10 +02:00
|
|
|
SELECT few.dataa, count(*) FROM few WHERE dataa = 'a' GROUP BY few.dataa ORDER BY 2;
|
|
|
|
dataa | count
|
|
|
|
-------+-------
|
|
|
|
a | 2
|
|
|
|
(1 row)
|
|
|
|
|
|
|
|
SELECT few.dataa, count(*) FROM few WHERE dataa = 'a' GROUP BY few.dataa, unnest('{1,1,3}'::int[]) ORDER BY 2;
|
|
|
|
dataa | count
|
|
|
|
-------+-------
|
|
|
|
a | 2
|
|
|
|
a | 4
|
2016-08-04 03:29:42 +02:00
|
|
|
(2 rows)
|
|
|
|
|
Disallow set-returning functions inside CASE or COALESCE.
When we reimplemented SRFs in commit 69f4b9c85, our initial choice was
to allow the behavior to vary from historical practice in cases where a
SRF call appeared within a conditional-execution construct (currently,
only CASE or COALESCE). But that was controversial to begin with, and
subsequent discussion has resulted in a consensus that it's better to
throw an error instead of executing the query differently from before,
so long as we can provide a reasonably clear error message and a way to
rewrite the query.
Hence, add a parser mechanism to allow detection of such cases during
parse analysis. The mechanism just requires storing, in the ParseState,
a pointer to the set-returning FuncExpr or OpExpr most recently emitted
by parse analysis. Then the parsing functions for CASE and COALESCE can
detect the presence of a SRF in their arguments by noting whether this
pointer changes while analyzing their arguments. Furthermore, if it does,
it provides a suitable error cursor location for the complaint. (This
means that if there's more than one SRF in the arguments, the error will
point at the last one to be analyzed not the first. While connoisseurs of
parsing behavior might find that odd, it's unlikely the average user would
ever notice.)
While at it, we can also provide more specific error messages than before
about some pre-existing restrictions, such as no-SRFs-within-aggregates.
Also, reject at parse time cases where a NULLIF or IS DISTINCT FROM
construct would need to return a set. We've never supported that, but the
restriction is depended on in more subtle ways now, so it seems wise to
detect it at the start.
Also, provide some documentation about how to rewrite a SRF-within-CASE
query using a custom wrapper SRF.
It turns out that the information_schema.user_mapping_options view
contained an instance of exactly the behavior we're now forbidding; but
rewriting it makes it more clear and safer too.
initdb forced because of user_mapping_options change.
Patch by me, with error message suggestions from Alvaro Herrera and
Andres Freund, pursuant to a complaint from Regina Obe.
Discussion: https://postgr.es/m/000001d2d5de$d8d66170$8a832450$@pcorp.us
2017-06-14 05:46:39 +02:00
|
|
|
-- SRFs are not allowed if they'd need to be conditionally executed
|
|
|
|
SELECT q1, case when q1 > 0 then generate_series(1,3) else 0 end FROM int8_tbl;
|
|
|
|
ERROR: set-returning functions are not allowed in CASE
|
|
|
|
LINE 1: SELECT q1, case when q1 > 0 then generate_series(1,3) else 0...
|
|
|
|
^
|
|
|
|
HINT: You might be able to move the set-returning function into a LATERAL FROM item.
|
|
|
|
SELECT q1, coalesce(generate_series(1,3), 0) FROM int8_tbl;
|
|
|
|
ERROR: set-returning functions are not allowed in COALESCE
|
|
|
|
LINE 1: SELECT q1, coalesce(generate_series(1,3), 0) FROM int8_tbl;
|
|
|
|
^
|
|
|
|
HINT: You might be able to move the set-returning function into a LATERAL FROM item.
|
2016-08-04 03:29:42 +02:00
|
|
|
-- SRFs are not allowed in aggregate arguments
|
|
|
|
SELECT min(generate_series(1, 3)) FROM few;
|
Disallow set-returning functions inside CASE or COALESCE.
When we reimplemented SRFs in commit 69f4b9c85, our initial choice was
to allow the behavior to vary from historical practice in cases where a
SRF call appeared within a conditional-execution construct (currently,
only CASE or COALESCE). But that was controversial to begin with, and
subsequent discussion has resulted in a consensus that it's better to
throw an error instead of executing the query differently from before,
so long as we can provide a reasonably clear error message and a way to
rewrite the query.
Hence, add a parser mechanism to allow detection of such cases during
parse analysis. The mechanism just requires storing, in the ParseState,
a pointer to the set-returning FuncExpr or OpExpr most recently emitted
by parse analysis. Then the parsing functions for CASE and COALESCE can
detect the presence of a SRF in their arguments by noting whether this
pointer changes while analyzing their arguments. Furthermore, if it does,
it provides a suitable error cursor location for the complaint. (This
means that if there's more than one SRF in the arguments, the error will
point at the last one to be analyzed not the first. While connoisseurs of
parsing behavior might find that odd, it's unlikely the average user would
ever notice.)
While at it, we can also provide more specific error messages than before
about some pre-existing restrictions, such as no-SRFs-within-aggregates.
Also, reject at parse time cases where a NULLIF or IS DISTINCT FROM
construct would need to return a set. We've never supported that, but the
restriction is depended on in more subtle ways now, so it seems wise to
detect it at the start.
Also, provide some documentation about how to rewrite a SRF-within-CASE
query using a custom wrapper SRF.
It turns out that the information_schema.user_mapping_options view
contained an instance of exactly the behavior we're now forbidding; but
rewriting it makes it more clear and safer too.
initdb forced because of user_mapping_options change.
Patch by me, with error message suggestions from Alvaro Herrera and
Andres Freund, pursuant to a complaint from Regina Obe.
Discussion: https://postgr.es/m/000001d2d5de$d8d66170$8a832450$@pcorp.us
2017-06-14 05:46:39 +02:00
|
|
|
ERROR: aggregate function calls cannot contain set-returning function calls
|
2017-04-18 19:20:59 +02:00
|
|
|
LINE 1: SELECT min(generate_series(1, 3)) FROM few;
|
|
|
|
^
|
Disallow set-returning functions inside CASE or COALESCE.
When we reimplemented SRFs in commit 69f4b9c85, our initial choice was
to allow the behavior to vary from historical practice in cases where a
SRF call appeared within a conditional-execution construct (currently,
only CASE or COALESCE). But that was controversial to begin with, and
subsequent discussion has resulted in a consensus that it's better to
throw an error instead of executing the query differently from before,
so long as we can provide a reasonably clear error message and a way to
rewrite the query.
Hence, add a parser mechanism to allow detection of such cases during
parse analysis. The mechanism just requires storing, in the ParseState,
a pointer to the set-returning FuncExpr or OpExpr most recently emitted
by parse analysis. Then the parsing functions for CASE and COALESCE can
detect the presence of a SRF in their arguments by noting whether this
pointer changes while analyzing their arguments. Furthermore, if it does,
it provides a suitable error cursor location for the complaint. (This
means that if there's more than one SRF in the arguments, the error will
point at the last one to be analyzed not the first. While connoisseurs of
parsing behavior might find that odd, it's unlikely the average user would
ever notice.)
While at it, we can also provide more specific error messages than before
about some pre-existing restrictions, such as no-SRFs-within-aggregates.
Also, reject at parse time cases where a NULLIF or IS DISTINCT FROM
construct would need to return a set. We've never supported that, but the
restriction is depended on in more subtle ways now, so it seems wise to
detect it at the start.
Also, provide some documentation about how to rewrite a SRF-within-CASE
query using a custom wrapper SRF.
It turns out that the information_schema.user_mapping_options view
contained an instance of exactly the behavior we're now forbidding; but
rewriting it makes it more clear and safer too.
initdb forced because of user_mapping_options change.
Patch by me, with error message suggestions from Alvaro Herrera and
Andres Freund, pursuant to a complaint from Regina Obe.
Discussion: https://postgr.es/m/000001d2d5de$d8d66170$8a832450$@pcorp.us
2017-06-14 05:46:39 +02:00
|
|
|
HINT: You might be able to move the set-returning function into a LATERAL FROM item.
|
2017-06-27 23:51:11 +02:00
|
|
|
-- ... unless they're within a sub-select
|
|
|
|
SELECT sum((3 = ANY(SELECT generate_series(1,4)))::int);
|
|
|
|
sum
|
|
|
|
-----
|
|
|
|
1
|
|
|
|
(1 row)
|
|
|
|
|
|
|
|
SELECT sum((3 = ANY(SELECT lag(x) over(order by x)
|
|
|
|
FROM generate_series(1,4) x))::int);
|
|
|
|
sum
|
|
|
|
-----
|
|
|
|
1
|
|
|
|
(1 row)
|
|
|
|
|
2016-09-14 20:30:40 +02:00
|
|
|
-- SRFs are not allowed in window function arguments, either
|
|
|
|
SELECT min(generate_series(1, 3)) OVER() FROM few;
|
Disallow set-returning functions inside CASE or COALESCE.
When we reimplemented SRFs in commit 69f4b9c85, our initial choice was
to allow the behavior to vary from historical practice in cases where a
SRF call appeared within a conditional-execution construct (currently,
only CASE or COALESCE). But that was controversial to begin with, and
subsequent discussion has resulted in a consensus that it's better to
throw an error instead of executing the query differently from before,
so long as we can provide a reasonably clear error message and a way to
rewrite the query.
Hence, add a parser mechanism to allow detection of such cases during
parse analysis. The mechanism just requires storing, in the ParseState,
a pointer to the set-returning FuncExpr or OpExpr most recently emitted
by parse analysis. Then the parsing functions for CASE and COALESCE can
detect the presence of a SRF in their arguments by noting whether this
pointer changes while analyzing their arguments. Furthermore, if it does,
it provides a suitable error cursor location for the complaint. (This
means that if there's more than one SRF in the arguments, the error will
point at the last one to be analyzed not the first. While connoisseurs of
parsing behavior might find that odd, it's unlikely the average user would
ever notice.)
While at it, we can also provide more specific error messages than before
about some pre-existing restrictions, such as no-SRFs-within-aggregates.
Also, reject at parse time cases where a NULLIF or IS DISTINCT FROM
construct would need to return a set. We've never supported that, but the
restriction is depended on in more subtle ways now, so it seems wise to
detect it at the start.
Also, provide some documentation about how to rewrite a SRF-within-CASE
query using a custom wrapper SRF.
It turns out that the information_schema.user_mapping_options view
contained an instance of exactly the behavior we're now forbidding; but
rewriting it makes it more clear and safer too.
initdb forced because of user_mapping_options change.
Patch by me, with error message suggestions from Alvaro Herrera and
Andres Freund, pursuant to a complaint from Regina Obe.
Discussion: https://postgr.es/m/000001d2d5de$d8d66170$8a832450$@pcorp.us
2017-06-14 05:46:39 +02:00
|
|
|
ERROR: window function calls cannot contain set-returning function calls
|
2017-04-18 19:20:59 +02:00
|
|
|
LINE 1: SELECT min(generate_series(1, 3)) OVER() FROM few;
|
|
|
|
^
|
Disallow set-returning functions inside CASE or COALESCE.
When we reimplemented SRFs in commit 69f4b9c85, our initial choice was
to allow the behavior to vary from historical practice in cases where a
SRF call appeared within a conditional-execution construct (currently,
only CASE or COALESCE). But that was controversial to begin with, and
subsequent discussion has resulted in a consensus that it's better to
throw an error instead of executing the query differently from before,
so long as we can provide a reasonably clear error message and a way to
rewrite the query.
Hence, add a parser mechanism to allow detection of such cases during
parse analysis. The mechanism just requires storing, in the ParseState,
a pointer to the set-returning FuncExpr or OpExpr most recently emitted
by parse analysis. Then the parsing functions for CASE and COALESCE can
detect the presence of a SRF in their arguments by noting whether this
pointer changes while analyzing their arguments. Furthermore, if it does,
it provides a suitable error cursor location for the complaint. (This
means that if there's more than one SRF in the arguments, the error will
point at the last one to be analyzed not the first. While connoisseurs of
parsing behavior might find that odd, it's unlikely the average user would
ever notice.)
While at it, we can also provide more specific error messages than before
about some pre-existing restrictions, such as no-SRFs-within-aggregates.
Also, reject at parse time cases where a NULLIF or IS DISTINCT FROM
construct would need to return a set. We've never supported that, but the
restriction is depended on in more subtle ways now, so it seems wise to
detect it at the start.
Also, provide some documentation about how to rewrite a SRF-within-CASE
query using a custom wrapper SRF.
It turns out that the information_schema.user_mapping_options view
contained an instance of exactly the behavior we're now forbidding; but
rewriting it makes it more clear and safer too.
initdb forced because of user_mapping_options change.
Patch by me, with error message suggestions from Alvaro Herrera and
Andres Freund, pursuant to a complaint from Regina Obe.
Discussion: https://postgr.es/m/000001d2d5de$d8d66170$8a832450$@pcorp.us
2017-06-14 05:46:39 +02:00
|
|
|
HINT: You might be able to move the set-returning function into a LATERAL FROM item.
|
2016-08-04 03:29:42 +02:00
|
|
|
-- SRFs are normally computed after window functions
|
|
|
|
SELECT id,lag(id) OVER(), count(*) OVER(), generate_series(1,3) FROM few;
|
|
|
|
id | lag | count | generate_series
|
|
|
|
----+-----+-------+-----------------
|
|
|
|
1 | | 3 | 1
|
|
|
|
1 | | 3 | 2
|
|
|
|
1 | | 3 | 3
|
|
|
|
2 | 1 | 3 | 1
|
|
|
|
2 | 1 | 3 | 2
|
|
|
|
2 | 1 | 3 | 3
|
|
|
|
3 | 2 | 3 | 1
|
|
|
|
3 | 2 | 3 | 2
|
|
|
|
3 | 2 | 3 | 3
|
|
|
|
(9 rows)
|
|
|
|
|
|
|
|
-- unless referencing SRFs
|
|
|
|
SELECT SUM(count(*)) OVER(PARTITION BY generate_series(1,3) ORDER BY generate_series(1,3)), generate_series(1,3) g FROM few GROUP BY g;
|
|
|
|
sum | g
|
|
|
|
-----+---
|
|
|
|
3 | 1
|
|
|
|
3 | 2
|
|
|
|
3 | 3
|
|
|
|
(3 rows)
|
|
|
|
|
|
|
|
-- sorting + grouping
|
2016-10-10 22:41:57 +02:00
|
|
|
SELECT few.dataa, count(*), min(id), max(id), generate_series(1,3) FROM few GROUP BY few.dataa ORDER BY 5, 1;
|
2016-08-04 03:29:42 +02:00
|
|
|
dataa | count | min | max | generate_series
|
|
|
|
-------+-------+-----+-----+-----------------
|
|
|
|
a | 2 | 1 | 2 | 1
|
2016-10-10 22:41:57 +02:00
|
|
|
b | 1 | 3 | 3 | 1
|
2016-08-04 03:29:42 +02:00
|
|
|
a | 2 | 1 | 2 | 2
|
2016-10-10 22:41:57 +02:00
|
|
|
b | 1 | 3 | 3 | 2
|
2016-08-04 03:29:42 +02:00
|
|
|
a | 2 | 1 | 2 | 3
|
2016-10-10 22:41:57 +02:00
|
|
|
b | 1 | 3 | 3 | 3
|
2016-08-04 03:29:42 +02:00
|
|
|
(6 rows)
|
|
|
|
|
|
|
|
-- grouping sets are a bit special, they produce NULLs in columns not actually NULL
|
2017-03-27 05:20:54 +02:00
|
|
|
set enable_hashagg = false;
|
2016-08-04 03:29:42 +02:00
|
|
|
SELECT dataa, datab b, generate_series(1,2) g, count(*) FROM few GROUP BY CUBE(dataa, datab);
|
|
|
|
dataa | b | g | count
|
|
|
|
-------+-----+---+-------
|
|
|
|
a | bar | 1 | 1
|
|
|
|
a | bar | 2 | 1
|
|
|
|
a | foo | 1 | 1
|
|
|
|
a | foo | 2 | 1
|
|
|
|
a | | 1 | 2
|
|
|
|
a | | 2 | 2
|
|
|
|
b | bar | 1 | 1
|
|
|
|
b | bar | 2 | 1
|
|
|
|
b | | 1 | 1
|
|
|
|
b | | 2 | 1
|
|
|
|
| | 1 | 3
|
|
|
|
| | 2 | 3
|
|
|
|
| bar | 1 | 2
|
|
|
|
| bar | 2 | 2
|
|
|
|
| foo | 1 | 1
|
|
|
|
| foo | 2 | 1
|
|
|
|
(16 rows)
|
|
|
|
|
|
|
|
SELECT dataa, datab b, generate_series(1,2) g, count(*) FROM few GROUP BY CUBE(dataa, datab) ORDER BY dataa;
|
|
|
|
dataa | b | g | count
|
|
|
|
-------+-----+---+-------
|
|
|
|
a | bar | 1 | 1
|
|
|
|
a | bar | 2 | 1
|
|
|
|
a | foo | 1 | 1
|
|
|
|
a | foo | 2 | 1
|
|
|
|
a | | 1 | 2
|
|
|
|
a | | 2 | 2
|
|
|
|
b | bar | 1 | 1
|
|
|
|
b | bar | 2 | 1
|
|
|
|
b | | 1 | 1
|
|
|
|
b | | 2 | 1
|
|
|
|
| | 1 | 3
|
|
|
|
| | 2 | 3
|
|
|
|
| bar | 1 | 2
|
|
|
|
| bar | 2 | 2
|
|
|
|
| foo | 1 | 1
|
|
|
|
| foo | 2 | 1
|
|
|
|
(16 rows)
|
|
|
|
|
|
|
|
SELECT dataa, datab b, generate_series(1,2) g, count(*) FROM few GROUP BY CUBE(dataa, datab) ORDER BY g;
|
|
|
|
dataa | b | g | count
|
|
|
|
-------+-----+---+-------
|
|
|
|
a | bar | 1 | 1
|
|
|
|
a | foo | 1 | 1
|
|
|
|
a | | 1 | 2
|
|
|
|
b | bar | 1 | 1
|
|
|
|
b | | 1 | 1
|
|
|
|
| | 1 | 3
|
|
|
|
| bar | 1 | 2
|
|
|
|
| foo | 1 | 1
|
|
|
|
| foo | 2 | 1
|
|
|
|
a | bar | 2 | 1
|
|
|
|
b | | 2 | 1
|
|
|
|
a | foo | 2 | 1
|
|
|
|
| bar | 2 | 2
|
|
|
|
a | | 2 | 2
|
|
|
|
| | 2 | 3
|
|
|
|
b | bar | 2 | 1
|
|
|
|
(16 rows)
|
|
|
|
|
|
|
|
SELECT dataa, datab b, generate_series(1,2) g, count(*) FROM few GROUP BY CUBE(dataa, datab, g);
|
|
|
|
dataa | b | g | count
|
|
|
|
-------+-----+---+-------
|
|
|
|
a | bar | 1 | 1
|
|
|
|
a | bar | 2 | 1
|
|
|
|
a | bar | | 2
|
|
|
|
a | foo | 1 | 1
|
|
|
|
a | foo | 2 | 1
|
|
|
|
a | foo | | 2
|
|
|
|
a | | | 4
|
|
|
|
b | bar | 1 | 1
|
|
|
|
b | bar | 2 | 1
|
|
|
|
b | bar | | 2
|
|
|
|
b | | | 2
|
|
|
|
| | | 6
|
|
|
|
| bar | 1 | 2
|
|
|
|
| bar | 2 | 2
|
|
|
|
| bar | | 4
|
|
|
|
| foo | 1 | 1
|
|
|
|
| foo | 2 | 1
|
|
|
|
| foo | | 2
|
2017-03-27 05:20:54 +02:00
|
|
|
a | | 1 | 2
|
|
|
|
b | | 1 | 1
|
|
|
|
| | 1 | 3
|
|
|
|
a | | 2 | 2
|
|
|
|
b | | 2 | 1
|
|
|
|
| | 2 | 3
|
2016-08-04 03:29:42 +02:00
|
|
|
(24 rows)
|
|
|
|
|
|
|
|
SELECT dataa, datab b, generate_series(1,2) g, count(*) FROM few GROUP BY CUBE(dataa, datab, g) ORDER BY dataa;
|
|
|
|
dataa | b | g | count
|
|
|
|
-------+-----+---+-------
|
2017-03-27 05:20:54 +02:00
|
|
|
a | foo | | 2
|
|
|
|
a | | | 4
|
|
|
|
a | | 2 | 2
|
2016-08-04 03:29:42 +02:00
|
|
|
a | bar | 1 | 1
|
|
|
|
a | bar | 2 | 1
|
|
|
|
a | bar | | 2
|
|
|
|
a | foo | 1 | 1
|
|
|
|
a | foo | 2 | 1
|
|
|
|
a | | 1 | 2
|
2017-03-27 05:20:54 +02:00
|
|
|
b | bar | 1 | 1
|
2016-08-04 03:29:42 +02:00
|
|
|
b | | | 2
|
|
|
|
b | | 1 | 1
|
2017-03-27 05:20:54 +02:00
|
|
|
b | bar | 2 | 1
|
2016-08-04 03:29:42 +02:00
|
|
|
b | bar | | 2
|
2017-03-27 05:20:54 +02:00
|
|
|
b | | 2 | 1
|
2016-08-04 03:29:42 +02:00
|
|
|
| | 2 | 3
|
2017-03-27 05:20:54 +02:00
|
|
|
| | | 6
|
2016-08-04 03:29:42 +02:00
|
|
|
| bar | 1 | 2
|
|
|
|
| bar | 2 | 2
|
|
|
|
| bar | | 4
|
2017-03-27 05:20:54 +02:00
|
|
|
| foo | 1 | 1
|
|
|
|
| foo | 2 | 1
|
|
|
|
| foo | | 2
|
2016-08-04 03:29:42 +02:00
|
|
|
| | 1 | 3
|
|
|
|
(24 rows)
|
|
|
|
|
|
|
|
SELECT dataa, datab b, generate_series(1,2) g, count(*) FROM few GROUP BY CUBE(dataa, datab, g) ORDER BY g;
|
|
|
|
dataa | b | g | count
|
|
|
|
-------+-----+---+-------
|
|
|
|
a | bar | 1 | 1
|
|
|
|
a | foo | 1 | 1
|
|
|
|
b | bar | 1 | 1
|
2017-03-27 05:20:54 +02:00
|
|
|
| bar | 1 | 2
|
|
|
|
| foo | 1 | 1
|
2016-08-04 03:29:42 +02:00
|
|
|
a | | 1 | 2
|
|
|
|
b | | 1 | 1
|
|
|
|
| | 1 | 3
|
|
|
|
a | | 2 | 2
|
|
|
|
b | | 2 | 1
|
2017-03-27 05:20:54 +02:00
|
|
|
| bar | 2 | 2
|
2016-08-04 03:29:42 +02:00
|
|
|
| | 2 | 3
|
2017-03-27 05:20:54 +02:00
|
|
|
| foo | 2 | 1
|
|
|
|
a | bar | 2 | 1
|
2016-08-04 03:29:42 +02:00
|
|
|
a | foo | 2 | 1
|
|
|
|
b | bar | 2 | 1
|
2017-03-27 05:20:54 +02:00
|
|
|
a | | | 4
|
2016-08-04 03:29:42 +02:00
|
|
|
b | bar | | 2
|
|
|
|
b | | | 2
|
|
|
|
| | | 6
|
2017-03-27 05:20:54 +02:00
|
|
|
a | foo | | 2
|
|
|
|
a | bar | | 2
|
2016-08-04 03:29:42 +02:00
|
|
|
| bar | | 4
|
|
|
|
| foo | | 2
|
|
|
|
(24 rows)
|
|
|
|
|
2017-03-27 05:20:54 +02:00
|
|
|
reset enable_hashagg;
|
2018-07-11 21:25:28 +02:00
|
|
|
-- case with degenerate ORDER BY
|
|
|
|
explain (verbose, costs off)
|
|
|
|
select 'foo' as f, generate_series(1,2) as g from few order by 1;
|
|
|
|
QUERY PLAN
|
|
|
|
----------------------------------------------
|
|
|
|
ProjectSet
|
|
|
|
Output: 'foo'::text, generate_series(1, 2)
|
|
|
|
-> Seq Scan on public.few
|
|
|
|
Output: id, dataa, datab
|
|
|
|
(4 rows)
|
|
|
|
|
|
|
|
select 'foo' as f, generate_series(1,2) as g from few order by 1;
|
|
|
|
f | g
|
|
|
|
-----+---
|
|
|
|
foo | 1
|
|
|
|
foo | 2
|
|
|
|
foo | 1
|
|
|
|
foo | 2
|
|
|
|
foo | 1
|
|
|
|
foo | 2
|
|
|
|
(6 rows)
|
|
|
|
|
2016-08-04 03:29:42 +02:00
|
|
|
-- data modification
|
|
|
|
CREATE TABLE fewmore AS SELECT generate_series(1,3) AS data;
|
|
|
|
INSERT INTO fewmore VALUES(generate_series(4,5));
|
|
|
|
SELECT * FROM fewmore;
|
|
|
|
data
|
|
|
|
------
|
|
|
|
1
|
|
|
|
2
|
|
|
|
3
|
|
|
|
4
|
|
|
|
5
|
|
|
|
(5 rows)
|
|
|
|
|
2016-09-13 19:54:24 +02:00
|
|
|
-- SRFs are not allowed in UPDATE (they once were, but it was nonsense)
|
2016-08-04 03:29:42 +02:00
|
|
|
UPDATE fewmore SET data = generate_series(4,9);
|
2016-09-13 19:54:24 +02:00
|
|
|
ERROR: set-returning functions are not allowed in UPDATE
|
|
|
|
LINE 1: UPDATE fewmore SET data = generate_series(4,9);
|
|
|
|
^
|
2016-08-04 03:29:42 +02:00
|
|
|
-- SRFs are not allowed in RETURNING
|
|
|
|
INSERT INTO fewmore VALUES(1) RETURNING generate_series(1,3);
|
2016-09-13 19:54:24 +02:00
|
|
|
ERROR: set-returning functions are not allowed in RETURNING
|
|
|
|
LINE 1: INSERT INTO fewmore VALUES(1) RETURNING generate_series(1,3)...
|
|
|
|
^
|
|
|
|
-- nor standalone VALUES (but surely this is a bug?)
|
2016-08-04 03:29:42 +02:00
|
|
|
VALUES(1, generate_series(1,2));
|
2017-01-16 21:23:11 +01:00
|
|
|
ERROR: set-returning functions are not allowed in VALUES
|
|
|
|
LINE 1: VALUES(1, generate_series(1,2));
|
|
|
|
^
|
2016-09-15 01:48:42 +02:00
|
|
|
-- We allow tSRFs that are not at top level
|
|
|
|
SELECT int4mul(generate_series(1,2), 10);
|
|
|
|
int4mul
|
|
|
|
---------
|
|
|
|
10
|
|
|
|
20
|
|
|
|
(2 rows)
|
|
|
|
|
2017-06-14 17:10:05 +02:00
|
|
|
SELECT generate_series(1,3) IS DISTINCT FROM 2;
|
|
|
|
?column?
|
|
|
|
----------
|
|
|
|
t
|
|
|
|
f
|
|
|
|
t
|
|
|
|
(3 rows)
|
|
|
|
|
2016-09-15 01:48:42 +02:00
|
|
|
-- but SRFs in function RTEs must be at top level (annoying restriction)
|
|
|
|
SELECT * FROM int4mul(generate_series(1,2), 10);
|
Disallow set-returning functions inside CASE or COALESCE.
When we reimplemented SRFs in commit 69f4b9c85, our initial choice was
to allow the behavior to vary from historical practice in cases where a
SRF call appeared within a conditional-execution construct (currently,
only CASE or COALESCE). But that was controversial to begin with, and
subsequent discussion has resulted in a consensus that it's better to
throw an error instead of executing the query differently from before,
so long as we can provide a reasonably clear error message and a way to
rewrite the query.
Hence, add a parser mechanism to allow detection of such cases during
parse analysis. The mechanism just requires storing, in the ParseState,
a pointer to the set-returning FuncExpr or OpExpr most recently emitted
by parse analysis. Then the parsing functions for CASE and COALESCE can
detect the presence of a SRF in their arguments by noting whether this
pointer changes while analyzing their arguments. Furthermore, if it does,
it provides a suitable error cursor location for the complaint. (This
means that if there's more than one SRF in the arguments, the error will
point at the last one to be analyzed not the first. While connoisseurs of
parsing behavior might find that odd, it's unlikely the average user would
ever notice.)
While at it, we can also provide more specific error messages than before
about some pre-existing restrictions, such as no-SRFs-within-aggregates.
Also, reject at parse time cases where a NULLIF or IS DISTINCT FROM
construct would need to return a set. We've never supported that, but the
restriction is depended on in more subtle ways now, so it seems wise to
detect it at the start.
Also, provide some documentation about how to rewrite a SRF-within-CASE
query using a custom wrapper SRF.
It turns out that the information_schema.user_mapping_options view
contained an instance of exactly the behavior we're now forbidding; but
rewriting it makes it more clear and safer too.
initdb forced because of user_mapping_options change.
Patch by me, with error message suggestions from Alvaro Herrera and
Andres Freund, pursuant to a complaint from Regina Obe.
Discussion: https://postgr.es/m/000001d2d5de$d8d66170$8a832450$@pcorp.us
2017-06-14 05:46:39 +02:00
|
|
|
ERROR: set-returning functions must appear at top level of FROM
|
2017-04-18 19:20:59 +02:00
|
|
|
LINE 1: SELECT * FROM int4mul(generate_series(1,2), 10);
|
|
|
|
^
|
2016-08-04 03:29:42 +02:00
|
|
|
-- DISTINCT ON is evaluated before tSRF evaluation if SRF is not
|
|
|
|
-- referenced either in ORDER BY or in the DISTINCT ON list. The ORDER
|
|
|
|
-- BY reference can be implicitly generated, if there's no other ORDER BY.
|
|
|
|
-- implicit reference (via implicit ORDER) to all columns
|
|
|
|
SELECT DISTINCT ON (a) a, b, generate_series(1,3) g
|
|
|
|
FROM (VALUES (3, 2), (3,1), (1,1), (1,4), (5,3), (5,1)) AS t(a, b);
|
|
|
|
a | b | g
|
|
|
|
---+---+---
|
|
|
|
1 | 1 | 1
|
|
|
|
3 | 2 | 1
|
|
|
|
5 | 3 | 1
|
|
|
|
(3 rows)
|
|
|
|
|
|
|
|
-- unreferenced in DISTINCT ON or ORDER BY
|
|
|
|
SELECT DISTINCT ON (a) a, b, generate_series(1,3) g
|
|
|
|
FROM (VALUES (3, 2), (3,1), (1,1), (1,4), (5,3), (5,1)) AS t(a, b)
|
|
|
|
ORDER BY a, b DESC;
|
|
|
|
a | b | g
|
|
|
|
---+---+---
|
|
|
|
1 | 4 | 1
|
|
|
|
1 | 4 | 2
|
|
|
|
1 | 4 | 3
|
|
|
|
3 | 2 | 1
|
|
|
|
3 | 2 | 2
|
|
|
|
3 | 2 | 3
|
|
|
|
5 | 3 | 1
|
|
|
|
5 | 3 | 2
|
|
|
|
5 | 3 | 3
|
|
|
|
(9 rows)
|
|
|
|
|
|
|
|
-- referenced in ORDER BY
|
|
|
|
SELECT DISTINCT ON (a) a, b, generate_series(1,3) g
|
|
|
|
FROM (VALUES (3, 2), (3,1), (1,1), (1,4), (5,3), (5,1)) AS t(a, b)
|
|
|
|
ORDER BY a, b DESC, g DESC;
|
|
|
|
a | b | g
|
|
|
|
---+---+---
|
|
|
|
1 | 4 | 3
|
|
|
|
3 | 2 | 3
|
|
|
|
5 | 3 | 3
|
|
|
|
(3 rows)
|
|
|
|
|
|
|
|
-- referenced in ORDER BY and DISTINCT ON
|
|
|
|
SELECT DISTINCT ON (a, b, g) a, b, generate_series(1,3) g
|
|
|
|
FROM (VALUES (3, 2), (3,1), (1,1), (1,4), (5,3), (5,1)) AS t(a, b)
|
|
|
|
ORDER BY a, b DESC, g DESC;
|
|
|
|
a | b | g
|
|
|
|
---+---+---
|
|
|
|
1 | 4 | 3
|
|
|
|
1 | 4 | 2
|
|
|
|
1 | 4 | 1
|
|
|
|
1 | 1 | 3
|
|
|
|
1 | 1 | 2
|
|
|
|
1 | 1 | 1
|
|
|
|
3 | 2 | 3
|
|
|
|
3 | 2 | 2
|
|
|
|
3 | 2 | 1
|
|
|
|
3 | 1 | 3
|
|
|
|
3 | 1 | 2
|
|
|
|
3 | 1 | 1
|
|
|
|
5 | 3 | 3
|
|
|
|
5 | 3 | 2
|
|
|
|
5 | 3 | 1
|
|
|
|
5 | 1 | 3
|
|
|
|
5 | 1 | 2
|
|
|
|
5 | 1 | 1
|
|
|
|
(18 rows)
|
|
|
|
|
|
|
|
-- only SRF mentioned in DISTINCT ON
|
|
|
|
SELECT DISTINCT ON (g) a, b, generate_series(1,3) g
|
|
|
|
FROM (VALUES (3, 2), (3,1), (1,1), (1,4), (5,3), (5,1)) AS t(a, b);
|
|
|
|
a | b | g
|
|
|
|
---+---+---
|
|
|
|
3 | 2 | 1
|
|
|
|
5 | 1 | 2
|
|
|
|
3 | 1 | 3
|
|
|
|
(3 rows)
|
|
|
|
|
|
|
|
-- LIMIT / OFFSET is evaluated after SRF evaluation
|
|
|
|
SELECT a, generate_series(1,2) FROM (VALUES(1),(2),(3)) r(a) LIMIT 2 OFFSET 2;
|
|
|
|
a | generate_series
|
|
|
|
---+-----------------
|
|
|
|
2 | 1
|
|
|
|
2 | 2
|
|
|
|
(2 rows)
|
|
|
|
|
|
|
|
-- SRFs are not allowed in LIMIT.
|
|
|
|
SELECT 1 LIMIT generate_series(1,3);
|
2016-09-13 19:54:24 +02:00
|
|
|
ERROR: set-returning functions are not allowed in LIMIT
|
2016-08-04 03:29:42 +02:00
|
|
|
LINE 1: SELECT 1 LIMIT generate_series(1,3);
|
|
|
|
^
|
|
|
|
-- tSRF in correlated subquery, referencing table outside
|
|
|
|
SELECT (SELECT generate_series(1,3) LIMIT 1 OFFSET few.id) FROM few;
|
|
|
|
generate_series
|
|
|
|
-----------------
|
|
|
|
2
|
|
|
|
3
|
|
|
|
|
|
|
|
(3 rows)
|
|
|
|
|
|
|
|
-- tSRF in correlated subquery, referencing SRF outside
|
|
|
|
SELECT (SELECT generate_series(1,3) LIMIT 1 OFFSET g.i) FROM generate_series(0,3) g(i);
|
|
|
|
generate_series
|
|
|
|
-----------------
|
|
|
|
1
|
|
|
|
2
|
|
|
|
3
|
|
|
|
|
|
|
|
(4 rows)
|
|
|
|
|
|
|
|
-- Operators can return sets too
|
|
|
|
CREATE OPERATOR |@| (PROCEDURE = unnest, RIGHTARG = ANYARRAY);
|
|
|
|
SELECT |@|ARRAY[1,2,3];
|
|
|
|
?column?
|
|
|
|
----------
|
|
|
|
1
|
|
|
|
2
|
|
|
|
3
|
|
|
|
(3 rows)
|
|
|
|
|
Fix mishandling of tSRFs at different nesting levels.
Given a targetlist like "srf(x), f(srf(x))", split_pathtarget_at_srfs()
decided that it needed two levels of ProjectSet nodes, failing to notice
that the two SRF calls are textually equal(). Because of that, setrefs.c
would convert the upper ProjectSet's tlist to "Var1, f(Var1)" (where Var1
represents a reference to the srf(x) output of the lower ProjectSet).
This triggered an assertion in nodeProjectSet.c complaining that it found
no SRFs to evaluate, as reported by Erik Rijkers.
What we want in such a case is to evaluate srf(x) only once and use a plain
Result node to compute "Var1, f(Var1)"; that gives results similar to what
previous versions produced, whereas allowing srf(x) to be evaluated again
in an upper ProjectSet would square the number of rows emitted.
Furthermore, even if the SRF calls aren't textually identical, we want them
to be evaluated in lockstep, because that's what happened in the old
implementation. But split_pathtarget_at_srfs() got this completely wrong,
using two levels of ProjectSet for a case like "srf(x), f(srf(y))".
Hence, rewrite split_pathtarget_at_srfs() from the ground up so that it
groups SRFs according to the depth of nesting of SRFs in their arguments.
This is pretty much how we envisioned that working originally, but I blew
it when it came to implementation.
In passing, optimize the case of target == input_target, which I noticed
is not only possible but quite common.
Discussion: https://postgr.es/m/dcbd2853c05d22088766553d60dc78c6@xs4all.nl
2017-02-02 22:38:13 +01:00
|
|
|
-- Some fun cases involving duplicate SRF calls
|
|
|
|
explain (verbose, costs off)
|
|
|
|
select generate_series(1,3) as x, generate_series(1,3) + 1 as xp1;
|
|
|
|
QUERY PLAN
|
|
|
|
------------------------------------------------------------------
|
|
|
|
Result
|
|
|
|
Output: (generate_series(1, 3)), ((generate_series(1, 3)) + 1)
|
|
|
|
-> ProjectSet
|
|
|
|
Output: generate_series(1, 3)
|
|
|
|
-> Result
|
|
|
|
(5 rows)
|
|
|
|
|
|
|
|
select generate_series(1,3) as x, generate_series(1,3) + 1 as xp1;
|
|
|
|
x | xp1
|
|
|
|
---+-----
|
|
|
|
1 | 2
|
|
|
|
2 | 3
|
|
|
|
3 | 4
|
|
|
|
(3 rows)
|
|
|
|
|
|
|
|
explain (verbose, costs off)
|
|
|
|
select generate_series(1,3)+1 order by generate_series(1,3);
|
|
|
|
QUERY PLAN
|
|
|
|
------------------------------------------------------------------------
|
|
|
|
Sort
|
|
|
|
Output: (((generate_series(1, 3)) + 1)), (generate_series(1, 3))
|
|
|
|
Sort Key: (generate_series(1, 3))
|
|
|
|
-> Result
|
|
|
|
Output: ((generate_series(1, 3)) + 1), (generate_series(1, 3))
|
|
|
|
-> ProjectSet
|
|
|
|
Output: generate_series(1, 3)
|
|
|
|
-> Result
|
|
|
|
(8 rows)
|
|
|
|
|
|
|
|
select generate_series(1,3)+1 order by generate_series(1,3);
|
|
|
|
?column?
|
|
|
|
----------
|
|
|
|
2
|
|
|
|
3
|
|
|
|
4
|
|
|
|
(3 rows)
|
|
|
|
|
|
|
|
-- Check that SRFs of same nesting level run in lockstep
|
|
|
|
explain (verbose, costs off)
|
|
|
|
select generate_series(1,3) as x, generate_series(3,6) + 1 as y;
|
|
|
|
QUERY PLAN
|
|
|
|
------------------------------------------------------------------
|
|
|
|
Result
|
|
|
|
Output: (generate_series(1, 3)), ((generate_series(3, 6)) + 1)
|
|
|
|
-> ProjectSet
|
|
|
|
Output: generate_series(1, 3), generate_series(3, 6)
|
|
|
|
-> Result
|
|
|
|
(5 rows)
|
|
|
|
|
|
|
|
select generate_series(1,3) as x, generate_series(3,6) + 1 as y;
|
|
|
|
x | y
|
|
|
|
---+---
|
|
|
|
1 | 4
|
|
|
|
2 | 5
|
|
|
|
3 | 6
|
|
|
|
| 7
|
|
|
|
(4 rows)
|
|
|
|
|
2016-08-04 03:29:42 +02:00
|
|
|
-- Clean up
|
|
|
|
DROP TABLE few;
|
|
|
|
DROP TABLE fewmore;
|