postgresql/src/test/regress/expected/tsrf.out

610 lines
16 KiB
Plaintext
Raw Normal View History

--
-- tsrf - targetlist set returning function tests
--
-- simple srf
SELECT generate_series(1, 3);
generate_series
-----------------
1
2
3
(3 rows)
-- parallel iteration
SELECT generate_series(1, 3), generate_series(3,5);
generate_series | generate_series
-----------------+-----------------
1 | 3
2 | 4
3 | 5
(3 rows)
-- parallel iteration, different number of rows
SELECT generate_series(1, 2), generate_series(1,4);
generate_series | generate_series
-----------------+-----------------
1 | 1
2 | 2
Move targetlist SRF handling from expression evaluation to new executor node. Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT generate_series(1,5)) so far was done in the expression evaluation (i.e. ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code. This meant that most executor nodes performing projection, and most expression evaluation functions, had to deal with the possibility that an evaluated expression could return a set of return values. That's bad because it leads to repeated code in a lot of places. It also, and that's my (Andres's) motivation, made it a lot harder to implement a more efficient way of doing expression evaluation. To fix this, introduce a new executor node (ProjectSet) that can evaluate targetlists containing one or more SRFs. To avoid the complexity of the old way of handling nested expressions returning sets (e.g. having to pass up ExprDoneCond, and dealing with arguments to functions returning sets etc.), those SRFs can only be at the top level of the node's targetlist. The planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is only necessary in ProjectSet nodes and that SRFs are only present at the top level of the node's targetlist. If there are nested SRFs the planner creates multiple stacked ProjectSet nodes. The ProjectSet nodes always get input from an underlying node. We also discussed and prototyped evaluating targetlist SRFs using ROWS FROM(), but that turned out to be more complicated than we'd hoped. While moving SRF evaluation to ProjectSet would allow to retain the old "least common multiple" behavior when multiple SRFs are present in one targetlist (i.e. continue returning rows until all SRFs are at the end of their input at the same time), we decided to instead only return rows till all SRFs are exhausted, returning NULL for already exhausted ones. We deemed the previous behavior to be too confusing, unexpected and actually not particularly useful. As a side effect, the previously prohibited case of multiple set returning arguments to a function, is now allowed. Not because it's particularly desirable, but because it ends up working and there seems to be no argument for adding code to prohibit it. Currently the behavior for COALESCE and CASE containing SRFs has changed, returning multiple rows from the expression, even when the SRF containing "arm" of the expression is not evaluated. That's because the SRFs are evaluated in a separate ProjectSet node. As that's quite confusing, we're likely to instead prohibit SRFs in those places. But that's still being discussed, and the code would reside in places not touched here, so that's a task for later. There's a lot of, now superfluous, code dealing with set return expressions around. But as the changes to get rid of those are verbose largely boring, it seems better for readability to keep the cleanup as a separate commit. Author: Tom Lane and Andres Freund Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
2017-01-18 21:46:50 +01:00
| 3
| 4
(4 rows)
-- srf, with SRF argument
SELECT generate_series(1, generate_series(1, 3));
generate_series
-----------------
1
1
2
1
2
3
(6 rows)
-- srf, with two SRF arguments
SELECT generate_series(generate_series(1,3), generate_series(2, 4));
Move targetlist SRF handling from expression evaluation to new executor node. Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT generate_series(1,5)) so far was done in the expression evaluation (i.e. ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code. This meant that most executor nodes performing projection, and most expression evaluation functions, had to deal with the possibility that an evaluated expression could return a set of return values. That's bad because it leads to repeated code in a lot of places. It also, and that's my (Andres's) motivation, made it a lot harder to implement a more efficient way of doing expression evaluation. To fix this, introduce a new executor node (ProjectSet) that can evaluate targetlists containing one or more SRFs. To avoid the complexity of the old way of handling nested expressions returning sets (e.g. having to pass up ExprDoneCond, and dealing with arguments to functions returning sets etc.), those SRFs can only be at the top level of the node's targetlist. The planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is only necessary in ProjectSet nodes and that SRFs are only present at the top level of the node's targetlist. If there are nested SRFs the planner creates multiple stacked ProjectSet nodes. The ProjectSet nodes always get input from an underlying node. We also discussed and prototyped evaluating targetlist SRFs using ROWS FROM(), but that turned out to be more complicated than we'd hoped. While moving SRF evaluation to ProjectSet would allow to retain the old "least common multiple" behavior when multiple SRFs are present in one targetlist (i.e. continue returning rows until all SRFs are at the end of their input at the same time), we decided to instead only return rows till all SRFs are exhausted, returning NULL for already exhausted ones. We deemed the previous behavior to be too confusing, unexpected and actually not particularly useful. As a side effect, the previously prohibited case of multiple set returning arguments to a function, is now allowed. Not because it's particularly desirable, but because it ends up working and there seems to be no argument for adding code to prohibit it. Currently the behavior for COALESCE and CASE containing SRFs has changed, returning multiple rows from the expression, even when the SRF containing "arm" of the expression is not evaluated. That's because the SRFs are evaluated in a separate ProjectSet node. As that's quite confusing, we're likely to instead prohibit SRFs in those places. But that's still being discussed, and the code would reside in places not touched here, so that's a task for later. There's a lot of, now superfluous, code dealing with set return expressions around. But as the changes to get rid of those are verbose largely boring, it seems better for readability to keep the cleanup as a separate commit. Author: Tom Lane and Andres Freund Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
2017-01-18 21:46:50 +01:00
generate_series
-----------------
1
2
2
3
3
4
(6 rows)
Fix mishandling of tSRFs at different nesting levels. Given a targetlist like "srf(x), f(srf(x))", split_pathtarget_at_srfs() decided that it needed two levels of ProjectSet nodes, failing to notice that the two SRF calls are textually equal(). Because of that, setrefs.c would convert the upper ProjectSet's tlist to "Var1, f(Var1)" (where Var1 represents a reference to the srf(x) output of the lower ProjectSet). This triggered an assertion in nodeProjectSet.c complaining that it found no SRFs to evaluate, as reported by Erik Rijkers. What we want in such a case is to evaluate srf(x) only once and use a plain Result node to compute "Var1, f(Var1)"; that gives results similar to what previous versions produced, whereas allowing srf(x) to be evaluated again in an upper ProjectSet would square the number of rows emitted. Furthermore, even if the SRF calls aren't textually identical, we want them to be evaluated in lockstep, because that's what happened in the old implementation. But split_pathtarget_at_srfs() got this completely wrong, using two levels of ProjectSet for a case like "srf(x), f(srf(y))". Hence, rewrite split_pathtarget_at_srfs() from the ground up so that it groups SRFs according to the depth of nesting of SRFs in their arguments. This is pretty much how we envisioned that working originally, but I blew it when it came to implementation. In passing, optimize the case of target == input_target, which I noticed is not only possible but quite common. Discussion: https://postgr.es/m/dcbd2853c05d22088766553d60dc78c6@xs4all.nl
2017-02-02 22:38:13 +01:00
-- check proper nesting of SRFs in different expressions
explain (verbose, costs off)
SELECT generate_series(1, generate_series(1, 3)), generate_series(2, 4);
QUERY PLAN
--------------------------------------------------------------------------------
ProjectSet
Output: generate_series(1, (generate_series(1, 3))), (generate_series(2, 4))
-> ProjectSet
Output: generate_series(1, 3), generate_series(2, 4)
-> Result
(5 rows)
SELECT generate_series(1, generate_series(1, 3)), generate_series(2, 4);
generate_series | generate_series
-----------------+-----------------
1 | 2
1 | 3
2 | 3
1 | 4
2 | 4
3 | 4
(6 rows)
CREATE TABLE few(id int, dataa text, datab text);
INSERT INTO few VALUES(1, 'a', 'foo'),(2, 'a', 'bar'),(3, 'b', 'bar');
-- SRF output order of sorting is maintained, if SRF is not referenced
SELECT few.id, generate_series(1,3) g FROM few ORDER BY id DESC;
id | g
----+---
3 | 1
3 | 2
3 | 3
2 | 1
2 | 2
2 | 3
1 | 1
1 | 2
1 | 3
(9 rows)
-- but SRFs can be referenced in sort
SELECT few.id, generate_series(1,3) g FROM few ORDER BY id, g DESC;
id | g
----+---
1 | 3
1 | 2
1 | 1
2 | 3
2 | 2
2 | 1
3 | 3
3 | 2
3 | 1
(9 rows)
SELECT few.id, generate_series(1,3) g FROM few ORDER BY id, generate_series(1,3) DESC;
id | g
----+---
1 | 3
1 | 2
1 | 1
2 | 3
2 | 2
2 | 1
3 | 3
3 | 2
3 | 1
(9 rows)
-- it's weird to have ORDER BYs that increase the number of results
SELECT few.id FROM few ORDER BY id, generate_series(1,3) DESC;
id
----
1
1
1
2
2
2
3
3
3
(9 rows)
-- SRFs are computed after aggregation
SET enable_hashagg TO 0; -- stable output order
SELECT few.dataa, count(*), min(id), max(id), unnest('{1,1,3}'::int[]) FROM few WHERE few.id = 1 GROUP BY few.dataa;
dataa | count | min | max | unnest
-------+-------+-----+-----+--------
a | 1 | 1 | 1 | 1
a | 1 | 1 | 1 | 1
a | 1 | 1 | 1 | 3
(3 rows)
-- unless referenced in GROUP BY clause
SELECT few.dataa, count(*), min(id), max(id), unnest('{1,1,3}'::int[]) FROM few WHERE few.id = 1 GROUP BY few.dataa, unnest('{1,1,3}'::int[]);
dataa | count | min | max | unnest
-------+-------+-----+-----+--------
Move targetlist SRF handling from expression evaluation to new executor node. Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT generate_series(1,5)) so far was done in the expression evaluation (i.e. ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code. This meant that most executor nodes performing projection, and most expression evaluation functions, had to deal with the possibility that an evaluated expression could return a set of return values. That's bad because it leads to repeated code in a lot of places. It also, and that's my (Andres's) motivation, made it a lot harder to implement a more efficient way of doing expression evaluation. To fix this, introduce a new executor node (ProjectSet) that can evaluate targetlists containing one or more SRFs. To avoid the complexity of the old way of handling nested expressions returning sets (e.g. having to pass up ExprDoneCond, and dealing with arguments to functions returning sets etc.), those SRFs can only be at the top level of the node's targetlist. The planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is only necessary in ProjectSet nodes and that SRFs are only present at the top level of the node's targetlist. If there are nested SRFs the planner creates multiple stacked ProjectSet nodes. The ProjectSet nodes always get input from an underlying node. We also discussed and prototyped evaluating targetlist SRFs using ROWS FROM(), but that turned out to be more complicated than we'd hoped. While moving SRF evaluation to ProjectSet would allow to retain the old "least common multiple" behavior when multiple SRFs are present in one targetlist (i.e. continue returning rows until all SRFs are at the end of their input at the same time), we decided to instead only return rows till all SRFs are exhausted, returning NULL for already exhausted ones. We deemed the previous behavior to be too confusing, unexpected and actually not particularly useful. As a side effect, the previously prohibited case of multiple set returning arguments to a function, is now allowed. Not because it's particularly desirable, but because it ends up working and there seems to be no argument for adding code to prohibit it. Currently the behavior for COALESCE and CASE containing SRFs has changed, returning multiple rows from the expression, even when the SRF containing "arm" of the expression is not evaluated. That's because the SRFs are evaluated in a separate ProjectSet node. As that's quite confusing, we're likely to instead prohibit SRFs in those places. But that's still being discussed, and the code would reside in places not touched here, so that's a task for later. There's a lot of, now superfluous, code dealing with set return expressions around. But as the changes to get rid of those are verbose largely boring, it seems better for readability to keep the cleanup as a separate commit. Author: Tom Lane and Andres Freund Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
2017-01-18 21:46:50 +01:00
a | 2 | 1 | 1 | 1
a | 1 | 1 | 1 | 3
(2 rows)
SELECT few.dataa, count(*), min(id), max(id), unnest('{1,1,3}'::int[]) FROM few WHERE few.id = 1 GROUP BY few.dataa, 5;
dataa | count | min | max | unnest
-------+-------+-----+-----+--------
Move targetlist SRF handling from expression evaluation to new executor node. Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT generate_series(1,5)) so far was done in the expression evaluation (i.e. ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code. This meant that most executor nodes performing projection, and most expression evaluation functions, had to deal with the possibility that an evaluated expression could return a set of return values. That's bad because it leads to repeated code in a lot of places. It also, and that's my (Andres's) motivation, made it a lot harder to implement a more efficient way of doing expression evaluation. To fix this, introduce a new executor node (ProjectSet) that can evaluate targetlists containing one or more SRFs. To avoid the complexity of the old way of handling nested expressions returning sets (e.g. having to pass up ExprDoneCond, and dealing with arguments to functions returning sets etc.), those SRFs can only be at the top level of the node's targetlist. The planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is only necessary in ProjectSet nodes and that SRFs are only present at the top level of the node's targetlist. If there are nested SRFs the planner creates multiple stacked ProjectSet nodes. The ProjectSet nodes always get input from an underlying node. We also discussed and prototyped evaluating targetlist SRFs using ROWS FROM(), but that turned out to be more complicated than we'd hoped. While moving SRF evaluation to ProjectSet would allow to retain the old "least common multiple" behavior when multiple SRFs are present in one targetlist (i.e. continue returning rows until all SRFs are at the end of their input at the same time), we decided to instead only return rows till all SRFs are exhausted, returning NULL for already exhausted ones. We deemed the previous behavior to be too confusing, unexpected and actually not particularly useful. As a side effect, the previously prohibited case of multiple set returning arguments to a function, is now allowed. Not because it's particularly desirable, but because it ends up working and there seems to be no argument for adding code to prohibit it. Currently the behavior for COALESCE and CASE containing SRFs has changed, returning multiple rows from the expression, even when the SRF containing "arm" of the expression is not evaluated. That's because the SRFs are evaluated in a separate ProjectSet node. As that's quite confusing, we're likely to instead prohibit SRFs in those places. But that's still being discussed, and the code would reside in places not touched here, so that's a task for later. There's a lot of, now superfluous, code dealing with set return expressions around. But as the changes to get rid of those are verbose largely boring, it seems better for readability to keep the cleanup as a separate commit. Author: Tom Lane and Andres Freund Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
2017-01-18 21:46:50 +01:00
a | 2 | 1 | 1 | 1
a | 1 | 1 | 1 | 3
(2 rows)
RESET enable_hashagg;
-- check HAVING works when GROUP BY does [not] reference SRF output
SELECT dataa, generate_series(1,1), count(*) FROM few GROUP BY 1 HAVING count(*) > 1;
dataa | generate_series | count
-------+-----------------+-------
a | 1 | 2
(1 row)
SELECT dataa, generate_series(1,1), count(*) FROM few GROUP BY 1, 2 HAVING count(*) > 1;
dataa | generate_series | count
-------+-----------------+-------
a | 1 | 2
(1 row)
-- it's weird to have GROUP BYs that increase the number of results
SELECT few.dataa, count(*) FROM few WHERE dataa = 'a' GROUP BY few.dataa ORDER BY 2;
dataa | count
-------+-------
a | 2
(1 row)
SELECT few.dataa, count(*) FROM few WHERE dataa = 'a' GROUP BY few.dataa, unnest('{1,1,3}'::int[]) ORDER BY 2;
dataa | count
-------+-------
a | 2
a | 4
(2 rows)
-- SRFs are not allowed in aggregate arguments
SELECT min(generate_series(1, 3)) FROM few;
ERROR: set-valued function called in context that cannot accept a set
-- SRFs are not allowed in window function arguments, either
SELECT min(generate_series(1, 3)) OVER() FROM few;
ERROR: set-valued function called in context that cannot accept a set
-- SRFs are normally computed after window functions
SELECT id,lag(id) OVER(), count(*) OVER(), generate_series(1,3) FROM few;
id | lag | count | generate_series
----+-----+-------+-----------------
1 | | 3 | 1
1 | | 3 | 2
1 | | 3 | 3
2 | 1 | 3 | 1
2 | 1 | 3 | 2
2 | 1 | 3 | 3
3 | 2 | 3 | 1
3 | 2 | 3 | 2
3 | 2 | 3 | 3
(9 rows)
-- unless referencing SRFs
SELECT SUM(count(*)) OVER(PARTITION BY generate_series(1,3) ORDER BY generate_series(1,3)), generate_series(1,3) g FROM few GROUP BY g;
sum | g
-----+---
3 | 1
3 | 2
3 | 3
(3 rows)
-- sorting + grouping
SELECT few.dataa, count(*), min(id), max(id), generate_series(1,3) FROM few GROUP BY few.dataa ORDER BY 5, 1;
dataa | count | min | max | generate_series
-------+-------+-----+-----+-----------------
a | 2 | 1 | 2 | 1
b | 1 | 3 | 3 | 1
a | 2 | 1 | 2 | 2
b | 1 | 3 | 3 | 2
a | 2 | 1 | 2 | 3
b | 1 | 3 | 3 | 3
(6 rows)
-- grouping sets are a bit special, they produce NULLs in columns not actually NULL
SELECT dataa, datab b, generate_series(1,2) g, count(*) FROM few GROUP BY CUBE(dataa, datab);
dataa | b | g | count
-------+-----+---+-------
a | bar | 1 | 1
a | bar | 2 | 1
a | foo | 1 | 1
a | foo | 2 | 1
a | | 1 | 2
a | | 2 | 2
b | bar | 1 | 1
b | bar | 2 | 1
b | | 1 | 1
b | | 2 | 1
| | 1 | 3
| | 2 | 3
| bar | 1 | 2
| bar | 2 | 2
| foo | 1 | 1
| foo | 2 | 1
(16 rows)
SELECT dataa, datab b, generate_series(1,2) g, count(*) FROM few GROUP BY CUBE(dataa, datab) ORDER BY dataa;
dataa | b | g | count
-------+-----+---+-------
a | bar | 1 | 1
a | bar | 2 | 1
a | foo | 1 | 1
a | foo | 2 | 1
a | | 1 | 2
a | | 2 | 2
b | bar | 1 | 1
b | bar | 2 | 1
b | | 1 | 1
b | | 2 | 1
| | 1 | 3
| | 2 | 3
| bar | 1 | 2
| bar | 2 | 2
| foo | 1 | 1
| foo | 2 | 1
(16 rows)
SELECT dataa, datab b, generate_series(1,2) g, count(*) FROM few GROUP BY CUBE(dataa, datab) ORDER BY g;
dataa | b | g | count
-------+-----+---+-------
a | bar | 1 | 1
a | foo | 1 | 1
a | | 1 | 2
b | bar | 1 | 1
b | | 1 | 1
| | 1 | 3
| bar | 1 | 2
| foo | 1 | 1
| foo | 2 | 1
a | bar | 2 | 1
b | | 2 | 1
a | foo | 2 | 1
| bar | 2 | 2
a | | 2 | 2
| | 2 | 3
b | bar | 2 | 1
(16 rows)
SELECT dataa, datab b, generate_series(1,2) g, count(*) FROM few GROUP BY CUBE(dataa, datab, g);
dataa | b | g | count
-------+-----+---+-------
a | bar | 1 | 1
a | bar | 2 | 1
a | bar | | 2
a | foo | 1 | 1
a | foo | 2 | 1
a | foo | | 2
a | | | 4
b | bar | 1 | 1
b | bar | 2 | 1
b | bar | | 2
b | | | 2
| | | 6
a | | 1 | 2
b | | 1 | 1
| | 1 | 3
a | | 2 | 2
b | | 2 | 1
| | 2 | 3
| bar | 1 | 2
| bar | 2 | 2
| bar | | 4
| foo | 1 | 1
| foo | 2 | 1
| foo | | 2
(24 rows)
SELECT dataa, datab b, generate_series(1,2) g, count(*) FROM few GROUP BY CUBE(dataa, datab, g) ORDER BY dataa;
dataa | b | g | count
-------+-----+---+-------
a | bar | 1 | 1
a | bar | 2 | 1
a | bar | | 2
a | foo | 1 | 1
a | foo | 2 | 1
a | foo | | 2
a | | | 4
a | | 1 | 2
a | | 2 | 2
b | bar | 2 | 1
b | | | 2
b | | 1 | 1
b | | 2 | 1
b | bar | 1 | 1
b | bar | | 2
| foo | | 2
| foo | 1 | 1
| | 2 | 3
| bar | 1 | 2
| bar | 2 | 2
| | | 6
| foo | 2 | 1
| bar | | 4
| | 1 | 3
(24 rows)
SELECT dataa, datab b, generate_series(1,2) g, count(*) FROM few GROUP BY CUBE(dataa, datab, g) ORDER BY g;
dataa | b | g | count
-------+-----+---+-------
a | bar | 1 | 1
a | foo | 1 | 1
b | bar | 1 | 1
a | | 1 | 2
b | | 1 | 1
| | 1 | 3
| bar | 1 | 2
| foo | 1 | 1
| foo | 2 | 1
| bar | 2 | 2
a | | 2 | 2
b | | 2 | 1
a | bar | 2 | 1
| | 2 | 3
a | foo | 2 | 1
b | bar | 2 | 1
a | foo | | 2
b | bar | | 2
b | | | 2
| | | 6
a | | | 4
| bar | | 4
| foo | | 2
a | bar | | 2
(24 rows)
-- data modification
CREATE TABLE fewmore AS SELECT generate_series(1,3) AS data;
INSERT INTO fewmore VALUES(generate_series(4,5));
SELECT * FROM fewmore;
data
------
1
2
3
4
5
(5 rows)
Improve parser's and planner's handling of set-returning functions. Teach the parser to reject misplaced set-returning functions during parse analysis using p_expr_kind, in much the same way as we do for aggregates and window functions (cf commit eaccfded9). While this isn't complete (it misses nesting-based restrictions), it's much better than the previous error reporting for such cases, and it allows elimination of assorted ad-hoc expression_returns_set() error checks. We could add nesting checks later if it seems important to catch all cases at parse time. There is one case the parser will now throw error for although previous versions allowed it, which is SRFs in the tlist of an UPDATE. That never behaved sensibly (since it's ill-defined which generated row should be used to perform the update) and it's hard to see why it should not be treated as an error. It's a release-note-worthy change though. Also, add a new Query field hasTargetSRFs reporting whether there are any SRFs in the targetlist (including GROUP BY/ORDER BY expressions). The parser can now set that basically for free during parse analysis, and we can use it in a number of places to avoid expression_returns_set searches. (There will be more such checks soon.) In some places, this allows decontorting the logic since it's no longer expensive to check for SRFs in the tlist --- so I made the checks parallel to the handling of hasAggs/hasWindowFuncs wherever it seemed appropriate. catversion bump because adding a Query field changes stored rules. Andres Freund and Tom Lane Discussion: <24639.1473782855@sss.pgh.pa.us>
2016-09-13 19:54:24 +02:00
-- SRFs are not allowed in UPDATE (they once were, but it was nonsense)
UPDATE fewmore SET data = generate_series(4,9);
Improve parser's and planner's handling of set-returning functions. Teach the parser to reject misplaced set-returning functions during parse analysis using p_expr_kind, in much the same way as we do for aggregates and window functions (cf commit eaccfded9). While this isn't complete (it misses nesting-based restrictions), it's much better than the previous error reporting for such cases, and it allows elimination of assorted ad-hoc expression_returns_set() error checks. We could add nesting checks later if it seems important to catch all cases at parse time. There is one case the parser will now throw error for although previous versions allowed it, which is SRFs in the tlist of an UPDATE. That never behaved sensibly (since it's ill-defined which generated row should be used to perform the update) and it's hard to see why it should not be treated as an error. It's a release-note-worthy change though. Also, add a new Query field hasTargetSRFs reporting whether there are any SRFs in the targetlist (including GROUP BY/ORDER BY expressions). The parser can now set that basically for free during parse analysis, and we can use it in a number of places to avoid expression_returns_set searches. (There will be more such checks soon.) In some places, this allows decontorting the logic since it's no longer expensive to check for SRFs in the tlist --- so I made the checks parallel to the handling of hasAggs/hasWindowFuncs wherever it seemed appropriate. catversion bump because adding a Query field changes stored rules. Andres Freund and Tom Lane Discussion: <24639.1473782855@sss.pgh.pa.us>
2016-09-13 19:54:24 +02:00
ERROR: set-returning functions are not allowed in UPDATE
LINE 1: UPDATE fewmore SET data = generate_series(4,9);
^
-- SRFs are not allowed in RETURNING
INSERT INTO fewmore VALUES(1) RETURNING generate_series(1,3);
Improve parser's and planner's handling of set-returning functions. Teach the parser to reject misplaced set-returning functions during parse analysis using p_expr_kind, in much the same way as we do for aggregates and window functions (cf commit eaccfded9). While this isn't complete (it misses nesting-based restrictions), it's much better than the previous error reporting for such cases, and it allows elimination of assorted ad-hoc expression_returns_set() error checks. We could add nesting checks later if it seems important to catch all cases at parse time. There is one case the parser will now throw error for although previous versions allowed it, which is SRFs in the tlist of an UPDATE. That never behaved sensibly (since it's ill-defined which generated row should be used to perform the update) and it's hard to see why it should not be treated as an error. It's a release-note-worthy change though. Also, add a new Query field hasTargetSRFs reporting whether there are any SRFs in the targetlist (including GROUP BY/ORDER BY expressions). The parser can now set that basically for free during parse analysis, and we can use it in a number of places to avoid expression_returns_set searches. (There will be more such checks soon.) In some places, this allows decontorting the logic since it's no longer expensive to check for SRFs in the tlist --- so I made the checks parallel to the handling of hasAggs/hasWindowFuncs wherever it seemed appropriate. catversion bump because adding a Query field changes stored rules. Andres Freund and Tom Lane Discussion: <24639.1473782855@sss.pgh.pa.us>
2016-09-13 19:54:24 +02:00
ERROR: set-returning functions are not allowed in RETURNING
LINE 1: INSERT INTO fewmore VALUES(1) RETURNING generate_series(1,3)...
^
-- nor standalone VALUES (but surely this is a bug?)
VALUES(1, generate_series(1,2));
ERROR: set-returning functions are not allowed in VALUES
LINE 1: VALUES(1, generate_series(1,2));
^
-- We allow tSRFs that are not at top level
SELECT int4mul(generate_series(1,2), 10);
int4mul
---------
10
20
(2 rows)
-- but SRFs in function RTEs must be at top level (annoying restriction)
SELECT * FROM int4mul(generate_series(1,2), 10);
ERROR: set-valued function called in context that cannot accept a set
-- DISTINCT ON is evaluated before tSRF evaluation if SRF is not
-- referenced either in ORDER BY or in the DISTINCT ON list. The ORDER
-- BY reference can be implicitly generated, if there's no other ORDER BY.
-- implicit reference (via implicit ORDER) to all columns
SELECT DISTINCT ON (a) a, b, generate_series(1,3) g
FROM (VALUES (3, 2), (3,1), (1,1), (1,4), (5,3), (5,1)) AS t(a, b);
a | b | g
---+---+---
1 | 1 | 1
3 | 2 | 1
5 | 3 | 1
(3 rows)
-- unreferenced in DISTINCT ON or ORDER BY
SELECT DISTINCT ON (a) a, b, generate_series(1,3) g
FROM (VALUES (3, 2), (3,1), (1,1), (1,4), (5,3), (5,1)) AS t(a, b)
ORDER BY a, b DESC;
a | b | g
---+---+---
1 | 4 | 1
1 | 4 | 2
1 | 4 | 3
3 | 2 | 1
3 | 2 | 2
3 | 2 | 3
5 | 3 | 1
5 | 3 | 2
5 | 3 | 3
(9 rows)
-- referenced in ORDER BY
SELECT DISTINCT ON (a) a, b, generate_series(1,3) g
FROM (VALUES (3, 2), (3,1), (1,1), (1,4), (5,3), (5,1)) AS t(a, b)
ORDER BY a, b DESC, g DESC;
a | b | g
---+---+---
1 | 4 | 3
3 | 2 | 3
5 | 3 | 3
(3 rows)
-- referenced in ORDER BY and DISTINCT ON
SELECT DISTINCT ON (a, b, g) a, b, generate_series(1,3) g
FROM (VALUES (3, 2), (3,1), (1,1), (1,4), (5,3), (5,1)) AS t(a, b)
ORDER BY a, b DESC, g DESC;
a | b | g
---+---+---
1 | 4 | 3
1 | 4 | 2
1 | 4 | 1
1 | 1 | 3
1 | 1 | 2
1 | 1 | 1
3 | 2 | 3
3 | 2 | 2
3 | 2 | 1
3 | 1 | 3
3 | 1 | 2
3 | 1 | 1
5 | 3 | 3
5 | 3 | 2
5 | 3 | 1
5 | 1 | 3
5 | 1 | 2
5 | 1 | 1
(18 rows)
-- only SRF mentioned in DISTINCT ON
SELECT DISTINCT ON (g) a, b, generate_series(1,3) g
FROM (VALUES (3, 2), (3,1), (1,1), (1,4), (5,3), (5,1)) AS t(a, b);
a | b | g
---+---+---
3 | 2 | 1
5 | 1 | 2
3 | 1 | 3
(3 rows)
-- LIMIT / OFFSET is evaluated after SRF evaluation
SELECT a, generate_series(1,2) FROM (VALUES(1),(2),(3)) r(a) LIMIT 2 OFFSET 2;
a | generate_series
---+-----------------
2 | 1
2 | 2
(2 rows)
-- SRFs are not allowed in LIMIT.
SELECT 1 LIMIT generate_series(1,3);
Improve parser's and planner's handling of set-returning functions. Teach the parser to reject misplaced set-returning functions during parse analysis using p_expr_kind, in much the same way as we do for aggregates and window functions (cf commit eaccfded9). While this isn't complete (it misses nesting-based restrictions), it's much better than the previous error reporting for such cases, and it allows elimination of assorted ad-hoc expression_returns_set() error checks. We could add nesting checks later if it seems important to catch all cases at parse time. There is one case the parser will now throw error for although previous versions allowed it, which is SRFs in the tlist of an UPDATE. That never behaved sensibly (since it's ill-defined which generated row should be used to perform the update) and it's hard to see why it should not be treated as an error. It's a release-note-worthy change though. Also, add a new Query field hasTargetSRFs reporting whether there are any SRFs in the targetlist (including GROUP BY/ORDER BY expressions). The parser can now set that basically for free during parse analysis, and we can use it in a number of places to avoid expression_returns_set searches. (There will be more such checks soon.) In some places, this allows decontorting the logic since it's no longer expensive to check for SRFs in the tlist --- so I made the checks parallel to the handling of hasAggs/hasWindowFuncs wherever it seemed appropriate. catversion bump because adding a Query field changes stored rules. Andres Freund and Tom Lane Discussion: <24639.1473782855@sss.pgh.pa.us>
2016-09-13 19:54:24 +02:00
ERROR: set-returning functions are not allowed in LIMIT
LINE 1: SELECT 1 LIMIT generate_series(1,3);
^
-- tSRF in correlated subquery, referencing table outside
SELECT (SELECT generate_series(1,3) LIMIT 1 OFFSET few.id) FROM few;
generate_series
-----------------
2
3
(3 rows)
-- tSRF in correlated subquery, referencing SRF outside
SELECT (SELECT generate_series(1,3) LIMIT 1 OFFSET g.i) FROM generate_series(0,3) g(i);
generate_series
-----------------
1
2
3
(4 rows)
-- Operators can return sets too
CREATE OPERATOR |@| (PROCEDURE = unnest, RIGHTARG = ANYARRAY);
SELECT |@|ARRAY[1,2,3];
?column?
----------
1
2
3
(3 rows)
Fix mishandling of tSRFs at different nesting levels. Given a targetlist like "srf(x), f(srf(x))", split_pathtarget_at_srfs() decided that it needed two levels of ProjectSet nodes, failing to notice that the two SRF calls are textually equal(). Because of that, setrefs.c would convert the upper ProjectSet's tlist to "Var1, f(Var1)" (where Var1 represents a reference to the srf(x) output of the lower ProjectSet). This triggered an assertion in nodeProjectSet.c complaining that it found no SRFs to evaluate, as reported by Erik Rijkers. What we want in such a case is to evaluate srf(x) only once and use a plain Result node to compute "Var1, f(Var1)"; that gives results similar to what previous versions produced, whereas allowing srf(x) to be evaluated again in an upper ProjectSet would square the number of rows emitted. Furthermore, even if the SRF calls aren't textually identical, we want them to be evaluated in lockstep, because that's what happened in the old implementation. But split_pathtarget_at_srfs() got this completely wrong, using two levels of ProjectSet for a case like "srf(x), f(srf(y))". Hence, rewrite split_pathtarget_at_srfs() from the ground up so that it groups SRFs according to the depth of nesting of SRFs in their arguments. This is pretty much how we envisioned that working originally, but I blew it when it came to implementation. In passing, optimize the case of target == input_target, which I noticed is not only possible but quite common. Discussion: https://postgr.es/m/dcbd2853c05d22088766553d60dc78c6@xs4all.nl
2017-02-02 22:38:13 +01:00
-- Some fun cases involving duplicate SRF calls
explain (verbose, costs off)
select generate_series(1,3) as x, generate_series(1,3) + 1 as xp1;
QUERY PLAN
------------------------------------------------------------------
Result
Output: (generate_series(1, 3)), ((generate_series(1, 3)) + 1)
-> ProjectSet
Output: generate_series(1, 3)
-> Result
(5 rows)
select generate_series(1,3) as x, generate_series(1,3) + 1 as xp1;
x | xp1
---+-----
1 | 2
2 | 3
3 | 4
(3 rows)
explain (verbose, costs off)
select generate_series(1,3)+1 order by generate_series(1,3);
QUERY PLAN
------------------------------------------------------------------------
Sort
Output: (((generate_series(1, 3)) + 1)), (generate_series(1, 3))
Sort Key: (generate_series(1, 3))
-> Result
Output: ((generate_series(1, 3)) + 1), (generate_series(1, 3))
-> ProjectSet
Output: generate_series(1, 3)
-> Result
(8 rows)
select generate_series(1,3)+1 order by generate_series(1,3);
?column?
----------
2
3
4
(3 rows)
-- Check that SRFs of same nesting level run in lockstep
explain (verbose, costs off)
select generate_series(1,3) as x, generate_series(3,6) + 1 as y;
QUERY PLAN
------------------------------------------------------------------
Result
Output: (generate_series(1, 3)), ((generate_series(3, 6)) + 1)
-> ProjectSet
Output: generate_series(1, 3), generate_series(3, 6)
-> Result
(5 rows)
select generate_series(1,3) as x, generate_series(3,6) + 1 as y;
x | y
---+---
1 | 4
2 | 5
3 | 6
| 7
(4 rows)
-- Clean up
DROP TABLE few;
DROP TABLE fewmore;