diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index 09427bbed2..255bfddad7 100644 --- a/doc/src/sgml/xfunc.sgml +++ b/doc/src/sgml/xfunc.sgml @@ -998,7 +998,7 @@ SELECT name, listchildren(name) FROM nodes; - If there is more than one set-returning function in the same select + If there is more than one set-returning function in the query's select list, the behavior is similar to what you get from putting the functions into a single LATERAL ROWS FROM( ... ) FROM-clause item. For each row from the underlying query, there is an output row @@ -1007,21 +1007,53 @@ SELECT name, listchildren(name) FROM nodes; produce fewer outputs than others, null values are substituted for the missing data, so that the total number of rows emitted for one underlying row is the same as for the set-returning function that - produced the most outputs. + produced the most outputs. Thus the set-returning functions + run in lockstep until they are all exhausted, and then + execution continues with the next underlying row. Set-returning functions can be nested in a select list, although that is not allowed in FROM-clause items. In such cases, each level of nesting is treated separately, as though it were - another LATERAL ROWS FROM( ... ) item. For example, in + a separate LATERAL ROWS FROM( ... ) item. For example, in -SELECT srf1(srf2(x), srf3(y)), srf4(srf5(z)) FROM ... +SELECT srf1(srf2(x), srf3(y)), srf4(srf5(z)) FROM tab; the set-returning functions srf2, srf3, - and srf5 would be run in lockstep for each row of the - underlying query, and then srf1 and srf4 would - be applied in lockstep to each row produced by the lower functions. + and srf5 would be run in lockstep for each row + of tab, and then srf1 and srf4 + would be applied in lockstep to each row produced by the lower + functions. + + + + This behavior also means that set-returning functions will be evaluated + even when it might appear that they should be skipped because of a + conditional-evaluation construct, such as CASE + or COALESCE. For example, consider + +SELECT x, CASE WHEN x > 0 THEN generate_series(1, 5) ELSE 0 END FROM tab; + + It might seem that this should produce five repetitions of input + rows that have x > 0, and a single repetition of those + that do not; but actually it will produce five repetitions of every + input row. This is because generate_series() is run first, + and then the CASE expression is applied to its result rows. + The behavior is thus comparable to + +SELECT x, CASE WHEN x > 0 THEN g ELSE 0 END + FROM tab, LATERAL generate_series(1,5) AS g; + + It would be exactly the same, except that in this specific example, + the planner could choose to put g on the outside of the + nestloop join, since g has no actual lateral dependency + on tab. That would result in a different output row + order. Set-returning functions in the select list are always evaluated + as though they are on the inside of a nestloop join with the rest of + the FROM clause, so that the function(s) are run to + completion before the next row from the FROM clause is + considered. @@ -1043,9 +1075,14 @@ SELECT srf1(srf2(x), srf3(y)), srf4(srf5(z)) FROM ... sensibly unless they always produced equal numbers of rows. Otherwise, what you got was a number of output rows equal to the least common multiple of the numbers of rows produced by the set-returning - functions. Furthermore, nested set-returning functions did not work at - all. Use of the LATERAL syntax is recommended when writing - queries that need to work in older PostgreSQL versions. + functions. Also, nested set-returning functions did not work as + described above; instead, a set-returning function could have at most + one set-returning argument, and each nest of set-returning functions + was run independently. The behavior for conditional execution + (set-returning functions inside CASE etc) was different too. + Use of the LATERAL syntax is recommended when writing + queries that need to work in older PostgreSQL versions, + because that will give consistent results across different versions.