2003-01-20 19:55:07 +01:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* prepjointree.c
|
|
|
|
* Planner preprocessing for subqueries and join tree manipulation.
|
|
|
|
*
|
2003-02-10 00:57:19 +01:00
|
|
|
* NOTE: the intended sequence for invoking these operations is
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
* replace_empty_jointree
|
2008-08-14 20:48:00 +02:00
|
|
|
* pull_up_sublinks
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
* preprocess_function_rtes
|
2003-02-10 00:57:19 +01:00
|
|
|
* pull_up_subqueries
|
2010-11-08 21:15:02 +01:00
|
|
|
* flatten_simple_union_all
|
2003-02-10 00:57:19 +01:00
|
|
|
* do expression preprocessing (including flattening JOIN alias vars)
|
|
|
|
* reduce_outer_joins
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
* remove_useless_result_rtes
|
2003-02-10 00:57:19 +01:00
|
|
|
*
|
2003-01-20 19:55:07 +01:00
|
|
|
*
|
2024-01-04 02:49:05 +01:00
|
|
|
* Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
|
2003-01-20 19:55:07 +01:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
*
|
|
|
|
*
|
|
|
|
* IDENTIFICATION
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/backend/optimizer/prep/prepjointree.c
|
2003-01-20 19:55:07 +01:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#include "postgres.h"
|
|
|
|
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
#include "catalog/pg_type.h"
|
2019-09-24 18:11:32 +02:00
|
|
|
#include "funcapi.h"
|
2022-12-22 16:35:02 +01:00
|
|
|
#include "miscadmin.h"
|
2006-02-03 22:08:49 +01:00
|
|
|
#include "nodes/makefuncs.h"
|
Invent "multibitmapsets", and use them to speed up antijoin detection.
Implement a data structure that is a List of Bitmapsets, which is
essentially a 2-D boolean array except that the rows need not all
be the same width. Operations such as union and intersection are
meaningful for these, just as they are for Bitmapsets. Eventually
we might build many of the same operations that we have written for
Bitmapsets, but for the first use-case we just need a few.
That first use-case is for antijoin detection: reduce_outer_joins
needs to find the set of Vars that are certain to be non-null in a
successfully joined (not null-extended) left join row, and also
find the set of Vars subject to higher-level IS NULL constraints,
and intersect them. We had been doing this by making Lists of
the Var nodes and then using list_intersect, which works but is
pretty inefficient compared to a bitmapset-like intersection.
Potentially it's O(N^2) if there are a lot of Vars involved,
which fortunately there generally aren't; still it's not great.
Moreover, that method requires the Vars of interest to be exactly
equal() in the join condition and the upper IS NULL condition,
which is problematic for my WIP patch that labels Vars according
to which outer joins have possibly nulled them.
Discussion: https://postgr.es/m/892228.1668437838@sss.pgh.pa.us
Discussion: https://postgr.es/m/CAMbWs4-mvPPCJ1W6iK6dD5HiNwoJdi6mZp=-7mE8N9Sh+cd0tQ@mail.gmail.com
2022-11-16 19:58:42 +01:00
|
|
|
#include "nodes/multibitmapset.h"
|
2008-08-26 00:42:34 +02:00
|
|
|
#include "nodes/nodeFuncs.h"
|
2003-01-20 19:55:07 +01:00
|
|
|
#include "optimizer/clauses.h"
|
2019-01-29 21:48:51 +01:00
|
|
|
#include "optimizer/optimizer.h"
|
2008-10-21 22:42:53 +02:00
|
|
|
#include "optimizer/placeholder.h"
|
2003-01-20 19:55:07 +01:00
|
|
|
#include "optimizer/prep.h"
|
|
|
|
#include "optimizer/subselect.h"
|
2006-02-03 22:08:49 +01:00
|
|
|
#include "optimizer/tlist.h"
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
#include "parser/parse_relation.h"
|
2003-01-20 19:55:07 +01:00
|
|
|
#include "parser/parsetree.h"
|
|
|
|
#include "rewrite/rewriteManip.h"
|
|
|
|
|
|
|
|
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
typedef struct pullup_replace_vars_context
|
|
|
|
{
|
|
|
|
PlannerInfo *root;
|
|
|
|
List *targetlist; /* tlist of subquery being pulled up */
|
|
|
|
RangeTblEntry *target_rte; /* RTE of subquery */
|
2013-08-18 02:22:37 +02:00
|
|
|
Relids relids; /* relids within subquery, as numbered after
|
|
|
|
* pullup (set only if target_rte->lateral) */
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
bool *outer_hasSubLinks; /* -> outer query's hasSubLinks */
|
|
|
|
int varno; /* varno of subquery */
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
bool wrap_non_vars; /* do we need all non-Var outputs to be PHVs? */
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
Node **rv_cache; /* cache for results with PHVs */
|
|
|
|
} pullup_replace_vars_context;
|
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
typedef struct reduce_outer_joins_pass1_state
|
2003-02-10 00:57:19 +01:00
|
|
|
{
|
|
|
|
Relids relids; /* base relids within this subtree */
|
|
|
|
bool contains_outer; /* does subtree contain outer join(s)? */
|
|
|
|
List *sub_states; /* List of states for subtree components */
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
} reduce_outer_joins_pass1_state;
|
|
|
|
|
|
|
|
typedef struct reduce_outer_joins_pass2_state
|
|
|
|
{
|
|
|
|
Relids inner_reduced; /* OJ relids reduced to plain inner joins */
|
|
|
|
List *partial_reduced; /* List of partially reduced FULL joins */
|
|
|
|
} reduce_outer_joins_pass2_state;
|
|
|
|
|
|
|
|
typedef struct reduce_outer_joins_partial_state
|
|
|
|
{
|
|
|
|
int full_join_rti; /* RT index of a formerly-FULL join */
|
|
|
|
Relids unreduced_side; /* relids in its still-nullable side */
|
|
|
|
} reduce_outer_joins_partial_state;
|
2003-02-10 00:57:19 +01:00
|
|
|
|
2008-08-17 03:20:00 +02:00
|
|
|
static Node *pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
|
|
|
|
Relids *relids);
|
|
|
|
static Node *pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
|
2012-01-28 01:46:41 +01:00
|
|
|
Node **jtlink1, Relids available_rels1,
|
|
|
|
Node **jtlink2, Relids available_rels2);
|
2012-08-12 22:01:26 +02:00
|
|
|
static Node *pull_up_subqueries_recurse(PlannerInfo *root, Node *jtnode,
|
|
|
|
JoinExpr *lowest_outer_join,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
AppendRelInfo *containing_appendrel);
|
2006-02-03 22:08:49 +01:00
|
|
|
static Node *pull_up_simple_subquery(PlannerInfo *root, Node *jtnode,
|
|
|
|
RangeTblEntry *rte,
|
2009-04-28 23:31:16 +02:00
|
|
|
JoinExpr *lowest_outer_join,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
AppendRelInfo *containing_appendrel);
|
2006-02-03 22:08:49 +01:00
|
|
|
static Node *pull_up_simple_union_all(PlannerInfo *root, Node *jtnode,
|
|
|
|
RangeTblEntry *rte);
|
|
|
|
static void pull_up_union_leaf_queries(Node *setOp, PlannerInfo *root,
|
2008-08-14 22:31:29 +02:00
|
|
|
int parentRTindex, Query *setOpQuery,
|
|
|
|
int childRToffset);
|
Remove arbitrary 64K-or-so limit on rangetable size.
Up to now the size of a query's rangetable has been limited by the
constants INNER_VAR et al, which mustn't be equal to any real
rangetable index. 65000 doubtless seemed like enough for anybody,
and it still is orders of magnitude larger than the number of joins
we can realistically handle. However, we need a rangetable entry
for each child partition that is (or might be) processed by a query.
Queries with a few thousand partitions are getting more realistic,
so that the day when that limit becomes a problem is in sight,
even if it's not here yet. Hence, let's raise the limit.
Rather than just increase the values of INNER_VAR et al, this patch
adopts the approach of making them small negative values, so that
rangetables could theoretically become as long as INT_MAX.
The bulk of the patch is concerned with changing Var.varno and some
related variables from "Index" (unsigned int) to plain "int". This
is basically cosmetic, with little actual effect other than to help
debuggers print their values nicely. As such, I've only bothered
with changing places that could actually see INNER_VAR et al, which
the parser and most of the planner don't. We do have to be careful
in places that are performing less/greater comparisons on varnos,
but there are very few such places, other than the IS_SPECIAL_VARNO
macro itself.
A notable side effect of this patch is that while it used to be
possible to add INNER_VAR et al to a Bitmapset, that will now
draw an error. I don't see any likelihood that it wouldn't be a
bug to include these fake varnos in a bitmapset of real varnos,
so I think this is all to the good.
Although this touches outfuncs/readfuncs, I don't think a catversion
bump is required, since stored rules would never contain Vars
with these fake varnos.
Andrey Lepikhov and Tom Lane, after a suggestion by Peter Eisentraut
Discussion: https://postgr.es/m/43c7f2f5-1e27-27aa-8c65-c91859d15190@postgrespro.ru
2021-09-15 20:11:21 +02:00
|
|
|
static void make_setop_translation_list(Query *query, int newvarno,
|
2019-12-03 00:05:29 +01:00
|
|
|
AppendRelInfo *appinfo);
|
Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins. This could result in a malformed plan. The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.
The right value to use is the join level we actually intend to evaluate
the PHV at. We can get that from the ph_eval_at field of the associated
PlaceHolderInfo. However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level. (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.) Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.
The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos. We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter. (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)
Back-patch to v12. The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit 4be058fe9 allowed trivial
subqueries to be collapsed out completely, eliminating their join order
restrictions.
Per report from Stephan Springl.
Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 21:37:23 +01:00
|
|
|
static bool is_simple_subquery(PlannerInfo *root, Query *subquery,
|
|
|
|
RangeTblEntry *rte,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
JoinExpr *lowest_outer_join);
|
2015-03-12 04:18:03 +01:00
|
|
|
static Node *pull_up_simple_values(PlannerInfo *root, Node *jtnode,
|
|
|
|
RangeTblEntry *rte);
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
static bool is_simple_values(PlannerInfo *root, RangeTblEntry *rte);
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
static Node *pull_up_constant_function(PlannerInfo *root, Node *jtnode,
|
|
|
|
RangeTblEntry *rte,
|
|
|
|
AppendRelInfo *containing_appendrel);
|
2006-02-03 22:08:49 +01:00
|
|
|
static bool is_simple_union_all(Query *subquery);
|
|
|
|
static bool is_simple_union_all_recurse(Node *setOp, Query *setOpQuery,
|
|
|
|
List *colTypes);
|
|
|
|
static bool is_safe_append_member(Query *subquery);
|
Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins. This could result in a malformed plan. The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.
The right value to use is the join level we actually intend to evaluate
the PHV at. We can get that from the ph_eval_at field of the associated
PlaceHolderInfo. However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level. (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.) Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.
The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos. We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter. (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)
Back-patch to v12. The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit 4be058fe9 allowed trivial
subqueries to be collapsed out completely, eliminating their join order
restrictions.
Per report from Stephan Springl.
Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 21:37:23 +01:00
|
|
|
static bool jointree_contains_lateral_outer_refs(PlannerInfo *root,
|
|
|
|
Node *jtnode, bool restricted,
|
2013-08-19 19:19:25 +02:00
|
|
|
Relids safe_upper_varnos);
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
static void perform_pullup_replace_vars(PlannerInfo *root,
|
|
|
|
pullup_replace_vars_context *rvcontext,
|
|
|
|
AppendRelInfo *containing_appendrel);
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
static void replace_vars_in_jointree(Node *jtnode,
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
pullup_replace_vars_context *context);
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
static Node *pullup_replace_vars(Node *expr,
|
|
|
|
pullup_replace_vars_context *context);
|
|
|
|
static Node *pullup_replace_vars_callback(Var *var,
|
|
|
|
replace_rte_variables_context *context);
|
2012-08-18 20:10:17 +02:00
|
|
|
static Query *pullup_replace_vars_subquery(Query *query,
|
|
|
|
pullup_replace_vars_context *context);
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
static reduce_outer_joins_pass1_state *reduce_outer_joins_pass1(Node *jtnode);
|
2003-02-10 00:57:19 +01:00
|
|
|
static void reduce_outer_joins_pass2(Node *jtnode,
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
reduce_outer_joins_pass1_state *state1,
|
|
|
|
reduce_outer_joins_pass2_state *state2,
|
2005-06-06 00:32:58 +02:00
|
|
|
PlannerInfo *root,
|
2008-08-14 20:48:00 +02:00
|
|
|
Relids nonnullable_rels,
|
|
|
|
List *forced_null_vars);
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
static void report_reduced_full_join(reduce_outer_joins_pass2_state *state2,
|
|
|
|
int rtindex, Relids relids);
|
|
|
|
static Node *remove_useless_results_recurse(PlannerInfo *root, Node *jtnode,
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
Node **parent_quals,
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
Relids *dropped_outer_joins);
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
static int get_result_relid(PlannerInfo *root, Node *jtnode);
|
|
|
|
static void remove_result_refs(PlannerInfo *root, int varno, Node *newjtloc);
|
2019-12-14 19:49:15 +01:00
|
|
|
static bool find_dependent_phvs(PlannerInfo *root, int varno);
|
|
|
|
static bool find_dependent_phvs_in_jointree(PlannerInfo *root,
|
|
|
|
Node *node, int varno);
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
static void substitute_phv_relids(Node *node,
|
|
|
|
int varno, Relids subrelids);
|
2022-12-22 17:02:03 +01:00
|
|
|
static void fix_append_rel_relids(PlannerInfo *root, int varno,
|
2006-02-03 22:08:49 +01:00
|
|
|
Relids subrelids);
|
2003-01-20 19:55:07 +01:00
|
|
|
static Node *find_jointree_node_for_rel(Node *jtnode, int relid);
|
|
|
|
|
|
|
|
|
2022-03-28 16:45:58 +02:00
|
|
|
/*
|
|
|
|
* transform_MERGE_to_join
|
|
|
|
* Replace a MERGE's jointree to also include the target relation.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
transform_MERGE_to_join(Query *parse)
|
|
|
|
{
|
|
|
|
RangeTblEntry *joinrte;
|
|
|
|
JoinExpr *joinexpr;
|
2024-03-30 11:00:26 +01:00
|
|
|
bool have_action[3];
|
2022-03-28 16:45:58 +02:00
|
|
|
JoinType jointype;
|
|
|
|
int joinrti;
|
|
|
|
List *vars;
|
2024-03-30 11:00:26 +01:00
|
|
|
RangeTblRef *rtr;
|
2022-03-28 16:45:58 +02:00
|
|
|
|
|
|
|
if (parse->commandType != CMD_MERGE)
|
|
|
|
return;
|
|
|
|
|
|
|
|
/* XXX probably bogus */
|
|
|
|
vars = NIL;
|
|
|
|
|
|
|
|
/*
|
2024-03-30 11:00:26 +01:00
|
|
|
* Work out what kind of join is required. If there any WHEN NOT MATCHED
|
|
|
|
* BY SOURCE/TARGET actions, an outer join is required so that we process
|
|
|
|
* all unmatched tuples from the source and/or target relations.
|
|
|
|
* Otherwise, we can use an inner join.
|
2022-03-28 16:45:58 +02:00
|
|
|
*/
|
2024-03-30 11:00:26 +01:00
|
|
|
have_action[MERGE_WHEN_MATCHED] = false;
|
|
|
|
have_action[MERGE_WHEN_NOT_MATCHED_BY_SOURCE] = false;
|
|
|
|
have_action[MERGE_WHEN_NOT_MATCHED_BY_TARGET] = false;
|
|
|
|
|
|
|
|
foreach_node(MergeAction, action, parse->mergeActionList)
|
|
|
|
{
|
|
|
|
if (action->commandType != CMD_NOTHING)
|
|
|
|
have_action[action->matchKind] = true;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (have_action[MERGE_WHEN_NOT_MATCHED_BY_SOURCE] &&
|
|
|
|
have_action[MERGE_WHEN_NOT_MATCHED_BY_TARGET])
|
|
|
|
jointype = JOIN_FULL;
|
|
|
|
else if (have_action[MERGE_WHEN_NOT_MATCHED_BY_SOURCE])
|
|
|
|
jointype = JOIN_LEFT;
|
|
|
|
else if (have_action[MERGE_WHEN_NOT_MATCHED_BY_TARGET])
|
2022-03-28 16:45:58 +02:00
|
|
|
jointype = JOIN_RIGHT;
|
|
|
|
else
|
|
|
|
jointype = JOIN_INNER;
|
|
|
|
|
|
|
|
/* Manufacture a join RTE to use. */
|
|
|
|
joinrte = makeNode(RangeTblEntry);
|
|
|
|
joinrte->rtekind = RTE_JOIN;
|
|
|
|
joinrte->jointype = jointype;
|
|
|
|
joinrte->joinmergedcols = 0;
|
|
|
|
joinrte->joinaliasvars = vars;
|
|
|
|
joinrte->joinleftcols = NIL; /* MERGE does not allow JOIN USING */
|
|
|
|
joinrte->joinrightcols = NIL; /* ditto */
|
|
|
|
joinrte->join_using_alias = NULL;
|
|
|
|
|
|
|
|
joinrte->alias = NULL;
|
|
|
|
joinrte->eref = makeAlias("*MERGE*", NIL);
|
|
|
|
joinrte->lateral = false;
|
|
|
|
joinrte->inh = false;
|
|
|
|
joinrte->inFromCl = true;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Add completed RTE to pstate's range table list, so that we know its
|
|
|
|
* index.
|
|
|
|
*/
|
|
|
|
parse->rtable = lappend(parse->rtable, joinrte);
|
|
|
|
joinrti = list_length(parse->rtable);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Create a JOIN between the target and the source relation.
|
2024-02-29 16:56:59 +01:00
|
|
|
*
|
|
|
|
* Here the target is identified by parse->mergeTargetRelation. For a
|
|
|
|
* regular table, this will equal parse->resultRelation, but for a
|
|
|
|
* trigger-updatable view, it will be the expanded view subquery that we
|
|
|
|
* need to pull data from.
|
2024-03-30 11:00:26 +01:00
|
|
|
*
|
|
|
|
* The source relation is in parse->jointree->fromlist, but any quals in
|
|
|
|
* parse->jointree->quals are restrictions on the target relation (if the
|
|
|
|
* target relation is an auto-updatable view).
|
2022-03-28 16:45:58 +02:00
|
|
|
*/
|
2024-03-30 11:00:26 +01:00
|
|
|
rtr = makeNode(RangeTblRef);
|
|
|
|
rtr->rtindex = parse->mergeTargetRelation;
|
2022-03-28 16:45:58 +02:00
|
|
|
joinexpr = makeNode(JoinExpr);
|
|
|
|
joinexpr->jointype = jointype;
|
|
|
|
joinexpr->isNatural = false;
|
2024-03-30 11:00:26 +01:00
|
|
|
joinexpr->larg = (Node *) makeFromExpr(list_make1(rtr), parse->jointree->quals);
|
|
|
|
joinexpr->rarg = linitial(parse->jointree->fromlist); /* source rel */
|
2022-03-28 16:45:58 +02:00
|
|
|
joinexpr->usingClause = NIL;
|
|
|
|
joinexpr->join_using_alias = NULL;
|
2024-03-30 11:00:26 +01:00
|
|
|
joinexpr->quals = parse->mergeJoinCondition;
|
2022-03-28 16:45:58 +02:00
|
|
|
joinexpr->alias = NULL;
|
|
|
|
joinexpr->rtindex = joinrti;
|
|
|
|
|
|
|
|
/* Make the new join be the sole entry in the query's jointree */
|
|
|
|
parse->jointree->fromlist = list_make1(joinexpr);
|
|
|
|
parse->jointree->quals = NULL;
|
2024-02-29 16:56:59 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If necessary, mark parse->targetlist entries that refer to the target
|
|
|
|
* as nullable by the join. Normally the targetlist will be empty for a
|
|
|
|
* MERGE, but if the target is a trigger-updatable view, it will contain a
|
|
|
|
* whole-row Var referring to the expanded view query.
|
|
|
|
*/
|
|
|
|
if (parse->targetList != NIL &&
|
|
|
|
(jointype == JOIN_RIGHT || jointype == JOIN_FULL))
|
|
|
|
parse->targetList = (List *)
|
|
|
|
add_nulling_relids((Node *) parse->targetList,
|
|
|
|
bms_make_singleton(parse->mergeTargetRelation),
|
|
|
|
bms_make_singleton(joinrti));
|
2024-03-30 11:00:26 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If there are any WHEN NOT MATCHED BY SOURCE actions, the executor will
|
|
|
|
* use the join condition to distinguish between MATCHED and NOT MATCHED
|
|
|
|
* BY SOURCE cases. Otherwise, it's no longer needed, and we set it to
|
|
|
|
* NULL, saving cycles during planning and execution.
|
|
|
|
*/
|
|
|
|
if (!have_action[MERGE_WHEN_NOT_MATCHED_BY_SOURCE])
|
|
|
|
parse->mergeJoinCondition = NULL;
|
2022-03-28 16:45:58 +02:00
|
|
|
}
|
|
|
|
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
/*
|
|
|
|
* replace_empty_jointree
|
|
|
|
* If the Query's jointree is empty, replace it with a dummy RTE_RESULT
|
|
|
|
* relation.
|
|
|
|
*
|
|
|
|
* By doing this, we can avoid a bunch of corner cases that formerly existed
|
|
|
|
* for SELECTs with omitted FROM clauses. An example is that a subquery
|
|
|
|
* with empty jointree previously could not be pulled up, because that would
|
|
|
|
* have resulted in an empty relid set, making the subquery not uniquely
|
|
|
|
* identifiable for join or PlaceHolderVar processing.
|
|
|
|
*
|
|
|
|
* Unlike most other functions in this file, this function doesn't recurse;
|
|
|
|
* we rely on other processing to invoke it on sub-queries at suitable times.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
replace_empty_jointree(Query *parse)
|
|
|
|
{
|
|
|
|
RangeTblEntry *rte;
|
|
|
|
Index rti;
|
|
|
|
RangeTblRef *rtr;
|
|
|
|
|
|
|
|
/* Nothing to do if jointree is already nonempty */
|
|
|
|
if (parse->jointree->fromlist != NIL)
|
|
|
|
return;
|
|
|
|
|
|
|
|
/* We mustn't change it in the top level of a setop tree, either */
|
|
|
|
if (parse->setOperations)
|
|
|
|
return;
|
|
|
|
|
|
|
|
/* Create suitable RTE */
|
|
|
|
rte = makeNode(RangeTblEntry);
|
|
|
|
rte->rtekind = RTE_RESULT;
|
|
|
|
rte->eref = makeAlias("*RESULT*", NIL);
|
|
|
|
|
|
|
|
/* Add it to rangetable */
|
|
|
|
parse->rtable = lappend(parse->rtable, rte);
|
|
|
|
rti = list_length(parse->rtable);
|
|
|
|
|
|
|
|
/* And jam a reference into the jointree */
|
|
|
|
rtr = makeNode(RangeTblRef);
|
|
|
|
rtr->rtindex = rti;
|
|
|
|
parse->jointree->fromlist = list_make1(rtr);
|
|
|
|
}
|
|
|
|
|
2003-01-20 19:55:07 +01:00
|
|
|
/*
|
2008-08-14 20:48:00 +02:00
|
|
|
* pull_up_sublinks
|
2008-08-17 03:20:00 +02:00
|
|
|
* Attempt to pull up ANY and EXISTS SubLinks to be treated as
|
|
|
|
* semijoins or anti-semijoins.
|
2003-01-20 19:55:07 +01:00
|
|
|
*
|
2008-08-17 03:20:00 +02:00
|
|
|
* A clause "foo op ANY (sub-SELECT)" can be processed by pulling the
|
|
|
|
* sub-SELECT up to become a rangetable entry and treating the implied
|
|
|
|
* comparisons as quals of a semijoin. However, this optimization *only*
|
|
|
|
* works at the top level of WHERE or a JOIN/ON clause, because we cannot
|
|
|
|
* distinguish whether the ANY ought to return FALSE or NULL in cases
|
|
|
|
* involving NULL inputs. Also, in an outer join's ON clause we can only
|
|
|
|
* do this if the sublink is degenerate (ie, references only the nullable
|
2009-02-25 04:30:38 +01:00
|
|
|
* side of the join). In that case it is legal to push the semijoin
|
2008-08-17 03:20:00 +02:00
|
|
|
* down into the nullable side of the join. If the sublink references any
|
|
|
|
* nonnullable-side variables then it would have to be evaluated as part
|
|
|
|
* of the outer join, which makes things way too complicated.
|
|
|
|
*
|
|
|
|
* Under similar conditions, EXISTS and NOT EXISTS clauses can be handled
|
|
|
|
* by pulling up the sub-SELECT and creating a semijoin or anti-semijoin.
|
2008-08-14 20:48:00 +02:00
|
|
|
*
|
|
|
|
* This routine searches for such clauses and does the necessary parsetree
|
|
|
|
* transformations if any are found.
|
2003-01-20 19:55:07 +01:00
|
|
|
*
|
2008-08-17 03:20:00 +02:00
|
|
|
* This routine has to run before preprocess_expression(), so the quals
|
2014-06-16 21:55:05 +02:00
|
|
|
* clauses are not yet reduced to implicit-AND format, and are not guaranteed
|
|
|
|
* to be AND/OR-flat either. That means we need to recursively search through
|
|
|
|
* explicit AND clauses. We stop as soon as we hit a non-AND item.
|
2008-08-17 03:20:00 +02:00
|
|
|
*/
|
|
|
|
void
|
|
|
|
pull_up_sublinks(PlannerInfo *root)
|
|
|
|
{
|
2009-02-25 04:30:38 +01:00
|
|
|
Node *jtnode;
|
2008-08-17 03:20:00 +02:00
|
|
|
Relids relids;
|
|
|
|
|
|
|
|
/* Begin recursion through the jointree */
|
2009-02-25 04:30:38 +01:00
|
|
|
jtnode = pull_up_sublinks_jointree_recurse(root,
|
|
|
|
(Node *) root->parse->jointree,
|
|
|
|
&relids);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* root->parse->jointree must always be a FromExpr, so insert a dummy one
|
|
|
|
* if we got a bare RangeTblRef or JoinExpr out of the recursion.
|
|
|
|
*/
|
|
|
|
if (IsA(jtnode, FromExpr))
|
|
|
|
root->parse->jointree = (FromExpr *) jtnode;
|
|
|
|
else
|
|
|
|
root->parse->jointree = makeFromExpr(list_make1(jtnode), NULL);
|
2008-08-17 03:20:00 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Recurse through jointree nodes for pull_up_sublinks()
|
2003-01-20 19:55:07 +01:00
|
|
|
*
|
2008-08-17 03:20:00 +02:00
|
|
|
* In addition to returning the possibly-modified jointree node, we return
|
|
|
|
* a relids set of the contained rels into *relids.
|
2003-01-20 19:55:07 +01:00
|
|
|
*/
|
2008-08-17 03:20:00 +02:00
|
|
|
static Node *
|
|
|
|
pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
|
|
|
|
Relids *relids)
|
|
|
|
{
|
2022-12-22 16:35:02 +01:00
|
|
|
/* Since this function recurses, it could be driven to stack overflow. */
|
|
|
|
check_stack_depth();
|
|
|
|
|
2008-08-17 03:20:00 +02:00
|
|
|
if (jtnode == NULL)
|
|
|
|
{
|
|
|
|
*relids = NULL;
|
|
|
|
}
|
|
|
|
else if (IsA(jtnode, RangeTblRef))
|
|
|
|
{
|
|
|
|
int varno = ((RangeTblRef *) jtnode)->rtindex;
|
|
|
|
|
|
|
|
*relids = bms_make_singleton(varno);
|
|
|
|
/* jtnode is returned unmodified */
|
|
|
|
}
|
|
|
|
else if (IsA(jtnode, FromExpr))
|
|
|
|
{
|
|
|
|
FromExpr *f = (FromExpr *) jtnode;
|
|
|
|
List *newfromlist = NIL;
|
|
|
|
Relids frelids = NULL;
|
2009-02-25 04:30:38 +01:00
|
|
|
FromExpr *newf;
|
|
|
|
Node *jtlink;
|
2008-08-17 03:20:00 +02:00
|
|
|
ListCell *l;
|
|
|
|
|
|
|
|
/* First, recurse to process children and collect their relids */
|
|
|
|
foreach(l, f->fromlist)
|
|
|
|
{
|
|
|
|
Node *newchild;
|
|
|
|
Relids childrelids;
|
|
|
|
|
|
|
|
newchild = pull_up_sublinks_jointree_recurse(root,
|
|
|
|
lfirst(l),
|
|
|
|
&childrelids);
|
|
|
|
newfromlist = lappend(newfromlist, newchild);
|
|
|
|
frelids = bms_join(frelids, childrelids);
|
|
|
|
}
|
2009-02-25 04:30:38 +01:00
|
|
|
/* Build the replacement FromExpr; no quals yet */
|
|
|
|
newf = makeFromExpr(newfromlist, NULL);
|
|
|
|
/* Set up a link representing the rebuilt jointree */
|
|
|
|
jtlink = (Node *) newf;
|
2008-08-17 03:20:00 +02:00
|
|
|
/* Now process qual --- all children are available for use */
|
2012-01-28 01:46:41 +01:00
|
|
|
newf->quals = pull_up_sublinks_qual_recurse(root, f->quals,
|
|
|
|
&jtlink, frelids,
|
|
|
|
NULL, NULL);
|
2008-08-17 03:20:00 +02:00
|
|
|
|
|
|
|
/*
|
2009-02-25 04:30:38 +01:00
|
|
|
* Note that the result will be either newf, or a stack of JoinExprs
|
|
|
|
* with newf at the base. We rely on subsequent optimization steps to
|
|
|
|
* flatten this and rearrange the joins as needed.
|
|
|
|
*
|
2008-08-17 03:20:00 +02:00
|
|
|
* Although we could include the pulled-up subqueries in the returned
|
|
|
|
* relids, there's no need since upper quals couldn't refer to their
|
|
|
|
* outputs anyway.
|
|
|
|
*/
|
|
|
|
*relids = frelids;
|
2009-02-25 04:30:38 +01:00
|
|
|
jtnode = jtlink;
|
2008-08-17 03:20:00 +02:00
|
|
|
}
|
|
|
|
else if (IsA(jtnode, JoinExpr))
|
|
|
|
{
|
|
|
|
JoinExpr *j;
|
|
|
|
Relids leftrelids;
|
|
|
|
Relids rightrelids;
|
2009-02-25 04:30:38 +01:00
|
|
|
Node *jtlink;
|
2008-08-17 03:20:00 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Make a modifiable copy of join node, but don't bother copying its
|
|
|
|
* subnodes (yet).
|
|
|
|
*/
|
|
|
|
j = (JoinExpr *) palloc(sizeof(JoinExpr));
|
|
|
|
memcpy(j, jtnode, sizeof(JoinExpr));
|
2009-02-25 04:30:38 +01:00
|
|
|
jtlink = (Node *) j;
|
2008-08-17 03:20:00 +02:00
|
|
|
|
|
|
|
/* Recurse to process children and collect their relids */
|
|
|
|
j->larg = pull_up_sublinks_jointree_recurse(root, j->larg,
|
|
|
|
&leftrelids);
|
|
|
|
j->rarg = pull_up_sublinks_jointree_recurse(root, j->rarg,
|
|
|
|
&rightrelids);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Now process qual, showing appropriate child relids as available,
|
2009-02-25 04:30:38 +01:00
|
|
|
* and attach any pulled-up jointree items at the right place. In the
|
|
|
|
* inner-join case we put new JoinExprs above the existing one (much
|
|
|
|
* as for a FromExpr-style join). In outer-join cases the new
|
|
|
|
* JoinExprs must go into the nullable side of the outer join. The
|
|
|
|
* point of the available_rels machinations is to ensure that we only
|
|
|
|
* pull up quals for which that's okay.
|
|
|
|
*
|
2023-04-05 22:59:00 +02:00
|
|
|
* We don't expect to see any pre-existing JOIN_SEMI, JOIN_ANTI, or
|
|
|
|
* JOIN_RIGHT_ANTI jointypes here.
|
2008-08-17 03:20:00 +02:00
|
|
|
*/
|
|
|
|
switch (j->jointype)
|
|
|
|
{
|
|
|
|
case JOIN_INNER:
|
|
|
|
j->quals = pull_up_sublinks_qual_recurse(root, j->quals,
|
2012-01-28 01:46:41 +01:00
|
|
|
&jtlink,
|
2008-08-17 03:20:00 +02:00
|
|
|
bms_union(leftrelids,
|
|
|
|
rightrelids),
|
2012-01-28 01:46:41 +01:00
|
|
|
NULL, NULL);
|
2008-08-17 03:20:00 +02:00
|
|
|
break;
|
|
|
|
case JOIN_LEFT:
|
|
|
|
j->quals = pull_up_sublinks_qual_recurse(root, j->quals,
|
2012-01-28 01:46:41 +01:00
|
|
|
&j->rarg,
|
2008-08-17 03:20:00 +02:00
|
|
|
rightrelids,
|
2012-01-28 01:46:41 +01:00
|
|
|
NULL, NULL);
|
2008-08-17 03:20:00 +02:00
|
|
|
break;
|
|
|
|
case JOIN_FULL:
|
|
|
|
/* can't do anything with full-join quals */
|
|
|
|
break;
|
|
|
|
case JOIN_RIGHT:
|
|
|
|
j->quals = pull_up_sublinks_qual_recurse(root, j->quals,
|
2012-01-28 01:46:41 +01:00
|
|
|
&j->larg,
|
2008-08-17 03:20:00 +02:00
|
|
|
leftrelids,
|
2012-01-28 01:46:41 +01:00
|
|
|
NULL, NULL);
|
2008-08-17 03:20:00 +02:00
|
|
|
break;
|
|
|
|
default:
|
|
|
|
elog(ERROR, "unrecognized join type: %d",
|
|
|
|
(int) j->jointype);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Although we could include the pulled-up subqueries in the returned
|
|
|
|
* relids, there's no need since upper quals couldn't refer to their
|
|
|
|
* outputs anyway. But we *do* need to include the join's own rtindex
|
|
|
|
* because we haven't yet collapsed join alias variables, so upper
|
|
|
|
* levels would mistakenly think they couldn't use references to this
|
|
|
|
* join.
|
|
|
|
*/
|
2009-02-25 04:30:38 +01:00
|
|
|
*relids = bms_join(leftrelids, rightrelids);
|
|
|
|
if (j->rtindex)
|
|
|
|
*relids = bms_add_member(*relids, j->rtindex);
|
|
|
|
jtnode = jtlink;
|
2008-08-17 03:20:00 +02:00
|
|
|
}
|
|
|
|
else
|
|
|
|
elog(ERROR, "unrecognized node type: %d",
|
|
|
|
(int) nodeTag(jtnode));
|
|
|
|
return jtnode;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Recurse through top-level qual nodes for pull_up_sublinks()
|
|
|
|
*
|
2012-01-28 01:46:41 +01:00
|
|
|
* jtlink1 points to the link in the jointree where any new JoinExprs should
|
|
|
|
* be inserted if they reference available_rels1 (i.e., available_rels1
|
|
|
|
* denotes the relations present underneath jtlink1). Optionally, jtlink2 can
|
|
|
|
* point to a second link where new JoinExprs should be inserted if they
|
|
|
|
* reference available_rels2 (pass NULL for both those arguments if not used).
|
|
|
|
* Note that SubLinks referencing both sets of variables cannot be optimized.
|
|
|
|
* If we find multiple pull-up-able SubLinks, they'll get stacked onto jtlink1
|
|
|
|
* and/or jtlink2 in the order we encounter them. We rely on subsequent
|
|
|
|
* optimization to rearrange the stack if appropriate.
|
|
|
|
*
|
|
|
|
* Returns the replacement qual node, or NULL if the qual should be removed.
|
2008-08-17 03:20:00 +02:00
|
|
|
*/
|
|
|
|
static Node *
|
|
|
|
pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
|
2012-01-28 01:46:41 +01:00
|
|
|
Node **jtlink1, Relids available_rels1,
|
|
|
|
Node **jtlink2, Relids available_rels2)
|
2003-01-20 19:55:07 +01:00
|
|
|
{
|
|
|
|
if (node == NULL)
|
|
|
|
return NULL;
|
|
|
|
if (IsA(node, SubLink))
|
|
|
|
{
|
|
|
|
SubLink *sublink = (SubLink *) node;
|
2009-02-25 04:30:38 +01:00
|
|
|
JoinExpr *j;
|
2011-05-02 21:56:36 +02:00
|
|
|
Relids child_rels;
|
2003-01-20 19:55:07 +01:00
|
|
|
|
2008-08-14 20:48:00 +02:00
|
|
|
/* Is it a convertible ANY or EXISTS clause? */
|
|
|
|
if (sublink->subLinkType == ANY_SUBLINK)
|
|
|
|
{
|
2012-01-28 01:46:41 +01:00
|
|
|
if ((j = convert_ANY_sublink_to_join(root, sublink,
|
|
|
|
available_rels1)) != NULL)
|
|
|
|
{
|
|
|
|
/* Yes; insert the new join node into the join tree */
|
|
|
|
j->larg = *jtlink1;
|
|
|
|
*jtlink1 = (Node *) j;
|
|
|
|
/* Recursively process pulled-up jointree nodes */
|
|
|
|
j->rarg = pull_up_sublinks_jointree_recurse(root,
|
|
|
|
j->rarg,
|
|
|
|
&child_rels);
|
2012-06-10 21:20:04 +02:00
|
|
|
|
2012-01-28 01:46:41 +01:00
|
|
|
/*
|
|
|
|
* Now recursively process the pulled-up quals. Any inserted
|
|
|
|
* joins can get stacked onto either j->larg or j->rarg,
|
|
|
|
* depending on which rels they reference.
|
|
|
|
*/
|
|
|
|
j->quals = pull_up_sublinks_qual_recurse(root,
|
|
|
|
j->quals,
|
|
|
|
&j->larg,
|
|
|
|
available_rels1,
|
|
|
|
&j->rarg,
|
|
|
|
child_rels);
|
|
|
|
/* Return NULL representing constant TRUE */
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
if (available_rels2 != NULL &&
|
|
|
|
(j = convert_ANY_sublink_to_join(root, sublink,
|
|
|
|
available_rels2)) != NULL)
|
2008-08-17 03:20:00 +02:00
|
|
|
{
|
2012-01-28 01:46:41 +01:00
|
|
|
/* Yes; insert the new join node into the join tree */
|
|
|
|
j->larg = *jtlink2;
|
|
|
|
*jtlink2 = (Node *) j;
|
|
|
|
/* Recursively process pulled-up jointree nodes */
|
2011-05-02 21:56:36 +02:00
|
|
|
j->rarg = pull_up_sublinks_jointree_recurse(root,
|
|
|
|
j->rarg,
|
|
|
|
&child_rels);
|
2012-06-10 21:20:04 +02:00
|
|
|
|
2012-01-28 01:46:41 +01:00
|
|
|
/*
|
|
|
|
* Now recursively process the pulled-up quals. Any inserted
|
|
|
|
* joins can get stacked onto either j->larg or j->rarg,
|
|
|
|
* depending on which rels they reference.
|
|
|
|
*/
|
2011-05-02 21:56:36 +02:00
|
|
|
j->quals = pull_up_sublinks_qual_recurse(root,
|
|
|
|
j->quals,
|
2012-01-28 01:46:41 +01:00
|
|
|
&j->larg,
|
|
|
|
available_rels2,
|
|
|
|
&j->rarg,
|
|
|
|
child_rels);
|
|
|
|
/* Return NULL representing constant TRUE */
|
2009-02-25 04:30:38 +01:00
|
|
|
return NULL;
|
2008-08-17 03:20:00 +02:00
|
|
|
}
|
2008-08-14 20:48:00 +02:00
|
|
|
}
|
|
|
|
else if (sublink->subLinkType == EXISTS_SUBLINK)
|
|
|
|
{
|
2012-01-28 01:46:41 +01:00
|
|
|
if ((j = convert_EXISTS_sublink_to_join(root, sublink, false,
|
|
|
|
available_rels1)) != NULL)
|
2008-08-17 03:20:00 +02:00
|
|
|
{
|
2012-01-28 01:46:41 +01:00
|
|
|
/* Yes; insert the new join node into the join tree */
|
|
|
|
j->larg = *jtlink1;
|
|
|
|
*jtlink1 = (Node *) j;
|
|
|
|
/* Recursively process pulled-up jointree nodes */
|
2011-05-02 21:56:36 +02:00
|
|
|
j->rarg = pull_up_sublinks_jointree_recurse(root,
|
|
|
|
j->rarg,
|
|
|
|
&child_rels);
|
2012-06-10 21:20:04 +02:00
|
|
|
|
2012-01-28 01:46:41 +01:00
|
|
|
/*
|
|
|
|
* Now recursively process the pulled-up quals. Any inserted
|
|
|
|
* joins can get stacked onto either j->larg or j->rarg,
|
|
|
|
* depending on which rels they reference.
|
|
|
|
*/
|
2011-05-02 21:56:36 +02:00
|
|
|
j->quals = pull_up_sublinks_qual_recurse(root,
|
|
|
|
j->quals,
|
2012-01-28 01:46:41 +01:00
|
|
|
&j->larg,
|
|
|
|
available_rels1,
|
|
|
|
&j->rarg,
|
|
|
|
child_rels);
|
|
|
|
/* Return NULL representing constant TRUE */
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
if (available_rels2 != NULL &&
|
|
|
|
(j = convert_EXISTS_sublink_to_join(root, sublink, false,
|
|
|
|
available_rels2)) != NULL)
|
|
|
|
{
|
|
|
|
/* Yes; insert the new join node into the join tree */
|
|
|
|
j->larg = *jtlink2;
|
|
|
|
*jtlink2 = (Node *) j;
|
|
|
|
/* Recursively process pulled-up jointree nodes */
|
|
|
|
j->rarg = pull_up_sublinks_jointree_recurse(root,
|
|
|
|
j->rarg,
|
|
|
|
&child_rels);
|
2012-06-10 21:20:04 +02:00
|
|
|
|
2012-01-28 01:46:41 +01:00
|
|
|
/*
|
|
|
|
* Now recursively process the pulled-up quals. Any inserted
|
|
|
|
* joins can get stacked onto either j->larg or j->rarg,
|
|
|
|
* depending on which rels they reference.
|
|
|
|
*/
|
|
|
|
j->quals = pull_up_sublinks_qual_recurse(root,
|
|
|
|
j->quals,
|
|
|
|
&j->larg,
|
|
|
|
available_rels2,
|
|
|
|
&j->rarg,
|
|
|
|
child_rels);
|
|
|
|
/* Return NULL representing constant TRUE */
|
2009-02-25 04:30:38 +01:00
|
|
|
return NULL;
|
2008-08-17 03:20:00 +02:00
|
|
|
}
|
2008-08-14 20:48:00 +02:00
|
|
|
}
|
|
|
|
/* Else return it unmodified */
|
|
|
|
return node;
|
|
|
|
}
|
2019-01-29 21:26:44 +01:00
|
|
|
if (is_notclause(node))
|
2008-08-14 20:48:00 +02:00
|
|
|
{
|
|
|
|
/* If the immediate argument of NOT is EXISTS, try to convert */
|
|
|
|
SubLink *sublink = (SubLink *) get_notclausearg((Expr *) node);
|
2009-02-25 04:30:38 +01:00
|
|
|
JoinExpr *j;
|
2012-01-28 01:46:41 +01:00
|
|
|
Relids child_rels;
|
2008-08-14 20:48:00 +02:00
|
|
|
|
|
|
|
if (sublink && IsA(sublink, SubLink))
|
|
|
|
{
|
|
|
|
if (sublink->subLinkType == EXISTS_SUBLINK)
|
|
|
|
{
|
2012-01-28 01:46:41 +01:00
|
|
|
if ((j = convert_EXISTS_sublink_to_join(root, sublink, true,
|
|
|
|
available_rels1)) != NULL)
|
2008-08-17 03:20:00 +02:00
|
|
|
{
|
2012-01-28 01:46:41 +01:00
|
|
|
/* Yes; insert the new join node into the join tree */
|
|
|
|
j->larg = *jtlink1;
|
|
|
|
*jtlink1 = (Node *) j;
|
|
|
|
/* Recursively process pulled-up jointree nodes */
|
|
|
|
j->rarg = pull_up_sublinks_jointree_recurse(root,
|
|
|
|
j->rarg,
|
|
|
|
&child_rels);
|
2012-06-10 21:20:04 +02:00
|
|
|
|
2011-06-20 20:33:20 +02:00
|
|
|
/*
|
2012-01-28 01:46:41 +01:00
|
|
|
* Now recursively process the pulled-up quals. Because
|
|
|
|
* we are underneath a NOT, we can't pull up sublinks that
|
|
|
|
* reference the left-hand stuff, but it's still okay to
|
|
|
|
* pull up sublinks referencing j->rarg.
|
2011-06-20 20:33:20 +02:00
|
|
|
*/
|
2012-01-28 01:46:41 +01:00
|
|
|
j->quals = pull_up_sublinks_qual_recurse(root,
|
|
|
|
j->quals,
|
|
|
|
&j->rarg,
|
|
|
|
child_rels,
|
|
|
|
NULL, NULL);
|
|
|
|
/* Return NULL representing constant TRUE */
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
if (available_rels2 != NULL &&
|
|
|
|
(j = convert_EXISTS_sublink_to_join(root, sublink, true,
|
|
|
|
available_rels2)) != NULL)
|
|
|
|
{
|
|
|
|
/* Yes; insert the new join node into the join tree */
|
|
|
|
j->larg = *jtlink2;
|
|
|
|
*jtlink2 = (Node *) j;
|
|
|
|
/* Recursively process pulled-up jointree nodes */
|
2011-05-02 21:56:36 +02:00
|
|
|
j->rarg = pull_up_sublinks_jointree_recurse(root,
|
|
|
|
j->rarg,
|
|
|
|
&child_rels);
|
2012-06-10 21:20:04 +02:00
|
|
|
|
2012-01-28 01:46:41 +01:00
|
|
|
/*
|
|
|
|
* Now recursively process the pulled-up quals. Because
|
|
|
|
* we are underneath a NOT, we can't pull up sublinks that
|
|
|
|
* reference the left-hand stuff, but it's still okay to
|
|
|
|
* pull up sublinks referencing j->rarg.
|
|
|
|
*/
|
2011-05-02 21:56:36 +02:00
|
|
|
j->quals = pull_up_sublinks_qual_recurse(root,
|
|
|
|
j->quals,
|
2012-01-28 01:46:41 +01:00
|
|
|
&j->rarg,
|
2011-05-02 21:56:36 +02:00
|
|
|
child_rels,
|
2012-01-28 01:46:41 +01:00
|
|
|
NULL, NULL);
|
|
|
|
/* Return NULL representing constant TRUE */
|
2009-02-25 04:30:38 +01:00
|
|
|
return NULL;
|
2008-08-17 03:20:00 +02:00
|
|
|
}
|
2008-08-14 20:48:00 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
/* Else return it unmodified */
|
|
|
|
return node;
|
2003-01-20 19:55:07 +01:00
|
|
|
}
|
2019-01-29 21:26:44 +01:00
|
|
|
if (is_andclause(node))
|
2003-01-20 19:55:07 +01:00
|
|
|
{
|
2008-08-17 03:20:00 +02:00
|
|
|
/* Recurse into AND clause */
|
2003-01-20 19:55:07 +01:00
|
|
|
List *newclauses = NIL;
|
2004-05-26 06:41:50 +02:00
|
|
|
ListCell *l;
|
2003-01-20 19:55:07 +01:00
|
|
|
|
2004-05-26 06:41:50 +02:00
|
|
|
foreach(l, ((BoolExpr *) node)->args)
|
2003-01-20 19:55:07 +01:00
|
|
|
{
|
2004-05-26 06:41:50 +02:00
|
|
|
Node *oldclause = (Node *) lfirst(l);
|
2009-02-25 04:30:38 +01:00
|
|
|
Node *newclause;
|
|
|
|
|
|
|
|
newclause = pull_up_sublinks_qual_recurse(root,
|
|
|
|
oldclause,
|
2012-01-28 01:46:41 +01:00
|
|
|
jtlink1,
|
|
|
|
available_rels1,
|
|
|
|
jtlink2,
|
|
|
|
available_rels2);
|
2009-02-25 04:30:38 +01:00
|
|
|
if (newclause)
|
|
|
|
newclauses = lappend(newclauses, newclause);
|
2003-01-20 19:55:07 +01:00
|
|
|
}
|
2009-02-25 04:30:38 +01:00
|
|
|
/* We might have got back fewer clauses than we started with */
|
|
|
|
if (newclauses == NIL)
|
|
|
|
return NULL;
|
|
|
|
else if (list_length(newclauses) == 1)
|
|
|
|
return (Node *) linitial(newclauses);
|
|
|
|
else
|
|
|
|
return (Node *) make_andclause(newclauses);
|
2003-01-20 19:55:07 +01:00
|
|
|
}
|
|
|
|
/* Stop if not an AND */
|
|
|
|
return node;
|
|
|
|
}
|
|
|
|
|
2008-03-18 23:04:14 +01:00
|
|
|
/*
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
* preprocess_function_rtes
|
|
|
|
* Constant-simplify any FUNCTION RTEs in the FROM clause, and then
|
|
|
|
* attempt to "inline" any that are set-returning functions.
|
2008-03-18 23:04:14 +01:00
|
|
|
*
|
|
|
|
* If an RTE_FUNCTION rtable entry invokes a set-returning function that
|
|
|
|
* contains just a simple SELECT, we can convert the rtable entry to an
|
|
|
|
* RTE_SUBQUERY entry exposing the SELECT directly. This is especially
|
|
|
|
* useful if the subquery can then be "pulled up" for further optimization,
|
|
|
|
* but we do it even if not, to reduce executor overhead.
|
|
|
|
*
|
|
|
|
* This has to be done before we have started to do any optimization of
|
|
|
|
* subqueries, else any such steps wouldn't get applied to subqueries
|
2008-08-14 20:48:00 +02:00
|
|
|
* obtained via inlining. However, we do it after pull_up_sublinks
|
|
|
|
* so that we can inline any functions used in SubLink subselects.
|
2008-03-18 23:04:14 +01:00
|
|
|
*
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
* The reason for applying const-simplification at this stage is that
|
|
|
|
* (a) we'd need to do it anyway to inline a SRF, and (b) by doing it now,
|
|
|
|
* we can be sure that pull_up_constant_function() will see constants
|
|
|
|
* if there are constants to be seen. This approach also guarantees
|
|
|
|
* that every FUNCTION RTE has been const-simplified, allowing planner.c's
|
|
|
|
* preprocess_expression() to skip doing it again.
|
|
|
|
*
|
2008-03-18 23:04:14 +01:00
|
|
|
* Like most of the planner, this feels free to scribble on its input data
|
|
|
|
* structure.
|
|
|
|
*/
|
|
|
|
void
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
preprocess_function_rtes(PlannerInfo *root)
|
2008-03-18 23:04:14 +01:00
|
|
|
{
|
|
|
|
ListCell *rt;
|
|
|
|
|
|
|
|
foreach(rt, root->parse->rtable)
|
|
|
|
{
|
|
|
|
RangeTblEntry *rte = (RangeTblEntry *) lfirst(rt);
|
|
|
|
|
|
|
|
if (rte->rtekind == RTE_FUNCTION)
|
|
|
|
{
|
|
|
|
Query *funcquery;
|
|
|
|
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
/* Apply const-simplification */
|
|
|
|
rte->functions = (List *)
|
|
|
|
eval_const_expressions(root, (Node *) rte->functions);
|
|
|
|
|
2008-03-18 23:04:14 +01:00
|
|
|
/* Check safety of expansion, and expand if possible */
|
2008-10-09 21:27:40 +02:00
|
|
|
funcquery = inline_set_returning_function(root, rte);
|
2008-03-18 23:04:14 +01:00
|
|
|
if (funcquery)
|
|
|
|
{
|
2018-09-18 21:08:28 +02:00
|
|
|
/* Successful expansion, convert the RTE to a subquery */
|
2008-03-18 23:04:14 +01:00
|
|
|
rte->rtekind = RTE_SUBQUERY;
|
|
|
|
rte->subquery = funcquery;
|
2018-09-18 21:08:28 +02:00
|
|
|
rte->security_barrier = false;
|
|
|
|
/* Clear fields that should not be set in a subquery RTE */
|
Support multi-argument UNNEST(), and TABLE() syntax for multiple functions.
This patch adds the ability to write TABLE( function1(), function2(), ...)
as a single FROM-clause entry. The result is the concatenation of the
first row from each function, followed by the second row from each
function, etc; with NULLs inserted if any function produces fewer rows than
others. This is believed to be a much more useful behavior than what
Postgres currently does with multiple SRFs in a SELECT list.
This syntax also provides a reasonable way to combine use of column
definition lists with WITH ORDINALITY: put the column definition list
inside TABLE(), where it's clear that it doesn't control the ordinality
column as well.
Also implement SQL-compliant multiple-argument UNNEST(), by turning
UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)).
The SQL standard specifies TABLE() with only a single function, not
multiple functions, and it seems to require an implicit UNNEST() which is
not what this patch does. There may be something wrong with that reading
of the spec, though, because if it's right then the spec's TABLE() is just
a pointless alternative spelling of UNNEST(). After further review of
that, we might choose to adopt a different syntax for what this patch does,
but in any case this functionality seems clearly worthwhile.
Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and
significantly revised by me
2013-11-22 01:37:02 +01:00
|
|
|
rte->functions = NIL;
|
2018-09-18 21:08:28 +02:00
|
|
|
rte->funcordinality = false;
|
2008-03-18 23:04:14 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2003-01-20 19:55:07 +01:00
|
|
|
/*
|
|
|
|
* pull_up_subqueries
|
|
|
|
* Look for subqueries in the rangetable that can be pulled up into
|
|
|
|
* the parent query. If the subquery has no special features like
|
|
|
|
* grouping/aggregation then we can merge it into the parent's jointree.
|
2006-02-03 22:08:49 +01:00
|
|
|
* Also, subqueries that are simple UNION ALL structures can be
|
|
|
|
* converted into "append relations".
|
2012-08-12 22:01:26 +02:00
|
|
|
*/
|
2015-03-12 04:18:03 +01:00
|
|
|
void
|
|
|
|
pull_up_subqueries(PlannerInfo *root)
|
2012-08-12 22:01:26 +02:00
|
|
|
{
|
2015-03-12 04:18:03 +01:00
|
|
|
/* Top level of jointree must always be a FromExpr */
|
|
|
|
Assert(IsA(root->parse->jointree, FromExpr));
|
|
|
|
/* Recursion starts with no containing join nor appendrel */
|
|
|
|
root->parse->jointree = (FromExpr *)
|
|
|
|
pull_up_subqueries_recurse(root, (Node *) root->parse->jointree,
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
NULL, NULL);
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
/* We should still have a FromExpr */
|
2015-03-12 04:18:03 +01:00
|
|
|
Assert(IsA(root->parse->jointree, FromExpr));
|
2012-08-12 22:01:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* pull_up_subqueries_recurse
|
|
|
|
* Recursive guts of pull_up_subqueries.
|
|
|
|
*
|
2015-03-12 04:18:03 +01:00
|
|
|
* This recursively processes the jointree and returns a modified jointree.
|
|
|
|
*
|
2012-08-12 22:01:26 +02:00
|
|
|
* If this jointree node is within either side of an outer join, then
|
|
|
|
* lowest_outer_join references the lowest such JoinExpr node; otherwise
|
|
|
|
* it is NULL. We use this to constrain the effects of LATERAL subqueries.
|
|
|
|
*
|
2009-04-28 23:31:16 +02:00
|
|
|
* If we are looking at a member subquery of an append relation,
|
|
|
|
* containing_appendrel describes that relation; else it is NULL.
|
|
|
|
* This forces use of the PlaceHolderVar mechanism for all non-Var targetlist
|
|
|
|
* items, and puts some additional restrictions on what can be pulled up.
|
2006-02-03 22:08:49 +01:00
|
|
|
*
|
2003-01-20 19:55:07 +01:00
|
|
|
* A tricky aspect of this code is that if we pull up a subquery we have
|
|
|
|
* to replace Vars that reference the subquery's outputs throughout the
|
|
|
|
* parent query, including quals attached to jointree nodes above the one
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
* we are currently processing! We handle this by being careful to maintain
|
|
|
|
* validity of the jointree structure while recursing, in the following sense:
|
|
|
|
* whenever we recurse, all qual expressions in the tree must be reachable
|
|
|
|
* from the top level, in case the recursive call needs to modify them.
|
|
|
|
*
|
|
|
|
* Notice also that we can't turn pullup_replace_vars loose on the whole
|
|
|
|
* jointree, because it'd return a mutated copy of the tree; we have to
|
2015-03-12 04:18:03 +01:00
|
|
|
* invoke it just on the quals, instead. This behavior is what makes it
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
* reasonable to pass lowest_outer_join as a pointer rather than some
|
|
|
|
* more-indirect way of identifying the lowest OJ. Likewise, we don't
|
|
|
|
* replace append_rel_list members but only their substructure, so the
|
|
|
|
* containing_appendrel reference is safe to use.
|
2003-01-20 19:55:07 +01:00
|
|
|
*/
|
2012-08-12 22:01:26 +02:00
|
|
|
static Node *
|
|
|
|
pull_up_subqueries_recurse(PlannerInfo *root, Node *jtnode,
|
|
|
|
JoinExpr *lowest_outer_join,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
AppendRelInfo *containing_appendrel)
|
2003-01-20 19:55:07 +01:00
|
|
|
{
|
2022-12-22 16:35:02 +01:00
|
|
|
/* Since this function recurses, it could be driven to stack overflow. */
|
|
|
|
check_stack_depth();
|
|
|
|
/* Also, since it's a bit expensive, let's check for query cancel. */
|
|
|
|
CHECK_FOR_INTERRUPTS();
|
|
|
|
|
2015-03-12 04:18:03 +01:00
|
|
|
Assert(jtnode != NULL);
|
2003-01-20 19:55:07 +01:00
|
|
|
if (IsA(jtnode, RangeTblRef))
|
|
|
|
{
|
|
|
|
int varno = ((RangeTblRef *) jtnode)->rtindex;
|
2006-02-03 22:08:49 +01:00
|
|
|
RangeTblEntry *rte = rt_fetch(varno, root->parse->rtable);
|
2003-01-20 19:55:07 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Is this a subquery RTE, and if so, is the subquery simple enough to
|
2008-10-21 22:42:53 +02:00
|
|
|
* pull up?
|
2006-02-03 22:08:49 +01:00
|
|
|
*
|
|
|
|
* If we are looking at an append-relation member, we can't pull it up
|
|
|
|
* unless is_safe_append_member says so.
|
2003-01-20 19:55:07 +01:00
|
|
|
*/
|
2004-05-11 00:44:49 +02:00
|
|
|
if (rte->rtekind == RTE_SUBQUERY &&
|
Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins. This could result in a malformed plan. The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.
The right value to use is the join level we actually intend to evaluate
the PHV at. We can get that from the ph_eval_at field of the associated
PlaceHolderInfo. However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level. (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.) Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.
The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos. We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter. (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)
Back-patch to v12. The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit 4be058fe9 allowed trivial
subqueries to be collapsed out completely, eliminating their join order
restrictions.
Per report from Stephan Springl.
Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 21:37:23 +01:00
|
|
|
is_simple_subquery(root, rte->subquery, rte, lowest_outer_join) &&
|
2009-04-28 23:31:16 +02:00
|
|
|
(containing_appendrel == NULL ||
|
|
|
|
is_safe_append_member(rte->subquery)))
|
2006-02-03 22:08:49 +01:00
|
|
|
return pull_up_simple_subquery(root, jtnode, rte,
|
2009-04-28 23:31:16 +02:00
|
|
|
lowest_outer_join,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
containing_appendrel);
|
2003-01-20 19:55:07 +01:00
|
|
|
|
2006-02-03 22:08:49 +01:00
|
|
|
/*
|
|
|
|
* Alternatively, is it a simple UNION ALL subquery? If so, flatten
|
2008-10-21 22:42:53 +02:00
|
|
|
* into an "append relation".
|
2006-02-03 22:08:49 +01:00
|
|
|
*
|
2008-10-21 22:42:53 +02:00
|
|
|
* It's safe to do this regardless of whether this query is itself an
|
2006-02-03 22:08:49 +01:00
|
|
|
* appendrel member. (If you're thinking we should try to flatten the
|
|
|
|
* two levels of appendrel together, you're right; but we handle that
|
|
|
|
* in set_append_rel_pathlist, not here.)
|
|
|
|
*/
|
|
|
|
if (rte->rtekind == RTE_SUBQUERY &&
|
|
|
|
is_simple_union_all(rte->subquery))
|
|
|
|
return pull_up_simple_union_all(root, jtnode, rte);
|
2008-10-21 22:42:53 +02:00
|
|
|
|
2015-03-12 04:18:03 +01:00
|
|
|
/*
|
|
|
|
* Or perhaps it's a simple VALUES RTE?
|
|
|
|
*
|
|
|
|
* We don't allow VALUES pullup below an outer join nor into an
|
|
|
|
* appendrel (such cases are impossible anyway at the moment).
|
|
|
|
*/
|
|
|
|
if (rte->rtekind == RTE_VALUES &&
|
|
|
|
lowest_outer_join == NULL &&
|
|
|
|
containing_appendrel == NULL &&
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
is_simple_values(root, rte))
|
2015-03-12 04:18:03 +01:00
|
|
|
return pull_up_simple_values(root, jtnode, rte);
|
|
|
|
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
/*
|
|
|
|
* Or perhaps it's a FUNCTION RTE that we could inline?
|
|
|
|
*/
|
|
|
|
if (rte->rtekind == RTE_FUNCTION)
|
|
|
|
return pull_up_constant_function(root, jtnode, rte,
|
|
|
|
containing_appendrel);
|
|
|
|
|
2008-10-21 22:42:53 +02:00
|
|
|
/* Otherwise, do nothing at this node. */
|
2003-01-20 19:55:07 +01:00
|
|
|
}
|
|
|
|
else if (IsA(jtnode, FromExpr))
|
|
|
|
{
|
|
|
|
FromExpr *f = (FromExpr *) jtnode;
|
2004-05-26 06:41:50 +02:00
|
|
|
ListCell *l;
|
2003-01-20 19:55:07 +01:00
|
|
|
|
2009-04-28 23:31:16 +02:00
|
|
|
Assert(containing_appendrel == NULL);
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
/* Recursively transform all the child nodes */
|
2003-01-20 19:55:07 +01:00
|
|
|
foreach(l, f->fromlist)
|
2015-03-12 04:18:03 +01:00
|
|
|
{
|
2012-08-12 22:01:26 +02:00
|
|
|
lfirst(l) = pull_up_subqueries_recurse(root, lfirst(l),
|
|
|
|
lowest_outer_join,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
NULL);
|
2015-03-12 04:18:03 +01:00
|
|
|
}
|
2003-01-20 19:55:07 +01:00
|
|
|
}
|
|
|
|
else if (IsA(jtnode, JoinExpr))
|
|
|
|
{
|
|
|
|
JoinExpr *j = (JoinExpr *) jtnode;
|
|
|
|
|
2009-04-28 23:31:16 +02:00
|
|
|
Assert(containing_appendrel == NULL);
|
2003-01-20 19:55:07 +01:00
|
|
|
/* Recurse, being careful to tell myself when inside outer join */
|
|
|
|
switch (j->jointype)
|
|
|
|
{
|
|
|
|
case JOIN_INNER:
|
2012-08-12 22:01:26 +02:00
|
|
|
j->larg = pull_up_subqueries_recurse(root, j->larg,
|
|
|
|
lowest_outer_join,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
NULL);
|
2012-08-12 22:01:26 +02:00
|
|
|
j->rarg = pull_up_subqueries_recurse(root, j->rarg,
|
|
|
|
lowest_outer_join,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
NULL);
|
2003-01-20 19:55:07 +01:00
|
|
|
break;
|
|
|
|
case JOIN_LEFT:
|
2009-02-25 04:30:38 +01:00
|
|
|
case JOIN_SEMI:
|
|
|
|
case JOIN_ANTI:
|
2012-08-12 22:01:26 +02:00
|
|
|
j->larg = pull_up_subqueries_recurse(root, j->larg,
|
|
|
|
j,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
NULL);
|
2012-08-12 22:01:26 +02:00
|
|
|
j->rarg = pull_up_subqueries_recurse(root, j->rarg,
|
|
|
|
j,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
NULL);
|
2003-01-20 19:55:07 +01:00
|
|
|
break;
|
|
|
|
case JOIN_FULL:
|
2012-08-12 22:01:26 +02:00
|
|
|
j->larg = pull_up_subqueries_recurse(root, j->larg,
|
|
|
|
j,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
NULL);
|
2012-08-12 22:01:26 +02:00
|
|
|
j->rarg = pull_up_subqueries_recurse(root, j->rarg,
|
|
|
|
j,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
NULL);
|
2003-01-20 19:55:07 +01:00
|
|
|
break;
|
|
|
|
case JOIN_RIGHT:
|
2012-08-12 22:01:26 +02:00
|
|
|
j->larg = pull_up_subqueries_recurse(root, j->larg,
|
|
|
|
j,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
NULL);
|
2012-08-12 22:01:26 +02:00
|
|
|
j->rarg = pull_up_subqueries_recurse(root, j->rarg,
|
|
|
|
j,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
NULL);
|
2003-01-20 19:55:07 +01:00
|
|
|
break;
|
|
|
|
default:
|
2003-07-25 02:01:09 +02:00
|
|
|
elog(ERROR, "unrecognized join type: %d",
|
|
|
|
(int) j->jointype);
|
2003-01-20 19:55:07 +01:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
2003-07-25 02:01:09 +02:00
|
|
|
elog(ERROR, "unrecognized node type: %d",
|
|
|
|
(int) nodeTag(jtnode));
|
2003-01-20 19:55:07 +01:00
|
|
|
return jtnode;
|
|
|
|
}
|
|
|
|
|
2006-02-03 22:08:49 +01:00
|
|
|
/*
|
|
|
|
* pull_up_simple_subquery
|
|
|
|
* Attempt to pull up a single simple subquery.
|
|
|
|
*
|
|
|
|
* jtnode is a RangeTblRef that has been tentatively identified as a simple
|
|
|
|
* subquery by pull_up_subqueries. We return the replacement jointree node,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
* or jtnode itself if we determine that the subquery can't be pulled up
|
|
|
|
* after all.
|
2009-04-28 23:31:16 +02:00
|
|
|
*
|
|
|
|
* rte is the RangeTblEntry referenced by jtnode. Remaining parameters are
|
2012-08-12 22:01:26 +02:00
|
|
|
* as for pull_up_subqueries_recurse.
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
|
|
|
static Node *
|
|
|
|
pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte,
|
2009-04-28 23:31:16 +02:00
|
|
|
JoinExpr *lowest_outer_join,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
AppendRelInfo *containing_appendrel)
|
2006-02-03 22:08:49 +01:00
|
|
|
{
|
|
|
|
Query *parse = root->parse;
|
|
|
|
int varno = ((RangeTblRef *) jtnode)->rtindex;
|
|
|
|
Query *subquery;
|
|
|
|
PlannerInfo *subroot;
|
|
|
|
int rtoffset;
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
pullup_replace_vars_context rvcontext;
|
2009-04-28 23:31:16 +02:00
|
|
|
ListCell *lc;
|
2006-02-03 22:08:49 +01:00
|
|
|
|
|
|
|
/*
|
2021-07-06 20:23:09 +02:00
|
|
|
* Make a modifiable copy of the subquery to hack on, so that the RTE will
|
|
|
|
* be left unchanged in case we decide below that we can't pull it up
|
|
|
|
* after all.
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
|
|
|
subquery = copyObject(rte->subquery);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Create a PlannerInfo data structure for this subquery.
|
|
|
|
*
|
|
|
|
* NOTE: the next few steps should match the first processing in
|
|
|
|
* subquery_planner(). Can we refactor to avoid code duplication, or
|
|
|
|
* would that just make things uglier?
|
|
|
|
*/
|
|
|
|
subroot = makeNode(PlannerInfo);
|
|
|
|
subroot->parse = subquery;
|
2007-02-19 08:03:34 +01:00
|
|
|
subroot->glob = root->glob;
|
|
|
|
subroot->query_level = root->query_level;
|
2008-10-04 23:56:55 +02:00
|
|
|
subroot->parent_root = root->parent_root;
|
Fix PARAM_EXEC assignment mechanism to be safe in the presence of WITH.
The planner previously assumed that parameter Vars having the same absolute
query level, varno, and varattno could safely be assigned the same runtime
PARAM_EXEC slot, even though they might be different Vars appearing in
different subqueries. This was (probably) safe before the introduction of
CTEs, but the lazy-evalution mechanism used for CTEs means that a CTE can
be executed during execution of some other subquery, causing the lifespan
of Params at the same syntactic nesting level as the CTE to overlap with
use of the same slots inside the CTE. In 9.1 we created additional hazards
by using the same parameter-assignment technology for nestloop inner scan
parameters, but it was broken before that, as illustrated by the added
regression test.
To fix, restructure the planner's management of PlannerParamItems so that
items having different semantic lifespans are kept rigorously separated.
This will probably result in complex queries using more runtime PARAM_EXEC
slots than before, but the slots are cheap enough that this hardly matters.
Also, stop generating PlannerParamItems containing Params for subquery
outputs: all we really need to do is reserve the PARAM_EXEC slot number,
and that now only takes incrementing a counter. The planning code is
simpler and probably faster than before, as well as being more correct.
Per report from Vik Reykja.
These changes will mostly also need to be made in the back branches, but
I'm going to hold off on that until after 9.2.0 wraps.
2012-09-05 18:54:03 +02:00
|
|
|
subroot->plan_params = NIL;
|
2015-08-12 05:48:37 +02:00
|
|
|
subroot->outer_params = NULL;
|
2007-01-20 21:45:41 +01:00
|
|
|
subroot->planner_cxt = CurrentMemoryContext;
|
2007-02-19 08:03:34 +01:00
|
|
|
subroot->init_plans = NIL;
|
2008-10-04 23:56:55 +02:00
|
|
|
subroot->cte_plan_ids = NIL;
|
Implement UPDATE tab SET (col1,col2,...) = (SELECT ...), ...
This SQL-standard feature allows a sub-SELECT yielding multiple columns
(but only one row) to be used to compute the new values of several columns
to be updated. While the same results can be had with an independent
sub-SELECT per column, such a workaround can require a great deal of
duplicated computation.
The standard actually says that the source for a multi-column assignment
could be any row-valued expression. The implementation used here is
tightly tied to our existing sub-SELECT support and can't handle other
cases; the Bison grammar would have some issues with them too. However,
I don't feel too bad about this since other cases can be converted into
sub-SELECTs. For instance, "SET (a,b,c) = row_valued_function(x)" could
be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".
2014-06-18 19:22:25 +02:00
|
|
|
subroot->multiexpr_params = NIL;
|
2023-01-30 19:50:25 +01:00
|
|
|
subroot->join_domains = NIL;
|
2008-03-18 23:04:14 +01:00
|
|
|
subroot->eq_classes = NIL;
|
Speed up finding EquivalenceClasses for a given set of rels
Previously in order to determine which ECs a relation had members in, we
had to loop over all ECs stored in PlannerInfo's eq_classes and check if
ec_relids mentioned the relation. For the most part, this was fine, as
generally, unless queries were fairly complex, the overhead of performing
the lookup would have not been that significant. However, when queries
contained large numbers of joins and ECs, the overhead to find the set of
classes matching a given set of relations could become a significant
portion of the overall planning effort.
Here we allow a much more efficient method to access the ECs which match a
given relation or set of relations. A new Bitmapset field in RelOptInfo
now exists to store the indexes into PlannerInfo's eq_classes list which
each relation is mentioned in. This allows very fast lookups to find all
ECs belonging to a single relation. When we need to lookup ECs belonging
to a given pair of relations, we can simply bitwise-AND the Bitmapsets from
each relation and use the result to perform the lookup.
We also take the opportunity to write a new implementation of
generate_join_implied_equalities which makes use of the new indexes.
generate_join_implied_equalities_for_ecs must remain as is as it can be
given a custom list of ECs, which we can't easily determine the indexes of.
This was originally intended to fix the performance penalty of looking up
foreign keys matching a join condition which was introduced by 100340e2d.
However, we're speeding up much more than just that here.
Author: David Rowley, Tom Lane
Reviewed-by: Tom Lane, Tomas Vondra
Discussion: https://postgr.es/m/6970.1545327857@sss.pgh.pa.us
2019-07-21 07:30:58 +02:00
|
|
|
subroot->ec_merging_done = false;
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
subroot->last_rinfo_serial = 0;
|
Rework planning and execution of UPDATE and DELETE.
This patch makes two closely related sets of changes:
1. For UPDATE, the subplan of the ModifyTable node now only delivers
the new values of the changed columns (i.e., the expressions computed
in the query's SET clause) plus row identity information such as CTID.
ModifyTable must re-fetch the original tuple to merge in the old
values of any unchanged columns. The core advantage of this is that
the changed columns are uniform across all tables of an inherited or
partitioned target relation, whereas the other columns might not be.
A secondary advantage, when the UPDATE involves joins, is that less
data needs to pass through the plan tree. The disadvantage of course
is an extra fetch of each tuple to be updated. However, that seems to
be very nearly free in context; even worst-case tests don't show it to
add more than a couple percent to the total query cost. At some point
it might be interesting to combine the re-fetch with the tuple access
that ModifyTable must do anyway to mark the old tuple dead; but that
would require a good deal of refactoring and it seems it wouldn't buy
all that much, so this patch doesn't attempt it.
2. For inherited UPDATE/DELETE, instead of generating a separate
subplan for each target relation, we now generate a single subplan
that is just exactly like a SELECT's plan, then stick ModifyTable
on top of that. To let ModifyTable know which target relation a
given incoming row refers to, a tableoid junk column is added to
the row identity information. This gets rid of the horrid hack
that was inheritance_planner(), eliminating O(N^2) planning cost
and memory consumption in cases where there were many unprunable
target relations.
Point 2 of course requires point 1, so that there is a uniform
definition of the non-junk columns to be returned by the subplan.
We can't insist on uniform definition of the row identity junk
columns however, if we want to keep the ability to have both
plain and foreign tables in a partitioning hierarchy. Since
it wouldn't scale very far to have every child table have its
own row identity column, this patch includes provisions to merge
similar row identity columns into one column of the subplan result.
In particular, we can merge the whole-row Vars typically used as
row identity by FDWs into one column by pretending they are type
RECORD. (It's still okay for the actual composite Datums to be
labeled with the table's rowtype OID, though.)
There is more that can be done to file down residual inefficiencies
in this patch, but it seems to be committable now.
FDW authors should note several API changes:
* The argument list for AddForeignUpdateTargets() has changed, and so
has the method it must use for adding junk columns to the query. Call
add_row_identity_var() instead of manipulating the parse tree directly.
You might want to reconsider exactly what you're adding, too.
* PlanDirectModify() must now work a little harder to find the
ForeignScan plan node; if the foreign table is part of a partitioning
hierarchy then the ForeignScan might not be the direct child of
ModifyTable. See postgres_fdw for sample code.
* To check whether a relation is a target relation, it's no
longer sufficient to compare its relid to root->parse->resultRelation.
Instead, check it against all_result_relids or leaf_result_relids,
as appropriate.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
|
|
|
subroot->all_result_relids = NULL;
|
|
|
|
subroot->leaf_result_relids = NULL;
|
2006-02-03 22:08:49 +01:00
|
|
|
subroot->append_rel_list = NIL;
|
Rework planning and execution of UPDATE and DELETE.
This patch makes two closely related sets of changes:
1. For UPDATE, the subplan of the ModifyTable node now only delivers
the new values of the changed columns (i.e., the expressions computed
in the query's SET clause) plus row identity information such as CTID.
ModifyTable must re-fetch the original tuple to merge in the old
values of any unchanged columns. The core advantage of this is that
the changed columns are uniform across all tables of an inherited or
partitioned target relation, whereas the other columns might not be.
A secondary advantage, when the UPDATE involves joins, is that less
data needs to pass through the plan tree. The disadvantage of course
is an extra fetch of each tuple to be updated. However, that seems to
be very nearly free in context; even worst-case tests don't show it to
add more than a couple percent to the total query cost. At some point
it might be interesting to combine the re-fetch with the tuple access
that ModifyTable must do anyway to mark the old tuple dead; but that
would require a good deal of refactoring and it seems it wouldn't buy
all that much, so this patch doesn't attempt it.
2. For inherited UPDATE/DELETE, instead of generating a separate
subplan for each target relation, we now generate a single subplan
that is just exactly like a SELECT's plan, then stick ModifyTable
on top of that. To let ModifyTable know which target relation a
given incoming row refers to, a tableoid junk column is added to
the row identity information. This gets rid of the horrid hack
that was inheritance_planner(), eliminating O(N^2) planning cost
and memory consumption in cases where there were many unprunable
target relations.
Point 2 of course requires point 1, so that there is a uniform
definition of the non-junk columns to be returned by the subplan.
We can't insist on uniform definition of the row identity junk
columns however, if we want to keep the ability to have both
plain and foreign tables in a partitioning hierarchy. Since
it wouldn't scale very far to have every child table have its
own row identity column, this patch includes provisions to merge
similar row identity columns into one column of the subplan result.
In particular, we can merge the whole-row Vars typically used as
row identity by FDWs into one column by pretending they are type
RECORD. (It's still okay for the actual composite Datums to be
labeled with the table's rowtype OID, though.)
There is more that can be done to file down residual inefficiencies
in this patch, but it seems to be committable now.
FDW authors should note several API changes:
* The argument list for AddForeignUpdateTargets() has changed, and so
has the method it must use for adding junk columns to the query. Call
add_row_identity_var() instead of manipulating the parse tree directly.
You might want to reconsider exactly what you're adding, too.
* PlanDirectModify() must now work a little harder to find the
ForeignScan plan node; if the foreign table is part of a partitioning
hierarchy then the ForeignScan might not be the direct child of
ModifyTable. See postgres_fdw for sample code.
* To check whether a relation is a target relation, it's no
longer sufficient to compare its relid to root->parse->resultRelation.
Instead, check it against all_result_relids or leaf_result_relids,
as appropriate.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
|
|
|
subroot->row_identity_vars = NIL;
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
2009-10-26 03:26:45 +01:00
|
|
|
subroot->rowMarks = NIL;
|
Make the upper part of the planner work by generating and comparing Paths.
I've been saying we needed to do this for more than five years, and here it
finally is. This patch removes the ever-growing tangle of spaghetti logic
that grouping_planner() used to use to try to identify the best plan for
post-scan/join query steps. Now, there is (nearly) independent
consideration of each execution step, and entirely separate construction of
Paths to represent each of the possible ways to do that step. We choose
the best Path or set of Paths using the same add_path() logic that's been
used inside query_planner() for years.
In addition, this patch removes the old restriction that subquery_planner()
could return only a single Plan. It now returns a RelOptInfo containing a
set of Paths, just as query_planner() does, and the parent query level can
use each of those Paths as the basis of a SubqueryScanPath at its level.
This allows finding some optimizations that we missed before, wherein a
subquery was capable of returning presorted data and thereby avoiding a
sort in the parent level, making the overall cost cheaper even though
delivering sorted output was not the cheapest plan for the subquery in
isolation. (A couple of regression test outputs change in consequence of
that. However, there is very little change in visible planner behavior
overall, because the point of this patch is not to get immediate planning
benefits but to create the infrastructure for future improvements.)
There is a great deal left to do here. This patch unblocks a lot of
planner work that was basically impractical in the old code structure,
such as allowing FDWs to implement remote aggregation, or rewriting
plan_set_operations() to allow consideration of multiple implementation
orders for set operations. (The latter will likely require a full
rewrite of plan_set_operations(); what I've done here is only to fix it
to return Paths not Plans.) I have also left unfinished some localized
refactoring in createplan.c and planner.c, because it was not necessary
to get this patch to a working state.
Thanks to Robert Haas, David Rowley, and Amit Kapila for review.
2016-03-07 21:58:22 +01:00
|
|
|
memset(subroot->upper_rels, 0, sizeof(subroot->upper_rels));
|
2016-03-15 00:23:29 +01:00
|
|
|
memset(subroot->upper_targets, 0, sizeof(subroot->upper_targets));
|
Remove redundant grouping and DISTINCT columns.
Avoid explicitly grouping by columns that we know are redundant
for sorting, for example we need group by only one of x and y in
SELECT ... WHERE x = y GROUP BY x, y
This comes up more often than you might think, as shown by the
changes in the regression tests. It's nearly free to detect too,
since we are just piggybacking on the existing logic that detects
redundant pathkeys. (In some of the existing plans that change,
it's visible that a sort step preceding the grouping step already
didn't bother to sort by the redundant column, making the old plan
a bit silly-looking.)
To do this, build processed_groupClause and processed_distinctClause
lists that omit any provably-redundant sort items, and consult those
not the originals where relevant. This means that within the
planner, one should usually consult root->processed_groupClause or
root->processed_distinctClause if one wants to know which columns
are to be grouped on; but to check whether grouping or distinct-ing
is happening at all, check non-NIL-ness of parse->groupClause or
parse->distinctClause. This is comparable to longstanding rules
about handling the HAVING clause, so I don't think it'll be a huge
maintenance problem.
nodeAgg.c also needs minor mods, because it's now possible to generate
AGG_PLAIN and AGG_SORTED Agg nodes with zero grouping columns.
Patch by me; thanks to Richard Guo and David Rowley for review.
Discussion: https://postgr.es/m/185315.1672179489@sss.pgh.pa.us
2023-01-18 18:37:57 +01:00
|
|
|
subroot->processed_groupClause = NIL;
|
|
|
|
subroot->processed_distinctClause = NIL;
|
Make the upper part of the planner work by generating and comparing Paths.
I've been saying we needed to do this for more than five years, and here it
finally is. This patch removes the ever-growing tangle of spaghetti logic
that grouping_planner() used to use to try to identify the best plan for
post-scan/join query steps. Now, there is (nearly) independent
consideration of each execution step, and entirely separate construction of
Paths to represent each of the possible ways to do that step. We choose
the best Path or set of Paths using the same add_path() logic that's been
used inside query_planner() for years.
In addition, this patch removes the old restriction that subquery_planner()
could return only a single Plan. It now returns a RelOptInfo containing a
set of Paths, just as query_planner() does, and the parent query level can
use each of those Paths as the basis of a SubqueryScanPath at its level.
This allows finding some optimizations that we missed before, wherein a
subquery was capable of returning presorted data and thereby avoiding a
sort in the parent level, making the overall cost cheaper even though
delivering sorted output was not the cheapest plan for the subquery in
isolation. (A couple of regression test outputs change in consequence of
that. However, there is very little change in visible planner behavior
overall, because the point of this patch is not to get immediate planning
benefits but to create the infrastructure for future improvements.)
There is a great deal left to do here. This patch unblocks a lot of
planner work that was basically impractical in the old code structure,
such as allowing FDWs to implement remote aggregation, or rewriting
plan_set_operations() to allow consideration of multiple implementation
orders for set operations. (The latter will likely require a full
rewrite of plan_set_operations(); what I've done here is only to fix it
to return Paths not Plans.) I have also left unfinished some localized
refactoring in createplan.c and planner.c, because it was not necessary
to get this patch to a working state.
Thanks to Robert Haas, David Rowley, and Amit Kapila for review.
2016-03-07 21:58:22 +01:00
|
|
|
subroot->processed_tlist = NIL;
|
Rework planning and execution of UPDATE and DELETE.
This patch makes two closely related sets of changes:
1. For UPDATE, the subplan of the ModifyTable node now only delivers
the new values of the changed columns (i.e., the expressions computed
in the query's SET clause) plus row identity information such as CTID.
ModifyTable must re-fetch the original tuple to merge in the old
values of any unchanged columns. The core advantage of this is that
the changed columns are uniform across all tables of an inherited or
partitioned target relation, whereas the other columns might not be.
A secondary advantage, when the UPDATE involves joins, is that less
data needs to pass through the plan tree. The disadvantage of course
is an extra fetch of each tuple to be updated. However, that seems to
be very nearly free in context; even worst-case tests don't show it to
add more than a couple percent to the total query cost. At some point
it might be interesting to combine the re-fetch with the tuple access
that ModifyTable must do anyway to mark the old tuple dead; but that
would require a good deal of refactoring and it seems it wouldn't buy
all that much, so this patch doesn't attempt it.
2. For inherited UPDATE/DELETE, instead of generating a separate
subplan for each target relation, we now generate a single subplan
that is just exactly like a SELECT's plan, then stick ModifyTable
on top of that. To let ModifyTable know which target relation a
given incoming row refers to, a tableoid junk column is added to
the row identity information. This gets rid of the horrid hack
that was inheritance_planner(), eliminating O(N^2) planning cost
and memory consumption in cases where there were many unprunable
target relations.
Point 2 of course requires point 1, so that there is a uniform
definition of the non-junk columns to be returned by the subplan.
We can't insist on uniform definition of the row identity junk
columns however, if we want to keep the ability to have both
plain and foreign tables in a partitioning hierarchy. Since
it wouldn't scale very far to have every child table have its
own row identity column, this patch includes provisions to merge
similar row identity columns into one column of the subplan result.
In particular, we can merge the whole-row Vars typically used as
row identity by FDWs into one column by pretending they are type
RECORD. (It's still okay for the actual composite Datums to be
labeled with the table's rowtype OID, though.)
There is more that can be done to file down residual inefficiencies
in this patch, but it seems to be committable now.
FDW authors should note several API changes:
* The argument list for AddForeignUpdateTargets() has changed, and so
has the method it must use for adding junk columns to the query. Call
add_row_identity_var() instead of manipulating the parse tree directly.
You might want to reconsider exactly what you're adding, too.
* PlanDirectModify() must now work a little harder to find the
ForeignScan plan node; if the foreign table is part of a partitioning
hierarchy then the ForeignScan might not be the direct child of
ModifyTable. See postgres_fdw for sample code.
* To check whether a relation is a target relation, it's no
longer sufficient to compare its relid to root->parse->resultRelation.
Instead, check it against all_result_relids or leaf_result_relids,
as appropriate.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
|
|
|
subroot->update_colnos = NIL;
|
Make the upper part of the planner work by generating and comparing Paths.
I've been saying we needed to do this for more than five years, and here it
finally is. This patch removes the ever-growing tangle of spaghetti logic
that grouping_planner() used to use to try to identify the best plan for
post-scan/join query steps. Now, there is (nearly) independent
consideration of each execution step, and entirely separate construction of
Paths to represent each of the possible ways to do that step. We choose
the best Path or set of Paths using the same add_path() logic that's been
used inside query_planner() for years.
In addition, this patch removes the old restriction that subquery_planner()
could return only a single Plan. It now returns a RelOptInfo containing a
set of Paths, just as query_planner() does, and the parent query level can
use each of those Paths as the basis of a SubqueryScanPath at its level.
This allows finding some optimizations that we missed before, wherein a
subquery was capable of returning presorted data and thereby avoiding a
sort in the parent level, making the overall cost cheaper even though
delivering sorted output was not the cheapest plan for the subquery in
isolation. (A couple of regression test outputs change in consequence of
that. However, there is very little change in visible planner behavior
overall, because the point of this patch is not to get immediate planning
benefits but to create the infrastructure for future improvements.)
There is a great deal left to do here. This patch unblocks a lot of
planner work that was basically impractical in the old code structure,
such as allowing FDWs to implement remote aggregation, or rewriting
plan_set_operations() to allow consideration of multiple implementation
orders for set operations. (The latter will likely require a full
rewrite of plan_set_operations(); what I've done here is only to fix it
to return Paths not Plans.) I have also left unfinished some localized
refactoring in createplan.c and planner.c, because it was not necessary
to get this patch to a working state.
Thanks to Robert Haas, David Rowley, and Amit Kapila for review.
2016-03-07 21:58:22 +01:00
|
|
|
subroot->grouping_map = NULL;
|
|
|
|
subroot->minmax_aggs = NIL;
|
Improve RLS planning by marking individual quals with security levels.
In an RLS query, we must ensure that security filter quals are evaluated
before ordinary query quals, in case the latter contain "leaky" functions
that could expose the contents of sensitive rows. The original
implementation of RLS planning ensured this by pushing the scan of a
secured table into a sub-query that it marked as a security-barrier view.
Unfortunately this results in very inefficient plans in many cases, because
the sub-query cannot be flattened and gets planned independently of the
rest of the query.
To fix, drop the use of sub-queries to enforce RLS qual order, and instead
mark each qual (RestrictInfo) with a security_level field establishing its
priority for evaluation. Quals must be evaluated in security_level order,
except that "leakproof" quals can be allowed to go ahead of quals of lower
security_level, if it's helpful to do so. This has to be enforced within
the ordering of any one list of quals to be evaluated at a table scan node,
and we also have to ensure that quals are not chosen for early evaluation
(i.e., use as an index qual or TID scan qual) if they're not allowed to go
ahead of other quals at the scan node.
This is sufficient to fix the problem for RLS quals, since we only support
RLS policies on simple tables and thus RLS quals will always exist at the
table scan level only. Eventually these qual ordering rules should be
enforced for join quals as well, which would permit improving planning for
explicit security-barrier views; but that's a task for another patch.
Note that FDWs would need to be aware of these rules --- and not, for
example, send an insecure qual for remote execution --- but since we do
not yet allow RLS policies on foreign tables, the case doesn't arise.
This will need to be addressed before we can allow such policies.
Patch by me, reviewed by Stephen Frost and Dean Rasheed.
Discussion: https://postgr.es/m/8185.1477432701@sss.pgh.pa.us
2017-01-18 18:58:20 +01:00
|
|
|
subroot->qual_security_level = 0;
|
2022-08-17 21:52:53 +02:00
|
|
|
subroot->placeholdersFrozen = false;
|
2008-10-04 23:56:55 +02:00
|
|
|
subroot->hasRecursion = false;
|
|
|
|
subroot->wt_param_id = -1;
|
Make the upper part of the planner work by generating and comparing Paths.
I've been saying we needed to do this for more than five years, and here it
finally is. This patch removes the ever-growing tangle of spaghetti logic
that grouping_planner() used to use to try to identify the best plan for
post-scan/join query steps. Now, there is (nearly) independent
consideration of each execution step, and entirely separate construction of
Paths to represent each of the possible ways to do that step. We choose
the best Path or set of Paths using the same add_path() logic that's been
used inside query_planner() for years.
In addition, this patch removes the old restriction that subquery_planner()
could return only a single Plan. It now returns a RelOptInfo containing a
set of Paths, just as query_planner() does, and the parent query level can
use each of those Paths as the basis of a SubqueryScanPath at its level.
This allows finding some optimizations that we missed before, wherein a
subquery was capable of returning presorted data and thereby avoiding a
sort in the parent level, making the overall cost cheaper even though
delivering sorted output was not the cheapest plan for the subquery in
isolation. (A couple of regression test outputs change in consequence of
that. However, there is very little change in visible planner behavior
overall, because the point of this patch is not to get immediate planning
benefits but to create the infrastructure for future improvements.)
There is a great deal left to do here. This patch unblocks a lot of
planner work that was basically impractical in the old code structure,
such as allowing FDWs to implement remote aggregation, or rewriting
plan_set_operations() to allow consideration of multiple implementation
orders for set operations. (The latter will likely require a full
rewrite of plan_set_operations(); what I've done here is only to fix it
to return Paths not Plans.) I have also left unfinished some localized
refactoring in createplan.c and planner.c, because it was not necessary
to get this patch to a working state.
Thanks to Robert Haas, David Rowley, and Amit Kapila for review.
2016-03-07 21:58:22 +01:00
|
|
|
subroot->non_recursive_path = NULL;
|
2023-01-30 19:50:25 +01:00
|
|
|
/* We don't currently need a top JoinDomain for the subroot */
|
2008-10-04 23:56:55 +02:00
|
|
|
|
|
|
|
/* No CTEs to worry about */
|
|
|
|
Assert(subquery->cteList == NIL);
|
2006-02-03 22:08:49 +01:00
|
|
|
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
/*
|
|
|
|
* If the FROM clause is empty, replace it with a dummy RTE_RESULT RTE, so
|
|
|
|
* that we don't need so many special cases to deal with that situation.
|
|
|
|
*/
|
|
|
|
replace_empty_jointree(subquery);
|
|
|
|
|
2006-02-03 22:08:49 +01:00
|
|
|
/*
|
2008-08-17 03:20:00 +02:00
|
|
|
* Pull up any SubLinks within the subquery's quals, so that we don't
|
2008-08-14 20:48:00 +02:00
|
|
|
* leave unoptimized SubLinks behind.
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
|
|
|
if (subquery->hasSubLinks)
|
2008-08-17 03:20:00 +02:00
|
|
|
pull_up_sublinks(subroot);
|
2006-02-03 22:08:49 +01:00
|
|
|
|
2008-03-18 23:04:14 +01:00
|
|
|
/*
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
* Similarly, preprocess its function RTEs to inline any set-returning
|
|
|
|
* functions in its rangetable.
|
2008-03-18 23:04:14 +01:00
|
|
|
*/
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
preprocess_function_rtes(subroot);
|
2008-03-18 23:04:14 +01:00
|
|
|
|
2006-02-03 22:08:49 +01:00
|
|
|
/*
|
|
|
|
* Recursively pull up the subquery's subqueries, so that
|
|
|
|
* pull_up_subqueries' processing is complete for its jointree and
|
|
|
|
* rangetable.
|
|
|
|
*
|
2015-03-12 04:18:03 +01:00
|
|
|
* Note: it's okay that the subquery's recursion starts with NULL for
|
|
|
|
* containing-join info, even if we are within an outer join in the upper
|
|
|
|
* query; the lower query starts with a clean slate for outer-join
|
|
|
|
* semantics. Likewise, we needn't pass down appendrel state.
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
2015-03-12 04:18:03 +01:00
|
|
|
pull_up_subqueries(subroot);
|
2006-02-03 22:08:49 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Now we must recheck whether the subquery is still simple enough to pull
|
|
|
|
* up. If not, abandon processing it.
|
|
|
|
*
|
|
|
|
* We don't really need to recheck all the conditions involved, but it's
|
|
|
|
* easier just to keep this "if" looking the same as the one in
|
2012-08-12 22:01:26 +02:00
|
|
|
* pull_up_subqueries_recurse.
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins. This could result in a malformed plan. The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.
The right value to use is the join level we actually intend to evaluate
the PHV at. We can get that from the ph_eval_at field of the associated
PlaceHolderInfo. However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level. (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.) Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.
The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos. We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter. (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)
Back-patch to v12. The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit 4be058fe9 allowed trivial
subqueries to be collapsed out completely, eliminating their join order
restrictions.
Per report from Stephan Springl.
Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 21:37:23 +01:00
|
|
|
if (is_simple_subquery(root, subquery, rte, lowest_outer_join) &&
|
2009-04-28 23:31:16 +02:00
|
|
|
(containing_appendrel == NULL || is_safe_append_member(subquery)))
|
2006-02-03 22:08:49 +01:00
|
|
|
{
|
|
|
|
/* good to go */
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Give up, return unmodified RangeTblRef.
|
|
|
|
*
|
|
|
|
* Note: The work we just did will be redone when the subquery gets
|
|
|
|
* planned on its own. Perhaps we could avoid that by storing the
|
|
|
|
* modified subquery back into the rangetable, but I'm not gonna risk
|
|
|
|
* it now.
|
|
|
|
*/
|
|
|
|
return jtnode;
|
|
|
|
}
|
|
|
|
|
Flatten join alias Vars before pulling up targetlist items from a subquery.
pullup_replace_vars()'s decisions about whether a pulled-up replacement
expression needs to be wrapped in a PlaceHolderVar depend on the assumption
that what looks like a Var behaves like a Var. However, if the Var is a
join alias reference, later flattening of join aliases might replace the
Var with something that's not a Var at all, and should have been wrapped.
To fix, do a forcible pass of flatten_join_alias_vars() on the subquery
targetlist before we start to copy items out of it. We'll re-run that
processing on the pulled-up expressions later, but that's harmless.
Per report from Ken Tanzer; the added regression test case is based on his
example. This bug has been there since the PlaceHolderVar mechanism was
invented, but has escaped detection because the circumstances that trigger
it are fairly narrow. You need a flattenable query underneath an outer
join, which contains another flattenable query inside a join of its own,
with a dangerous expression (a constant or something else non-strict)
in that one's targetlist.
Having seen this, I'm wondering if it wouldn't be prudent to do all
alias-variable flattening earlier, perhaps even in the rewriter.
But that would probably not be a back-patchable change.
2013-11-22 20:37:21 +01:00
|
|
|
/*
|
|
|
|
* We must flatten any join alias Vars in the subquery's targetlist,
|
|
|
|
* because pulling up the subquery's subqueries might have changed their
|
|
|
|
* expansions into arbitrary expressions, which could affect
|
|
|
|
* pullup_replace_vars' decisions about whether PlaceHolderVar wrappers
|
|
|
|
* are needed for tlist entries. (Likely it'd be better to do
|
|
|
|
* flatten_join_alias_vars on the whole query tree at some earlier stage,
|
|
|
|
* maybe even in the rewriter; but for now let's just fix this case here.)
|
|
|
|
*/
|
|
|
|
subquery->targetList = (List *)
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
flatten_join_alias_vars(subroot, subroot->parse,
|
|
|
|
(Node *) subquery->targetList);
|
Flatten join alias Vars before pulling up targetlist items from a subquery.
pullup_replace_vars()'s decisions about whether a pulled-up replacement
expression needs to be wrapped in a PlaceHolderVar depend on the assumption
that what looks like a Var behaves like a Var. However, if the Var is a
join alias reference, later flattening of join aliases might replace the
Var with something that's not a Var at all, and should have been wrapped.
To fix, do a forcible pass of flatten_join_alias_vars() on the subquery
targetlist before we start to copy items out of it. We'll re-run that
processing on the pulled-up expressions later, but that's harmless.
Per report from Ken Tanzer; the added regression test case is based on his
example. This bug has been there since the PlaceHolderVar mechanism was
invented, but has escaped detection because the circumstances that trigger
it are fairly narrow. You need a flattenable query underneath an outer
join, which contains another flattenable query inside a join of its own,
with a dangerous expression (a constant or something else non-strict)
in that one's targetlist.
Having seen this, I'm wondering if it wouldn't be prudent to do all
alias-variable flattening earlier, perhaps even in the rewriter.
But that would probably not be a back-patchable change.
2013-11-22 20:37:21 +01:00
|
|
|
|
2006-02-03 22:08:49 +01:00
|
|
|
/*
|
|
|
|
* Adjust level-0 varnos in subquery so that we can append its rangetable
|
2008-08-14 20:48:00 +02:00
|
|
|
* to upper query's. We have to fix the subquery's append_rel_list as
|
2008-10-22 22:17:52 +02:00
|
|
|
* well.
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
|
|
|
rtoffset = list_length(parse->rtable);
|
|
|
|
OffsetVarNodes((Node *) subquery, rtoffset, 0);
|
|
|
|
OffsetVarNodes((Node *) subroot->append_rel_list, rtoffset, 0);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Upper-level vars in subquery are now one level closer to their parent
|
|
|
|
* than before.
|
|
|
|
*/
|
|
|
|
IncrementVarSublevelsUp((Node *) subquery, -1, 1);
|
|
|
|
IncrementVarSublevelsUp((Node *) subroot->append_rel_list, -1, 1);
|
2008-10-21 22:42:53 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* The subquery's targetlist items are now in the appropriate form to
|
2018-01-12 18:24:50 +01:00
|
|
|
* insert into the top query, except that we may need to wrap them in
|
|
|
|
* PlaceHolderVars. Set up required context data for pullup_replace_vars.
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
* (Note that we should include the subquery's inner joins in relids,
|
|
|
|
* since it may include join alias vars referencing them.)
|
2008-10-21 22:42:53 +02:00
|
|
|
*/
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
rvcontext.root = root;
|
|
|
|
rvcontext.targetlist = subquery->targetList;
|
|
|
|
rvcontext.target_rte = rte;
|
2013-08-18 02:22:37 +02:00
|
|
|
if (rte->lateral)
|
|
|
|
rvcontext.relids = get_relids_in_jointree((Node *) subquery->jointree,
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
true, true);
|
2013-08-18 02:22:37 +02:00
|
|
|
else /* won't need relids */
|
|
|
|
rvcontext.relids = NULL;
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
rvcontext.outer_hasSubLinks = &parse->hasSubLinks;
|
|
|
|
rvcontext.varno = varno;
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
/* this flag will be set below, if needed */
|
2018-01-12 18:24:50 +01:00
|
|
|
rvcontext.wrap_non_vars = false;
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
/* initialize cache array with indexes 0 .. length(tlist) */
|
|
|
|
rvcontext.rv_cache = palloc0((list_length(subquery->targetList) + 1) *
|
|
|
|
sizeof(Node *));
|
2006-02-03 22:08:49 +01:00
|
|
|
|
2018-01-12 18:24:50 +01:00
|
|
|
/*
|
|
|
|
* If we are dealing with an appendrel member then anything that's not a
|
|
|
|
* simple Var has to be turned into a PlaceHolderVar. We force this to
|
|
|
|
* ensure that what we pull up doesn't get merged into a surrounding
|
|
|
|
* expression during later processing and then fail to match the
|
|
|
|
* expression actually available from the appendrel.
|
|
|
|
*/
|
|
|
|
if (containing_appendrel != NULL)
|
|
|
|
rvcontext.wrap_non_vars = true;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If the parent query uses grouping sets, we need a PlaceHolderVar for
|
|
|
|
* anything that's not a simple Var. Again, this ensures that expressions
|
|
|
|
* retain their separate identity so that they will match grouping set
|
|
|
|
* columns when appropriate. (It'd be sufficient to wrap values used in
|
|
|
|
* grouping set columns, and do so only in non-aggregated portions of the
|
|
|
|
* tlist and havingQual, but that would require a lot of infrastructure
|
|
|
|
* that pullup_replace_vars hasn't currently got.)
|
|
|
|
*/
|
|
|
|
if (parse->groupingSets)
|
|
|
|
rvcontext.wrap_non_vars = true;
|
|
|
|
|
2006-02-03 22:08:49 +01:00
|
|
|
/*
|
|
|
|
* Replace all of the top query's references to the subquery's outputs
|
|
|
|
* with copies of the adjusted subtlist items, being careful not to
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
* replace any of the jointree structure.
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
perform_pullup_replace_vars(root, &rvcontext,
|
|
|
|
containing_appendrel);
|
2006-02-03 22:08:49 +01:00
|
|
|
|
2012-08-12 22:01:26 +02:00
|
|
|
/*
|
|
|
|
* If the subquery had a LATERAL marker, propagate that to any of its
|
|
|
|
* child RTEs that could possibly now contain lateral cross-references.
|
|
|
|
* The children might or might not contain any actual lateral
|
|
|
|
* cross-references, but we have to mark the pulled-up child RTEs so that
|
|
|
|
* later planner stages will check for such.
|
|
|
|
*/
|
|
|
|
if (rte->lateral)
|
|
|
|
{
|
|
|
|
foreach(lc, subquery->rtable)
|
|
|
|
{
|
|
|
|
RangeTblEntry *child_rte = (RangeTblEntry *) lfirst(lc);
|
|
|
|
|
|
|
|
switch (child_rte->rtekind)
|
|
|
|
{
|
Redesign tablesample method API, and do extensive code review.
The original implementation of TABLESAMPLE modeled the tablesample method
API on index access methods, which wasn't a good choice because, without
specialized DDL commands, there's no way to build an extension that can
implement a TSM. (Raw inserts into system catalogs are not an acceptable
thing to do, because we can't undo them during DROP EXTENSION, nor will
pg_upgrade behave sanely.) Instead adopt an API more like procedural
language handlers or foreign data wrappers, wherein the only SQL-level
support object needed is a single handler function identified by having
a special return type. This lets us get rid of the supporting catalog
altogether, so that no custom DDL support is needed for the feature.
Adjust the API so that it can support non-constant tablesample arguments
(the original coding assumed we could evaluate the argument expressions at
ExecInitSampleScan time, which is undesirable even if it weren't outright
unsafe), and discourage sampling methods from looking at invisible tuples.
Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
within and across queries, as required by the SQL standard, and deal more
honestly with methods that can't support that requirement.
Make a full code-review pass over the tablesample additions, and fix
assorted bugs, omissions, infelicities, and cosmetic issues (such as
failure to put the added code stanzas in a consistent ordering).
Improve EXPLAIN's output of tablesample plans, too.
Back-patch to 9.5 so that we don't have to support the original API
in production.
2015-07-25 20:39:00 +02:00
|
|
|
case RTE_RELATION:
|
|
|
|
if (child_rte->tablesample)
|
|
|
|
child_rte->lateral = true;
|
|
|
|
break;
|
2012-08-12 22:01:26 +02:00
|
|
|
case RTE_SUBQUERY:
|
|
|
|
case RTE_FUNCTION:
|
|
|
|
case RTE_VALUES:
|
2017-03-08 16:39:37 +01:00
|
|
|
case RTE_TABLEFUNC:
|
2012-08-12 22:01:26 +02:00
|
|
|
child_rte->lateral = true;
|
|
|
|
break;
|
|
|
|
case RTE_JOIN:
|
|
|
|
case RTE_CTE:
|
2017-04-01 06:17:18 +02:00
|
|
|
case RTE_NAMEDTUPLESTORE:
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
case RTE_RESULT:
|
2012-08-12 22:01:26 +02:00
|
|
|
/* these can't contain any lateral references */
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2006-02-03 22:08:49 +01:00
|
|
|
/*
|
2022-12-06 16:09:24 +01:00
|
|
|
* Now append the adjusted rtable entries and their perminfos to upper
|
|
|
|
* query. (We hold off until after fixing the upper rtable entries; no
|
|
|
|
* point in running that code on the subquery ones too.)
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
2022-12-06 16:09:24 +01:00
|
|
|
CombineRangeTables(&parse->rtable, &parse->rteperminfos,
|
|
|
|
subquery->rtable, subquery->rteperminfos);
|
2006-02-03 22:08:49 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Pull up any FOR UPDATE/SHARE markers, too. (OffsetVarNodes already
|
2006-04-30 20:30:40 +02:00
|
|
|
* adjusted the marker rtindexes, so just concat the lists.)
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
|
|
|
parse->rowMarks = list_concat(parse->rowMarks, subquery->rowMarks);
|
|
|
|
|
|
|
|
/*
|
2009-02-25 04:30:38 +01:00
|
|
|
* We also have to fix the relid sets of any PlaceHolderVar nodes in the
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
* parent query. (This could perhaps be done by pullup_replace_vars(),
|
|
|
|
* but it seems cleaner to use two passes.) Note in particular that any
|
|
|
|
* PlaceHolderVar nodes just created by pullup_replace_vars() will be
|
2009-02-25 04:30:38 +01:00
|
|
|
* adjusted, so having created them with the subquery's varno is correct.
|
2006-02-03 22:08:49 +01:00
|
|
|
*
|
2008-11-11 20:05:21 +01:00
|
|
|
* Likewise, relids appearing in AppendRelInfo nodes have to be fixed. We
|
|
|
|
* already checked that this won't require introducing multiple subrelids
|
|
|
|
* into the single-slot AppendRelInfo structs.
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
2022-12-22 17:02:03 +01:00
|
|
|
if (root->glob->lastPHId != 0 || root->append_rel_list)
|
2006-02-03 22:08:49 +01:00
|
|
|
{
|
|
|
|
Relids subrelids;
|
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
subrelids = get_relids_in_jointree((Node *) subquery->jointree,
|
|
|
|
true, false);
|
2022-12-22 17:02:03 +01:00
|
|
|
if (root->glob->lastPHId != 0)
|
|
|
|
substitute_phv_relids((Node *) parse, varno, subrelids);
|
|
|
|
fix_append_rel_relids(root, varno, subrelids);
|
2006-02-03 22:08:49 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2008-10-22 22:17:52 +02:00
|
|
|
* And now add subquery's AppendRelInfos to our list.
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
|
|
|
root->append_rel_list = list_concat(root->append_rel_list,
|
|
|
|
subroot->append_rel_list);
|
|
|
|
|
|
|
|
/*
|
2015-12-11 21:52:16 +01:00
|
|
|
* We don't have to do the equivalent bookkeeping for outer-join info,
|
|
|
|
* because that hasn't been set up yet. placeholder_list likewise.
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
2008-08-14 20:48:00 +02:00
|
|
|
Assert(root->join_info_list == NIL);
|
|
|
|
Assert(subroot->join_info_list == NIL);
|
2008-10-22 22:17:52 +02:00
|
|
|
Assert(root->placeholder_list == NIL);
|
|
|
|
Assert(subroot->placeholder_list == NIL);
|
2006-02-03 22:08:49 +01:00
|
|
|
|
2021-07-06 20:23:09 +02:00
|
|
|
/*
|
|
|
|
* We no longer need the RTE's copy of the subquery's query tree. Getting
|
|
|
|
* rid of it saves nothing in particular so far as this level of query is
|
|
|
|
* concerned; but if this query level is in turn pulled up into a parent,
|
|
|
|
* we'd waste cycles copying the now-unused query tree.
|
|
|
|
*/
|
|
|
|
rte->subquery = NULL;
|
|
|
|
|
2006-02-03 22:08:49 +01:00
|
|
|
/*
|
|
|
|
* Miscellaneous housekeeping.
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
*
|
|
|
|
* Although replace_rte_variables() faithfully updated parse->hasSubLinks
|
|
|
|
* if it copied any SubLinks out of the subquery's targetlist, we still
|
|
|
|
* could have SubLinks added to the query in the expressions of FUNCTION
|
|
|
|
* and VALUES RTEs copied up from the subquery. So it's necessary to copy
|
|
|
|
* subquery->hasSubLinks anyway. Perhaps this can be improved someday.
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
|
|
|
parse->hasSubLinks |= subquery->hasSubLinks;
|
2009-06-11 16:49:15 +02:00
|
|
|
|
2016-11-10 22:16:33 +01:00
|
|
|
/* If subquery had any RLS conditions, now main query does too */
|
|
|
|
parse->hasRowSecurity |= subquery->hasRowSecurity;
|
|
|
|
|
2008-12-28 19:54:01 +01:00
|
|
|
/*
|
2016-09-13 19:54:24 +02:00
|
|
|
* subquery won't be pulled up if it hasAggs, hasWindowFuncs, or
|
|
|
|
* hasTargetSRFs, so no work needed on those flags
|
2008-12-28 19:54:01 +01:00
|
|
|
*/
|
2006-02-03 22:08:49 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Return the adjusted subquery jointree to replace the RangeTblRef entry
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
* in parent's jointree; or, if the FromExpr is degenerate, just return
|
|
|
|
* its single member.
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
Assert(IsA(subquery->jointree, FromExpr));
|
|
|
|
Assert(subquery->jointree->fromlist != NIL);
|
|
|
|
if (subquery->jointree->quals == NULL &&
|
|
|
|
list_length(subquery->jointree->fromlist) == 1)
|
|
|
|
return (Node *) linitial(subquery->jointree->fromlist);
|
2015-03-12 04:18:03 +01:00
|
|
|
|
2006-02-03 22:08:49 +01:00
|
|
|
return (Node *) subquery->jointree;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* pull_up_simple_union_all
|
|
|
|
* Pull up a single simple UNION ALL subquery.
|
|
|
|
*
|
|
|
|
* jtnode is a RangeTblRef that has been identified as a simple UNION ALL
|
|
|
|
* subquery by pull_up_subqueries. We pull up the leaf subqueries and
|
|
|
|
* build an "append relation" for the union set. The result value is just
|
|
|
|
* jtnode, since we don't actually need to change the query jointree.
|
|
|
|
*/
|
|
|
|
static Node *
|
|
|
|
pull_up_simple_union_all(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte)
|
|
|
|
{
|
|
|
|
int varno = ((RangeTblRef *) jtnode)->rtindex;
|
|
|
|
Query *subquery = rte->subquery;
|
2012-08-12 00:42:20 +02:00
|
|
|
int rtoffset = list_length(root->parse->rtable);
|
2008-08-14 22:31:29 +02:00
|
|
|
List *rtable;
|
|
|
|
|
|
|
|
/*
|
2012-08-12 00:42:20 +02:00
|
|
|
* Make a modifiable copy of the subquery's rtable, so we can adjust
|
|
|
|
* upper-level Vars in it. There are no such Vars in the setOperations
|
|
|
|
* tree proper, so fixing the rtable should be sufficient.
|
|
|
|
*/
|
|
|
|
rtable = copyObject(subquery->rtable);
|
|
|
|
|
|
|
|
/*
|
2008-08-14 22:31:29 +02:00
|
|
|
* Upper-level vars in subquery are now one level closer to their parent
|
|
|
|
* than before. We don't have to worry about offsetting varnos, though,
|
2012-08-12 00:42:20 +02:00
|
|
|
* because the UNION leaf queries can't cross-reference each other.
|
2008-08-14 22:31:29 +02:00
|
|
|
*/
|
|
|
|
IncrementVarSublevelsUp_rtable(rtable, -1, 1);
|
2012-08-12 00:42:20 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If the UNION ALL subquery had a LATERAL marker, propagate that to all
|
|
|
|
* its children. The individual children might or might not contain any
|
|
|
|
* actual lateral cross-references, but we have to mark the pulled-up
|
|
|
|
* child RTEs so that later planner stages will check for such.
|
|
|
|
*/
|
|
|
|
if (rte->lateral)
|
|
|
|
{
|
|
|
|
ListCell *rt;
|
|
|
|
|
|
|
|
foreach(rt, rtable)
|
|
|
|
{
|
|
|
|
RangeTblEntry *child_rte = (RangeTblEntry *) lfirst(rt);
|
|
|
|
|
|
|
|
Assert(child_rte->rtekind == RTE_SUBQUERY);
|
|
|
|
child_rte->lateral = true;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2022-12-06 16:09:24 +01:00
|
|
|
* Append child RTEs (and their perminfos) to parent rtable.
|
2012-08-12 00:42:20 +02:00
|
|
|
*/
|
2022-12-06 16:09:24 +01:00
|
|
|
CombineRangeTables(&root->parse->rtable, &root->parse->rteperminfos,
|
|
|
|
rtable, subquery->rteperminfos);
|
2006-02-03 22:08:49 +01:00
|
|
|
|
|
|
|
/*
|
2008-08-14 22:31:29 +02:00
|
|
|
* Recursively scan the subquery's setOperations tree and add
|
|
|
|
* AppendRelInfo nodes for leaf subqueries to the parent's
|
2010-11-08 21:15:02 +01:00
|
|
|
* append_rel_list. Also apply pull_up_subqueries to the leaf subqueries.
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
|
|
|
Assert(subquery->setOperations);
|
2008-08-14 22:31:29 +02:00
|
|
|
pull_up_union_leaf_queries(subquery->setOperations, root, varno, subquery,
|
|
|
|
rtoffset);
|
2006-02-03 22:08:49 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Mark the parent as an append relation.
|
|
|
|
*/
|
|
|
|
rte->inh = true;
|
|
|
|
|
|
|
|
return jtnode;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* pull_up_union_leaf_queries -- recursive guts of pull_up_simple_union_all
|
|
|
|
*
|
2010-11-08 21:15:02 +01:00
|
|
|
* Build an AppendRelInfo for each leaf query in the setop tree, and then
|
|
|
|
* apply pull_up_subqueries to the leaf query.
|
|
|
|
*
|
|
|
|
* Note that setOpQuery is the Query containing the setOp node, whose tlist
|
|
|
|
* contains references to all the setop output columns. When called from
|
|
|
|
* pull_up_simple_union_all, this is *not* the same as root->parse, which is
|
|
|
|
* the parent Query we are pulling up into.
|
2008-08-14 22:31:29 +02:00
|
|
|
*
|
2006-02-03 22:08:49 +01:00
|
|
|
* parentRTindex is the appendrel parent's index in root->parse->rtable.
|
2008-08-14 22:31:29 +02:00
|
|
|
*
|
2010-11-08 21:15:02 +01:00
|
|
|
* The child RTEs have already been copied to the parent. childRToffset
|
|
|
|
* tells us where in the parent's range table they were copied. When called
|
|
|
|
* from flatten_simple_union_all, childRToffset is 0 since the child RTEs
|
|
|
|
* were already in root->parse->rtable and no RT index adjustment is needed.
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
|
|
|
static void
|
|
|
|
pull_up_union_leaf_queries(Node *setOp, PlannerInfo *root, int parentRTindex,
|
2008-08-14 22:31:29 +02:00
|
|
|
Query *setOpQuery, int childRToffset)
|
2006-02-03 22:08:49 +01:00
|
|
|
{
|
|
|
|
if (IsA(setOp, RangeTblRef))
|
|
|
|
{
|
|
|
|
RangeTblRef *rtr = (RangeTblRef *) setOp;
|
|
|
|
int childRTindex;
|
|
|
|
AppendRelInfo *appinfo;
|
|
|
|
|
|
|
|
/*
|
2008-08-14 22:31:29 +02:00
|
|
|
* Calculate the index in the parent's range table
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
2008-08-14 22:31:29 +02:00
|
|
|
childRTindex = childRToffset + rtr->rtindex;
|
2006-02-03 22:08:49 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Build a suitable AppendRelInfo, and attach to parent's list.
|
|
|
|
*/
|
|
|
|
appinfo = makeNode(AppendRelInfo);
|
|
|
|
appinfo->parent_relid = parentRTindex;
|
|
|
|
appinfo->child_relid = childRTindex;
|
|
|
|
appinfo->parent_reltype = InvalidOid;
|
|
|
|
appinfo->child_reltype = InvalidOid;
|
2019-12-03 00:05:29 +01:00
|
|
|
make_setop_translation_list(setOpQuery, childRTindex, appinfo);
|
2006-02-03 22:08:49 +01:00
|
|
|
appinfo->parent_reloid = InvalidOid;
|
|
|
|
root->append_rel_list = lappend(root->append_rel_list, appinfo);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Recursively apply pull_up_subqueries to the new child RTE. (We
|
2022-12-22 17:02:03 +01:00
|
|
|
* must build the AppendRelInfo first, because this will modify it;
|
|
|
|
* indeed, that's the only part of the upper query where Vars
|
|
|
|
* referencing childRTindex can exist at this point.)
|
|
|
|
*
|
2009-04-28 23:31:16 +02:00
|
|
|
* Note that we can pass NULL for containing-join info even if we're
|
2006-02-03 22:08:49 +01:00
|
|
|
* actually under an outer join, because the child's expressions
|
2015-03-12 04:18:03 +01:00
|
|
|
* aren't going to propagate up to the join. Also, we ignore the
|
|
|
|
* possibility that pull_up_subqueries_recurse() returns a different
|
|
|
|
* jointree node than what we pass it; if it does, the important thing
|
|
|
|
* is that it replaced the child relid in the AppendRelInfo node.
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
|
|
|
rtr = makeNode(RangeTblRef);
|
|
|
|
rtr->rtindex = childRTindex;
|
2012-08-12 22:01:26 +02:00
|
|
|
(void) pull_up_subqueries_recurse(root, (Node *) rtr,
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
NULL, appinfo);
|
2006-02-03 22:08:49 +01:00
|
|
|
}
|
|
|
|
else if (IsA(setOp, SetOperationStmt))
|
|
|
|
{
|
|
|
|
SetOperationStmt *op = (SetOperationStmt *) setOp;
|
|
|
|
|
|
|
|
/* Recurse to reach leaf queries */
|
2008-08-14 22:31:29 +02:00
|
|
|
pull_up_union_leaf_queries(op->larg, root, parentRTindex, setOpQuery,
|
|
|
|
childRToffset);
|
|
|
|
pull_up_union_leaf_queries(op->rarg, root, parentRTindex, setOpQuery,
|
|
|
|
childRToffset);
|
2006-02-03 22:08:49 +01:00
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
elog(ERROR, "unrecognized node type: %d",
|
|
|
|
(int) nodeTag(setOp));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2008-11-11 19:13:32 +01:00
|
|
|
* make_setop_translation_list
|
|
|
|
* Build the list of translations from parent Vars to child Vars for
|
|
|
|
* a UNION ALL member. (At this point it's just a simple list of
|
|
|
|
* referencing Vars, but if we succeed in pulling up the member
|
|
|
|
* subquery, the Vars will get replaced by pulled-up expressions.)
|
2019-12-03 00:05:29 +01:00
|
|
|
* Also create the rather trivial reverse-translation array.
|
2006-02-03 22:08:49 +01:00
|
|
|
*/
|
|
|
|
static void
|
Remove arbitrary 64K-or-so limit on rangetable size.
Up to now the size of a query's rangetable has been limited by the
constants INNER_VAR et al, which mustn't be equal to any real
rangetable index. 65000 doubtless seemed like enough for anybody,
and it still is orders of magnitude larger than the number of joins
we can realistically handle. However, we need a rangetable entry
for each child partition that is (or might be) processed by a query.
Queries with a few thousand partitions are getting more realistic,
so that the day when that limit becomes a problem is in sight,
even if it's not here yet. Hence, let's raise the limit.
Rather than just increase the values of INNER_VAR et al, this patch
adopts the approach of making them small negative values, so that
rangetables could theoretically become as long as INT_MAX.
The bulk of the patch is concerned with changing Var.varno and some
related variables from "Index" (unsigned int) to plain "int". This
is basically cosmetic, with little actual effect other than to help
debuggers print their values nicely. As such, I've only bothered
with changing places that could actually see INNER_VAR et al, which
the parser and most of the planner don't. We do have to be careful
in places that are performing less/greater comparisons on varnos,
but there are very few such places, other than the IS_SPECIAL_VARNO
macro itself.
A notable side effect of this patch is that while it used to be
possible to add INNER_VAR et al to a Bitmapset, that will now
draw an error. I don't see any likelihood that it wouldn't be a
bug to include these fake varnos in a bitmapset of real varnos,
so I think this is all to the good.
Although this touches outfuncs/readfuncs, I don't think a catversion
bump is required, since stored rules would never contain Vars
with these fake varnos.
Andrey Lepikhov and Tom Lane, after a suggestion by Peter Eisentraut
Discussion: https://postgr.es/m/43c7f2f5-1e27-27aa-8c65-c91859d15190@postgrespro.ru
2021-09-15 20:11:21 +02:00
|
|
|
make_setop_translation_list(Query *query, int newvarno,
|
2019-12-03 00:05:29 +01:00
|
|
|
AppendRelInfo *appinfo)
|
2006-02-03 22:08:49 +01:00
|
|
|
{
|
|
|
|
List *vars = NIL;
|
2019-12-03 00:05:29 +01:00
|
|
|
AttrNumber *pcolnos;
|
2006-02-03 22:08:49 +01:00
|
|
|
ListCell *l;
|
|
|
|
|
2019-12-03 00:05:29 +01:00
|
|
|
/* Initialize reverse-translation array with all entries zero */
|
|
|
|
/* (entries for resjunk columns will stay that way) */
|
|
|
|
appinfo->num_child_cols = list_length(query->targetList);
|
|
|
|
appinfo->parent_colnos = pcolnos =
|
|
|
|
(AttrNumber *) palloc0(appinfo->num_child_cols * sizeof(AttrNumber));
|
|
|
|
|
2006-02-03 22:08:49 +01:00
|
|
|
foreach(l, query->targetList)
|
|
|
|
{
|
|
|
|
TargetEntry *tle = (TargetEntry *) lfirst(l);
|
|
|
|
|
|
|
|
if (tle->resjunk)
|
|
|
|
continue;
|
|
|
|
|
2010-08-27 22:30:08 +02:00
|
|
|
vars = lappend(vars, makeVarFromTargetEntry(newvarno, tle));
|
2019-12-03 00:05:29 +01:00
|
|
|
pcolnos[tle->resno - 1] = tle->resno;
|
2006-02-03 22:08:49 +01:00
|
|
|
}
|
|
|
|
|
2019-12-03 00:05:29 +01:00
|
|
|
appinfo->translated_vars = vars;
|
2006-02-03 22:08:49 +01:00
|
|
|
}
|
|
|
|
|
2003-01-20 19:55:07 +01:00
|
|
|
/*
|
|
|
|
* is_simple_subquery
|
|
|
|
* Check a subquery in the range table to see if it's simple enough
|
|
|
|
* to pull up into the parent query.
|
2012-08-12 22:01:26 +02:00
|
|
|
*
|
|
|
|
* rte is the RTE_SUBQUERY RangeTblEntry that contained the subquery.
|
|
|
|
* (Note subquery is not necessarily equal to rte->subquery; it could be a
|
|
|
|
* processed copy of that.)
|
|
|
|
* lowest_outer_join is the lowest outer join above the subquery, or NULL.
|
2003-01-20 19:55:07 +01:00
|
|
|
*/
|
|
|
|
static bool
|
Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins. This could result in a malformed plan. The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.
The right value to use is the join level we actually intend to evaluate
the PHV at. We can get that from the ph_eval_at field of the associated
PlaceHolderInfo. However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level. (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.) Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.
The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos. We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter. (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)
Back-patch to v12. The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit 4be058fe9 allowed trivial
subqueries to be collapsed out completely, eliminating their join order
restrictions.
Per report from Stephan Springl.
Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 21:37:23 +01:00
|
|
|
is_simple_subquery(PlannerInfo *root, Query *subquery, RangeTblEntry *rte,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
JoinExpr *lowest_outer_join)
|
2003-01-20 19:55:07 +01:00
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Let's just make sure it's a valid subselect ...
|
|
|
|
*/
|
|
|
|
if (!IsA(subquery, Query) ||
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
subquery->commandType != CMD_SELECT)
|
2003-07-25 02:01:09 +02:00
|
|
|
elog(ERROR, "subquery is bogus");
|
2003-01-20 19:55:07 +01:00
|
|
|
|
|
|
|
/*
|
2006-02-03 22:08:49 +01:00
|
|
|
* Can't currently pull up a query with setops (unless it's simple UNION
|
|
|
|
* ALL, which is handled by a different code path). Maybe after querytree
|
2003-01-20 19:55:07 +01:00
|
|
|
* redesign...
|
|
|
|
*/
|
|
|
|
if (subquery->setOperations)
|
|
|
|
return false;
|
|
|
|
|
|
|
|
/*
|
2016-09-13 19:54:24 +02:00
|
|
|
* Can't pull up a subquery involving grouping, aggregation, SRFs,
|
|
|
|
* sorting, limiting, or WITH. (XXX WITH could possibly be allowed later)
|
2009-10-28 15:55:47 +01:00
|
|
|
*
|
|
|
|
* We also don't pull up a subquery that has explicit FOR UPDATE/SHARE
|
|
|
|
* clauses, because pullup would cause the locking to occur semantically
|
|
|
|
* higher than it should. Implicit FOR UPDATE/SHARE is okay because in
|
|
|
|
* that case the locking was originally declared in the upper query
|
|
|
|
* anyway.
|
2003-01-20 19:55:07 +01:00
|
|
|
*/
|
|
|
|
if (subquery->hasAggs ||
|
2008-12-28 19:54:01 +01:00
|
|
|
subquery->hasWindowFuncs ||
|
2016-09-13 19:54:24 +02:00
|
|
|
subquery->hasTargetSRFs ||
|
2003-01-20 19:55:07 +01:00
|
|
|
subquery->groupClause ||
|
Support GROUPING SETS, CUBE and ROLLUP.
This SQL standard functionality allows to aggregate data by different
GROUP BY clauses at once. Each grouping set returns rows with columns
grouped by in other sets set to NULL.
This could previously be achieved by doing each grouping as a separate
query, conjoined by UNION ALLs. Besides being considerably more concise,
grouping sets will in many cases be faster, requiring only one scan over
the underlying data.
The current implementation of grouping sets only supports using sorting
for input. Individual sets that share a sort order are computed in one
pass. If there are sets that don't share a sort order, additional sort &
aggregation steps are performed. These additional passes are sourced by
the previous sort step; thus avoiding repeated scans of the source data.
The code is structured in a way that adding support for purely using
hash aggregation or a mix of hashing and sorting is possible. Sorting
was chosen to be supported first, as it is the most generic method of
implementation.
Instead of, as in an earlier versions of the patch, representing the
chain of sort and aggregation steps as full blown planner and executor
nodes, all but the first sort are performed inside the aggregation node
itself. This avoids the need to do some unusual gymnastics to handle
having to return aggregated and non-aggregated tuples from underlying
nodes, as well as having to shut down underlying nodes early to limit
memory usage. The optimizer still builds Sort/Agg node to describe each
phase, but they're not part of the plan tree, but instead additional
data for the aggregation node. They're a convenient and preexisting way
to describe aggregation and sorting. The first (and possibly only) sort
step is still performed as a separate execution step. That retains
similarity with existing group by plans, makes rescans fairly simple,
avoids very deep plans (leading to slow explains) and easily allows to
avoid the sorting step if the underlying data is sorted by other means.
A somewhat ugly side of this patch is having to deal with a grammar
ambiguity between the new CUBE keyword and the cube extension/functions
named cube (and rollup). To avoid breaking existing deployments of the
cube extension it has not been renamed, neither has cube been made a
reserved keyword. Instead precedence hacking is used to make GROUP BY
cube(..) refer to the CUBE grouping sets feature, and not the function
cube(). To actually group by a function cube(), unlikely as that might
be, the function name has to be quoted.
Needs a catversion bump because stored rules may change.
Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund
Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas
Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule
Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
|
|
|
subquery->groupingSets ||
|
2003-01-20 19:55:07 +01:00
|
|
|
subquery->havingQual ||
|
|
|
|
subquery->sortClause ||
|
|
|
|
subquery->distinctClause ||
|
|
|
|
subquery->limitOffset ||
|
2008-10-04 23:56:55 +02:00
|
|
|
subquery->limitCount ||
|
2009-10-28 15:55:47 +01:00
|
|
|
subquery->hasForUpdate ||
|
2008-10-04 23:56:55 +02:00
|
|
|
subquery->cteList)
|
2003-01-20 19:55:07 +01:00
|
|
|
return false;
|
|
|
|
|
2012-08-08 01:02:54 +02:00
|
|
|
/*
|
2012-08-12 22:01:26 +02:00
|
|
|
* Don't pull up if the RTE represents a security-barrier view; we
|
|
|
|
* couldn't prevent information leakage once the RTE's Vars are scattered
|
|
|
|
* about in the upper query.
|
2012-08-08 01:02:54 +02:00
|
|
|
*/
|
2012-08-12 22:01:26 +02:00
|
|
|
if (rte->security_barrier)
|
2012-08-08 01:02:54 +02:00
|
|
|
return false;
|
|
|
|
|
2012-08-12 22:01:26 +02:00
|
|
|
/*
|
2013-08-19 19:19:25 +02:00
|
|
|
* If the subquery is LATERAL, check for pullup restrictions from that.
|
2012-08-12 22:01:26 +02:00
|
|
|
*/
|
2013-08-19 19:19:25 +02:00
|
|
|
if (rte->lateral)
|
2012-08-12 22:01:26 +02:00
|
|
|
{
|
2013-08-19 19:19:25 +02:00
|
|
|
bool restricted;
|
|
|
|
Relids safe_upper_varnos;
|
2012-08-12 22:01:26 +02:00
|
|
|
|
2013-08-19 19:19:25 +02:00
|
|
|
/*
|
|
|
|
* The subquery's WHERE and JOIN/ON quals mustn't contain any lateral
|
|
|
|
* references to rels outside a higher outer join (including the case
|
|
|
|
* where the outer join is within the subquery itself). In such a
|
|
|
|
* case, pulling up would result in a situation where we need to
|
|
|
|
* postpone quals from below an outer join to above it, which is
|
|
|
|
* probably completely wrong and in any case is a complication that
|
|
|
|
* doesn't seem worth addressing at the moment.
|
|
|
|
*/
|
|
|
|
if (lowest_outer_join != NULL)
|
|
|
|
{
|
|
|
|
restricted = true;
|
|
|
|
safe_upper_varnos = get_relids_in_jointree((Node *) lowest_outer_join,
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
true, true);
|
2013-08-19 19:19:25 +02:00
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
restricted = false;
|
|
|
|
safe_upper_varnos = NULL; /* doesn't matter */
|
|
|
|
}
|
|
|
|
|
Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins. This could result in a malformed plan. The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.
The right value to use is the join level we actually intend to evaluate
the PHV at. We can get that from the ph_eval_at field of the associated
PlaceHolderInfo. However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level. (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.) Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.
The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos. We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter. (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)
Back-patch to v12. The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit 4be058fe9 allowed trivial
subqueries to be collapsed out completely, eliminating their join order
restrictions.
Per report from Stephan Springl.
Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 21:37:23 +01:00
|
|
|
if (jointree_contains_lateral_outer_refs(root,
|
|
|
|
(Node *) subquery->jointree,
|
2013-08-19 19:19:25 +02:00
|
|
|
restricted, safe_upper_varnos))
|
2012-08-12 22:01:26 +02:00
|
|
|
return false;
|
2013-08-19 19:19:25 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If there's an outer join above the LATERAL subquery, also disallow
|
|
|
|
* pullup if the subquery's targetlist has any references to rels
|
|
|
|
* outside the outer join, since these might get pulled into quals
|
|
|
|
* above the subquery (but in or below the outer join) and then lead
|
|
|
|
* to qual-postponement issues similar to the case checked for above.
|
|
|
|
* (We wouldn't need to prevent pullup if no such references appear in
|
|
|
|
* outer-query quals, but we don't have enough info here to check
|
|
|
|
* that. Also, maybe this restriction could be removed if we forced
|
|
|
|
* such refs to be wrapped in PlaceHolderVars, even when they're below
|
|
|
|
* the nearest outer join? But it's a pretty hokey usage, so not
|
|
|
|
* clear this is worth sweating over.)
|
|
|
|
*/
|
|
|
|
if (lowest_outer_join != NULL)
|
|
|
|
{
|
Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins. This could result in a malformed plan. The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.
The right value to use is the join level we actually intend to evaluate
the PHV at. We can get that from the ph_eval_at field of the associated
PlaceHolderInfo. However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level. (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.) Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.
The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos. We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter. (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)
Back-patch to v12. The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit 4be058fe9 allowed trivial
subqueries to be collapsed out completely, eliminating their join order
restrictions.
Per report from Stephan Springl.
Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 21:37:23 +01:00
|
|
|
Relids lvarnos = pull_varnos_of_level(root,
|
|
|
|
(Node *) subquery->targetList,
|
|
|
|
1);
|
2013-08-19 19:19:25 +02:00
|
|
|
|
|
|
|
if (!bms_is_subset(lvarnos, safe_upper_varnos))
|
|
|
|
return false;
|
|
|
|
}
|
2012-08-12 22:01:26 +02:00
|
|
|
}
|
|
|
|
|
2006-08-19 04:48:53 +02:00
|
|
|
/*
|
|
|
|
* Don't pull up a subquery that has any volatile functions in its
|
|
|
|
* targetlist. Otherwise we might introduce multiple evaluations of these
|
|
|
|
* functions, if they get copied to multiple places in the upper query,
|
2008-10-21 22:42:53 +02:00
|
|
|
* leading to surprising results. (Note: the PlaceHolderVar mechanism
|
|
|
|
* doesn't quite guarantee single evaluation; else we could pull up anyway
|
|
|
|
* and just wrap such items in PlaceHolderVars ...)
|
2006-08-19 04:48:53 +02:00
|
|
|
*/
|
|
|
|
if (contain_volatile_functions((Node *) subquery->targetList))
|
2003-01-20 19:55:07 +01:00
|
|
|
return false;
|
|
|
|
|
2015-03-12 04:18:03 +01:00
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* pull_up_simple_values
|
|
|
|
* Pull up a single simple VALUES RTE.
|
|
|
|
*
|
|
|
|
* jtnode is a RangeTblRef that has been identified as a simple VALUES RTE
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
* by pull_up_subqueries. We always return a RangeTblRef representing a
|
|
|
|
* RESULT RTE to replace it (all failure cases should have been detected by
|
|
|
|
* is_simple_values()). Actually, what we return is just jtnode, because
|
|
|
|
* we replace the VALUES RTE in the rangetable with the RESULT RTE.
|
2015-03-12 04:18:03 +01:00
|
|
|
*
|
|
|
|
* rte is the RangeTblEntry referenced by jtnode. Because of the limited
|
|
|
|
* possible usage of VALUES RTEs, we do not need the remaining parameters
|
|
|
|
* of pull_up_subqueries_recurse.
|
|
|
|
*/
|
|
|
|
static Node *
|
|
|
|
pull_up_simple_values(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte)
|
|
|
|
{
|
|
|
|
Query *parse = root->parse;
|
|
|
|
int varno = ((RangeTblRef *) jtnode)->rtindex;
|
|
|
|
List *values_list;
|
|
|
|
List *tlist;
|
|
|
|
AttrNumber attrno;
|
|
|
|
pullup_replace_vars_context rvcontext;
|
|
|
|
ListCell *lc;
|
|
|
|
|
|
|
|
Assert(rte->rtekind == RTE_VALUES);
|
|
|
|
Assert(list_length(rte->values_lists) == 1);
|
|
|
|
|
2003-01-20 19:55:07 +01:00
|
|
|
/*
|
2015-03-12 04:18:03 +01:00
|
|
|
* Need a modifiable copy of the VALUES list to hack on, just in case it's
|
|
|
|
* multiply referenced.
|
2003-01-20 19:55:07 +01:00
|
|
|
*/
|
2017-03-09 21:18:59 +01:00
|
|
|
values_list = copyObject(linitial(rte->values_lists));
|
2015-03-12 04:18:03 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* The VALUES RTE can't contain any Vars of level zero, let alone any that
|
|
|
|
* are join aliases, so no need to flatten join alias Vars.
|
|
|
|
*/
|
|
|
|
Assert(!contain_vars_of_level((Node *) values_list, 0));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Set up required context data for pullup_replace_vars. In particular,
|
|
|
|
* we have to make the VALUES list look like a subquery targetlist.
|
|
|
|
*/
|
|
|
|
tlist = NIL;
|
|
|
|
attrno = 1;
|
|
|
|
foreach(lc, values_list)
|
|
|
|
{
|
|
|
|
tlist = lappend(tlist,
|
|
|
|
makeTargetEntry((Expr *) lfirst(lc),
|
|
|
|
attrno,
|
|
|
|
NULL,
|
|
|
|
false));
|
|
|
|
attrno++;
|
|
|
|
}
|
|
|
|
rvcontext.root = root;
|
|
|
|
rvcontext.targetlist = tlist;
|
|
|
|
rvcontext.target_rte = rte;
|
|
|
|
rvcontext.relids = NULL;
|
|
|
|
rvcontext.outer_hasSubLinks = &parse->hasSubLinks;
|
|
|
|
rvcontext.varno = varno;
|
|
|
|
rvcontext.wrap_non_vars = false;
|
|
|
|
/* initialize cache array with indexes 0 .. length(tlist) */
|
|
|
|
rvcontext.rv_cache = palloc0((list_length(tlist) + 1) *
|
|
|
|
sizeof(Node *));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Replace all of the top query's references to the RTE's outputs with
|
|
|
|
* copies of the adjusted VALUES expressions, being careful not to replace
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
* any of the jointree structure. We can assume there's no outer joins or
|
|
|
|
* appendrels in the dummy Query that surrounds a VALUES RTE.
|
2015-03-12 04:18:03 +01:00
|
|
|
*/
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
perform_pullup_replace_vars(root, &rvcontext, NULL);
|
2015-03-12 04:18:03 +01:00
|
|
|
|
|
|
|
/*
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
* There should be no appendrels to fix, nor any outer joins and hence no
|
|
|
|
* PlaceHolderVars.
|
2015-03-12 04:18:03 +01:00
|
|
|
*/
|
|
|
|
Assert(root->append_rel_list == NIL);
|
|
|
|
Assert(root->join_info_list == NIL);
|
|
|
|
Assert(root->placeholder_list == NIL);
|
|
|
|
|
|
|
|
/*
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
* Replace the VALUES RTE with a RESULT RTE. The VALUES RTE is the only
|
|
|
|
* rtable entry in the current query level, so this is easy.
|
2015-03-12 04:18:03 +01:00
|
|
|
*/
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
Assert(list_length(parse->rtable) == 1);
|
|
|
|
|
|
|
|
/* Create suitable RTE */
|
|
|
|
rte = makeNode(RangeTblEntry);
|
|
|
|
rte->rtekind = RTE_RESULT;
|
|
|
|
rte->eref = makeAlias("*RESULT*", NIL);
|
|
|
|
|
|
|
|
/* Replace rangetable */
|
|
|
|
parse->rtable = list_make1(rte);
|
|
|
|
|
|
|
|
/* We could manufacture a new RangeTblRef, but the one we have is fine */
|
|
|
|
Assert(varno == 1);
|
|
|
|
|
|
|
|
return jtnode;
|
2015-03-12 04:18:03 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* is_simple_values
|
|
|
|
* Check a VALUES RTE in the range table to see if it's simple enough
|
|
|
|
* to pull up into the parent query.
|
|
|
|
*
|
|
|
|
* rte is the RTE_VALUES RangeTblEntry to check.
|
|
|
|
*/
|
|
|
|
static bool
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
is_simple_values(PlannerInfo *root, RangeTblEntry *rte)
|
2015-03-12 04:18:03 +01:00
|
|
|
{
|
|
|
|
Assert(rte->rtekind == RTE_VALUES);
|
|
|
|
|
|
|
|
/*
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
* There must be exactly one VALUES list, else it's not semantically
|
|
|
|
* correct to replace the VALUES RTE with a RESULT RTE, nor would we have
|
|
|
|
* a unique set of expressions to substitute into the parent query.
|
2015-03-12 04:18:03 +01:00
|
|
|
*/
|
|
|
|
if (list_length(rte->values_lists) != 1)
|
|
|
|
return false;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Because VALUES can't appear under an outer join (or at least, we won't
|
2015-07-26 23:44:27 +02:00
|
|
|
* try to pull it up if it does), we need not worry about LATERAL, nor
|
|
|
|
* about validity of PHVs for the VALUES' outputs.
|
2015-03-12 04:18:03 +01:00
|
|
|
*/
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Don't pull up a VALUES that contains any set-returning or volatile
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
* functions. The considerations here are basically identical to the
|
|
|
|
* restrictions on a pull-able subquery's targetlist.
|
2015-03-12 04:18:03 +01:00
|
|
|
*/
|
|
|
|
if (expression_returns_set((Node *) rte->values_lists) ||
|
|
|
|
contain_volatile_functions((Node *) rte->values_lists))
|
|
|
|
return false;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Do not pull up a VALUES that's not the only RTE in its parent query.
|
|
|
|
* This is actually the only case that the parser will generate at the
|
|
|
|
* moment, and assuming this is true greatly simplifies
|
|
|
|
* pull_up_simple_values().
|
|
|
|
*/
|
|
|
|
if (list_length(root->parse->rtable) != 1 ||
|
|
|
|
rte != (RangeTblEntry *) linitial(root->parse->rtable))
|
2003-01-20 19:55:07 +01:00
|
|
|
return false;
|
|
|
|
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
/*
|
|
|
|
* pull_up_constant_function
|
|
|
|
* Pull up an RTE_FUNCTION expression that was simplified to a constant.
|
|
|
|
*
|
|
|
|
* jtnode is a RangeTblRef that has been identified as a FUNCTION RTE by
|
|
|
|
* pull_up_subqueries. If its expression is just a Const, hoist that value
|
|
|
|
* up into the parent query, and replace the RTE_FUNCTION with RTE_RESULT.
|
|
|
|
*
|
|
|
|
* In principle we could pull up any immutable expression, but we don't.
|
|
|
|
* That might result in multiple evaluations of the expression, which could
|
|
|
|
* be costly if it's not just a Const. Also, the main value of this is
|
|
|
|
* to let the constant participate in further const-folding, and of course
|
|
|
|
* that won't happen for a non-Const.
|
|
|
|
*
|
|
|
|
* The pulled-up value might need to be wrapped in a PlaceHolderVar if the
|
|
|
|
* RTE is below an outer join or is part of an appendrel; the extra
|
|
|
|
* parameters show whether that's needed.
|
|
|
|
*/
|
|
|
|
static Node *
|
|
|
|
pull_up_constant_function(PlannerInfo *root, Node *jtnode,
|
|
|
|
RangeTblEntry *rte,
|
|
|
|
AppendRelInfo *containing_appendrel)
|
|
|
|
{
|
|
|
|
Query *parse = root->parse;
|
|
|
|
RangeTblFunction *rtf;
|
2019-09-24 18:11:32 +02:00
|
|
|
TypeFuncClass functypclass;
|
|
|
|
Oid funcrettype;
|
|
|
|
TupleDesc tupdesc;
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
pullup_replace_vars_context rvcontext;
|
|
|
|
|
|
|
|
/* Fail if the RTE has ORDINALITY - we don't implement that here. */
|
|
|
|
if (rte->funcordinality)
|
|
|
|
return jtnode;
|
|
|
|
|
|
|
|
/* Fail if RTE isn't a single, simple Const expr */
|
|
|
|
if (list_length(rte->functions) != 1)
|
|
|
|
return jtnode;
|
|
|
|
rtf = linitial_node(RangeTblFunction, rte->functions);
|
|
|
|
if (!IsA(rtf->funcexpr, Const))
|
|
|
|
return jtnode;
|
|
|
|
|
2019-09-24 18:11:32 +02:00
|
|
|
/*
|
|
|
|
* If the function's result is not a scalar, we punt. In principle we
|
|
|
|
* could break the composite constant value apart into per-column
|
|
|
|
* constants, but for now it seems not worth the work.
|
|
|
|
*/
|
|
|
|
if (rtf->funccolcount != 1)
|
|
|
|
return jtnode; /* definitely composite */
|
|
|
|
|
|
|
|
functypclass = get_expr_result_type(rtf->funcexpr,
|
|
|
|
&funcrettype,
|
|
|
|
&tupdesc);
|
|
|
|
if (functypclass != TYPEFUNC_SCALAR)
|
|
|
|
return jtnode; /* must be a one-column composite type */
|
|
|
|
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
/* Create context for applying pullup_replace_vars */
|
|
|
|
rvcontext.root = root;
|
|
|
|
rvcontext.targetlist = list_make1(makeTargetEntry((Expr *) rtf->funcexpr,
|
|
|
|
1, /* resno */
|
|
|
|
NULL, /* resname */
|
|
|
|
false)); /* resjunk */
|
|
|
|
rvcontext.target_rte = rte;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Since this function was reduced to a Const, it doesn't contain any
|
|
|
|
* lateral references, even if it's marked as LATERAL. This means we
|
|
|
|
* don't need to fill relids.
|
|
|
|
*/
|
|
|
|
rvcontext.relids = NULL;
|
|
|
|
|
|
|
|
rvcontext.outer_hasSubLinks = &parse->hasSubLinks;
|
|
|
|
rvcontext.varno = ((RangeTblRef *) jtnode)->rtindex;
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
/* this flag will be set below, if needed */
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
rvcontext.wrap_non_vars = false;
|
|
|
|
/* initialize cache array with indexes 0 .. length(tlist) */
|
|
|
|
rvcontext.rv_cache = palloc0((list_length(rvcontext.targetlist) + 1) *
|
|
|
|
sizeof(Node *));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If we are dealing with an appendrel member then anything that's not a
|
|
|
|
* simple Var has to be turned into a PlaceHolderVar. (See comments in
|
|
|
|
* pull_up_simple_subquery().)
|
|
|
|
*/
|
|
|
|
if (containing_appendrel != NULL)
|
|
|
|
rvcontext.wrap_non_vars = true;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If the parent query uses grouping sets, we need a PlaceHolderVar for
|
|
|
|
* anything that's not a simple Var.
|
|
|
|
*/
|
|
|
|
if (parse->groupingSets)
|
|
|
|
rvcontext.wrap_non_vars = true;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Replace all of the top query's references to the RTE's output with
|
|
|
|
* copies of the funcexpr, being careful not to replace any of the
|
|
|
|
* jointree structure.
|
|
|
|
*/
|
|
|
|
perform_pullup_replace_vars(root, &rvcontext,
|
|
|
|
containing_appendrel);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We don't need to bother with changing PlaceHolderVars in the parent
|
|
|
|
* query. Their references to the RT index are still good for now, and
|
|
|
|
* will get removed later if we're able to drop the RTE_RESULT.
|
|
|
|
*/
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Convert the RTE to be RTE_RESULT type, signifying that we don't need to
|
2021-07-09 19:38:24 +02:00
|
|
|
* scan it anymore, and zero out RTE_FUNCTION-specific fields. Also make
|
|
|
|
* sure the RTE is not marked LATERAL, since elsewhere we don't expect
|
|
|
|
* RTE_RESULTs to be LATERAL.
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
*/
|
|
|
|
rte->rtekind = RTE_RESULT;
|
|
|
|
rte->functions = NIL;
|
2021-07-09 19:38:24 +02:00
|
|
|
rte->lateral = false;
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* We can reuse the RangeTblRef node.
|
|
|
|
*/
|
|
|
|
return jtnode;
|
|
|
|
}
|
|
|
|
|
2006-02-03 22:08:49 +01:00
|
|
|
/*
|
|
|
|
* is_simple_union_all
|
|
|
|
* Check a subquery to see if it's a simple UNION ALL.
|
|
|
|
*
|
|
|
|
* We require all the setops to be UNION ALL (no mixing) and there can't be
|
|
|
|
* any datatype coercions involved, ie, all the leaf queries must emit the
|
|
|
|
* same datatypes.
|
|
|
|
*/
|
|
|
|
static bool
|
|
|
|
is_simple_union_all(Query *subquery)
|
|
|
|
{
|
|
|
|
SetOperationStmt *topop;
|
|
|
|
|
|
|
|
/* Let's just make sure it's a valid subselect ... */
|
|
|
|
if (!IsA(subquery, Query) ||
|
Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 22:02:35 +01:00
|
|
|
subquery->commandType != CMD_SELECT)
|
2006-02-03 22:08:49 +01:00
|
|
|
elog(ERROR, "subquery is bogus");
|
|
|
|
|
|
|
|
/* Is it a set-operation query at all? */
|
2017-02-21 17:33:07 +01:00
|
|
|
topop = castNode(SetOperationStmt, subquery->setOperations);
|
2006-02-03 22:08:49 +01:00
|
|
|
if (!topop)
|
|
|
|
return false;
|
|
|
|
|
2008-10-04 23:56:55 +02:00
|
|
|
/* Can't handle ORDER BY, LIMIT/OFFSET, locking, or WITH */
|
2006-02-03 22:08:49 +01:00
|
|
|
if (subquery->sortClause ||
|
|
|
|
subquery->limitOffset ||
|
|
|
|
subquery->limitCount ||
|
2008-10-04 23:56:55 +02:00
|
|
|
subquery->rowMarks ||
|
|
|
|
subquery->cteList)
|
2006-02-03 22:08:49 +01:00
|
|
|
return false;
|
|
|
|
|
|
|
|
/* Recursively check the tree of set operations */
|
|
|
|
return is_simple_union_all_recurse((Node *) topop, subquery,
|
|
|
|
topop->colTypes);
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool
|
|
|
|
is_simple_union_all_recurse(Node *setOp, Query *setOpQuery, List *colTypes)
|
|
|
|
{
|
2022-12-22 16:35:02 +01:00
|
|
|
/* Since this function recurses, it could be driven to stack overflow. */
|
|
|
|
check_stack_depth();
|
|
|
|
|
2006-02-03 22:08:49 +01:00
|
|
|
if (IsA(setOp, RangeTblRef))
|
|
|
|
{
|
|
|
|
RangeTblRef *rtr = (RangeTblRef *) setOp;
|
|
|
|
RangeTblEntry *rte = rt_fetch(rtr->rtindex, setOpQuery->rtable);
|
|
|
|
Query *subquery = rte->subquery;
|
|
|
|
|
|
|
|
Assert(subquery != NULL);
|
|
|
|
|
|
|
|
/* Leaf nodes are OK if they match the toplevel column types */
|
2011-04-16 22:39:50 +02:00
|
|
|
/* We don't have to compare typmods or collations here */
|
2006-02-03 22:08:49 +01:00
|
|
|
return tlist_same_datatypes(subquery->targetList, colTypes, true);
|
|
|
|
}
|
|
|
|
else if (IsA(setOp, SetOperationStmt))
|
|
|
|
{
|
|
|
|
SetOperationStmt *op = (SetOperationStmt *) setOp;
|
|
|
|
|
|
|
|
/* Must be UNION ALL */
|
|
|
|
if (op->op != SETOP_UNION || !op->all)
|
|
|
|
return false;
|
|
|
|
|
|
|
|
/* Recurse to check inputs */
|
|
|
|
return is_simple_union_all_recurse(op->larg, setOpQuery, colTypes) &&
|
|
|
|
is_simple_union_all_recurse(op->rarg, setOpQuery, colTypes);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
elog(ERROR, "unrecognized node type: %d",
|
|
|
|
(int) nodeTag(setOp));
|
|
|
|
return false; /* keep compiler quiet */
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* is_safe_append_member
|
|
|
|
* Check a subquery that is a leaf of a UNION ALL appendrel to see if it's
|
|
|
|
* safe to pull up.
|
|
|
|
*/
|
|
|
|
static bool
|
|
|
|
is_safe_append_member(Query *subquery)
|
|
|
|
{
|
|
|
|
FromExpr *jtnode;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* It's only safe to pull up the child if its jointree contains exactly
|
|
|
|
* one RTE, else the AppendRelInfo data structure breaks. The one base RTE
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
* could be buried in several levels of FromExpr, however. Also, if the
|
|
|
|
* child's jointree is completely empty, we can pull up because
|
|
|
|
* pull_up_simple_subquery will insert a single RTE_RESULT RTE instead.
|
2006-02-03 22:08:49 +01:00
|
|
|
*
|
|
|
|
* Also, the child can't have any WHERE quals because there's no place to
|
|
|
|
* put them in an appendrel. (This is a bit annoying...) If we didn't
|
|
|
|
* need to check this, we'd just test whether get_relids_in_jointree()
|
|
|
|
* yields a singleton set, to be more consistent with the coding of
|
|
|
|
* fix_append_rel_relids().
|
|
|
|
*/
|
|
|
|
jtnode = subquery->jointree;
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
Assert(IsA(jtnode, FromExpr));
|
|
|
|
/* Check the completely-empty case */
|
|
|
|
if (jtnode->fromlist == NIL && jtnode->quals == NULL)
|
|
|
|
return true;
|
|
|
|
/* Check the more general case */
|
2006-02-03 22:08:49 +01:00
|
|
|
while (IsA(jtnode, FromExpr))
|
|
|
|
{
|
|
|
|
if (jtnode->quals != NULL)
|
|
|
|
return false;
|
|
|
|
if (list_length(jtnode->fromlist) != 1)
|
|
|
|
return false;
|
|
|
|
jtnode = linitial(jtnode->fromlist);
|
|
|
|
}
|
|
|
|
if (!IsA(jtnode, RangeTblRef))
|
|
|
|
return false;
|
|
|
|
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2013-08-19 19:19:25 +02:00
|
|
|
/*
|
|
|
|
* jointree_contains_lateral_outer_refs
|
|
|
|
* Check for disallowed lateral references in a jointree's quals
|
|
|
|
*
|
|
|
|
* If restricted is false, all level-1 Vars are allowed (but we still must
|
|
|
|
* search the jointree, since it might contain outer joins below which there
|
2017-08-16 06:22:32 +02:00
|
|
|
* will be restrictions). If restricted is true, return true when any qual
|
2013-08-19 19:19:25 +02:00
|
|
|
* in the jointree contains level-1 Vars coming from outside the rels listed
|
|
|
|
* in safe_upper_varnos.
|
|
|
|
*/
|
|
|
|
static bool
|
Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins. This could result in a malformed plan. The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.
The right value to use is the join level we actually intend to evaluate
the PHV at. We can get that from the ph_eval_at field of the associated
PlaceHolderInfo. However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level. (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.) Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.
The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos. We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter. (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)
Back-patch to v12. The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit 4be058fe9 allowed trivial
subqueries to be collapsed out completely, eliminating their join order
restrictions.
Per report from Stephan Springl.
Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 21:37:23 +01:00
|
|
|
jointree_contains_lateral_outer_refs(PlannerInfo *root, Node *jtnode,
|
|
|
|
bool restricted,
|
2013-08-19 19:19:25 +02:00
|
|
|
Relids safe_upper_varnos)
|
|
|
|
{
|
|
|
|
if (jtnode == NULL)
|
|
|
|
return false;
|
|
|
|
if (IsA(jtnode, RangeTblRef))
|
|
|
|
return false;
|
|
|
|
else if (IsA(jtnode, FromExpr))
|
|
|
|
{
|
|
|
|
FromExpr *f = (FromExpr *) jtnode;
|
|
|
|
ListCell *l;
|
|
|
|
|
|
|
|
/* First, recurse to check child joins */
|
|
|
|
foreach(l, f->fromlist)
|
|
|
|
{
|
Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins. This could result in a malformed plan. The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.
The right value to use is the join level we actually intend to evaluate
the PHV at. We can get that from the ph_eval_at field of the associated
PlaceHolderInfo. However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level. (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.) Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.
The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos. We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter. (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)
Back-patch to v12. The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit 4be058fe9 allowed trivial
subqueries to be collapsed out completely, eliminating their join order
restrictions.
Per report from Stephan Springl.
Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 21:37:23 +01:00
|
|
|
if (jointree_contains_lateral_outer_refs(root,
|
|
|
|
lfirst(l),
|
2013-08-19 19:19:25 +02:00
|
|
|
restricted,
|
|
|
|
safe_upper_varnos))
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Then check the top-level quals */
|
|
|
|
if (restricted &&
|
Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins. This could result in a malformed plan. The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.
The right value to use is the join level we actually intend to evaluate
the PHV at. We can get that from the ph_eval_at field of the associated
PlaceHolderInfo. However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level. (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.) Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.
The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos. We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter. (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)
Back-patch to v12. The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit 4be058fe9 allowed trivial
subqueries to be collapsed out completely, eliminating their join order
restrictions.
Per report from Stephan Springl.
Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 21:37:23 +01:00
|
|
|
!bms_is_subset(pull_varnos_of_level(root, f->quals, 1),
|
2013-08-19 19:19:25 +02:00
|
|
|
safe_upper_varnos))
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
else if (IsA(jtnode, JoinExpr))
|
|
|
|
{
|
|
|
|
JoinExpr *j = (JoinExpr *) jtnode;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If this is an outer join, we mustn't allow any upper lateral
|
|
|
|
* references in or below it.
|
|
|
|
*/
|
|
|
|
if (j->jointype != JOIN_INNER)
|
|
|
|
{
|
|
|
|
restricted = true;
|
|
|
|
safe_upper_varnos = NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Check the child joins */
|
Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins. This could result in a malformed plan. The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.
The right value to use is the join level we actually intend to evaluate
the PHV at. We can get that from the ph_eval_at field of the associated
PlaceHolderInfo. However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level. (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.) Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.
The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos. We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter. (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)
Back-patch to v12. The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit 4be058fe9 allowed trivial
subqueries to be collapsed out completely, eliminating their join order
restrictions.
Per report from Stephan Springl.
Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 21:37:23 +01:00
|
|
|
if (jointree_contains_lateral_outer_refs(root,
|
|
|
|
j->larg,
|
2013-08-19 19:19:25 +02:00
|
|
|
restricted,
|
|
|
|
safe_upper_varnos))
|
|
|
|
return true;
|
Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins. This could result in a malformed plan. The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.
The right value to use is the join level we actually intend to evaluate
the PHV at. We can get that from the ph_eval_at field of the associated
PlaceHolderInfo. However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level. (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.) Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.
The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos. We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter. (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)
Back-patch to v12. The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit 4be058fe9 allowed trivial
subqueries to be collapsed out completely, eliminating their join order
restrictions.
Per report from Stephan Springl.
Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 21:37:23 +01:00
|
|
|
if (jointree_contains_lateral_outer_refs(root,
|
|
|
|
j->rarg,
|
2013-08-19 19:19:25 +02:00
|
|
|
restricted,
|
|
|
|
safe_upper_varnos))
|
|
|
|
return true;
|
|
|
|
|
|
|
|
/* Check the JOIN's qual clauses */
|
|
|
|
if (restricted &&
|
Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins. This could result in a malformed plan. The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.
The right value to use is the join level we actually intend to evaluate
the PHV at. We can get that from the ph_eval_at field of the associated
PlaceHolderInfo. However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level. (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.) Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.
The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos. We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter. (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)
Back-patch to v12. The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit 4be058fe9 allowed trivial
subqueries to be collapsed out completely, eliminating their join order
restrictions.
Per report from Stephan Springl.
Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 21:37:23 +01:00
|
|
|
!bms_is_subset(pull_varnos_of_level(root, j->quals, 1),
|
2013-08-19 19:19:25 +02:00
|
|
|
safe_upper_varnos))
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
elog(ERROR, "unrecognized node type: %d",
|
|
|
|
(int) nodeTag(jtnode));
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2003-01-20 19:55:07 +01:00
|
|
|
/*
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
* Perform pullup_replace_vars everyplace it's needed in the query tree.
|
|
|
|
*
|
|
|
|
* Caller has already filled *rvcontext with data describing what to
|
|
|
|
* substitute for Vars referencing the target subquery. In addition
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
* we need the identity of the containing appendrel if any.
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
*/
|
|
|
|
static void
|
|
|
|
perform_pullup_replace_vars(PlannerInfo *root,
|
|
|
|
pullup_replace_vars_context *rvcontext,
|
|
|
|
AppendRelInfo *containing_appendrel)
|
|
|
|
{
|
|
|
|
Query *parse = root->parse;
|
|
|
|
ListCell *lc;
|
|
|
|
|
2022-12-22 17:02:03 +01:00
|
|
|
/*
|
|
|
|
* If we are considering an appendrel child subquery (that is, a UNION ALL
|
|
|
|
* member query that we're pulling up), then the only part of the upper
|
|
|
|
* query that could reference the child yet is the translated_vars list of
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
* the associated AppendRelInfo. Furthermore, we do not want to force use
|
|
|
|
* of PHVs in the AppendRelInfo --- there isn't any outer join between.
|
2022-12-22 17:02:03 +01:00
|
|
|
*/
|
|
|
|
if (containing_appendrel)
|
|
|
|
{
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
bool save_wrap_non_vars = rvcontext->wrap_non_vars;
|
2022-12-22 17:02:03 +01:00
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
rvcontext->wrap_non_vars = false;
|
2022-12-22 17:02:03 +01:00
|
|
|
containing_appendrel->translated_vars = (List *)
|
|
|
|
pullup_replace_vars((Node *) containing_appendrel->translated_vars,
|
|
|
|
rvcontext);
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
rvcontext->wrap_non_vars = save_wrap_non_vars;
|
2022-12-22 17:02:03 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
/*
|
|
|
|
* Replace all of the top query's references to the subquery's outputs
|
|
|
|
* with copies of the adjusted subtlist items, being careful not to
|
|
|
|
* replace any of the jointree structure. (This'd be a lot cleaner if we
|
|
|
|
* could use query_tree_mutator.) We have to use PHVs in the targetList,
|
|
|
|
* returningList, and havingQual, since those are certainly above any
|
|
|
|
* outer join. replace_vars_in_jointree tracks its location in the
|
|
|
|
* jointree and uses PHVs or not appropriately.
|
|
|
|
*/
|
|
|
|
parse->targetList = (List *)
|
|
|
|
pullup_replace_vars((Node *) parse->targetList, rvcontext);
|
|
|
|
parse->returningList = (List *)
|
|
|
|
pullup_replace_vars((Node *) parse->returningList, rvcontext);
|
2022-12-10 07:27:20 +01:00
|
|
|
|
|
|
|
foreach(lc, parse->windowClause)
|
|
|
|
{
|
|
|
|
WindowClause *wc = lfirst_node(WindowClause, lc);
|
|
|
|
|
|
|
|
if (wc->runCondition != NIL)
|
|
|
|
wc->runCondition = (List *)
|
|
|
|
pullup_replace_vars((Node *) wc->runCondition, rvcontext);
|
|
|
|
}
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
if (parse->onConflict)
|
|
|
|
{
|
|
|
|
parse->onConflict->onConflictSet = (List *)
|
|
|
|
pullup_replace_vars((Node *) parse->onConflict->onConflictSet,
|
|
|
|
rvcontext);
|
|
|
|
parse->onConflict->onConflictWhere =
|
|
|
|
pullup_replace_vars(parse->onConflict->onConflictWhere,
|
|
|
|
rvcontext);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We assume ON CONFLICT's arbiterElems, arbiterWhere, exclRelTlist
|
|
|
|
* can't contain any references to a subquery.
|
|
|
|
*/
|
|
|
|
}
|
2022-03-28 16:45:58 +02:00
|
|
|
if (parse->mergeActionList)
|
|
|
|
{
|
|
|
|
foreach(lc, parse->mergeActionList)
|
|
|
|
{
|
|
|
|
MergeAction *action = lfirst(lc);
|
|
|
|
|
|
|
|
action->qual = pullup_replace_vars(action->qual, rvcontext);
|
|
|
|
action->targetList = (List *)
|
|
|
|
pullup_replace_vars((Node *) action->targetList, rvcontext);
|
|
|
|
}
|
|
|
|
}
|
2024-03-30 11:00:26 +01:00
|
|
|
parse->mergeJoinCondition = pullup_replace_vars(parse->mergeJoinCondition,
|
|
|
|
rvcontext);
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
replace_vars_in_jointree((Node *) parse->jointree, rvcontext);
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
Assert(parse->setOperations == NULL);
|
|
|
|
parse->havingQual = pullup_replace_vars(parse->havingQual, rvcontext);
|
|
|
|
|
|
|
|
/*
|
2022-12-22 17:02:03 +01:00
|
|
|
* Replace references in the translated_vars lists of appendrels.
|
Allow functions-in-FROM to be pulled up if they reduce to constants.
This allows simplification of the plan tree in some common usage
patterns: we can get rid of a join to the function RTE.
In principle we could pull up any immutable expression, but restricting
it to Consts avoids the risk that multiple evaluations of the expression
might cost more than we can save. (Possibly this could be improved in
future --- but we've more or less promised people that putting a function
in FROM guarantees single evaluation, so we'd have to tread carefully.)
To do this, we need to rearrange when eval_const_expressions()
happens for expressions in function RTEs. I moved it to
inline_set_returning_functions(), which already has to iterate over
every function RTE, and in consequence renamed that function to
preprocess_function_rtes(). A useful consequence is that
inline_set_returning_function() no longer has to do this for itself,
simplifying that code.
In passing, break out pull_up_simple_subquery's code that knows where
everything that needs pullup_replace_vars() processing is, so that
the new pull_up_constant_function() routine can share it. We'd
gotten away with one-and-a-half copies of that code so far, since
pull_up_simple_values() could assume that a lot of cases didn't apply
to it --- but I don't think pull_up_constant_function() can make any
simplifying assumptions. Might as well make pull_up_simple_values()
use it too.
(Possibly this refactoring should go further: maybe we could share
some of the code to fill in the pullup_replace_vars_context struct?
For now, I left it that the callers fill that completely.)
Note: the one existing test case that this patch changes has to be
changed because inlining its function RTEs would destroy the point
of the test, namely to check join order.
Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by
Antonin Houska and Anastasia Lubennikova, and whacked around
some more by me
Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru
2019-08-02 00:50:22 +02:00
|
|
|
*/
|
|
|
|
foreach(lc, root->append_rel_list)
|
|
|
|
{
|
|
|
|
AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc);
|
|
|
|
|
|
|
|
appinfo->translated_vars = (List *)
|
|
|
|
pullup_replace_vars((Node *) appinfo->translated_vars, rvcontext);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Replace references in the joinaliasvars lists of join RTEs.
|
|
|
|
*/
|
|
|
|
foreach(lc, parse->rtable)
|
|
|
|
{
|
|
|
|
RangeTblEntry *otherrte = (RangeTblEntry *) lfirst(lc);
|
|
|
|
|
|
|
|
if (otherrte->rtekind == RTE_JOIN)
|
|
|
|
otherrte->joinaliasvars = (List *)
|
|
|
|
pullup_replace_vars((Node *) otherrte->joinaliasvars,
|
|
|
|
rvcontext);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Helper routine for perform_pullup_replace_vars: do pullup_replace_vars on
|
|
|
|
* every expression in the jointree, without changing the jointree structure
|
|
|
|
* itself. Ugly, but there's no other way...
|
2003-01-20 19:55:07 +01:00
|
|
|
*/
|
|
|
|
static void
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
replace_vars_in_jointree(Node *jtnode,
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
pullup_replace_vars_context *context)
|
2003-01-20 19:55:07 +01:00
|
|
|
{
|
|
|
|
if (jtnode == NULL)
|
|
|
|
return;
|
|
|
|
if (IsA(jtnode, RangeTblRef))
|
|
|
|
{
|
2012-08-18 20:10:17 +02:00
|
|
|
/*
|
|
|
|
* If the RangeTblRef refers to a LATERAL subquery (that isn't the
|
|
|
|
* same subquery we're pulling up), it might contain references to the
|
|
|
|
* target subquery, which we must replace. We drive this from the
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
* jointree scan, rather than a scan of the rtable, so that we can
|
|
|
|
* avoid processing no-longer-referenced RTEs.
|
2012-08-18 20:10:17 +02:00
|
|
|
*/
|
|
|
|
int varno = ((RangeTblRef *) jtnode)->rtindex;
|
|
|
|
|
|
|
|
if (varno != context->varno) /* ignore target subquery itself */
|
|
|
|
{
|
|
|
|
RangeTblEntry *rte = rt_fetch(varno, context->root->parse->rtable);
|
|
|
|
|
|
|
|
Assert(rte != context->target_rte);
|
|
|
|
if (rte->lateral)
|
|
|
|
{
|
|
|
|
switch (rte->rtekind)
|
|
|
|
{
|
Redesign tablesample method API, and do extensive code review.
The original implementation of TABLESAMPLE modeled the tablesample method
API on index access methods, which wasn't a good choice because, without
specialized DDL commands, there's no way to build an extension that can
implement a TSM. (Raw inserts into system catalogs are not an acceptable
thing to do, because we can't undo them during DROP EXTENSION, nor will
pg_upgrade behave sanely.) Instead adopt an API more like procedural
language handlers or foreign data wrappers, wherein the only SQL-level
support object needed is a single handler function identified by having
a special return type. This lets us get rid of the supporting catalog
altogether, so that no custom DDL support is needed for the feature.
Adjust the API so that it can support non-constant tablesample arguments
(the original coding assumed we could evaluate the argument expressions at
ExecInitSampleScan time, which is undesirable even if it weren't outright
unsafe), and discourage sampling methods from looking at invisible tuples.
Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
within and across queries, as required by the SQL standard, and deal more
honestly with methods that can't support that requirement.
Make a full code-review pass over the tablesample additions, and fix
assorted bugs, omissions, infelicities, and cosmetic issues (such as
failure to put the added code stanzas in a consistent ordering).
Improve EXPLAIN's output of tablesample plans, too.
Back-patch to 9.5 so that we don't have to support the original API
in production.
2015-07-25 20:39:00 +02:00
|
|
|
case RTE_RELATION:
|
|
|
|
/* shouldn't be marked LATERAL unless tablesample */
|
|
|
|
Assert(rte->tablesample);
|
|
|
|
rte->tablesample = (TableSampleClause *)
|
|
|
|
pullup_replace_vars((Node *) rte->tablesample,
|
|
|
|
context);
|
|
|
|
break;
|
2012-08-18 20:10:17 +02:00
|
|
|
case RTE_SUBQUERY:
|
|
|
|
rte->subquery =
|
|
|
|
pullup_replace_vars_subquery(rte->subquery,
|
|
|
|
context);
|
|
|
|
break;
|
|
|
|
case RTE_FUNCTION:
|
Support multi-argument UNNEST(), and TABLE() syntax for multiple functions.
This patch adds the ability to write TABLE( function1(), function2(), ...)
as a single FROM-clause entry. The result is the concatenation of the
first row from each function, followed by the second row from each
function, etc; with NULLs inserted if any function produces fewer rows than
others. This is believed to be a much more useful behavior than what
Postgres currently does with multiple SRFs in a SELECT list.
This syntax also provides a reasonable way to combine use of column
definition lists with WITH ORDINALITY: put the column definition list
inside TABLE(), where it's clear that it doesn't control the ordinality
column as well.
Also implement SQL-compliant multiple-argument UNNEST(), by turning
UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)).
The SQL standard specifies TABLE() with only a single function, not
multiple functions, and it seems to require an implicit UNNEST() which is
not what this patch does. There may be something wrong with that reading
of the spec, though, because if it's right then the spec's TABLE() is just
a pointless alternative spelling of UNNEST(). After further review of
that, we might choose to adopt a different syntax for what this patch does,
but in any case this functionality seems clearly worthwhile.
Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and
significantly revised by me
2013-11-22 01:37:02 +01:00
|
|
|
rte->functions = (List *)
|
|
|
|
pullup_replace_vars((Node *) rte->functions,
|
2012-08-18 20:10:17 +02:00
|
|
|
context);
|
|
|
|
break;
|
2017-03-08 16:39:37 +01:00
|
|
|
case RTE_TABLEFUNC:
|
|
|
|
rte->tablefunc = (TableFunc *)
|
|
|
|
pullup_replace_vars((Node *) rte->tablefunc,
|
|
|
|
context);
|
|
|
|
break;
|
2012-08-18 20:10:17 +02:00
|
|
|
case RTE_VALUES:
|
|
|
|
rte->values_lists = (List *)
|
|
|
|
pullup_replace_vars((Node *) rte->values_lists,
|
|
|
|
context);
|
|
|
|
break;
|
|
|
|
case RTE_JOIN:
|
|
|
|
case RTE_CTE:
|
2017-04-01 06:17:18 +02:00
|
|
|
case RTE_NAMEDTUPLESTORE:
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
case RTE_RESULT:
|
2012-08-18 20:10:17 +02:00
|
|
|
/* these shouldn't be marked LATERAL */
|
|
|
|
Assert(false);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2003-01-20 19:55:07 +01:00
|
|
|
}
|
|
|
|
else if (IsA(jtnode, FromExpr))
|
|
|
|
{
|
|
|
|
FromExpr *f = (FromExpr *) jtnode;
|
2004-05-26 06:41:50 +02:00
|
|
|
ListCell *l;
|
2003-01-20 19:55:07 +01:00
|
|
|
|
|
|
|
foreach(l, f->fromlist)
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
replace_vars_in_jointree(lfirst(l), context);
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
f->quals = pullup_replace_vars(f->quals, context);
|
2003-01-20 19:55:07 +01:00
|
|
|
}
|
|
|
|
else if (IsA(jtnode, JoinExpr))
|
|
|
|
{
|
|
|
|
JoinExpr *j = (JoinExpr *) jtnode;
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
bool save_wrap_non_vars = context->wrap_non_vars;
|
2003-01-20 19:55:07 +01:00
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
replace_vars_in_jointree(j->larg, context);
|
|
|
|
replace_vars_in_jointree(j->rarg, context);
|
2018-10-14 19:07:29 +02:00
|
|
|
|
|
|
|
/*
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
* Use PHVs within the join quals of a full join. Otherwise, we
|
|
|
|
* cannot identify which side of the join a pulled-up var-free
|
|
|
|
* expression came from, which can lead to failure to make a plan at
|
|
|
|
* all because none of the quals appear to be mergeable or hashable
|
|
|
|
* conditions.
|
2018-10-14 19:07:29 +02:00
|
|
|
*/
|
|
|
|
if (j->jointype == JOIN_FULL)
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
context->wrap_non_vars = true;
|
2018-10-14 19:07:29 +02:00
|
|
|
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
j->quals = pullup_replace_vars(j->quals, context);
|
2003-01-20 19:55:07 +01:00
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
context->wrap_non_vars = save_wrap_non_vars;
|
2003-01-20 19:55:07 +01:00
|
|
|
}
|
|
|
|
else
|
2003-07-25 02:01:09 +02:00
|
|
|
elog(ERROR, "unrecognized node type: %d",
|
|
|
|
(int) nodeTag(jtnode));
|
2003-01-20 19:55:07 +01:00
|
|
|
}
|
|
|
|
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
/*
|
|
|
|
* Apply pullup variable replacement throughout an expression tree
|
|
|
|
*
|
|
|
|
* Returns a modified copy of the tree, so this can't be used where we
|
|
|
|
* need to do in-place replacement.
|
|
|
|
*/
|
|
|
|
static Node *
|
|
|
|
pullup_replace_vars(Node *expr, pullup_replace_vars_context *context)
|
|
|
|
{
|
|
|
|
return replace_rte_variables(expr,
|
|
|
|
context->varno, 0,
|
|
|
|
pullup_replace_vars_callback,
|
|
|
|
(void *) context,
|
|
|
|
context->outer_hasSubLinks);
|
|
|
|
}
|
|
|
|
|
|
|
|
static Node *
|
|
|
|
pullup_replace_vars_callback(Var *var,
|
|
|
|
replace_rte_variables_context *context)
|
|
|
|
{
|
|
|
|
pullup_replace_vars_context *rcon = (pullup_replace_vars_context *) context->callback_arg;
|
|
|
|
int varattno = var->varattno;
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
bool need_phv;
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
Node *newnode;
|
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
/*
|
|
|
|
* We need a PlaceHolderVar if the Var-to-be-replaced has nonempty
|
|
|
|
* varnullingrels (unless we find below that the replacement expression is
|
|
|
|
* a Var or PlaceHolderVar that we can just add the nullingrels to). We
|
|
|
|
* also need one if the caller has instructed us that all non-Var/PHV
|
|
|
|
* replacements need to be wrapped for identification purposes.
|
|
|
|
*/
|
|
|
|
need_phv = (var->varnullingrels != NULL) || rcon->wrap_non_vars;
|
|
|
|
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
/*
|
|
|
|
* If PlaceHolderVars are needed, we cache the modified expressions in
|
|
|
|
* rcon->rv_cache[]. This is not in hopes of any material speed gain
|
|
|
|
* within this function, but to avoid generating identical PHVs with
|
|
|
|
* different IDs. That would result in duplicate evaluations at runtime,
|
|
|
|
* and possibly prevent optimizations that rely on recognizing different
|
|
|
|
* references to the same subquery output as being equal(). So it's worth
|
|
|
|
* a bit of extra effort to avoid it.
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
*
|
|
|
|
* The cached items have phlevelsup = 0 and phnullingrels = NULL; we'll
|
|
|
|
* copy them and adjust those values for this reference site below.
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
*/
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
if (need_phv &&
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
varattno >= InvalidAttrNumber &&
|
|
|
|
varattno <= list_length(rcon->targetlist) &&
|
|
|
|
rcon->rv_cache[varattno] != NULL)
|
|
|
|
{
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
/* Just copy the entry and fall through to adjust phlevelsup etc */
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
newnode = copyObject(rcon->rv_cache[varattno]);
|
|
|
|
}
|
|
|
|
else if (varattno == InvalidAttrNumber)
|
|
|
|
{
|
|
|
|
/* Must expand whole-tuple reference into RowExpr */
|
|
|
|
RowExpr *rowexpr;
|
|
|
|
List *colnames;
|
|
|
|
List *fields;
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
bool save_wrap_non_vars = rcon->wrap_non_vars;
|
2010-06-21 02:14:48 +02:00
|
|
|
int save_sublevelsup = context->sublevels_up;
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If generating an expansion for a var of a named rowtype (ie, this
|
|
|
|
* is a plain relation RTE), then we must include dummy items for
|
|
|
|
* dropped columns. If the var is RECORD (ie, this is a JOIN), then
|
2022-03-17 23:25:44 +01:00
|
|
|
* omit dropped columns. In the latter case, attach column names to
|
|
|
|
* the RowExpr for use of the executor and ruleutils.c.
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
*
|
|
|
|
* In order to be able to cache the results, we always generate the
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
* expansion with varlevelsup = 0, and then adjust below if needed.
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
*/
|
|
|
|
expandRTE(rcon->target_rte,
|
|
|
|
var->varno, 0 /* not varlevelsup */ , var->location,
|
|
|
|
(var->vartype != RECORDOID),
|
|
|
|
&colnames, &fields);
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
/* Expand the generated per-field Vars, but don't insert PHVs there */
|
|
|
|
rcon->wrap_non_vars = false;
|
2010-06-21 02:14:48 +02:00
|
|
|
context->sublevels_up = 0; /* to match the expandRTE output */
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
fields = (List *) replace_rte_variables_mutator((Node *) fields,
|
|
|
|
context);
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
rcon->wrap_non_vars = save_wrap_non_vars;
|
2010-06-21 02:14:48 +02:00
|
|
|
context->sublevels_up = save_sublevelsup;
|
|
|
|
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
rowexpr = makeNode(RowExpr);
|
|
|
|
rowexpr->args = fields;
|
|
|
|
rowexpr->row_typeid = var->vartype;
|
|
|
|
rowexpr->row_format = COERCE_IMPLICIT_CAST;
|
2022-03-17 23:25:44 +01:00
|
|
|
rowexpr->colnames = (var->vartype == RECORDOID) ? colnames : NIL;
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
rowexpr->location = var->location;
|
|
|
|
newnode = (Node *) rowexpr;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Insert PlaceHolderVar if needed. Notice that we are wrapping one
|
|
|
|
* PlaceHolderVar around the whole RowExpr, rather than putting one
|
|
|
|
* around each element of the row. This is because we need the
|
|
|
|
* expression to yield NULL, not ROW(NULL,NULL,...) when it is forced
|
|
|
|
* to null by an outer join.
|
|
|
|
*/
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
if (need_phv)
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
{
|
|
|
|
newnode = (Node *)
|
|
|
|
make_placeholder_expr(rcon->root,
|
|
|
|
(Expr *) newnode,
|
|
|
|
bms_make_singleton(rcon->varno));
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
/* cache it with the PHV, and with phlevelsup etc not set yet */
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
rcon->rv_cache[InvalidAttrNumber] = copyObject(newnode);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/* Normal case referencing one targetlist element */
|
|
|
|
TargetEntry *tle = get_tle_by_resno(rcon->targetlist, varattno);
|
|
|
|
|
|
|
|
if (tle == NULL) /* shouldn't happen */
|
|
|
|
elog(ERROR, "could not find attribute %d in subquery targetlist",
|
|
|
|
varattno);
|
|
|
|
|
|
|
|
/* Make a copy of the tlist item to return */
|
2017-03-09 21:18:59 +01:00
|
|
|
newnode = (Node *) copyObject(tle->expr);
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
|
|
|
|
/* Insert PlaceHolderVar if needed */
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
if (need_phv)
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
{
|
|
|
|
bool wrap;
|
|
|
|
|
|
|
|
if (newnode && IsA(newnode, Var) &&
|
|
|
|
((Var *) newnode)->varlevelsup == 0)
|
|
|
|
{
|
2013-08-18 02:22:37 +02:00
|
|
|
/*
|
|
|
|
* Simple Vars always escape being wrapped, unless they are
|
|
|
|
* lateral references to something outside the subquery being
|
|
|
|
* pulled up. (Even then, we could omit the PlaceHolderVar if
|
|
|
|
* the referenced rel is under the same lowest outer join, but
|
|
|
|
* it doesn't seem worth the trouble to check that.)
|
|
|
|
*/
|
|
|
|
if (rcon->target_rte->lateral &&
|
|
|
|
!bms_is_member(((Var *) newnode)->varno, rcon->relids))
|
|
|
|
wrap = true;
|
|
|
|
else
|
|
|
|
wrap = false;
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
}
|
2011-08-09 17:33:46 +02:00
|
|
|
else if (newnode && IsA(newnode, PlaceHolderVar) &&
|
|
|
|
((PlaceHolderVar *) newnode)->phlevelsup == 0)
|
|
|
|
{
|
2024-01-11 21:28:13 +01:00
|
|
|
/* The same rules apply for a PlaceHolderVar */
|
|
|
|
if (rcon->target_rte->lateral &&
|
|
|
|
!bms_is_subset(((PlaceHolderVar *) newnode)->phrels,
|
|
|
|
rcon->relids))
|
|
|
|
wrap = true;
|
|
|
|
else
|
|
|
|
wrap = false;
|
2011-08-09 17:33:46 +02:00
|
|
|
}
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
else
|
|
|
|
{
|
|
|
|
/*
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
* Must wrap, either because we need a place to insert
|
|
|
|
* varnullingrels or because caller told us to wrap
|
|
|
|
* everything.
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
*/
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
wrap = true;
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
if (wrap)
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
{
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
newnode = (Node *)
|
|
|
|
make_placeholder_expr(rcon->root,
|
|
|
|
(Expr *) newnode,
|
|
|
|
bms_make_singleton(rcon->varno));
|
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
/*
|
|
|
|
* Cache it if possible (ie, if the attno is in range, which
|
|
|
|
* it probably always should be).
|
|
|
|
*/
|
|
|
|
if (varattno > InvalidAttrNumber &&
|
|
|
|
varattno <= list_length(rcon->targetlist))
|
|
|
|
rcon->rv_cache[varattno] = copyObject(newnode);
|
|
|
|
}
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
/* Must adjust varlevelsup if replaced Var is within a subquery */
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
if (var->varlevelsup > 0)
|
|
|
|
IncrementVarSublevelsUp(newnode, var->varlevelsup, 0);
|
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
/* Propagate any varnullingrels into the replacement Var or PHV */
|
|
|
|
if (var->varnullingrels != NULL)
|
|
|
|
{
|
|
|
|
if (IsA(newnode, Var))
|
|
|
|
{
|
|
|
|
Var *newvar = (Var *) newnode;
|
|
|
|
|
|
|
|
Assert(newvar->varlevelsup == var->varlevelsup);
|
|
|
|
newvar->varnullingrels = bms_add_members(newvar->varnullingrels,
|
|
|
|
var->varnullingrels);
|
|
|
|
}
|
|
|
|
else if (IsA(newnode, PlaceHolderVar))
|
|
|
|
{
|
|
|
|
PlaceHolderVar *newphv = (PlaceHolderVar *) newnode;
|
|
|
|
|
|
|
|
Assert(newphv->phlevelsup == var->varlevelsup);
|
|
|
|
newphv->phnullingrels = bms_add_members(newphv->phnullingrels,
|
|
|
|
var->varnullingrels);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
elog(ERROR, "failed to wrap a non-Var");
|
|
|
|
}
|
|
|
|
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
return newnode;
|
|
|
|
}
|
|
|
|
|
2012-08-18 20:10:17 +02:00
|
|
|
/*
|
|
|
|
* Apply pullup variable replacement to a subquery
|
|
|
|
*
|
|
|
|
* This needs to be different from pullup_replace_vars() because
|
|
|
|
* replace_rte_variables will think that it shouldn't increment sublevels_up
|
|
|
|
* before entering the Query; so we need to call it with sublevels_up == 1.
|
|
|
|
*/
|
|
|
|
static Query *
|
|
|
|
pullup_replace_vars_subquery(Query *query,
|
|
|
|
pullup_replace_vars_context *context)
|
|
|
|
{
|
|
|
|
Assert(IsA(query, Query));
|
|
|
|
return (Query *) replace_rte_variables((Node *) query,
|
|
|
|
context->varno, 1,
|
|
|
|
pullup_replace_vars_callback,
|
|
|
|
(void *) context,
|
|
|
|
NULL);
|
|
|
|
}
|
|
|
|
|
2010-11-08 21:15:02 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* flatten_simple_union_all
|
|
|
|
* Try to optimize top-level UNION ALL structure into an appendrel
|
|
|
|
*
|
|
|
|
* If a query's setOperations tree consists entirely of simple UNION ALL
|
|
|
|
* operations, flatten it into an append relation, which we can process more
|
|
|
|
* intelligently than the general setops case. Otherwise, do nothing.
|
|
|
|
*
|
|
|
|
* In most cases, this can succeed only for a top-level query, because for a
|
|
|
|
* subquery in FROM, the parent query's invocation of pull_up_subqueries would
|
|
|
|
* already have flattened the UNION via pull_up_simple_union_all. But there
|
|
|
|
* are a few cases we can support here but not in that code path, for example
|
|
|
|
* when the subquery also contains ORDER BY.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
flatten_simple_union_all(PlannerInfo *root)
|
|
|
|
{
|
|
|
|
Query *parse = root->parse;
|
|
|
|
SetOperationStmt *topop;
|
|
|
|
Node *leftmostjtnode;
|
|
|
|
int leftmostRTI;
|
|
|
|
RangeTblEntry *leftmostRTE;
|
|
|
|
int childRTI;
|
|
|
|
RangeTblEntry *childRTE;
|
|
|
|
RangeTblRef *rtr;
|
|
|
|
|
|
|
|
/* Shouldn't be called unless query has setops */
|
2017-02-21 17:33:07 +01:00
|
|
|
topop = castNode(SetOperationStmt, parse->setOperations);
|
|
|
|
Assert(topop);
|
2010-11-08 21:15:02 +01:00
|
|
|
|
|
|
|
/* Can't optimize away a recursive UNION */
|
|
|
|
if (root->hasRecursion)
|
|
|
|
return;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Recursively check the tree of set operations. If not all UNION ALL
|
|
|
|
* with identical column types, punt.
|
|
|
|
*/
|
|
|
|
if (!is_simple_union_all_recurse((Node *) topop, parse, topop->colTypes))
|
|
|
|
return;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Locate the leftmost leaf query in the setops tree. The upper query's
|
|
|
|
* Vars all refer to this RTE (see transformSetOperationStmt).
|
|
|
|
*/
|
|
|
|
leftmostjtnode = topop->larg;
|
|
|
|
while (leftmostjtnode && IsA(leftmostjtnode, SetOperationStmt))
|
|
|
|
leftmostjtnode = ((SetOperationStmt *) leftmostjtnode)->larg;
|
|
|
|
Assert(leftmostjtnode && IsA(leftmostjtnode, RangeTblRef));
|
|
|
|
leftmostRTI = ((RangeTblRef *) leftmostjtnode)->rtindex;
|
|
|
|
leftmostRTE = rt_fetch(leftmostRTI, parse->rtable);
|
|
|
|
Assert(leftmostRTE->rtekind == RTE_SUBQUERY);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Make a copy of the leftmost RTE and add it to the rtable. This copy
|
|
|
|
* will represent the leftmost leaf query in its capacity as a member of
|
|
|
|
* the appendrel. The original will represent the appendrel as a whole.
|
|
|
|
* (We must do things this way because the upper query's Vars have to be
|
|
|
|
* seen as referring to the whole appendrel.)
|
|
|
|
*/
|
|
|
|
childRTE = copyObject(leftmostRTE);
|
|
|
|
parse->rtable = lappend(parse->rtable, childRTE);
|
|
|
|
childRTI = list_length(parse->rtable);
|
|
|
|
|
|
|
|
/* Modify the setops tree to reference the child copy */
|
|
|
|
((RangeTblRef *) leftmostjtnode)->rtindex = childRTI;
|
|
|
|
|
|
|
|
/* Modify the formerly-leftmost RTE to mark it as an appendrel parent */
|
|
|
|
leftmostRTE->inh = true;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Form a RangeTblRef for the appendrel, and insert it into FROM. The top
|
|
|
|
* Query of a setops tree should have had an empty FromClause initially.
|
|
|
|
*/
|
|
|
|
rtr = makeNode(RangeTblRef);
|
|
|
|
rtr->rtindex = leftmostRTI;
|
|
|
|
Assert(parse->jointree->fromlist == NIL);
|
|
|
|
parse->jointree->fromlist = list_make1(rtr);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Now pretend the query has no setops. We must do this before trying to
|
|
|
|
* do subquery pullup, because of Assert in pull_up_simple_subquery.
|
|
|
|
*/
|
|
|
|
parse->setOperations = NULL;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Build AppendRelInfo information, and apply pull_up_subqueries to the
|
|
|
|
* leaf queries of the UNION ALL. (We must do that now because they
|
|
|
|
* weren't previously referenced by the jointree, and so were missed by
|
|
|
|
* the main invocation of pull_up_subqueries.)
|
|
|
|
*/
|
|
|
|
pull_up_union_leaf_queries((Node *) topop, root, leftmostRTI, parse, 0);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2003-01-20 19:55:07 +01:00
|
|
|
/*
|
2003-02-10 00:57:19 +01:00
|
|
|
* reduce_outer_joins
|
|
|
|
* Attempt to reduce outer joins to plain inner joins.
|
|
|
|
*
|
|
|
|
* The idea here is that given a query like
|
|
|
|
* SELECT ... FROM a LEFT JOIN b ON (...) WHERE b.y = 42;
|
|
|
|
* we can reduce the LEFT JOIN to a plain JOIN if the "=" operator in WHERE
|
|
|
|
* is strict. The strict operator will always return NULL, causing the outer
|
|
|
|
* WHERE to fail, on any row where the LEFT JOIN filled in NULLs for b's
|
|
|
|
* columns. Therefore, there's no need for the join to produce null-extended
|
|
|
|
* rows in the first place --- which makes it a plain join not an outer join.
|
|
|
|
* (This scenario may not be very likely in a query written out by hand, but
|
|
|
|
* it's reasonably likely when pushing quals down into complex views.)
|
|
|
|
*
|
|
|
|
* More generally, an outer join can be reduced in strength if there is a
|
|
|
|
* strict qual above it in the qual tree that constrains a Var from the
|
|
|
|
* nullable side of the join to be non-null. (For FULL joins this applies
|
|
|
|
* to each side separately.)
|
|
|
|
*
|
2008-08-14 20:48:00 +02:00
|
|
|
* Another transformation we apply here is to recognize cases like
|
|
|
|
* SELECT ... FROM a LEFT JOIN b ON (a.x = b.y) WHERE b.y IS NULL;
|
|
|
|
* If the join clause is strict for b.y, then only null-extended rows could
|
|
|
|
* pass the upper WHERE, and we can conclude that what the query is really
|
|
|
|
* specifying is an anti-semijoin. We change the join type from JOIN_LEFT
|
|
|
|
* to JOIN_ANTI. The IS NULL clause then becomes redundant, and must be
|
|
|
|
* removed to prevent bogus selectivity calculations, but we leave it to
|
|
|
|
* distribute_qual_to_rels to get rid of such clauses.
|
|
|
|
*
|
|
|
|
* Also, we get rid of JOIN_RIGHT cases by flipping them around to become
|
2023-04-05 22:59:00 +02:00
|
|
|
* JOIN_LEFT. This saves some code here and in some later planner routines;
|
|
|
|
* the main benefit is to reduce the number of jointypes that can appear in
|
|
|
|
* SpecialJoinInfo nodes. Note that we can still generate Paths and Plans
|
|
|
|
* that use JOIN_RIGHT (or JOIN_RIGHT_ANTI) by switching the inputs again.
|
2008-08-14 20:48:00 +02:00
|
|
|
*
|
2003-02-10 00:57:19 +01:00
|
|
|
* To ease recognition of strict qual clauses, we require this routine to be
|
|
|
|
* run after expression preprocessing (i.e., qual canonicalization and JOIN
|
|
|
|
* alias-var expansion).
|
|
|
|
*/
|
|
|
|
void
|
2005-06-06 00:32:58 +02:00
|
|
|
reduce_outer_joins(PlannerInfo *root)
|
2003-02-10 00:57:19 +01:00
|
|
|
{
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
reduce_outer_joins_pass1_state *state1;
|
|
|
|
reduce_outer_joins_pass2_state state2;
|
|
|
|
ListCell *lc;
|
2003-02-10 00:57:19 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* To avoid doing strictness checks on more quals than necessary, we want
|
|
|
|
* to stop descending the jointree as soon as there are no outer joins
|
|
|
|
* below our current point. This consideration forces a two-pass process.
|
|
|
|
* The first pass gathers information about which base rels appear below
|
|
|
|
* each side of each join clause, and about whether there are outer
|
|
|
|
* join(s) below each side of each join clause. The second pass examines
|
|
|
|
* qual clauses and changes join types as it descends the tree.
|
|
|
|
*/
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
state1 = reduce_outer_joins_pass1((Node *) root->parse->jointree);
|
2003-02-10 00:57:19 +01:00
|
|
|
|
|
|
|
/* planner.c shouldn't have called me if no outer joins */
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
if (state1 == NULL || !state1->contains_outer)
|
2003-07-25 02:01:09 +02:00
|
|
|
elog(ERROR, "so where are the outer joins?");
|
2003-02-10 00:57:19 +01:00
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
state2.inner_reduced = NULL;
|
|
|
|
state2.partial_reduced = NIL;
|
|
|
|
|
2005-06-06 00:32:58 +02:00
|
|
|
reduce_outer_joins_pass2((Node *) root->parse->jointree,
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
state1, &state2,
|
|
|
|
root, NULL, NIL);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If we successfully reduced the strength of any outer joins, we must
|
|
|
|
* remove references to those joins as nulling rels. This is handled as
|
|
|
|
* an additional pass, for simplicity and because we can handle all
|
|
|
|
* fully-reduced joins in a single pass over the parse tree.
|
|
|
|
*/
|
|
|
|
if (!bms_is_empty(state2.inner_reduced))
|
|
|
|
{
|
|
|
|
root->parse = (Query *)
|
|
|
|
remove_nulling_relids((Node *) root->parse,
|
|
|
|
state2.inner_reduced,
|
|
|
|
NULL);
|
|
|
|
/* There could be references in the append_rel_list, too */
|
|
|
|
root->append_rel_list = (List *)
|
|
|
|
remove_nulling_relids((Node *) root->append_rel_list,
|
|
|
|
state2.inner_reduced,
|
|
|
|
NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Partially-reduced full joins have to be done one at a time, since
|
|
|
|
* they'll each need a different setting of except_relids.
|
|
|
|
*/
|
|
|
|
foreach(lc, state2.partial_reduced)
|
|
|
|
{
|
|
|
|
reduce_outer_joins_partial_state *statep = lfirst(lc);
|
|
|
|
Relids full_join_relids = bms_make_singleton(statep->full_join_rti);
|
|
|
|
|
|
|
|
root->parse = (Query *)
|
|
|
|
remove_nulling_relids((Node *) root->parse,
|
|
|
|
full_join_relids,
|
|
|
|
statep->unreduced_side);
|
|
|
|
root->append_rel_list = (List *)
|
|
|
|
remove_nulling_relids((Node *) root->append_rel_list,
|
|
|
|
full_join_relids,
|
|
|
|
statep->unreduced_side);
|
|
|
|
}
|
2003-02-10 00:57:19 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* reduce_outer_joins_pass1 - phase 1 data collection
|
|
|
|
*
|
|
|
|
* Returns a state node describing the given jointree node.
|
|
|
|
*/
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
static reduce_outer_joins_pass1_state *
|
2003-02-10 00:57:19 +01:00
|
|
|
reduce_outer_joins_pass1(Node *jtnode)
|
|
|
|
{
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
reduce_outer_joins_pass1_state *result;
|
2003-02-10 00:57:19 +01:00
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
result = (reduce_outer_joins_pass1_state *)
|
|
|
|
palloc(sizeof(reduce_outer_joins_pass1_state));
|
2003-02-10 00:57:19 +01:00
|
|
|
result->relids = NULL;
|
|
|
|
result->contains_outer = false;
|
|
|
|
result->sub_states = NIL;
|
|
|
|
|
|
|
|
if (jtnode == NULL)
|
|
|
|
return result;
|
|
|
|
if (IsA(jtnode, RangeTblRef))
|
|
|
|
{
|
|
|
|
int varno = ((RangeTblRef *) jtnode)->rtindex;
|
|
|
|
|
|
|
|
result->relids = bms_make_singleton(varno);
|
|
|
|
}
|
|
|
|
else if (IsA(jtnode, FromExpr))
|
|
|
|
{
|
|
|
|
FromExpr *f = (FromExpr *) jtnode;
|
2004-05-26 06:41:50 +02:00
|
|
|
ListCell *l;
|
2003-02-10 00:57:19 +01:00
|
|
|
|
|
|
|
foreach(l, f->fromlist)
|
|
|
|
{
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
reduce_outer_joins_pass1_state *sub_state;
|
2003-02-10 00:57:19 +01:00
|
|
|
|
|
|
|
sub_state = reduce_outer_joins_pass1(lfirst(l));
|
|
|
|
result->relids = bms_add_members(result->relids,
|
|
|
|
sub_state->relids);
|
|
|
|
result->contains_outer |= sub_state->contains_outer;
|
|
|
|
result->sub_states = lappend(result->sub_states, sub_state);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else if (IsA(jtnode, JoinExpr))
|
|
|
|
{
|
|
|
|
JoinExpr *j = (JoinExpr *) jtnode;
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
reduce_outer_joins_pass1_state *sub_state;
|
2003-02-10 00:57:19 +01:00
|
|
|
|
|
|
|
/* join's own RT index is not wanted in result->relids */
|
|
|
|
if (IS_OUTER_JOIN(j->jointype))
|
|
|
|
result->contains_outer = true;
|
|
|
|
|
|
|
|
sub_state = reduce_outer_joins_pass1(j->larg);
|
|
|
|
result->relids = bms_add_members(result->relids,
|
|
|
|
sub_state->relids);
|
|
|
|
result->contains_outer |= sub_state->contains_outer;
|
|
|
|
result->sub_states = lappend(result->sub_states, sub_state);
|
|
|
|
|
|
|
|
sub_state = reduce_outer_joins_pass1(j->rarg);
|
|
|
|
result->relids = bms_add_members(result->relids,
|
|
|
|
sub_state->relids);
|
|
|
|
result->contains_outer |= sub_state->contains_outer;
|
|
|
|
result->sub_states = lappend(result->sub_states, sub_state);
|
|
|
|
}
|
|
|
|
else
|
2003-07-25 02:01:09 +02:00
|
|
|
elog(ERROR, "unrecognized node type: %d",
|
|
|
|
(int) nodeTag(jtnode));
|
2003-02-10 00:57:19 +01:00
|
|
|
return result;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* reduce_outer_joins_pass2 - phase 2 processing
|
|
|
|
*
|
|
|
|
* jtnode: current jointree node
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
* state1: state data collected by phase 1 for this node
|
|
|
|
* state2: where to accumulate info about successfully-reduced joins
|
2005-06-06 00:32:58 +02:00
|
|
|
* root: toplevel planner state
|
2003-02-10 00:57:19 +01:00
|
|
|
* nonnullable_rels: set of base relids forced non-null by upper quals
|
Invent "multibitmapsets", and use them to speed up antijoin detection.
Implement a data structure that is a List of Bitmapsets, which is
essentially a 2-D boolean array except that the rows need not all
be the same width. Operations such as union and intersection are
meaningful for these, just as they are for Bitmapsets. Eventually
we might build many of the same operations that we have written for
Bitmapsets, but for the first use-case we just need a few.
That first use-case is for antijoin detection: reduce_outer_joins
needs to find the set of Vars that are certain to be non-null in a
successfully joined (not null-extended) left join row, and also
find the set of Vars subject to higher-level IS NULL constraints,
and intersect them. We had been doing this by making Lists of
the Var nodes and then using list_intersect, which works but is
pretty inefficient compared to a bitmapset-like intersection.
Potentially it's O(N^2) if there are a lot of Vars involved,
which fortunately there generally aren't; still it's not great.
Moreover, that method requires the Vars of interest to be exactly
equal() in the join condition and the upper IS NULL condition,
which is problematic for my WIP patch that labels Vars according
to which outer joins have possibly nulled them.
Discussion: https://postgr.es/m/892228.1668437838@sss.pgh.pa.us
Discussion: https://postgr.es/m/CAMbWs4-mvPPCJ1W6iK6dD5HiNwoJdi6mZp=-7mE8N9Sh+cd0tQ@mail.gmail.com
2022-11-16 19:58:42 +01:00
|
|
|
* forced_null_vars: multibitmapset of Vars forced null by upper quals
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
*
|
|
|
|
* Returns info in state2 about outer joins that were successfully simplified.
|
|
|
|
* Joins that were fully reduced to inner joins are all added to
|
|
|
|
* state2->inner_reduced. If a full join is reduced to a left join,
|
|
|
|
* it needs its own entry in state2->partial_reduced, since that will
|
|
|
|
* require custom processing to remove only the correct nullingrel markers.
|
2003-02-10 00:57:19 +01:00
|
|
|
*/
|
|
|
|
static void
|
|
|
|
reduce_outer_joins_pass2(Node *jtnode,
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
reduce_outer_joins_pass1_state *state1,
|
|
|
|
reduce_outer_joins_pass2_state *state2,
|
2005-06-06 00:32:58 +02:00
|
|
|
PlannerInfo *root,
|
2008-08-14 20:48:00 +02:00
|
|
|
Relids nonnullable_rels,
|
|
|
|
List *forced_null_vars)
|
2003-02-10 00:57:19 +01:00
|
|
|
{
|
|
|
|
/*
|
|
|
|
* pass 2 should never descend as far as an empty subnode or base rel,
|
|
|
|
* because it's only called on subtrees marked as contains_outer.
|
|
|
|
*/
|
|
|
|
if (jtnode == NULL)
|
2003-07-25 02:01:09 +02:00
|
|
|
elog(ERROR, "reached empty jointree");
|
2003-02-10 00:57:19 +01:00
|
|
|
if (IsA(jtnode, RangeTblRef))
|
2003-07-25 02:01:09 +02:00
|
|
|
elog(ERROR, "reached base rel");
|
2003-02-10 00:57:19 +01:00
|
|
|
else if (IsA(jtnode, FromExpr))
|
|
|
|
{
|
|
|
|
FromExpr *f = (FromExpr *) jtnode;
|
2004-05-26 06:41:50 +02:00
|
|
|
ListCell *l;
|
|
|
|
ListCell *s;
|
2008-08-14 20:48:00 +02:00
|
|
|
Relids pass_nonnullable_rels;
|
|
|
|
List *pass_forced_null_vars;
|
|
|
|
|
|
|
|
/* Scan quals to see if we can add any constraints */
|
|
|
|
pass_nonnullable_rels = find_nonnullable_rels(f->quals);
|
|
|
|
pass_nonnullable_rels = bms_add_members(pass_nonnullable_rels,
|
|
|
|
nonnullable_rels);
|
|
|
|
pass_forced_null_vars = find_forced_null_vars(f->quals);
|
Invent "multibitmapsets", and use them to speed up antijoin detection.
Implement a data structure that is a List of Bitmapsets, which is
essentially a 2-D boolean array except that the rows need not all
be the same width. Operations such as union and intersection are
meaningful for these, just as they are for Bitmapsets. Eventually
we might build many of the same operations that we have written for
Bitmapsets, but for the first use-case we just need a few.
That first use-case is for antijoin detection: reduce_outer_joins
needs to find the set of Vars that are certain to be non-null in a
successfully joined (not null-extended) left join row, and also
find the set of Vars subject to higher-level IS NULL constraints,
and intersect them. We had been doing this by making Lists of
the Var nodes and then using list_intersect, which works but is
pretty inefficient compared to a bitmapset-like intersection.
Potentially it's O(N^2) if there are a lot of Vars involved,
which fortunately there generally aren't; still it's not great.
Moreover, that method requires the Vars of interest to be exactly
equal() in the join condition and the upper IS NULL condition,
which is problematic for my WIP patch that labels Vars according
to which outer joins have possibly nulled them.
Discussion: https://postgr.es/m/892228.1668437838@sss.pgh.pa.us
Discussion: https://postgr.es/m/CAMbWs4-mvPPCJ1W6iK6dD5HiNwoJdi6mZp=-7mE8N9Sh+cd0tQ@mail.gmail.com
2022-11-16 19:58:42 +01:00
|
|
|
pass_forced_null_vars = mbms_add_members(pass_forced_null_vars,
|
|
|
|
forced_null_vars);
|
2003-02-10 00:57:19 +01:00
|
|
|
/* And recurse --- but only into interesting subtrees */
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
Assert(list_length(f->fromlist) == list_length(state1->sub_states));
|
|
|
|
forboth(l, f->fromlist, s, state1->sub_states)
|
2003-02-10 00:57:19 +01:00
|
|
|
{
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
reduce_outer_joins_pass1_state *sub_state = lfirst(s);
|
2003-02-10 00:57:19 +01:00
|
|
|
|
|
|
|
if (sub_state->contains_outer)
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
reduce_outer_joins_pass2(lfirst(l), sub_state,
|
|
|
|
state2, root,
|
2008-08-14 20:48:00 +02:00
|
|
|
pass_nonnullable_rels,
|
|
|
|
pass_forced_null_vars);
|
2003-02-10 00:57:19 +01:00
|
|
|
}
|
2008-08-14 20:48:00 +02:00
|
|
|
bms_free(pass_nonnullable_rels);
|
|
|
|
/* can't so easily clean up var lists, unfortunately */
|
2003-02-10 00:57:19 +01:00
|
|
|
}
|
|
|
|
else if (IsA(jtnode, JoinExpr))
|
|
|
|
{
|
|
|
|
JoinExpr *j = (JoinExpr *) jtnode;
|
|
|
|
int rtindex = j->rtindex;
|
|
|
|
JoinType jointype = j->jointype;
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
reduce_outer_joins_pass1_state *left_state = linitial(state1->sub_states);
|
|
|
|
reduce_outer_joins_pass1_state *right_state = lsecond(state1->sub_states);
|
2003-02-10 00:57:19 +01:00
|
|
|
|
|
|
|
/* Can we simplify this join? */
|
|
|
|
switch (jointype)
|
|
|
|
{
|
2008-08-14 20:48:00 +02:00
|
|
|
case JOIN_INNER:
|
|
|
|
break;
|
2003-02-10 00:57:19 +01:00
|
|
|
case JOIN_LEFT:
|
|
|
|
if (bms_overlap(nonnullable_rels, right_state->relids))
|
|
|
|
jointype = JOIN_INNER;
|
|
|
|
break;
|
|
|
|
case JOIN_RIGHT:
|
|
|
|
if (bms_overlap(nonnullable_rels, left_state->relids))
|
|
|
|
jointype = JOIN_INNER;
|
|
|
|
break;
|
|
|
|
case JOIN_FULL:
|
|
|
|
if (bms_overlap(nonnullable_rels, left_state->relids))
|
|
|
|
{
|
|
|
|
if (bms_overlap(nonnullable_rels, right_state->relids))
|
|
|
|
jointype = JOIN_INNER;
|
|
|
|
else
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
{
|
2003-02-10 00:57:19 +01:00
|
|
|
jointype = JOIN_LEFT;
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
/* Also report partial reduction in state2 */
|
|
|
|
report_reduced_full_join(state2, rtindex,
|
|
|
|
right_state->relids);
|
|
|
|
}
|
2003-02-10 00:57:19 +01:00
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
if (bms_overlap(nonnullable_rels, right_state->relids))
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
{
|
2003-02-10 00:57:19 +01:00
|
|
|
jointype = JOIN_RIGHT;
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
/* Also report partial reduction in state2 */
|
|
|
|
report_reduced_full_join(state2, rtindex,
|
|
|
|
left_state->relids);
|
|
|
|
}
|
2003-02-10 00:57:19 +01:00
|
|
|
}
|
|
|
|
break;
|
2009-02-25 04:30:38 +01:00
|
|
|
case JOIN_SEMI:
|
|
|
|
case JOIN_ANTI:
|
2009-06-11 16:49:15 +02:00
|
|
|
|
2009-02-25 04:30:38 +01:00
|
|
|
/*
|
|
|
|
* These could only have been introduced by pull_up_sublinks,
|
|
|
|
* so there's no way that upper quals could refer to their
|
2023-04-05 22:59:00 +02:00
|
|
|
* righthand sides, and no point in checking. We don't expect
|
|
|
|
* to see JOIN_RIGHT_ANTI yet.
|
2009-02-25 04:30:38 +01:00
|
|
|
*/
|
|
|
|
break;
|
2003-02-10 00:57:19 +01:00
|
|
|
default:
|
2008-08-14 20:48:00 +02:00
|
|
|
elog(ERROR, "unrecognized join type: %d",
|
|
|
|
(int) jointype);
|
2003-02-10 00:57:19 +01:00
|
|
|
break;
|
|
|
|
}
|
2008-08-14 20:48:00 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Convert JOIN_RIGHT to JOIN_LEFT. Note that in the case where we
|
|
|
|
* reduced JOIN_FULL to JOIN_RIGHT, this will mean the JoinExpr no
|
|
|
|
* longer matches the internal ordering of any CoalesceExpr's built to
|
|
|
|
* represent merged join variables. We don't care about that at
|
|
|
|
* present, but be wary of it ...
|
|
|
|
*/
|
|
|
|
if (jointype == JOIN_RIGHT)
|
|
|
|
{
|
|
|
|
Node *tmparg;
|
|
|
|
|
|
|
|
tmparg = j->larg;
|
|
|
|
j->larg = j->rarg;
|
|
|
|
j->rarg = tmparg;
|
|
|
|
jointype = JOIN_LEFT;
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
right_state = linitial(state1->sub_states);
|
|
|
|
left_state = lsecond(state1->sub_states);
|
2008-08-14 20:48:00 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* See if we can reduce JOIN_LEFT to JOIN_ANTI. This is the case if
|
|
|
|
* the join's own quals are strict for any var that was forced null by
|
|
|
|
* higher qual levels. NOTE: there are other ways that we could
|
|
|
|
* detect an anti-join, in particular if we were to check whether Vars
|
|
|
|
* coming from the RHS must be non-null because of table constraints.
|
|
|
|
* That seems complicated and expensive though (in particular, one
|
|
|
|
* would have to be wary of lower outer joins). For the moment this
|
|
|
|
* seems sufficient.
|
|
|
|
*/
|
|
|
|
if (jointype == JOIN_LEFT)
|
|
|
|
{
|
2022-11-05 20:58:51 +01:00
|
|
|
List *nonnullable_vars;
|
Invent "multibitmapsets", and use them to speed up antijoin detection.
Implement a data structure that is a List of Bitmapsets, which is
essentially a 2-D boolean array except that the rows need not all
be the same width. Operations such as union and intersection are
meaningful for these, just as they are for Bitmapsets. Eventually
we might build many of the same operations that we have written for
Bitmapsets, but for the first use-case we just need a few.
That first use-case is for antijoin detection: reduce_outer_joins
needs to find the set of Vars that are certain to be non-null in a
successfully joined (not null-extended) left join row, and also
find the set of Vars subject to higher-level IS NULL constraints,
and intersect them. We had been doing this by making Lists of
the Var nodes and then using list_intersect, which works but is
pretty inefficient compared to a bitmapset-like intersection.
Potentially it's O(N^2) if there are a lot of Vars involved,
which fortunately there generally aren't; still it's not great.
Moreover, that method requires the Vars of interest to be exactly
equal() in the join condition and the upper IS NULL condition,
which is problematic for my WIP patch that labels Vars according
to which outer joins have possibly nulled them.
Discussion: https://postgr.es/m/892228.1668437838@sss.pgh.pa.us
Discussion: https://postgr.es/m/CAMbWs4-mvPPCJ1W6iK6dD5HiNwoJdi6mZp=-7mE8N9Sh+cd0tQ@mail.gmail.com
2022-11-16 19:58:42 +01:00
|
|
|
Bitmapset *overlap;
|
2008-08-14 20:48:00 +02:00
|
|
|
|
2022-11-05 20:58:51 +01:00
|
|
|
/* Find Vars in j->quals that must be non-null in joined rows */
|
|
|
|
nonnullable_vars = find_nonnullable_vars(j->quals);
|
2008-08-14 20:48:00 +02:00
|
|
|
|
|
|
|
/*
|
2022-11-05 20:58:51 +01:00
|
|
|
* It's not sufficient to check whether nonnullable_vars and
|
2008-08-14 20:48:00 +02:00
|
|
|
* forced_null_vars overlap: we need to know if the overlap
|
|
|
|
* includes any RHS variables.
|
|
|
|
*/
|
Invent "multibitmapsets", and use them to speed up antijoin detection.
Implement a data structure that is a List of Bitmapsets, which is
essentially a 2-D boolean array except that the rows need not all
be the same width. Operations such as union and intersection are
meaningful for these, just as they are for Bitmapsets. Eventually
we might build many of the same operations that we have written for
Bitmapsets, but for the first use-case we just need a few.
That first use-case is for antijoin detection: reduce_outer_joins
needs to find the set of Vars that are certain to be non-null in a
successfully joined (not null-extended) left join row, and also
find the set of Vars subject to higher-level IS NULL constraints,
and intersect them. We had been doing this by making Lists of
the Var nodes and then using list_intersect, which works but is
pretty inefficient compared to a bitmapset-like intersection.
Potentially it's O(N^2) if there are a lot of Vars involved,
which fortunately there generally aren't; still it's not great.
Moreover, that method requires the Vars of interest to be exactly
equal() in the join condition and the upper IS NULL condition,
which is problematic for my WIP patch that labels Vars according
to which outer joins have possibly nulled them.
Discussion: https://postgr.es/m/892228.1668437838@sss.pgh.pa.us
Discussion: https://postgr.es/m/CAMbWs4-mvPPCJ1W6iK6dD5HiNwoJdi6mZp=-7mE8N9Sh+cd0tQ@mail.gmail.com
2022-11-16 19:58:42 +01:00
|
|
|
overlap = mbms_overlap_sets(nonnullable_vars, forced_null_vars);
|
|
|
|
if (bms_overlap(overlap, right_state->relids))
|
2008-08-14 20:48:00 +02:00
|
|
|
jointype = JOIN_ANTI;
|
|
|
|
}
|
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
/*
|
|
|
|
* Apply the jointype change, if any, to both jointree node and RTE.
|
|
|
|
* Also, if we changed an RTE to INNER, add its RTI to inner_reduced.
|
|
|
|
*/
|
2009-02-25 04:30:38 +01:00
|
|
|
if (rtindex && jointype != j->jointype)
|
2003-02-10 00:57:19 +01:00
|
|
|
{
|
2005-06-06 00:32:58 +02:00
|
|
|
RangeTblEntry *rte = rt_fetch(rtindex, root->parse->rtable);
|
2003-02-10 00:57:19 +01:00
|
|
|
|
|
|
|
Assert(rte->rtekind == RTE_JOIN);
|
|
|
|
Assert(rte->jointype == j->jointype);
|
2009-02-25 04:30:38 +01:00
|
|
|
rte->jointype = jointype;
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
if (jointype == JOIN_INNER)
|
|
|
|
state2->inner_reduced = bms_add_member(state2->inner_reduced,
|
|
|
|
rtindex);
|
2003-02-10 00:57:19 +01:00
|
|
|
}
|
2009-02-25 04:30:38 +01:00
|
|
|
j->jointype = jointype;
|
2003-02-10 00:57:19 +01:00
|
|
|
|
|
|
|
/* Only recurse if there's more to do below here */
|
|
|
|
if (left_state->contains_outer || right_state->contains_outer)
|
|
|
|
{
|
2008-08-14 20:48:00 +02:00
|
|
|
Relids local_nonnullable_rels;
|
|
|
|
List *local_forced_null_vars;
|
|
|
|
Relids pass_nonnullable_rels;
|
|
|
|
List *pass_forced_null_vars;
|
2003-02-10 00:57:19 +01:00
|
|
|
|
|
|
|
/*
|
2008-08-14 20:48:00 +02:00
|
|
|
* If this join is (now) inner, we can add any constraints its
|
|
|
|
* quals provide to those we got from above. But if it is outer,
|
|
|
|
* we can pass down the local constraints only into the nullable
|
|
|
|
* side, because an outer join never eliminates any rows from its
|
|
|
|
* non-nullable side. Also, there is no point in passing upper
|
|
|
|
* constraints into the nullable side, since if there were any
|
|
|
|
* we'd have been able to reduce the join. (In the case of upper
|
|
|
|
* forced-null constraints, we *must not* pass them into the
|
|
|
|
* nullable side --- they either applied here, or not.) The upshot
|
|
|
|
* is that we pass either the local or the upper constraints,
|
|
|
|
* never both, to the children of an outer join.
|
|
|
|
*
|
2011-01-30 23:04:31 +01:00
|
|
|
* Note that a SEMI join works like an inner join here: it's okay
|
|
|
|
* to pass down both local and upper constraints. (There can't be
|
|
|
|
* any upper constraints affecting its inner side, but it's not
|
|
|
|
* worth having a separate code path to avoid passing them.)
|
|
|
|
*
|
2008-08-14 20:48:00 +02:00
|
|
|
* At a FULL join we just punt and pass nothing down --- is it
|
|
|
|
* possible to be smarter?
|
2003-02-10 00:57:19 +01:00
|
|
|
*/
|
2003-02-10 18:08:50 +01:00
|
|
|
if (jointype != JOIN_FULL)
|
|
|
|
{
|
2008-08-14 20:48:00 +02:00
|
|
|
local_nonnullable_rels = find_nonnullable_rels(j->quals);
|
|
|
|
local_forced_null_vars = find_forced_null_vars(j->quals);
|
2011-01-30 23:04:31 +01:00
|
|
|
if (jointype == JOIN_INNER || jointype == JOIN_SEMI)
|
2008-08-14 20:48:00 +02:00
|
|
|
{
|
|
|
|
/* OK to merge upper and local constraints */
|
|
|
|
local_nonnullable_rels = bms_add_members(local_nonnullable_rels,
|
|
|
|
nonnullable_rels);
|
Invent "multibitmapsets", and use them to speed up antijoin detection.
Implement a data structure that is a List of Bitmapsets, which is
essentially a 2-D boolean array except that the rows need not all
be the same width. Operations such as union and intersection are
meaningful for these, just as they are for Bitmapsets. Eventually
we might build many of the same operations that we have written for
Bitmapsets, but for the first use-case we just need a few.
That first use-case is for antijoin detection: reduce_outer_joins
needs to find the set of Vars that are certain to be non-null in a
successfully joined (not null-extended) left join row, and also
find the set of Vars subject to higher-level IS NULL constraints,
and intersect them. We had been doing this by making Lists of
the Var nodes and then using list_intersect, which works but is
pretty inefficient compared to a bitmapset-like intersection.
Potentially it's O(N^2) if there are a lot of Vars involved,
which fortunately there generally aren't; still it's not great.
Moreover, that method requires the Vars of interest to be exactly
equal() in the join condition and the upper IS NULL condition,
which is problematic for my WIP patch that labels Vars according
to which outer joins have possibly nulled them.
Discussion: https://postgr.es/m/892228.1668437838@sss.pgh.pa.us
Discussion: https://postgr.es/m/CAMbWs4-mvPPCJ1W6iK6dD5HiNwoJdi6mZp=-7mE8N9Sh+cd0tQ@mail.gmail.com
2022-11-16 19:58:42 +01:00
|
|
|
local_forced_null_vars = mbms_add_members(local_forced_null_vars,
|
|
|
|
forced_null_vars);
|
2008-08-14 20:48:00 +02:00
|
|
|
}
|
2003-02-10 18:08:50 +01:00
|
|
|
}
|
|
|
|
else
|
2008-08-14 20:48:00 +02:00
|
|
|
{
|
|
|
|
/* no use in calculating these */
|
|
|
|
local_nonnullable_rels = NULL;
|
|
|
|
local_forced_null_vars = NIL;
|
|
|
|
}
|
2003-02-10 18:08:50 +01:00
|
|
|
|
2003-02-10 00:57:19 +01:00
|
|
|
if (left_state->contains_outer)
|
2003-02-10 18:08:50 +01:00
|
|
|
{
|
2011-01-30 23:04:31 +01:00
|
|
|
if (jointype == JOIN_INNER || jointype == JOIN_SEMI)
|
2008-08-14 20:48:00 +02:00
|
|
|
{
|
|
|
|
/* pass union of local and upper constraints */
|
|
|
|
pass_nonnullable_rels = local_nonnullable_rels;
|
|
|
|
pass_forced_null_vars = local_forced_null_vars;
|
|
|
|
}
|
2011-01-30 23:04:31 +01:00
|
|
|
else if (jointype != JOIN_FULL) /* ie, LEFT or ANTI */
|
2008-08-14 20:48:00 +02:00
|
|
|
{
|
|
|
|
/* can't pass local constraints to non-nullable side */
|
|
|
|
pass_nonnullable_rels = nonnullable_rels;
|
|
|
|
pass_forced_null_vars = forced_null_vars;
|
|
|
|
}
|
2003-02-10 18:08:50 +01:00
|
|
|
else
|
2008-08-14 20:48:00 +02:00
|
|
|
{
|
|
|
|
/* no constraints pass through JOIN_FULL */
|
|
|
|
pass_nonnullable_rels = NULL;
|
|
|
|
pass_forced_null_vars = NIL;
|
|
|
|
}
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
reduce_outer_joins_pass2(j->larg, left_state,
|
|
|
|
state2, root,
|
2008-08-14 20:48:00 +02:00
|
|
|
pass_nonnullable_rels,
|
|
|
|
pass_forced_null_vars);
|
2003-02-10 18:08:50 +01:00
|
|
|
}
|
2008-08-14 20:48:00 +02:00
|
|
|
|
2003-02-10 00:57:19 +01:00
|
|
|
if (right_state->contains_outer)
|
2003-02-10 18:08:50 +01:00
|
|
|
{
|
2009-02-25 04:30:38 +01:00
|
|
|
if (jointype != JOIN_FULL) /* ie, INNER/LEFT/SEMI/ANTI */
|
2008-08-14 20:48:00 +02:00
|
|
|
{
|
|
|
|
/* pass appropriate constraints, per comment above */
|
|
|
|
pass_nonnullable_rels = local_nonnullable_rels;
|
|
|
|
pass_forced_null_vars = local_forced_null_vars;
|
|
|
|
}
|
2003-02-10 18:08:50 +01:00
|
|
|
else
|
2008-08-14 20:48:00 +02:00
|
|
|
{
|
|
|
|
/* no constraints pass through JOIN_FULL */
|
|
|
|
pass_nonnullable_rels = NULL;
|
|
|
|
pass_forced_null_vars = NIL;
|
|
|
|
}
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
reduce_outer_joins_pass2(j->rarg, right_state,
|
|
|
|
state2, root,
|
2008-08-14 20:48:00 +02:00
|
|
|
pass_nonnullable_rels,
|
|
|
|
pass_forced_null_vars);
|
2003-02-10 18:08:50 +01:00
|
|
|
}
|
2008-08-14 20:48:00 +02:00
|
|
|
bms_free(local_nonnullable_rels);
|
2003-02-10 00:57:19 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
2003-07-25 02:01:09 +02:00
|
|
|
elog(ERROR, "unrecognized node type: %d",
|
|
|
|
(int) nodeTag(jtnode));
|
2003-02-10 00:57:19 +01:00
|
|
|
}
|
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
/* Helper for reduce_outer_joins_pass2 */
|
|
|
|
static void
|
|
|
|
report_reduced_full_join(reduce_outer_joins_pass2_state *state2,
|
|
|
|
int rtindex, Relids relids)
|
|
|
|
{
|
|
|
|
reduce_outer_joins_partial_state *statep;
|
|
|
|
|
|
|
|
statep = palloc(sizeof(reduce_outer_joins_partial_state));
|
|
|
|
statep->full_join_rti = rtindex;
|
|
|
|
statep->unreduced_side = relids;
|
|
|
|
state2->partial_reduced = lappend(state2->partial_reduced, statep);
|
|
|
|
}
|
|
|
|
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* remove_useless_result_rtes
|
|
|
|
* Attempt to remove RTE_RESULT RTEs from the join tree.
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
* Also, elide single-child FromExprs where possible.
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
*
|
|
|
|
* We can remove RTE_RESULT entries from the join tree using the knowledge
|
|
|
|
* that RTE_RESULT returns exactly one row and has no output columns. Hence,
|
|
|
|
* if one is inner-joined to anything else, we can delete it. Optimizations
|
|
|
|
* are also possible for some outer-join cases, as detailed below.
|
|
|
|
*
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
* This pass also replaces single-child FromExprs with their child node
|
|
|
|
* where possible. It's appropriate to do that here and not earlier because
|
|
|
|
* RTE_RESULT removal might reduce a multiple-child FromExpr to have only one
|
|
|
|
* child. We can remove such a FromExpr if its quals are empty, or if it's
|
|
|
|
* semantically valid to merge the quals into those of the parent node.
|
|
|
|
* While removing unnecessary join tree nodes has some micro-efficiency value,
|
|
|
|
* the real reason to do this is to eliminate cases where the nullable side of
|
|
|
|
* an outer join node is a FromExpr whose single child is another outer join.
|
|
|
|
* To correctly determine whether the two outer joins can commute,
|
|
|
|
* deconstruct_jointree() must treat any quals of such a FromExpr as being
|
|
|
|
* degenerate quals of the upper outer join. The best way to do that is to
|
|
|
|
* make them actually *be* quals of the upper join, by dropping the FromExpr
|
|
|
|
* and hoisting the quals up into the upper join's quals. (Note that there is
|
|
|
|
* no hazard when the intermediate FromExpr has multiple children, since then
|
|
|
|
* it represents an inner join that cannot commute with the upper outer join.)
|
|
|
|
* As long as we have to do that, we might as well elide such FromExprs
|
|
|
|
* everywhere.
|
|
|
|
*
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
* Some of these optimizations depend on recognizing empty (constant-true)
|
|
|
|
* quals for FromExprs and JoinExprs. That makes it useful to apply this
|
|
|
|
* optimization pass after expression preprocessing, since that will have
|
|
|
|
* eliminated constant-true quals, allowing more cases to be recognized as
|
|
|
|
* optimizable. What's more, the usual reason for an RTE_RESULT to be present
|
|
|
|
* is that we pulled up a subquery or VALUES clause, thus very possibly
|
|
|
|
* replacing Vars with constants, making it more likely that a qual can be
|
|
|
|
* reduced to constant true. Also, because some optimizations depend on
|
|
|
|
* the outer-join type, it's best to have done reduce_outer_joins() first.
|
|
|
|
*
|
|
|
|
* A PlaceHolderVar referencing an RTE_RESULT RTE poses an obstacle to this
|
|
|
|
* process: we must remove the RTE_RESULT's relid from the PHV's phrels, but
|
|
|
|
* we must not reduce the phrels set to empty. If that would happen, and
|
|
|
|
* the RTE_RESULT is an immediate child of an outer join, we have to give up
|
|
|
|
* and not remove the RTE_RESULT: there is noplace else to evaluate the
|
|
|
|
* PlaceHolderVar. (That is, in such cases the RTE_RESULT *does* have output
|
|
|
|
* columns.) But if the RTE_RESULT is an immediate child of an inner join,
|
2019-12-14 19:49:15 +01:00
|
|
|
* we can usually change the PlaceHolderVar's phrels so as to evaluate it at
|
|
|
|
* the inner join instead. This is OK because we really only care that PHVs
|
|
|
|
* are evaluated above or below the correct outer joins. We can't, however,
|
|
|
|
* postpone the evaluation of a PHV to above where it is used; so there are
|
|
|
|
* some checks below on whether output PHVs are laterally referenced in the
|
|
|
|
* other join input rel(s).
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
*
|
|
|
|
* We used to try to do this work as part of pull_up_subqueries() where the
|
|
|
|
* potentially-optimizable cases get introduced; but it's way simpler, and
|
|
|
|
* more effective, to do it separately.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
remove_useless_result_rtes(PlannerInfo *root)
|
|
|
|
{
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
Relids dropped_outer_joins = NULL;
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
ListCell *cell;
|
|
|
|
|
|
|
|
/* Top level of jointree must always be a FromExpr */
|
|
|
|
Assert(IsA(root->parse->jointree, FromExpr));
|
|
|
|
/* Recurse ... */
|
|
|
|
root->parse->jointree = (FromExpr *)
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
remove_useless_results_recurse(root,
|
|
|
|
(Node *) root->parse->jointree,
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
NULL,
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
&dropped_outer_joins);
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
/* We should still have a FromExpr */
|
|
|
|
Assert(IsA(root->parse->jointree, FromExpr));
|
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
/*
|
|
|
|
* If we removed any outer-join nodes from the jointree, run around and
|
|
|
|
* remove references to those joins as nulling rels. (There could be such
|
|
|
|
* references in PHVs that we pulled up out of the original subquery that
|
|
|
|
* the RESULT rel replaced. This is kosher on the grounds that we now
|
|
|
|
* know that such an outer join wouldn't really have nulled anything.) We
|
|
|
|
* don't do this during the main recursion, for simplicity and because we
|
|
|
|
* can handle all such joins in a single pass over the parse tree.
|
|
|
|
*/
|
|
|
|
if (!bms_is_empty(dropped_outer_joins))
|
|
|
|
{
|
|
|
|
root->parse = (Query *)
|
|
|
|
remove_nulling_relids((Node *) root->parse,
|
|
|
|
dropped_outer_joins,
|
|
|
|
NULL);
|
|
|
|
/* There could be references in the append_rel_list, too */
|
|
|
|
root->append_rel_list = (List *)
|
|
|
|
remove_nulling_relids((Node *) root->append_rel_list,
|
|
|
|
dropped_outer_joins,
|
|
|
|
NULL);
|
|
|
|
}
|
|
|
|
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
/*
|
|
|
|
* Remove any PlanRowMark referencing an RTE_RESULT RTE. We obviously
|
|
|
|
* must do that for any RTE_RESULT that we just removed. But one for a
|
|
|
|
* RTE that we did not remove can be dropped anyway: since the RTE has
|
|
|
|
* only one possible output row, there is no need for EPQ to mark and
|
|
|
|
* restore that row.
|
|
|
|
*
|
|
|
|
* It's necessary, not optional, to remove the PlanRowMark for a surviving
|
|
|
|
* RTE_RESULT RTE; otherwise we'll generate a whole-row Var for the
|
|
|
|
* RTE_RESULT, which the executor has no support for.
|
|
|
|
*/
|
Represent Lists as expansible arrays, not chains of cons-cells.
Originally, Postgres Lists were a more or less exact reimplementation of
Lisp lists, which consist of chains of separately-allocated cons cells,
each having a value and a next-cell link. We'd hacked that once before
(commit d0b4399d8) to add a separate List header, but the data was still
in cons cells. That makes some operations -- notably list_nth() -- O(N),
and it's bulky because of the next-cell pointers and per-cell palloc
overhead, and it's very cache-unfriendly if the cons cells end up
scattered around rather than being adjacent.
In this rewrite, we still have List headers, but the data is in a
resizable array of values, with no next-cell links. Now we need at
most two palloc's per List, and often only one, since we can allocate
some values in the same palloc call as the List header. (Of course,
extending an existing List may require repalloc's to enlarge the array.
But this involves just O(log N) allocations not O(N).)
Of course this is not without downsides. The key difficulty is that
addition or deletion of a list entry may now cause other entries to
move, which it did not before.
For example, that breaks foreach() and sister macros, which historically
used a pointer to the current cons-cell as loop state. We can repair
those macros transparently by making their actual loop state be an
integer list index; the exposed "ListCell *" pointer is no longer state
carried across loop iterations, but is just a derived value. (In
practice, modern compilers can optimize things back to having just one
loop state value, at least for simple cases with inline loop bodies.)
In principle, this is a semantics change for cases where the loop body
inserts or deletes list entries ahead of the current loop index; but
I found no such cases in the Postgres code.
The change is not at all transparent for code that doesn't use foreach()
but chases lists "by hand" using lnext(). The largest share of such
code in the backend is in loops that were maintaining "prev" and "next"
variables in addition to the current-cell pointer, in order to delete
list cells efficiently using list_delete_cell(). However, we no longer
need a previous-cell pointer to delete a list cell efficiently. Keeping
a next-cell pointer doesn't work, as explained above, but we can improve
matters by changing such code to use a regular foreach() loop and then
using the new macro foreach_delete_current() to delete the current cell.
(This macro knows how to update the associated foreach loop's state so
that no cells will be missed in the traversal.)
There remains a nontrivial risk of code assuming that a ListCell *
pointer will remain good over an operation that could now move the list
contents. To help catch such errors, list.c can be compiled with a new
define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents
whenever that could possibly happen. This makes list operations
significantly more expensive so it's not normally turned on (though it
is on by default if USE_VALGRIND is on).
There are two notable API differences from the previous code:
* lnext() now requires the List's header pointer in addition to the
current cell's address.
* list_delete_cell() no longer requires a previous-cell argument.
These changes are somewhat unfortunate, but on the other hand code using
either function needs inspection to see if it is assuming anything
it shouldn't, so it's not all bad.
Programmers should be aware of these significant performance changes:
* list_nth() and related functions are now O(1); so there's no
major access-speed difference between a list and an array.
* Inserting or deleting a list element now takes time proportional to
the distance to the end of the list, due to moving the array elements.
(However, it typically *doesn't* require palloc or pfree, so except in
long lists it's probably still faster than before.) Notably, lcons()
used to be about the same cost as lappend(), but that's no longer true
if the list is long. Code that uses lcons() and list_delete_first()
to maintain a stack might usefully be rewritten to push and pop at the
end of the list rather than the beginning.
* There are now list_insert_nth...() and list_delete_nth...() functions
that add or remove a list cell identified by index. These have the
data-movement penalty explained above, but there's no search penalty.
* list_concat() and variants now copy the second list's data into
storage belonging to the first list, so there is no longer any
sharing of cells between the input lists. The second argument is
now declared "const List *" to reflect that it isn't changed.
This patch just does the minimum needed to get the new implementation
in place and fix bugs exposed by the regression tests. As suggested
by the foregoing, there's a fair amount of followup work remaining to
do.
Also, the ENABLE_LIST_COMPAT macros are finally removed in this
commit. Code using those should have been gone a dozen years ago.
Patch by me; thanks to David Rowley, Jesper Pedersen, and others
for review.
Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
|
|
|
foreach(cell, root->rowMarks)
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
{
|
|
|
|
PlanRowMark *rc = (PlanRowMark *) lfirst(cell);
|
|
|
|
|
|
|
|
if (rt_fetch(rc->rti, root->parse->rtable)->rtekind == RTE_RESULT)
|
Represent Lists as expansible arrays, not chains of cons-cells.
Originally, Postgres Lists were a more or less exact reimplementation of
Lisp lists, which consist of chains of separately-allocated cons cells,
each having a value and a next-cell link. We'd hacked that once before
(commit d0b4399d8) to add a separate List header, but the data was still
in cons cells. That makes some operations -- notably list_nth() -- O(N),
and it's bulky because of the next-cell pointers and per-cell palloc
overhead, and it's very cache-unfriendly if the cons cells end up
scattered around rather than being adjacent.
In this rewrite, we still have List headers, but the data is in a
resizable array of values, with no next-cell links. Now we need at
most two palloc's per List, and often only one, since we can allocate
some values in the same palloc call as the List header. (Of course,
extending an existing List may require repalloc's to enlarge the array.
But this involves just O(log N) allocations not O(N).)
Of course this is not without downsides. The key difficulty is that
addition or deletion of a list entry may now cause other entries to
move, which it did not before.
For example, that breaks foreach() and sister macros, which historically
used a pointer to the current cons-cell as loop state. We can repair
those macros transparently by making their actual loop state be an
integer list index; the exposed "ListCell *" pointer is no longer state
carried across loop iterations, but is just a derived value. (In
practice, modern compilers can optimize things back to having just one
loop state value, at least for simple cases with inline loop bodies.)
In principle, this is a semantics change for cases where the loop body
inserts or deletes list entries ahead of the current loop index; but
I found no such cases in the Postgres code.
The change is not at all transparent for code that doesn't use foreach()
but chases lists "by hand" using lnext(). The largest share of such
code in the backend is in loops that were maintaining "prev" and "next"
variables in addition to the current-cell pointer, in order to delete
list cells efficiently using list_delete_cell(). However, we no longer
need a previous-cell pointer to delete a list cell efficiently. Keeping
a next-cell pointer doesn't work, as explained above, but we can improve
matters by changing such code to use a regular foreach() loop and then
using the new macro foreach_delete_current() to delete the current cell.
(This macro knows how to update the associated foreach loop's state so
that no cells will be missed in the traversal.)
There remains a nontrivial risk of code assuming that a ListCell *
pointer will remain good over an operation that could now move the list
contents. To help catch such errors, list.c can be compiled with a new
define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents
whenever that could possibly happen. This makes list operations
significantly more expensive so it's not normally turned on (though it
is on by default if USE_VALGRIND is on).
There are two notable API differences from the previous code:
* lnext() now requires the List's header pointer in addition to the
current cell's address.
* list_delete_cell() no longer requires a previous-cell argument.
These changes are somewhat unfortunate, but on the other hand code using
either function needs inspection to see if it is assuming anything
it shouldn't, so it's not all bad.
Programmers should be aware of these significant performance changes:
* list_nth() and related functions are now O(1); so there's no
major access-speed difference between a list and an array.
* Inserting or deleting a list element now takes time proportional to
the distance to the end of the list, due to moving the array elements.
(However, it typically *doesn't* require palloc or pfree, so except in
long lists it's probably still faster than before.) Notably, lcons()
used to be about the same cost as lappend(), but that's no longer true
if the list is long. Code that uses lcons() and list_delete_first()
to maintain a stack might usefully be rewritten to push and pop at the
end of the list rather than the beginning.
* There are now list_insert_nth...() and list_delete_nth...() functions
that add or remove a list cell identified by index. These have the
data-movement penalty explained above, but there's no search penalty.
* list_concat() and variants now copy the second list's data into
storage belonging to the first list, so there is no longer any
sharing of cells between the input lists. The second argument is
now declared "const List *" to reflect that it isn't changed.
This patch just does the minimum needed to get the new implementation
in place and fix bugs exposed by the regression tests. As suggested
by the foregoing, there's a fair amount of followup work remaining to
do.
Also, the ENABLE_LIST_COMPAT macros are finally removed in this
commit. Code using those should have been gone a dozen years ago.
Patch by me; thanks to David Rowley, Jesper Pedersen, and others
for review.
Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
|
|
|
root->rowMarks = foreach_delete_current(root->rowMarks, cell);
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* remove_useless_results_recurse
|
|
|
|
* Recursive guts of remove_useless_result_rtes.
|
|
|
|
*
|
|
|
|
* This recursively processes the jointree and returns a modified jointree.
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
* In addition, the RT indexes of any removed outer-join nodes are added to
|
|
|
|
* *dropped_outer_joins.
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
*
|
|
|
|
* jtnode is the current jointree node. If it could be valid to merge
|
|
|
|
* its quals into those of the parent node, parent_quals should point to
|
|
|
|
* the parent's quals list; otherwise, pass NULL for parent_quals.
|
2023-06-20 17:09:56 +02:00
|
|
|
* (Note that in some cases, parent_quals points to the quals of a parent
|
|
|
|
* more than one level up in the tree.)
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
*/
|
|
|
|
static Node *
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
remove_useless_results_recurse(PlannerInfo *root, Node *jtnode,
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
Node **parent_quals,
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
Relids *dropped_outer_joins)
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
{
|
|
|
|
Assert(jtnode != NULL);
|
|
|
|
if (IsA(jtnode, RangeTblRef))
|
|
|
|
{
|
|
|
|
/* Can't immediately do anything with a RangeTblRef */
|
|
|
|
}
|
|
|
|
else if (IsA(jtnode, FromExpr))
|
|
|
|
{
|
|
|
|
FromExpr *f = (FromExpr *) jtnode;
|
|
|
|
Relids result_relids = NULL;
|
|
|
|
ListCell *cell;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We can drop RTE_RESULT rels from the fromlist so long as at least
|
|
|
|
* one child remains, since joining to a one-row table changes
|
2019-12-14 19:49:15 +01:00
|
|
|
* nothing. (But we can't drop a RTE_RESULT that computes PHV(s) that
|
|
|
|
* are needed by some sibling. The cleanup transformation below would
|
|
|
|
* reassign the PHVs to be computed at the join, which is too late for
|
|
|
|
* the sibling's use.) The easiest way to mechanize this rule is to
|
|
|
|
* modify the list in-place.
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
*/
|
Represent Lists as expansible arrays, not chains of cons-cells.
Originally, Postgres Lists were a more or less exact reimplementation of
Lisp lists, which consist of chains of separately-allocated cons cells,
each having a value and a next-cell link. We'd hacked that once before
(commit d0b4399d8) to add a separate List header, but the data was still
in cons cells. That makes some operations -- notably list_nth() -- O(N),
and it's bulky because of the next-cell pointers and per-cell palloc
overhead, and it's very cache-unfriendly if the cons cells end up
scattered around rather than being adjacent.
In this rewrite, we still have List headers, but the data is in a
resizable array of values, with no next-cell links. Now we need at
most two palloc's per List, and often only one, since we can allocate
some values in the same palloc call as the List header. (Of course,
extending an existing List may require repalloc's to enlarge the array.
But this involves just O(log N) allocations not O(N).)
Of course this is not without downsides. The key difficulty is that
addition or deletion of a list entry may now cause other entries to
move, which it did not before.
For example, that breaks foreach() and sister macros, which historically
used a pointer to the current cons-cell as loop state. We can repair
those macros transparently by making their actual loop state be an
integer list index; the exposed "ListCell *" pointer is no longer state
carried across loop iterations, but is just a derived value. (In
practice, modern compilers can optimize things back to having just one
loop state value, at least for simple cases with inline loop bodies.)
In principle, this is a semantics change for cases where the loop body
inserts or deletes list entries ahead of the current loop index; but
I found no such cases in the Postgres code.
The change is not at all transparent for code that doesn't use foreach()
but chases lists "by hand" using lnext(). The largest share of such
code in the backend is in loops that were maintaining "prev" and "next"
variables in addition to the current-cell pointer, in order to delete
list cells efficiently using list_delete_cell(). However, we no longer
need a previous-cell pointer to delete a list cell efficiently. Keeping
a next-cell pointer doesn't work, as explained above, but we can improve
matters by changing such code to use a regular foreach() loop and then
using the new macro foreach_delete_current() to delete the current cell.
(This macro knows how to update the associated foreach loop's state so
that no cells will be missed in the traversal.)
There remains a nontrivial risk of code assuming that a ListCell *
pointer will remain good over an operation that could now move the list
contents. To help catch such errors, list.c can be compiled with a new
define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents
whenever that could possibly happen. This makes list operations
significantly more expensive so it's not normally turned on (though it
is on by default if USE_VALGRIND is on).
There are two notable API differences from the previous code:
* lnext() now requires the List's header pointer in addition to the
current cell's address.
* list_delete_cell() no longer requires a previous-cell argument.
These changes are somewhat unfortunate, but on the other hand code using
either function needs inspection to see if it is assuming anything
it shouldn't, so it's not all bad.
Programmers should be aware of these significant performance changes:
* list_nth() and related functions are now O(1); so there's no
major access-speed difference between a list and an array.
* Inserting or deleting a list element now takes time proportional to
the distance to the end of the list, due to moving the array elements.
(However, it typically *doesn't* require palloc or pfree, so except in
long lists it's probably still faster than before.) Notably, lcons()
used to be about the same cost as lappend(), but that's no longer true
if the list is long. Code that uses lcons() and list_delete_first()
to maintain a stack might usefully be rewritten to push and pop at the
end of the list rather than the beginning.
* There are now list_insert_nth...() and list_delete_nth...() functions
that add or remove a list cell identified by index. These have the
data-movement penalty explained above, but there's no search penalty.
* list_concat() and variants now copy the second list's data into
storage belonging to the first list, so there is no longer any
sharing of cells between the input lists. The second argument is
now declared "const List *" to reflect that it isn't changed.
This patch just does the minimum needed to get the new implementation
in place and fix bugs exposed by the regression tests. As suggested
by the foregoing, there's a fair amount of followup work remaining to
do.
Also, the ENABLE_LIST_COMPAT macros are finally removed in this
commit. Code using those should have been gone a dozen years ago.
Patch by me; thanks to David Rowley, Jesper Pedersen, and others
for review.
Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
|
|
|
foreach(cell, f->fromlist)
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
{
|
|
|
|
Node *child = (Node *) lfirst(cell);
|
|
|
|
int varno;
|
|
|
|
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
/* Recursively transform child, allowing it to push up quals ... */
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
child = remove_useless_results_recurse(root, child,
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
&f->quals,
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
dropped_outer_joins);
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
/* ... and stick it back into the tree */
|
|
|
|
lfirst(cell) = child;
|
|
|
|
|
|
|
|
/*
|
2019-12-14 19:49:15 +01:00
|
|
|
* If it's an RTE_RESULT with at least one sibling, and no sibling
|
|
|
|
* references dependent PHVs, we can drop it. We don't yet know
|
|
|
|
* what the inner join's final relid set will be, so postpone
|
|
|
|
* cleanup of PHVs etc till after this loop.
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
*/
|
|
|
|
if (list_length(f->fromlist) > 1 &&
|
2019-12-14 19:49:15 +01:00
|
|
|
(varno = get_result_relid(root, child)) != 0 &&
|
|
|
|
!find_dependent_phvs_in_jointree(root, (Node *) f, varno))
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
{
|
Represent Lists as expansible arrays, not chains of cons-cells.
Originally, Postgres Lists were a more or less exact reimplementation of
Lisp lists, which consist of chains of separately-allocated cons cells,
each having a value and a next-cell link. We'd hacked that once before
(commit d0b4399d8) to add a separate List header, but the data was still
in cons cells. That makes some operations -- notably list_nth() -- O(N),
and it's bulky because of the next-cell pointers and per-cell palloc
overhead, and it's very cache-unfriendly if the cons cells end up
scattered around rather than being adjacent.
In this rewrite, we still have List headers, but the data is in a
resizable array of values, with no next-cell links. Now we need at
most two palloc's per List, and often only one, since we can allocate
some values in the same palloc call as the List header. (Of course,
extending an existing List may require repalloc's to enlarge the array.
But this involves just O(log N) allocations not O(N).)
Of course this is not without downsides. The key difficulty is that
addition or deletion of a list entry may now cause other entries to
move, which it did not before.
For example, that breaks foreach() and sister macros, which historically
used a pointer to the current cons-cell as loop state. We can repair
those macros transparently by making their actual loop state be an
integer list index; the exposed "ListCell *" pointer is no longer state
carried across loop iterations, but is just a derived value. (In
practice, modern compilers can optimize things back to having just one
loop state value, at least for simple cases with inline loop bodies.)
In principle, this is a semantics change for cases where the loop body
inserts or deletes list entries ahead of the current loop index; but
I found no such cases in the Postgres code.
The change is not at all transparent for code that doesn't use foreach()
but chases lists "by hand" using lnext(). The largest share of such
code in the backend is in loops that were maintaining "prev" and "next"
variables in addition to the current-cell pointer, in order to delete
list cells efficiently using list_delete_cell(). However, we no longer
need a previous-cell pointer to delete a list cell efficiently. Keeping
a next-cell pointer doesn't work, as explained above, but we can improve
matters by changing such code to use a regular foreach() loop and then
using the new macro foreach_delete_current() to delete the current cell.
(This macro knows how to update the associated foreach loop's state so
that no cells will be missed in the traversal.)
There remains a nontrivial risk of code assuming that a ListCell *
pointer will remain good over an operation that could now move the list
contents. To help catch such errors, list.c can be compiled with a new
define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents
whenever that could possibly happen. This makes list operations
significantly more expensive so it's not normally turned on (though it
is on by default if USE_VALGRIND is on).
There are two notable API differences from the previous code:
* lnext() now requires the List's header pointer in addition to the
current cell's address.
* list_delete_cell() no longer requires a previous-cell argument.
These changes are somewhat unfortunate, but on the other hand code using
either function needs inspection to see if it is assuming anything
it shouldn't, so it's not all bad.
Programmers should be aware of these significant performance changes:
* list_nth() and related functions are now O(1); so there's no
major access-speed difference between a list and an array.
* Inserting or deleting a list element now takes time proportional to
the distance to the end of the list, due to moving the array elements.
(However, it typically *doesn't* require palloc or pfree, so except in
long lists it's probably still faster than before.) Notably, lcons()
used to be about the same cost as lappend(), but that's no longer true
if the list is long. Code that uses lcons() and list_delete_first()
to maintain a stack might usefully be rewritten to push and pop at the
end of the list rather than the beginning.
* There are now list_insert_nth...() and list_delete_nth...() functions
that add or remove a list cell identified by index. These have the
data-movement penalty explained above, but there's no search penalty.
* list_concat() and variants now copy the second list's data into
storage belonging to the first list, so there is no longer any
sharing of cells between the input lists. The second argument is
now declared "const List *" to reflect that it isn't changed.
This patch just does the minimum needed to get the new implementation
in place and fix bugs exposed by the regression tests. As suggested
by the foregoing, there's a fair amount of followup work remaining to
do.
Also, the ENABLE_LIST_COMPAT macros are finally removed in this
commit. Code using those should have been gone a dozen years ago.
Patch by me; thanks to David Rowley, Jesper Pedersen, and others
for review.
Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
|
|
|
f->fromlist = foreach_delete_current(f->fromlist, cell);
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
result_relids = bms_add_member(result_relids, varno);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Clean up if we dropped any RTE_RESULT RTEs. This is a bit
|
|
|
|
* inefficient if there's more than one, but it seems better to
|
|
|
|
* optimize the support code for the single-relid case.
|
|
|
|
*/
|
|
|
|
if (result_relids)
|
|
|
|
{
|
|
|
|
int varno = -1;
|
|
|
|
|
|
|
|
while ((varno = bms_next_member(result_relids, varno)) >= 0)
|
|
|
|
remove_result_refs(root, varno, (Node *) f);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
* If the FromExpr now has only one child, see if we can elide it.
|
|
|
|
* This is always valid if there are no quals, except at the top of
|
|
|
|
* the jointree (since Query.jointree is required to point to a
|
|
|
|
* FromExpr). Otherwise, we can do it if we can push the quals up to
|
|
|
|
* the parent node.
|
|
|
|
*
|
|
|
|
* Note: while it would not be terribly hard to generalize this
|
|
|
|
* transformation to merge multi-child FromExprs into their parent
|
|
|
|
* FromExpr, that risks making the parent join too expensive to plan.
|
|
|
|
* We leave it to later processing to decide heuristically whether
|
|
|
|
* that's a good idea. Pulling up a single child is always OK,
|
|
|
|
* however.
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
*/
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
if (list_length(f->fromlist) == 1 &&
|
|
|
|
f != root->parse->jointree &&
|
|
|
|
(f->quals == NULL || parent_quals != NULL))
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Merge any quals up to parent. They should be in implicit-AND
|
|
|
|
* format by now, so we just need to concatenate lists. Put the
|
|
|
|
* child quals at the front, on the grounds that they should
|
|
|
|
* nominally be evaluated earlier.
|
|
|
|
*/
|
|
|
|
if (f->quals != NULL)
|
|
|
|
*parent_quals = (Node *)
|
|
|
|
list_concat(castNode(List, f->quals),
|
|
|
|
castNode(List, *parent_quals));
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
return (Node *) linitial(f->fromlist);
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
}
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
}
|
|
|
|
else if (IsA(jtnode, JoinExpr))
|
|
|
|
{
|
|
|
|
JoinExpr *j = (JoinExpr *) jtnode;
|
|
|
|
int varno;
|
|
|
|
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
/*
|
2023-06-20 17:09:56 +02:00
|
|
|
* First, recurse. We can absorb pushed-up FromExpr quals from either
|
|
|
|
* child into this node if the jointype is INNER, since then this is
|
|
|
|
* equivalent to a FromExpr. When the jointype is LEFT, we can absorb
|
|
|
|
* quals from the RHS child into the current node, as they're
|
|
|
|
* essentially degenerate quals of the outer join. Moreover, if we've
|
|
|
|
* been passed down a parent_quals pointer then we can allow quals of
|
|
|
|
* the LHS child to be absorbed into the parent. (This is important
|
|
|
|
* to ensure we remove single-child FromExprs immediately below
|
|
|
|
* commutable left joins.) For other jointypes, we can't move child
|
|
|
|
* quals up, or at least there's no particular reason to.
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
*/
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
j->larg = remove_useless_results_recurse(root, j->larg,
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
(j->jointype == JOIN_INNER) ?
|
2023-06-20 17:09:56 +02:00
|
|
|
&j->quals :
|
|
|
|
(j->jointype == JOIN_LEFT) ?
|
|
|
|
parent_quals : NULL,
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
dropped_outer_joins);
|
|
|
|
j->rarg = remove_useless_results_recurse(root, j->rarg,
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
(j->jointype == JOIN_INNER ||
|
|
|
|
j->jointype == JOIN_LEFT) ?
|
|
|
|
&j->quals : NULL,
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
dropped_outer_joins);
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
|
|
|
|
/* Apply join-type-specific optimization rules */
|
|
|
|
switch (j->jointype)
|
|
|
|
{
|
|
|
|
case JOIN_INNER:
|
|
|
|
|
|
|
|
/*
|
|
|
|
* An inner join is equivalent to a FromExpr, so if either
|
|
|
|
* side was simplified to an RTE_RESULT rel, we can replace
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
* the join with a FromExpr with just the other side.
|
|
|
|
* Furthermore, we can elide that FromExpr according to the
|
|
|
|
* same rules as above.
|
2019-12-14 19:49:15 +01:00
|
|
|
*
|
|
|
|
* Just as in the FromExpr case, we can't simplify if the
|
|
|
|
* other input rel references any PHVs that are marked as to
|
|
|
|
* be evaluated at the RTE_RESULT rel, because we can't
|
|
|
|
* postpone their evaluation in that case. But we only have
|
|
|
|
* to check this in cases where it's syntactically legal for
|
|
|
|
* the other input to have a LATERAL reference to the
|
|
|
|
* RTE_RESULT rel. Only RHSes of inner and left joins are
|
|
|
|
* allowed to have such refs.
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
*/
|
2019-12-14 19:49:15 +01:00
|
|
|
if ((varno = get_result_relid(root, j->larg)) != 0 &&
|
|
|
|
!find_dependent_phvs_in_jointree(root, j->rarg, varno))
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
{
|
|
|
|
remove_result_refs(root, varno, j->rarg);
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
if (j->quals != NULL && parent_quals == NULL)
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
jtnode = (Node *)
|
|
|
|
makeFromExpr(list_make1(j->rarg), j->quals);
|
|
|
|
else
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
{
|
|
|
|
/* Merge any quals up to parent */
|
|
|
|
if (j->quals != NULL)
|
|
|
|
*parent_quals = (Node *)
|
|
|
|
list_concat(castNode(List, j->quals),
|
|
|
|
castNode(List, *parent_quals));
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
jtnode = j->rarg;
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
}
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
}
|
|
|
|
else if ((varno = get_result_relid(root, j->rarg)) != 0)
|
|
|
|
{
|
|
|
|
remove_result_refs(root, varno, j->larg);
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
if (j->quals != NULL && parent_quals == NULL)
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
jtnode = (Node *)
|
|
|
|
makeFromExpr(list_make1(j->larg), j->quals);
|
|
|
|
else
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
{
|
|
|
|
/* Merge any quals up to parent */
|
|
|
|
if (j->quals != NULL)
|
|
|
|
*parent_quals = (Node *)
|
|
|
|
list_concat(castNode(List, j->quals),
|
|
|
|
castNode(List, *parent_quals));
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
jtnode = j->larg;
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
}
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
}
|
|
|
|
break;
|
|
|
|
case JOIN_LEFT:
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We can simplify this case if the RHS is an RTE_RESULT, with
|
|
|
|
* two different possibilities:
|
|
|
|
*
|
|
|
|
* If the qual is empty (JOIN ON TRUE), then the join can be
|
|
|
|
* strength-reduced to a plain inner join, since each LHS row
|
|
|
|
* necessarily has exactly one join partner. So we can always
|
|
|
|
* discard the RHS, much as in the JOIN_INNER case above.
|
2019-12-14 19:49:15 +01:00
|
|
|
* (Again, the LHS could not contain a lateral reference to
|
|
|
|
* the RHS.)
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
*
|
|
|
|
* Otherwise, it's still true that each LHS row should be
|
|
|
|
* returned exactly once, and since the RHS returns no columns
|
|
|
|
* (unless there are PHVs that have to be evaluated there), we
|
|
|
|
* don't much care if it's null-extended or not. So in this
|
|
|
|
* case also, we can just ignore the qual and discard the left
|
|
|
|
* join.
|
|
|
|
*/
|
|
|
|
if ((varno = get_result_relid(root, j->rarg)) != 0 &&
|
|
|
|
(j->quals == NULL ||
|
2019-12-14 19:49:15 +01:00
|
|
|
!find_dependent_phvs(root, varno)))
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
{
|
|
|
|
remove_result_refs(root, varno, j->larg);
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
*dropped_outer_joins = bms_add_member(*dropped_outer_joins,
|
|
|
|
j->rtindex);
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
jtnode = j->larg;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
case JOIN_SEMI:
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We may simplify this case if the RHS is an RTE_RESULT; the
|
|
|
|
* join qual becomes effectively just a filter qual for the
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
* LHS, since we should either return the LHS row or not. The
|
|
|
|
* filter clause must go into a new FromExpr if we can't push
|
|
|
|
* it up to the parent.
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
*
|
2022-11-29 16:52:44 +01:00
|
|
|
* There is a fine point about PHVs that are supposed to be
|
|
|
|
* evaluated at the RHS. Such PHVs could only appear in the
|
|
|
|
* semijoin's qual, since the rest of the query cannot
|
|
|
|
* reference any outputs of the semijoin's RHS. Therefore,
|
|
|
|
* they can't actually go to null before being examined, and
|
|
|
|
* it'd be OK to just remove the PHV wrapping. We don't have
|
|
|
|
* infrastructure for that, but remove_result_refs() will
|
|
|
|
* relabel them as to be evaluated at the LHS, which is fine.
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
*
|
|
|
|
* Also, we don't need to worry about removing traces of the
|
|
|
|
* join's rtindex, since it hasn't got one.
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
*/
|
|
|
|
if ((varno = get_result_relid(root, j->rarg)) != 0)
|
|
|
|
{
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
Assert(j->rtindex == 0);
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
remove_result_refs(root, varno, j->larg);
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
if (j->quals != NULL && parent_quals == NULL)
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
jtnode = (Node *)
|
|
|
|
makeFromExpr(list_make1(j->larg), j->quals);
|
|
|
|
else
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
{
|
|
|
|
/* Merge any quals up to parent */
|
|
|
|
if (j->quals != NULL)
|
|
|
|
*parent_quals = (Node *)
|
|
|
|
list_concat(castNode(List, j->quals),
|
|
|
|
castNode(List, *parent_quals));
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
jtnode = j->larg;
|
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of
infrastructure that calculated it. One use-case for it was in
join_clause_is_movable_to, but we can now replace that usage with
a check to see if the clause's relids include any outer join
that can null the target relation. The other use-case was in
join_clause_is_movable_into, but that test can just be dropped
entirely now that the clause's relids include outer joins.
Furthermore, join_clause_is_movable_into should now be
accurate enough that it will accept anything returned by
generate_join_implied_equalities, so we can restore the Assert
that was diked out in commit 95f4e59c3.
Remove the outerjoin_delayed mechanism. We needed this before to
prevent quals from getting evaluated below outer joins that should
null some of their vars. Now that we consider varnullingrels while
placing quals, that's taken care of automatically, so throw the
whole thing away.
Teach remove_useless_result_rtes to also remove useless FromExprs.
Having done that, the delay_upper_joins flag serves no purpose any
more and we can remove it, largely reverting 11086f2f2.
Use constant TRUE for "dummy" clauses when throwing back outer joins.
This improves on a hack I introduced in commit 6a6522529. If we
have a left-join clause l.x = r.y, and a WHERE clause l.x = constant,
we generate r.y = constant and then don't really have a need for the
join clause. But we must throw the join clause back anyway after
marking it redundant, so that the join search heuristics won't think
this is a clauseless join and avoid it. That was a kluge introduced
under time pressure, and after looking at it I thought of a better
way: let's just introduce constant-TRUE "join clauses" instead,
and get rid of them at the end. This improves the generated plans for
such cases by not having to test a redundant join clause. We can also
get rid of the ugly hack used to mark such clauses as redundant for
selectivity estimation.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:44:36 +01:00
|
|
|
}
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
}
|
|
|
|
break;
|
|
|
|
case JOIN_FULL:
|
|
|
|
case JOIN_ANTI:
|
|
|
|
/* We have no special smarts for these cases */
|
|
|
|
break;
|
|
|
|
default:
|
2022-11-29 16:52:44 +01:00
|
|
|
/* Note: JOIN_RIGHT should be gone at this point */
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
elog(ERROR, "unrecognized join type: %d",
|
|
|
|
(int) j->jointype);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
elog(ERROR, "unrecognized node type: %d",
|
|
|
|
(int) nodeTag(jtnode));
|
|
|
|
return jtnode;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* get_result_relid
|
|
|
|
* If jtnode is a RangeTblRef for an RTE_RESULT RTE, return its relid;
|
|
|
|
* otherwise return 0.
|
|
|
|
*/
|
2019-02-05 03:45:39 +01:00
|
|
|
static int
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
get_result_relid(PlannerInfo *root, Node *jtnode)
|
|
|
|
{
|
|
|
|
int varno;
|
|
|
|
|
|
|
|
if (!IsA(jtnode, RangeTblRef))
|
|
|
|
return 0;
|
|
|
|
varno = ((RangeTblRef *) jtnode)->rtindex;
|
|
|
|
if (rt_fetch(varno, root->parse->rtable)->rtekind != RTE_RESULT)
|
|
|
|
return 0;
|
|
|
|
return varno;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* remove_result_refs
|
|
|
|
* Helper routine for dropping an unneeded RTE_RESULT RTE.
|
|
|
|
*
|
|
|
|
* This doesn't physically remove the RTE from the jointree, because that's
|
|
|
|
* more easily handled in remove_useless_results_recurse. What it does do
|
|
|
|
* is the necessary cleanup in the rest of the tree: we must adjust any PHVs
|
|
|
|
* that may reference the RTE. Be sure to call this at a point where the
|
|
|
|
* jointree is valid (no disconnected nodes).
|
|
|
|
*
|
|
|
|
* Note that we don't need to process the append_rel_list, since RTEs
|
|
|
|
* referenced directly in the jointree won't be appendrel members.
|
|
|
|
*
|
|
|
|
* varno is the RTE_RESULT's relid.
|
|
|
|
* newjtloc is the jointree location at which any PHVs referencing the
|
|
|
|
* RTE_RESULT should be evaluated instead.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
remove_result_refs(PlannerInfo *root, int varno, Node *newjtloc)
|
|
|
|
{
|
|
|
|
/* Fix up PlaceHolderVars as needed */
|
|
|
|
/* If there are no PHVs anywhere, we can skip this bit */
|
|
|
|
if (root->glob->lastPHId != 0)
|
|
|
|
{
|
|
|
|
Relids subrelids;
|
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
subrelids = get_relids_in_jointree(newjtloc, true, false);
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
Assert(!bms_is_empty(subrelids));
|
|
|
|
substitute_phv_relids((Node *) root->parse, varno, subrelids);
|
2022-12-22 17:02:03 +01:00
|
|
|
fix_append_rel_relids(root, varno, subrelids);
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We also need to remove any PlanRowMark referencing the RTE, but we
|
|
|
|
* postpone that work until we return to remove_useless_result_rtes.
|
|
|
|
*/
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
* find_dependent_phvs - are there any PlaceHolderVars whose relids are
|
|
|
|
* exactly the given varno?
|
2019-12-14 19:49:15 +01:00
|
|
|
*
|
|
|
|
* find_dependent_phvs should be used when we want to see if there are
|
|
|
|
* any such PHVs anywhere in the Query. Another use-case is to see if
|
|
|
|
* a subtree of the join tree contains such PHVs; but for that, we have
|
|
|
|
* to look not only at the join tree nodes themselves but at the
|
|
|
|
* referenced RTEs. For that, use find_dependent_phvs_in_jointree.
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
*/
|
|
|
|
|
|
|
|
typedef struct
|
|
|
|
{
|
|
|
|
Relids relids;
|
|
|
|
int sublevels_up;
|
|
|
|
} find_dependent_phvs_context;
|
|
|
|
|
|
|
|
static bool
|
|
|
|
find_dependent_phvs_walker(Node *node,
|
|
|
|
find_dependent_phvs_context *context)
|
|
|
|
{
|
|
|
|
if (node == NULL)
|
|
|
|
return false;
|
|
|
|
if (IsA(node, PlaceHolderVar))
|
|
|
|
{
|
|
|
|
PlaceHolderVar *phv = (PlaceHolderVar *) node;
|
|
|
|
|
|
|
|
if (phv->phlevelsup == context->sublevels_up &&
|
|
|
|
bms_equal(context->relids, phv->phrels))
|
|
|
|
return true;
|
|
|
|
/* fall through to examine children */
|
|
|
|
}
|
|
|
|
if (IsA(node, Query))
|
|
|
|
{
|
|
|
|
/* Recurse into subselects */
|
|
|
|
bool result;
|
|
|
|
|
|
|
|
context->sublevels_up++;
|
|
|
|
result = query_tree_walker((Query *) node,
|
|
|
|
find_dependent_phvs_walker,
|
|
|
|
(void *) context, 0);
|
|
|
|
context->sublevels_up--;
|
|
|
|
return result;
|
|
|
|
}
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
/* Shouldn't need to handle most planner auxiliary nodes here */
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
Assert(!IsA(node, SpecialJoinInfo));
|
|
|
|
Assert(!IsA(node, PlaceHolderInfo));
|
|
|
|
Assert(!IsA(node, MinMaxAggInfo));
|
|
|
|
|
|
|
|
return expression_tree_walker(node, find_dependent_phvs_walker,
|
|
|
|
(void *) context);
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool
|
2019-12-14 19:49:15 +01:00
|
|
|
find_dependent_phvs(PlannerInfo *root, int varno)
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
{
|
|
|
|
find_dependent_phvs_context context;
|
|
|
|
|
2019-12-14 19:49:15 +01:00
|
|
|
/* If there are no PHVs anywhere, we needn't work hard */
|
|
|
|
if (root->glob->lastPHId == 0)
|
|
|
|
return false;
|
|
|
|
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
context.relids = bms_make_singleton(varno);
|
|
|
|
context.sublevels_up = 0;
|
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
if (query_tree_walker(root->parse,
|
|
|
|
find_dependent_phvs_walker,
|
|
|
|
(void *) &context,
|
|
|
|
0))
|
|
|
|
return true;
|
|
|
|
/* The append_rel_list could be populated already, so check it too */
|
|
|
|
if (expression_tree_walker((Node *) root->append_rel_list,
|
|
|
|
find_dependent_phvs_walker,
|
|
|
|
(void *) &context))
|
|
|
|
return true;
|
|
|
|
return false;
|
2019-12-14 19:49:15 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
static bool
|
|
|
|
find_dependent_phvs_in_jointree(PlannerInfo *root, Node *node, int varno)
|
|
|
|
{
|
|
|
|
find_dependent_phvs_context context;
|
|
|
|
Relids subrelids;
|
|
|
|
int relid;
|
|
|
|
|
|
|
|
/* If there are no PHVs anywhere, we needn't work hard */
|
|
|
|
if (root->glob->lastPHId == 0)
|
|
|
|
return false;
|
|
|
|
|
|
|
|
context.relids = bms_make_singleton(varno);
|
|
|
|
context.sublevels_up = 0;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* See if the jointree fragment itself contains references (in join quals)
|
|
|
|
*/
|
|
|
|
if (find_dependent_phvs_walker(node, &context))
|
|
|
|
return true;
|
|
|
|
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
/*
|
2019-12-14 19:49:15 +01:00
|
|
|
* Otherwise, identify the set of referenced RTEs (we can ignore joins,
|
|
|
|
* since they should be flattened already, so their join alias lists no
|
|
|
|
* longer matter), and tediously check each RTE. We can ignore RTEs that
|
|
|
|
* are not marked LATERAL, though, since they couldn't possibly contain
|
|
|
|
* any cross-references to other RTEs.
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
*/
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
subrelids = get_relids_in_jointree(node, false, false);
|
2019-12-14 19:49:15 +01:00
|
|
|
relid = -1;
|
|
|
|
while ((relid = bms_next_member(subrelids, relid)) >= 0)
|
|
|
|
{
|
|
|
|
RangeTblEntry *rte = rt_fetch(relid, root->parse->rtable);
|
|
|
|
|
|
|
|
if (rte->lateral &&
|
|
|
|
range_table_entry_walker(rte,
|
|
|
|
find_dependent_phvs_walker,
|
|
|
|
(void *) &context,
|
|
|
|
0))
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
return false;
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
}
|
|
|
|
|
2003-01-20 19:55:07 +01:00
|
|
|
/*
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
* substitute_phv_relids - adjust PlaceHolderVar relid sets after pulling up
|
|
|
|
* a subquery or removing an RTE_RESULT jointree item
|
2003-01-20 19:55:07 +01:00
|
|
|
*
|
2009-02-25 04:30:38 +01:00
|
|
|
* Find any PlaceHolderVar nodes in the given tree that reference the
|
|
|
|
* pulled-up relid, and change them to reference the replacement relid(s).
|
2003-01-20 19:55:07 +01:00
|
|
|
*
|
2008-08-14 20:48:00 +02:00
|
|
|
* NOTE: although this has the form of a walker, we cheat and modify the
|
Fix subquery pullup to wrap a PlaceHolderVar around the entire RowExpr
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
2009-09-02 19:52:24 +02:00
|
|
|
* nodes in-place. This should be OK since the tree was copied by
|
|
|
|
* pullup_replace_vars earlier. Avoid scribbling on the original values of
|
|
|
|
* the bitmapsets, though, because expression_tree_mutator doesn't copy those.
|
2003-01-20 19:55:07 +01:00
|
|
|
*/
|
2008-08-14 20:48:00 +02:00
|
|
|
|
|
|
|
typedef struct
|
2003-01-20 19:55:07 +01:00
|
|
|
{
|
2008-08-14 20:48:00 +02:00
|
|
|
int varno;
|
2012-03-24 21:21:39 +01:00
|
|
|
int sublevels_up;
|
2008-08-14 20:48:00 +02:00
|
|
|
Relids subrelids;
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
} substitute_phv_relids_context;
|
2003-01-20 19:55:07 +01:00
|
|
|
|
2008-08-14 20:48:00 +02:00
|
|
|
static bool
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
substitute_phv_relids_walker(Node *node,
|
|
|
|
substitute_phv_relids_context *context)
|
2008-08-14 20:48:00 +02:00
|
|
|
{
|
|
|
|
if (node == NULL)
|
|
|
|
return false;
|
2008-10-21 22:42:53 +02:00
|
|
|
if (IsA(node, PlaceHolderVar))
|
|
|
|
{
|
|
|
|
PlaceHolderVar *phv = (PlaceHolderVar *) node;
|
|
|
|
|
2012-03-24 21:21:39 +01:00
|
|
|
if (phv->phlevelsup == context->sublevels_up &&
|
|
|
|
bms_is_member(context->varno, phv->phrels))
|
2008-10-21 22:42:53 +02:00
|
|
|
{
|
|
|
|
phv->phrels = bms_union(phv->phrels,
|
|
|
|
context->subrelids);
|
|
|
|
phv->phrels = bms_del_member(phv->phrels,
|
|
|
|
context->varno);
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
/* Assert we haven't broken the PHV */
|
|
|
|
Assert(!bms_is_empty(phv->phrels));
|
2008-10-21 22:42:53 +02:00
|
|
|
}
|
|
|
|
/* fall through to examine children */
|
|
|
|
}
|
2012-03-24 21:21:39 +01:00
|
|
|
if (IsA(node, Query))
|
|
|
|
{
|
|
|
|
/* Recurse into subselects */
|
|
|
|
bool result;
|
|
|
|
|
|
|
|
context->sublevels_up++;
|
|
|
|
result = query_tree_walker((Query *) node,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
substitute_phv_relids_walker,
|
2012-03-24 21:21:39 +01:00
|
|
|
(void *) context, 0);
|
|
|
|
context->sublevels_up--;
|
|
|
|
return result;
|
|
|
|
}
|
2008-10-22 22:17:52 +02:00
|
|
|
/* Shouldn't need to handle planner auxiliary nodes here */
|
|
|
|
Assert(!IsA(node, SpecialJoinInfo));
|
|
|
|
Assert(!IsA(node, AppendRelInfo));
|
|
|
|
Assert(!IsA(node, PlaceHolderInfo));
|
2010-11-04 17:01:17 +01:00
|
|
|
Assert(!IsA(node, MinMaxAggInfo));
|
2008-10-21 22:42:53 +02:00
|
|
|
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
return expression_tree_walker(node, substitute_phv_relids_walker,
|
2008-08-14 20:48:00 +02:00
|
|
|
(void *) context);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
substitute_phv_relids(Node *node, int varno, Relids subrelids)
|
2008-08-14 20:48:00 +02:00
|
|
|
{
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
substitute_phv_relids_context context;
|
2008-08-14 20:48:00 +02:00
|
|
|
|
|
|
|
context.varno = varno;
|
2012-03-24 21:21:39 +01:00
|
|
|
context.sublevels_up = 0;
|
2008-08-14 20:48:00 +02:00
|
|
|
context.subrelids = subrelids;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Must be prepared to start with a Query or a bare expression tree.
|
|
|
|
*/
|
|
|
|
query_or_expression_tree_walker(node,
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
substitute_phv_relids_walker,
|
2008-08-14 20:48:00 +02:00
|
|
|
(void *) &context,
|
|
|
|
0);
|
2003-01-20 19:55:07 +01:00
|
|
|
}
|
|
|
|
|
2006-02-03 22:08:49 +01:00
|
|
|
/*
|
|
|
|
* fix_append_rel_relids: update RT-index fields of AppendRelInfo nodes
|
|
|
|
*
|
|
|
|
* When we pull up a subquery, any AppendRelInfo references to the subquery's
|
|
|
|
* RT index have to be replaced by the substituted relid (and there had better
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
* be only one). We also need to apply substitute_phv_relids to their
|
2008-11-11 20:05:21 +01:00
|
|
|
* translated_vars lists, since those might contain PlaceHolderVars.
|
2006-02-03 22:08:49 +01:00
|
|
|
*
|
|
|
|
* We assume we may modify the AppendRelInfo nodes in-place.
|
|
|
|
*/
|
|
|
|
static void
|
2022-12-22 17:02:03 +01:00
|
|
|
fix_append_rel_relids(PlannerInfo *root, int varno, Relids subrelids)
|
2006-02-03 22:08:49 +01:00
|
|
|
{
|
|
|
|
ListCell *l;
|
|
|
|
int subvarno = -1;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We only want to extract the member relid once, but we mustn't fail
|
|
|
|
* immediately if there are multiple members; it could be that none of the
|
|
|
|
* AppendRelInfo nodes refer to it. So compute it on first use. Note that
|
|
|
|
* bms_singleton_member will complain if set is not singleton.
|
|
|
|
*/
|
2022-12-22 17:02:03 +01:00
|
|
|
foreach(l, root->append_rel_list)
|
2006-02-03 22:08:49 +01:00
|
|
|
{
|
|
|
|
AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(l);
|
|
|
|
|
|
|
|
/* The parent_relid shouldn't ever be a pullup target */
|
|
|
|
Assert(appinfo->parent_relid != varno);
|
|
|
|
|
|
|
|
if (appinfo->child_relid == varno)
|
|
|
|
{
|
|
|
|
if (subvarno < 0)
|
|
|
|
subvarno = bms_singleton_member(subrelids);
|
|
|
|
appinfo->child_relid = subvarno;
|
|
|
|
}
|
2008-11-11 20:05:21 +01:00
|
|
|
|
In the planner, replace an empty FROM clause with a dummy RTE.
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
2019-01-28 23:54:10 +01:00
|
|
|
/* Also fix up any PHVs in its translated vars */
|
2022-12-22 17:02:03 +01:00
|
|
|
if (root->glob->lastPHId != 0)
|
|
|
|
substitute_phv_relids((Node *) appinfo->translated_vars,
|
|
|
|
varno, subrelids);
|
2006-02-03 22:08:49 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2003-01-20 19:55:07 +01:00
|
|
|
/*
|
2008-08-17 03:20:00 +02:00
|
|
|
* get_relids_in_jointree: get set of RT indexes present in a jointree
|
|
|
|
*
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
* Base-relation relids are always included in the result.
|
|
|
|
* If include_outer_joins is true, outer-join RT indexes are included.
|
|
|
|
* If include_inner_joins is true, inner-join RT indexes are included.
|
|
|
|
*
|
|
|
|
* Note that for most purposes in the planner, outer joins are included
|
|
|
|
* in standard relid sets. Setting include_inner_joins true is only
|
|
|
|
* appropriate for special purposes during subquery flattening.
|
2003-01-20 19:55:07 +01:00
|
|
|
*/
|
2003-02-08 21:20:55 +01:00
|
|
|
Relids
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
get_relids_in_jointree(Node *jtnode, bool include_outer_joins,
|
|
|
|
bool include_inner_joins)
|
2003-01-20 19:55:07 +01:00
|
|
|
{
|
2003-02-08 21:20:55 +01:00
|
|
|
Relids result = NULL;
|
2003-01-20 19:55:07 +01:00
|
|
|
|
|
|
|
if (jtnode == NULL)
|
|
|
|
return result;
|
|
|
|
if (IsA(jtnode, RangeTblRef))
|
|
|
|
{
|
|
|
|
int varno = ((RangeTblRef *) jtnode)->rtindex;
|
|
|
|
|
2003-02-08 21:20:55 +01:00
|
|
|
result = bms_make_singleton(varno);
|
2003-01-20 19:55:07 +01:00
|
|
|
}
|
|
|
|
else if (IsA(jtnode, FromExpr))
|
|
|
|
{
|
|
|
|
FromExpr *f = (FromExpr *) jtnode;
|
2004-05-26 06:41:50 +02:00
|
|
|
ListCell *l;
|
2003-01-20 19:55:07 +01:00
|
|
|
|
|
|
|
foreach(l, f->fromlist)
|
|
|
|
{
|
2003-02-08 21:20:55 +01:00
|
|
|
result = bms_join(result,
|
2008-08-17 03:20:00 +02:00
|
|
|
get_relids_in_jointree(lfirst(l),
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
include_outer_joins,
|
|
|
|
include_inner_joins));
|
2003-01-20 19:55:07 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
else if (IsA(jtnode, JoinExpr))
|
|
|
|
{
|
|
|
|
JoinExpr *j = (JoinExpr *) jtnode;
|
|
|
|
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
result = get_relids_in_jointree(j->larg,
|
|
|
|
include_outer_joins,
|
|
|
|
include_inner_joins);
|
2008-08-17 03:20:00 +02:00
|
|
|
result = bms_join(result,
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
get_relids_in_jointree(j->rarg,
|
|
|
|
include_outer_joins,
|
|
|
|
include_inner_joins));
|
|
|
|
if (j->rtindex)
|
|
|
|
{
|
|
|
|
if (j->jointype == JOIN_INNER)
|
|
|
|
{
|
|
|
|
if (include_inner_joins)
|
|
|
|
result = bms_add_member(result, j->rtindex);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
if (include_outer_joins)
|
|
|
|
result = bms_add_member(result, j->rtindex);
|
|
|
|
}
|
|
|
|
}
|
2003-01-20 19:55:07 +01:00
|
|
|
}
|
|
|
|
else
|
2003-07-25 02:01:09 +02:00
|
|
|
elog(ERROR, "unrecognized node type: %d",
|
|
|
|
(int) nodeTag(jtnode));
|
2003-01-20 19:55:07 +01:00
|
|
|
return result;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
* get_relids_for_join: get set of base+OJ RT indexes making up a join
|
2003-01-20 19:55:07 +01:00
|
|
|
*/
|
2003-02-08 21:20:55 +01:00
|
|
|
Relids
|
2019-01-29 21:26:44 +01:00
|
|
|
get_relids_for_join(Query *query, int joinrelid)
|
2003-01-20 19:55:07 +01:00
|
|
|
{
|
|
|
|
Node *jtnode;
|
|
|
|
|
2019-01-29 21:26:44 +01:00
|
|
|
jtnode = find_jointree_node_for_rel((Node *) query->jointree,
|
2005-06-06 00:32:58 +02:00
|
|
|
joinrelid);
|
2003-01-20 19:55:07 +01:00
|
|
|
if (!jtnode)
|
2003-07-25 02:01:09 +02:00
|
|
|
elog(ERROR, "could not find join node %d", joinrelid);
|
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
|
|
|
return get_relids_in_jointree(jtnode, true, false);
|
2003-01-20 19:55:07 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* find_jointree_node_for_rel: locate jointree node for a base or join RT index
|
|
|
|
*
|
|
|
|
* Returns NULL if not found
|
|
|
|
*/
|
|
|
|
static Node *
|
|
|
|
find_jointree_node_for_rel(Node *jtnode, int relid)
|
|
|
|
{
|
|
|
|
if (jtnode == NULL)
|
|
|
|
return NULL;
|
|
|
|
if (IsA(jtnode, RangeTblRef))
|
|
|
|
{
|
|
|
|
int varno = ((RangeTblRef *) jtnode)->rtindex;
|
|
|
|
|
|
|
|
if (relid == varno)
|
|
|
|
return jtnode;
|
|
|
|
}
|
|
|
|
else if (IsA(jtnode, FromExpr))
|
|
|
|
{
|
|
|
|
FromExpr *f = (FromExpr *) jtnode;
|
2004-05-26 06:41:50 +02:00
|
|
|
ListCell *l;
|
2003-01-20 19:55:07 +01:00
|
|
|
|
|
|
|
foreach(l, f->fromlist)
|
|
|
|
{
|
|
|
|
jtnode = find_jointree_node_for_rel(lfirst(l), relid);
|
|
|
|
if (jtnode)
|
|
|
|
return jtnode;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else if (IsA(jtnode, JoinExpr))
|
|
|
|
{
|
|
|
|
JoinExpr *j = (JoinExpr *) jtnode;
|
|
|
|
|
|
|
|
if (relid == j->rtindex)
|
|
|
|
return jtnode;
|
|
|
|
jtnode = find_jointree_node_for_rel(j->larg, relid);
|
|
|
|
if (jtnode)
|
|
|
|
return jtnode;
|
|
|
|
jtnode = find_jointree_node_for_rel(j->rarg, relid);
|
|
|
|
if (jtnode)
|
|
|
|
return jtnode;
|
|
|
|
}
|
|
|
|
else
|
2003-07-25 02:01:09 +02:00
|
|
|
elog(ERROR, "unrecognized node type: %d",
|
|
|
|
(int) nodeTag(jtnode));
|
2003-01-20 19:55:07 +01:00
|
|
|
return NULL;
|
|
|
|
}
|