Centralize fixups for mismatched nullingrels in nestloop params.

It turns out that the fixes we applied in commits bfd332b3f
and 63e4f13d2 were not nearly enough to solve the problem.
We'd focused narrowly on subquery RTEs with lateral references,
but lateral references can occur in several other RTE kinds
such as function RTEs.  Putting the same hack into half a dozen
code paths seems quite unattractive.  Hence, revert the code changes
(but not the test cases) from those commits and instead solve it
centrally in identify_current_nestloop_params(), as Richard proposed
originally.  This is a bit annoying because it could mask erroneous
nullingrels in nestloop params that are generated from non-LATERAL
parameterized paths; but on balance I don't see a better way.
Maybe at some future time we'll be motivated to find a more rigorous
approach to nestloop params, but that's not happening for beta2.

Richard Guo and Tom Lane

Discussion: https://postgr.es/m/CAMbWs48Jcw-NvnxT23WiHP324wG44DvzcH1j4hc0Zn+3sR9cfg@mail.gmail.com
This commit is contained in:
Tom Lane 2023-06-20 10:22:52 -04:00
parent 5961141646
commit 0655c03ef9
5 changed files with 69 additions and 72 deletions

View File

@ -430,24 +430,6 @@ have_unsafe_outer_join_ref(PlannerInfo *root,
* These are returned in parallel lists in *param_exprs and *operators.
* We also set *binary_mode to indicate whether strict binary matching is
* required.
*
* A complication is that innerrel's lateral_vars may contain nullingrel
* markers that need adjustment. This occurs if we have applied outer join
* identity 3,
* (A leftjoin B on (Pab)) leftjoin C on (Pb*c)
* = A leftjoin (B leftjoin C on (Pbc)) on (Pab)
* and C contains lateral references to B. It's still safe to apply the
* identity, but the parser will have created those references in the form
* "b*" (i.e., with varnullingrels listing the A/B join), while what we will
* have available from the nestloop's outer side is just "b". We deal with
* that here by stripping the nullingrels down to what is available from the
* outer side according to outerrel->relids.
* That fixes matters for the case of forward application of identity 3.
* If the identity was applied in the reverse direction, we will have
* innerrel's lateral_vars containing too few nullingrel bits rather than
* too many. Currently, that causes no problems because setrefs.c applies
* only a subset check to nullingrels in NestLoopParams, but we'd have to
* work harder if we ever want to tighten that check.
*/
static bool
paraminfo_get_equal_hashops(PlannerInfo *root, ParamPathInfo *param_info,
@ -551,25 +533,6 @@ paraminfo_get_equal_hashops(PlannerInfo *root, ParamPathInfo *param_info,
return false;
}
/* OK, but adjust its nullingrels before adding it to result */
expr = copyObject(expr);
if (IsA(expr, Var))
{
Var *var = (Var *) expr;
var->varnullingrels = bms_intersect(var->varnullingrels,
outerrel->relids);
}
else if (IsA(expr, PlaceHolderVar))
{
PlaceHolderVar *phv = (PlaceHolderVar *) expr;
phv->phnullingrels = bms_intersect(phv->phnullingrels,
outerrel->relids);
}
else
Assert(false);
*operators = lappend_oid(*operators, typentry->eq_opr);
*param_exprs = lappend(*param_exprs, expr);

View File

@ -2289,11 +2289,11 @@ set_join_references(PlannerInfo *root, Join *join, int rtoffset)
* the outer-join level at which they are used, Vars seen in the
* NestLoopParam expression may have nullingrels that are just a
* subset of those in the Vars actually available from the outer
* side. Lateral references can create the same situation, as
* explained in the comments for process_subquery_nestloop_params
* and paraminfo_get_equal_hashops. Not checking this exactly is
* a bit grotty, but the work needed to make things match up
* perfectly seems well out of proportion to the value.
* side. (Lateral references can also cause this, as explained in
* the comments for identify_current_nestloop_params.) Not
* checking this exactly is a bit grotty, but the work needed to
* make things match up perfectly seems well out of proportion to
* the value.
*/
nlp->paramval = (Var *) fix_upper_expr(root,
(Node *) nlp->paramval,

View File

@ -421,26 +421,8 @@ replace_nestloop_param_placeholdervar(PlannerInfo *root, PlaceHolderVar *phv)
* provide these values. This differs from replace_nestloop_param_var in
* that the PARAM_EXEC slots to use have already been determined.
*
* An additional complication is that the subplan_params may contain
* nullingrel markers that need adjustment. This occurs if we have applied
* outer join identity 3,
* (A leftjoin B on (Pab)) leftjoin C on (Pb*c)
* = A leftjoin (B leftjoin C on (Pbc)) on (Pab)
* and C is a subquery containing lateral references to B. It's still safe
* to apply the identity, but the parser will have created those references
* in the form "b*" (i.e., with varnullingrels listing the A/B join), while
* what we will have available from the nestloop's outer side is just "b".
* We deal with that here by stripping the nullingrels down to what is
* available from the outer side according to root->curOuterRels.
* That fixes matters for the case of forward application of identity 3.
* If the identity was applied in the reverse direction, we will have
* subplan_params containing too few nullingrel bits rather than too many.
* Currently, that causes no problems because setrefs.c applies only a
* subset check to nullingrels in NestLoopParams, but we'd have to work
* harder if we ever want to tighten that check.
*
* Note that we also use root->curOuterRels as an implicit parameter for
* sanity checks and nullingrel adjustments.
* sanity checks.
*/
void
process_subquery_nestloop_params(PlannerInfo *root, List *subplan_params)
@ -467,19 +449,17 @@ process_subquery_nestloop_params(PlannerInfo *root, List *subplan_params)
nlp = (NestLoopParam *) lfirst(lc2);
if (nlp->paramno == pitem->paramId)
{
Assert(equal(var, nlp->paramval));
/* Present, so nothing to do */
break;
}
}
if (lc2 == NULL)
{
/* No, so add it after adjusting its nullingrels */
var = copyObject(var);
var->varnullingrels = bms_intersect(var->varnullingrels,
root->curOuterRels);
/* No, so add it */
nlp = makeNode(NestLoopParam);
nlp->paramno = pitem->paramId;
nlp->paramval = var;
nlp->paramval = copyObject(var);
root->curOuterParams = lappend(root->curOuterParams, nlp);
}
}
@ -500,19 +480,17 @@ process_subquery_nestloop_params(PlannerInfo *root, List *subplan_params)
nlp = (NestLoopParam *) lfirst(lc2);
if (nlp->paramno == pitem->paramId)
{
Assert(equal(phv, nlp->paramval));
/* Present, so nothing to do */
break;
}
}
if (lc2 == NULL)
{
/* No, so add it after adjusting its nullingrels */
phv = copyObject(phv);
phv->phnullingrels = bms_intersect(phv->phnullingrels,
root->curOuterRels);
/* No, so add it */
nlp = makeNode(NestLoopParam);
nlp->paramno = pitem->paramId;
nlp->paramval = (Var *) phv;
nlp->paramval = (Var *) copyObject(phv);
root->curOuterParams = lappend(root->curOuterParams, nlp);
}
}
@ -525,6 +503,28 @@ process_subquery_nestloop_params(PlannerInfo *root, List *subplan_params)
* Identify any NestLoopParams that should be supplied by a NestLoop plan
* node with the specified lefthand rels. Remove them from the active
* root->curOuterParams list and return them as the result list.
*
* XXX Here we also hack up the returned Vars and PHVs so that they do not
* contain nullingrel sets exceeding what is available from the outer side.
* This is needed if we have applied outer join identity 3,
* (A leftjoin B on (Pab)) leftjoin C on (Pb*c)
* = A leftjoin (B leftjoin C on (Pbc)) on (Pab)
* and C contains lateral references to B. It's still safe to apply the
* identity, but the parser will have created those references in the form
* "b*" (i.e., with varnullingrels listing the A/B join), while what we will
* have available from the nestloop's outer side is just "b". We deal with
* that here by stripping the nullingrels down to what is available from the
* outer side according to leftrelids.
*
* That fixes matters for the case of forward application of identity 3.
* If the identity was applied in the reverse direction, we will have
* parameter Vars containing too few nullingrel bits rather than too many.
* Currently, that causes no problems because setrefs.c applies only a
* subset check to nullingrels in NestLoopParams, but we'd have to work
* harder if we ever want to tighten that check. This is all pretty annoying
* because it greatly weakens setrefs.c's cross-check, but the alternative
* seems to be to generate multiple versions of each laterally-parameterized
* subquery, which'd be unduly expensive.
*/
List *
identify_current_nestloop_params(PlannerInfo *root, Relids leftrelids)
@ -539,13 +539,19 @@ identify_current_nestloop_params(PlannerInfo *root, Relids leftrelids)
/*
* We are looking for Vars and PHVs that can be supplied by the
* lefthand rels.
* lefthand rels. When we find one, it's okay to modify it in-place
* because all the routines above make a fresh copy to put into
* curOuterParams.
*/
if (IsA(nlp->paramval, Var) &&
bms_is_member(nlp->paramval->varno, leftrelids))
{
Var *var = (Var *) nlp->paramval;
root->curOuterParams = foreach_delete_current(root->curOuterParams,
cell);
var->varnullingrels = bms_intersect(var->varnullingrels,
leftrelids);
result = lappend(result, nlp);
}
else if (IsA(nlp->paramval, PlaceHolderVar) &&
@ -553,8 +559,12 @@ identify_current_nestloop_params(PlannerInfo *root, Relids leftrelids)
(PlaceHolderVar *) nlp->paramval)->ph_eval_at,
leftrelids))
{
PlaceHolderVar *phv = (PlaceHolderVar *) nlp->paramval;
root->curOuterParams = foreach_delete_current(root->curOuterParams,
cell);
phv->phnullingrels = bms_intersect(phv->phnullingrels,
leftrelids);
result = lappend(result, nlp);
}
}

View File

@ -2607,6 +2607,23 @@ select * from int8_tbl t1
Filter: (q1 = t2.q1)
(8 rows)
explain (costs off)
select * from int8_tbl t1
left join int8_tbl t2 on true
left join lateral
(select * from generate_series(t2.q1, 100)) s
on t2.q1 = 1;
QUERY PLAN
----------------------------------------------------
Nested Loop Left Join
-> Seq Scan on int8_tbl t1
-> Materialize
-> Nested Loop Left Join
Join Filter: (t2.q1 = 1)
-> Seq Scan on int8_tbl t2
-> Function Scan on generate_series
(7 rows)
explain (costs off)
select * from onek t1
left join onek t2 on true

View File

@ -521,6 +521,13 @@ select * from int8_tbl t1
(select * from int8_tbl t3 where t3.q1 = t2.q1 offset 0) s
on t2.q1 = 1;
explain (costs off)
select * from int8_tbl t1
left join int8_tbl t2 on true
left join lateral
(select * from generate_series(t2.q1, 100)) s
on t2.q1 = 1;
explain (costs off)
select * from onek t1
left join onek t2 on true