Fix some issues with wrong placement of pseudo-constant quals.

initsplan.c figured that it could push Var-free qual clauses to
the top of the current JoinDomain, which is okay in the abstract.
But if the current domain is inside some outer join, and we later
commute an inside-the-domain outer join with one outside it,
we end up placing the pushed-up qual clause incorrectly.

In distribute_qual_to_rels, avoid this by using the syntactic scope
of the qual clause; with the exception that if we're in the top-level
join domain we can still use the full query relid set, ensuring the
resulting gating Result node goes to the top of the plan.  (This is
approximately as smart as the pre-v16 code was.  Perhaps we can do
better later, but it's not clear that such cases are worth a lot of
sweat.)

In process_implied_equality, we don't have a clear notion of syntactic
scope, but we do have the results of SpecialJoinInfo construction.
Thumb through those and remove any lower outer joins that might get
commuted to above the join domain.  Again, we can make an exception
for the top-level join domain.  It'd be possible to work harder here
(for example, by keeping outer joins that aren't shown as potentially
commutable), but I'm going to stop here for the moment.  This issue
has convinced me that the current representation of join domains
probably needs further refinement, so I'm disinclined to write
inessential dependent logic just yet.

In passing, tighten the qualscope passed to process_implied_equality
by generate_base_implied_equalities_no_const; there's no need for
it to be larger than the rel we are currently considering.

Tom Lane and Richard Guo, per report from Tender Wang.

Discussion: https://postgr.es/m/CAHewXNk9eJ35ru5xATWioTV4+xZPHptjy9etdcNPjUfY9RQ+uQ@mail.gmail.com
This commit is contained in:
Tom Lane 2023-02-22 12:39:07 -05:00
parent 7fe1aa991b
commit a75ff55c83
4 changed files with 155 additions and 9 deletions

View File

@ -1248,11 +1248,11 @@ generate_base_implied_equalities_no_const(PlannerInfo *root,
/*
* The expressions aren't constants, so the passed qualscope will
* never be used to place the generated clause. We just need to
* be sure it covers both expressions, so ec_relids will serve.
* be sure it covers both expressions, which em_relids should do.
*/
rinfo = process_implied_equality(root, eq_op, ec->ec_collation,
prev_em->em_expr, cur_em->em_expr,
ec->ec_relids,
cur_em->em_relids,
ec->ec_min_security,
false);

View File

@ -125,6 +125,7 @@ static void distribute_qual_to_rels(PlannerInfo *root, Node *clause,
bool is_clone,
List **postponed_oj_qual_list);
static bool check_redundant_nullability_qual(PlannerInfo *root, Node *clause);
static Relids get_join_domain_min_rels(PlannerInfo *root, Relids domain_relids);
static void check_mergejoinable(RestrictInfo *restrictinfo);
static void check_hashjoinable(RestrictInfo *restrictinfo);
static void check_memoizable(RestrictInfo *restrictinfo);
@ -2250,8 +2251,7 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
* RestrictInfo lists for the moment, but eventually createplan.c will
* pull it out and make a gating Result node immediately above whatever
* plan node the pseudoconstant clause is assigned to. It's usually best
* to put a gating node as high in the plan tree as possible, which we can
* do by assigning it the full relid set of the current JoinDomain.
* to put a gating node as high in the plan tree as possible.
*/
if (bms_is_empty(relids))
{
@ -2269,8 +2269,19 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
}
else
{
/* eval at join domain level */
relids = bms_copy(jtitem->jdomain->jd_relids);
/*
* If we are in the top-level join domain, we can push the qual to
* the top of the plan tree. Otherwise, be conservative and eval
* it at original syntactic level. (Ideally we'd push it to the
* top of the current join domain in all cases, but that causes
* problems if we later rearrange outer-join evaluation order.
* Pseudoconstant quals below the top level are a pretty odd case,
* so it's not clear that it's worth working hard on.)
*/
if (jtitem->jdomain == (JoinDomain *) linitial(root->join_domains))
relids = bms_copy(jtitem->jdomain->jd_relids);
else
relids = bms_copy(qualscope);
/* mark as gating qual */
pseudoconstant = true;
/* tell createplan.c to check for gating quals */
@ -2734,12 +2745,13 @@ process_implied_equality(PlannerInfo *root,
/*
* If the clause is variable-free, our normal heuristic for pushing it
* down to just the mentioned rels doesn't work, because there are none.
* Apply it as a gating qual at the given qualscope.
* Apply it as a gating qual at the appropriate level (see comments for
* get_join_domain_min_rels).
*/
if (bms_is_empty(relids))
{
/* eval at join domain level */
relids = bms_copy(qualscope);
/* eval at join domain's safe level */
relids = get_join_domain_min_rels(root, qualscope);
/* mark as gating qual */
pseudoconstant = true;
/* tell createplan.c to check for gating quals */
@ -2856,6 +2868,54 @@ build_implied_join_equality(PlannerInfo *root,
return restrictinfo;
}
/*
* get_join_domain_min_rels
* Identify the appropriate join level for derived quals belonging
* to the join domain with the given relids.
*
* When we derive a pseudoconstant (Var-free) clause from an EquivalenceClass,
* we'd ideally apply the clause at the top level of the EC's join domain.
* However, if there are any outer joins inside that domain that get commuted
* with joins outside it, that leads to not finding a correct place to apply
* the clause. Instead, remove any lower outer joins from the relid set,
* and apply the clause to just the remaining rels. This still results in a
* correct answer, since if the clause produces FALSE then the LHS of these
* joins will be empty leading to an empty join result.
*
* However, there's no need to remove outer joins if this is the top-level
* join domain of the query, since then there's nothing else to commute with.
*
* Note: it's tempting to use this in distribute_qual_to_rels where it's
* dealing with pseudoconstant quals; but we can't because the necessary
* SpecialJoinInfos aren't all formed at that point.
*
* The result is always freshly palloc'd; we do not modify domain_relids.
*/
static Relids
get_join_domain_min_rels(PlannerInfo *root, Relids domain_relids)
{
Relids result = bms_copy(domain_relids);
ListCell *lc;
/* Top-level join domain? */
if (bms_equal(result, root->all_query_rels))
return result;
/* Nope, look for lower outer joins that could potentially commute out */
foreach(lc, root->join_info_list)
{
SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) lfirst(lc);
if (sjinfo->jointype == JOIN_LEFT &&
bms_is_member(sjinfo->ojrelid, result))
{
result = bms_del_member(result, sjinfo->ojrelid);
result = bms_del_members(result, sjinfo->syn_righthand);
}
}
return result;
}
/*
* match_foreign_keys_to_quals

View File

@ -5119,6 +5119,67 @@ from int8_tbl t1
-> Seq Scan on onek t4
(13 rows)
-- More tests of correct placement of pseudoconstant quals
-- simple constant-false condition
explain (costs off)
select * from int8_tbl t1 left join
(int8_tbl t2 inner join int8_tbl t3 on false
left join int8_tbl t4 on t2.q2 = t4.q2)
on t1.q1 = t2.q1;
QUERY PLAN
--------------------------------------
Hash Left Join
Hash Cond: (t1.q1 = q1)
-> Seq Scan on int8_tbl t1
-> Hash
-> Result
One-Time Filter: false
(6 rows)
-- deduce constant-false from an EquivalenceClass
explain (costs off)
select * from int8_tbl t1 left join
(int8_tbl t2 inner join int8_tbl t3 on (t2.q1-t3.q2) = 0 and (t2.q1-t3.q2) = 1
left join int8_tbl t4 on t2.q2 = t4.q2)
on t1.q1 = t2.q1;
QUERY PLAN
--------------------------------------
Hash Left Join
Hash Cond: (t1.q1 = q1)
-> Seq Scan on int8_tbl t1
-> Hash
-> Result
One-Time Filter: false
(6 rows)
-- pseudoconstant based on an outer-level Param
explain (costs off)
select exists(
select * from int8_tbl t1 left join
(int8_tbl t2 inner join int8_tbl t3 on x0.f1 = 1
left join int8_tbl t4 on t2.q2 = t4.q2)
on t1.q1 = t2.q1
) from int4_tbl x0;
QUERY PLAN
---------------------------------------------------------------------
Seq Scan on int4_tbl x0
SubPlan 1
-> Nested Loop Left Join
Join Filter: (t2.q2 = t4.q2)
-> Nested Loop Left Join
Join Filter: (t1.q1 = t2.q1)
-> Seq Scan on int8_tbl t1
-> Materialize
-> Result
One-Time Filter: (x0.f1 = 1)
-> Nested Loop
-> Seq Scan on int8_tbl t2
-> Materialize
-> Seq Scan on int8_tbl t3
-> Materialize
-> Seq Scan on int8_tbl t4
(16 rows)
-- check that join removal works for a left join when joining a subquery
-- that is guaranteed to be unique by its GROUP BY clause
explain (costs off)

View File

@ -1846,6 +1846,31 @@ from int8_tbl t1
left join onek t4
on t2.q2 < t3.unique2;
-- More tests of correct placement of pseudoconstant quals
-- simple constant-false condition
explain (costs off)
select * from int8_tbl t1 left join
(int8_tbl t2 inner join int8_tbl t3 on false
left join int8_tbl t4 on t2.q2 = t4.q2)
on t1.q1 = t2.q1;
-- deduce constant-false from an EquivalenceClass
explain (costs off)
select * from int8_tbl t1 left join
(int8_tbl t2 inner join int8_tbl t3 on (t2.q1-t3.q2) = 0 and (t2.q1-t3.q2) = 1
left join int8_tbl t4 on t2.q2 = t4.q2)
on t1.q1 = t2.q1;
-- pseudoconstant based on an outer-level Param
explain (costs off)
select exists(
select * from int8_tbl t1 left join
(int8_tbl t2 inner join int8_tbl t3 on x0.f1 = 1
left join int8_tbl t4 on t2.q2 = t4.q2)
on t1.q1 = t2.q1
) from int4_tbl x0;
-- check that join removal works for a left join when joining a subquery
-- that is guaranteed to be unique by its GROUP BY clause
explain (costs off)