tuple, instead of the former cpu_tuple_cost. It is sane to charge less than
cpu_tuple_cost because Materialize never does any qual-checking or projection,
so it's got less overhead than most plan node types. In particular, we want
to have the same charge here as is charged for readout in cost_sort. That
avoids the problem recently exhibited by Teodor wherein the planner prefers
a useless sort over a materialize step in a context where a lot of rescanning
will happen. The rescan costs should be just about the same for both node
types, so make their estimates the same.
Not back-patching because all of the current logic for rescan cost estimates
is new in 9.0. The old handling of rescans is sufficiently not-sane that
changing this in that structure is a bit pointless, and might indeed cause
regressions.
peculiar variant of UNION ALL, and so wouldn't likely get written directly
as-is, it's possible for it to arise as a result of simplification of
less-obviously-silly queries. In particular, now that we can do flattening
of subqueries that have constant outputs and are underneath an outer join,
it's possible for the case to result from simplification of queries of the
type exhibited in bug #5263. Back-patch to 8.4 to avoid a functionality
regression for this type of query.
This patch only supports seq_page_cost and random_page_cost as parameters,
but it provides the infrastructure to scalably support many more.
In particular, we may want to add support for effective_io_concurrency,
but I'm leaving that as future work for now.
Thanks to Tom Lane for design help and Alvaro Herrera for the review.
8.2beta but never carried out. This avoids repetitive tests of whether the
argument is of scalar or composite type. Also, be a bit more paranoid about
composite arguments in some places where we previously weren't checking.
to be just a minor extension of the previous patch that made "x IS NULL"
indexable, because we can treat the IS NOT NULL condition as if it were
"x < NULL" or "x > NULL" (depending on the index's NULLS FIRST/LAST option),
just like IS NULL is treated like "x = NULL". Aside from any possible
usefulness in its own right, this is an important improvement for
index-optimized MAX/MIN aggregates: it is now reliably possible to get
a column's min or max value cheaply, even when there are a lot of nulls
cluttering the interesting end of the index.
by adding a requirement that build_join_rel add new join RelOptInfos to the
appropriate list immediately at creation. Per report from Robert Haas,
the list_concat_unique_ptr() calls that this change eliminates were taking
the lion's share of the runtime in larger join problems. This doesn't do
anything to fix the fundamental combinatorial explosion in large join
problems, but it should push out the threshold of pain a bit further.
Note: because this changes the order in which joinrel lists are built,
it might result in changes in selected plans in cases where different
alternatives have exactly the same costs. There is one example in the
regression tests.
mergejoin to shield it from doing mark/restore and refetches. Put an explicit
flag in MergePath so we can centralize the logic that knows about this,
and add costing logic that considers using Materialize even when it's not
forced by the previously-existing considerations. This is in response to
a discussion back in August that suggested that materializing an inner
indexscan can be helpful when the refetch percentage is high enough.
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
execMain.c and into a new plan node type LockRows. Like the recent change
to put table updating into a ModifyTable plan node, this increases planning
flexibility by allowing the operations to occur below the top level of the
plan tree. It's necessary in any case to restore the previous behavior of
having FOR UPDATE locking occur before ModifyTable does.
This partially refactors EvalPlanQual to allow multiple rows-under-test
to be inserted into the EPQ machinery before starting an EPQ test query.
That isn't sufficient to fix EPQ's general bogosity in the face of plans
that return multiple rows per test row, though. Since this patch is
mostly about getting some plan node infrastructure in place and not about
fixing ten-year-old bugs, I will leave EPQ improvements for another day.
Another behavioral change that we could now think about is doing FOR UPDATE
before LIMIT, but that too seems like it should be treated as a followon
patch.
The original coding correctly noted that these aren't just redundancies
(they're effectively X IS NOT NULL, assuming = is strict). However, they
got treated that way if X happened to be in a single-member EquivalenceClass
already, which could happen if there was an ORDER BY X clause, for instance.
The simplest and most reliable solution seems to be to not try to process
such clauses through the EquivalenceClass machinery; just throw them back
for traditional processing. The amount of work that'd be needed to be
smarter than that seems out of proportion to the benefit.
Per bug #5084 from Bernt Marius Johnsen, and analysis by Andrew Gierth.
tests into a small common subroutine, and eliminate an unnecessary difference
in the order in which conditions are tested. Per a comment from Robert Haas.
is unique and is not referenced above the join. In this case the inner
side doesn't affect the query result and can be thrown away entirely.
Although perhaps nobody would ever write such a thing by hand, it's
a reasonably common case in machine-generated SQL.
The current implementation only recognizes the case where the inner side
is a simple relation with a unique index matching the query conditions.
This is enough for the use-cases that have been shown so far, but we
might want to try to handle other cases later.
Robert Haas, somewhat rewritten by Tom
an explicit model of rescan costs being different from first-time costs.
The costing of Material nodes in particular now has some visible relationship
to the actual runtime behavior, where before it was essentially fantasy.
This also fixes up a couple of places where different materialized plan types
were treated differently for no very good reason (probably just oversights).
A couple of the regression tests are affected, because the planner now chooses
to put the other relation on the inside of a nestloop-with-materialize.
So far as I can see both changes are sane, and the planner is now more
consistently following the expectation that it should prefer to materialize
the smaller of two relations.
Per a recent discussion with Robert Haas.
In this case we generate two PathKey references to the expression (one for
DISTINCT and one for ORDER BY) and they really need to refer to the same
EquivalenceClass. However get_eclass_for_sort_expr was being overly paranoid
and creating two different EC's. Correct behavior is to use the SortGroupRef
index to decide whether two references to volatile expressions that are
equal() (ie textually equivalent) should be considered the same.
Backpatch to 8.4. Possibly this should be changed in 8.3 as well, but
I'll refrain in the absence of evidence of a visible failure in that branch.
Per bug #5049.
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
Both hex format and the traditional "escape" format are automatically
handled on input. The output format is selected by the new GUC variable
bytea_output.
As committed, bytea_output defaults to HEX, which is an *incompatible
change*. We will keep it this way for awhile for testing purposes, but
should consider whether to switch to the more backwards-compatible
default of ESCAPE before 8.5 is released.
Peter Eisentraut
for the case that the semijoin was implemented within either input by
unique-ifying its RHS before we test to see if it appears to match the current
join situation. The previous coding would select semijoin logic in situations
where we'd already unique-ified the RHS and joined it to some unrelated
relation(s), and then came to join it to the semijoin's LHS. That still gave
the right answer as far as the semijoin itself was concerned, but would lead
to incorrectly examining only an arbitrary one of the matchable rows from the
unrelated relation(s). The cause of this thinko was incorrect unification of
the pre-8.4 logic for IN joins and OUTER joins --- the comparable case for
outer joins can be handled after making the match test, but that's because
there is nothing like the unique-ification escape hatch for outer joins.
Per bug #4934 from Benjamin Reed.
by unique-ifying the RHS and then inner-joining to some other relation,
that is not grounds for violating the RHS of some other outer join.
Noticed while regression-testing new GEQO code, which will blindly follow
any path that join_is_legal says is legal, and then complain later if that
leads to a dead end.
I'm not certain that this can result in any visible failure in 8.4: the
mistake may always be masked by the fact that subsequent attempts to join
the rest of the RHS of the other join will fail. But I'm not certain it
can't, either, and it's definitely not operating as intended. So back-patch.
The added regression test depends on the new no-failures-allowed logic
that I'm about to commit in GEQO, so no point back-patching that.
that the sanity checking I added to create_mergejoin_plan() in 8.3 was a
few bricks shy of a load: the mergeclauses could reference pathkeys in a
noncanonical order such as x,y,x, not only cases like x,x,y which is all
that the code had allowed for. The odd cases only turn up when using
redundant clauses in an outer join condition, which is why no one had
noticed before.
RelOptInfo targetlist. It used to be that the only possibility other than
a Var was a RowExpr representing a whole-row child Var, but as of 8.4's
expanded ability to flatten appendrel members, we can get arbitrary expressions
in there. Use the expression's type info and get_typavgwidth() to produce
an at-least-marginally-sane result. Note that get_typavgwidth()'s fallback
estimate (32 bytes) is the same as what was here before, so there will be
no behavioral change for RowExprs. Noted while looking at recent gripe
about constant quals pushed down to FunctionScan appendrel members ...
not only were we failing to recognize the constant qual, we were getting
the width estimate wrong :-(
substituting a child rel's output expressions into the appendrel's restriction
clauses yields a pseudoconstant restriction. We might be able to skip scanning
that child rel entirely (if we get constant FALSE), or generate a one-time
filter. 8.3 more or less accidentally generated plans that weren't completely
stupid in these cases, but that was only because an extra recursive level of
subquery_planner() always occurred and allowed const-simplification to happen.
8.4's ability to pull up appendrel members with non-Var outputs exposes the
fact that we need to work harder here. Per gripe from Sergey Burladyan.
joins a bit better, ie, understand the differing cost functions for matched
and unmatched outer tuples. There is more that could be done in cost_hashjoin
but this already helps a great deal. Per discussions with Robert Haas.
Stefan Kaltenbrunner. The most reasonable behavior (at least for the near
term) seems to be to ignore the PlaceHolderVar and examine its argument
instead. In support of this, change the API of pull_var_clause() to allow
callers to request recursion into PlaceHolderVars. Currently
estimate_num_groups() is the only customer for that behavior, but where
there's one there may be others.
constants through full joins, as in
select * from tenk1 a full join tenk1 b using (unique1)
where unique1 = 42;
which should generate a fairly cheap plan where we apply the constraint
unique1 = 42 in each relation scan. This had been broken by my patch of
2008-06-27, which is now reverted in favor of a more invasive but hopefully
less incorrect approach. That patch was meant to prevent incorrect extraction
of OR'd indexclauses from OR conditions above an outer join. To do that
correctly we need more information than the outerjoin_delay flag can provide,
so add a nullable_relids field to RestrictInfo that records exactly which
relations are nulled by outer joins that are underneath a particular qual
clause. A side benefit is that we can make the test in create_or_index_quals
more specific: it is now smart enough to extract an OR'd indexclause into the
outer side of an outer join, even though it must not do so in the inner side.
The old coding couldn't distinguish these cases so it could not do either.
"physical tlist" optimization on the outer relation (ie, force a projection
step to occur in its scan). This avoids storing useless column values when
the outer relation's tuples are written to temporary batch files.
Modified version of a patch by Michael Henderson and Ramon Lawrence.
distribution, by creating a special fast path for the (first few) most common
values of the outer relation. Tuples having hashvalues matching the MCVs
are effectively forced to be in the first batch, so that we never write
them out to the batch temp files.
Bryce Cutt and Ramon Lawrence, with some editorialization by me.
exact-match pattern (no wildcard) can be index-optimized in some cases where a
prefix-match pattern cannot; specifically, since the required index clause is
simple equality, it works for regular text/varchar indexes even when the
locale is not C. I'm not sure how often this case really comes up, but since
it requires hardly any additional work to handle it, we might as well get it
right. Motivated by a discussion on the JDBC list.
by the planning process. This prevents the "failed to locate grouping columns"
error recently reported by Dickson Guedes. That happens because planning
replaces SubLinks by SubPlans in the subquery's targetlist, and exprTypmod()
is smarter about the former than the latter, causing the apparent type of
the subquery's output columns to change. This seems to be a deficiency we
should fix in exprTypmod(), but that will be a much more invasive patch
with possible side-effects elsewhere, so I'll do that only in HEAD.
Back-patch to 8.3. Arguably the lack of a copying step is broken/dangerous
all the way back, but in the absence of known problems I'll refrain from
making the older branches pay the extra cost. (The reason this particular
symptom didn't appear before is that exprTypmod() wasn't smart about SubLinks
either, until 8.3.)
amgettuple or only implement amgetbitmap, instead of the former assumption
that every AM supports both APIs. Extracted with minor editorialization
from Teodor's fast-GIN-insert patch; whatever becomes of that, this seems
like a simple and reasonable generalization of the index AM interface spec.
input lists before we grovel through the lists. This doesn't save much,
but testing shows that the case of both inputs NIL is common enough that
it saves something. And this is used enough to be a hotspot.
unique for a particular query, if the index predicate is satisfied. This
requires a bit of reordering of operations so that we check the predicates
before doing any selectivity estimates, but shouldn't really cause any
noticeable slowdown. Per a comment from Michal Politowski.
keys when considering a semi or anti join. This requires estimating the
selectivity of the merge qual as though it were a regular inner join condition.
To allow caching both that and the real outer-join-aware selectivity, split
RestrictInfo.this_selec into two fields.
This fixes one of the problems reported by Kevin Grittner.
the cheapest-total inner path as a new candidate while truncating the
sort key list, if it already matched the full sort key list. This is
too much of a corner case to be worth back-patching, since it's unusual
for the cheapest total path to be sorted, and anyway no real harm is
done (except in JOIN_SEMI/ANTI cases where cost_mergejoin is a bit
broken at the moment). But it wasn't behaving as intended, so fix it.
Noted while examining a test case from Kevin Grittner. This error doesn't
explain his issue, but it does explain why "set enable_seqscan = off"
seemed to reproduce it for me.
outer join clauses. Given, say,
... from a left join b on a.a1 = b.b1 where a.a1 = 42;
we'll deduce a clause b.b1 = 42 and then mark the original join clause
redundant (we can't remove it completely for reasons I don't feel like
squeezing into this log entry). However the original implementation of
that wasn't bulletproof, because clause_selectivity() wouldn't honor
this_selec if given nonzero varRelid --- which in practice meant that
it worked as desired *except* when considering index scan quals. Which
resulted in bogus underestimation of the size of the indexscan result for
an inner indexscan in an outer join, and consequently a possibly bad
choice of indexscan vs. bitmap scan. Fix by introducing an explicit test
into clause_selectivity(). Also, to make sure we don't trigger that test
in corner cases, change the convention to be that this_selec > 1, not
this_selec = 1, means it's been marked redundant. Per trouble report from
Scara Maccai.
Back-patch to 8.2, where the problem was introduced.
RHS that can't be unique-ified --- join_is_legal has to check that before
deciding to build a join, else we'll have an unimplementable joinrel.
Per report from Greg Stark.
though it is an inner rather than outer join type. This essentially means
that we don't bother to separate "pushed down" qual conditions from actual
join quals at a semijoin plan node; which is okay because the restrictions of
SQL syntax make it impossible to have a pushed-down qual that references the
inner side of a semijoin. This allows noticeably better optimization of
IN/EXISTS cases than we had before, since the equivalence-class machinery can
now use those quals. Also fix a couple of other mistakes that had essentially
disabled the ability to unique-ify the inner relation and then join it to just
a subset of the left-hand relations. An example case using the regression
database is
select * from tenk1 a, tenk1 b
where (a.unique1,b.unique2) in (select unique1,unique2 from tenk1 c);
which is planned reasonably well by 8.3 and earlier but had been forcing a
cartesian join of a/b in CVS HEAD.
return the tableoid as well as the ctid for any FOR UPDATE targets that
have child tables. All child tables are listed in the ExecRowMark list,
but the executor just skips the ones that didn't produce the current row.
Curiously, this longstanding restriction doesn't seem to have been documented
anywhere; so no doc changes.
we extended the appendrel mechanism to support UNION ALL optimization. The
reason nobody noticed was that we are not actually using attr_needed data for
appendrel children; hence it seems more reasonable to rip it out than fix it.
Back-patch to 8.2 because an Assert failure is possible in corner cases.
Per examination of an example from Jim Nasby.
In HEAD, also get rid of AppendRelInfo.col_mappings, which is quite inadequate
to represent UNION ALL situations; depend entirely on translated_vars instead.
that represent some expression that we desire to compute below the top level
of the plan, and then let that value "bubble up" as though it were a plain
Var (ie, a column value).
The immediate application is to allow sub-selects to be flattened even when
they are below an outer join and have non-nullable output expressions.
Formerly we couldn't flatten because such an expression wouldn't properly
go to NULL when evaluated above the outer join. Now, we wrap it in a
PlaceHolderVar and arrange for the actual evaluation to occur below the outer
join. When the resulting Var bubbles up through the join, it will be set to
NULL if necessary, yielding the correct results. This fixes a planner
limitation that's existed since 7.1.
In future we might want to use this mechanism to re-introduce some form of
Hellerstein's "expensive functions" optimization, ie place the evaluation of
an expensive function at the most suitable point in the plan tree.
set_rel_width(). The code had been catering for the possibility of different
varnos in the relation targetlist, but this is impossible for a base relation
(and if it were possible, putting all the widths in the same RelOptInfo would
be wrong anyway).