Fix cost estimation for indexscan filter conditions.

cost_index's method for estimating per-tuple costs of evaluating filter
conditions (a/k/a qpquals) was completely wrong in the presence of derived
indexable conditions, such as range conditions derived from a LIKE clause.
This was largely masked in common cases as a result of all simple operator
clauses having about the same costs, but it could show up in a big way when
dealing with functional indexes containing expensive functions, as seen for
example in bug #6579 from Istvan Endredy.  Rejigger the calculation to give
sane answers when the indexquals aren't a subset of the baserestrictinfo
list.  As a side benefit, we now do the calculation properly for cases
involving join clauses (ie, parameterized indexscans), which we always
overestimated before.

There are still cases where this is an oversimplification, such as clauses
that can be dropped because they are implied by a partial index's
predicate.  But we've never accounted for that in cost estimates before,
and I'm not convinced it's worth the cycles to try to do so.
This commit is contained in:
Tom Lane 2012-04-11 20:24:17 -04:00
parent 880bfc3287
commit 732bfa2448
1 changed files with 20 additions and 18 deletions

View File

@ -228,6 +228,7 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count)
IndexOptInfo *index = path->indexinfo; IndexOptInfo *index = path->indexinfo;
RelOptInfo *baserel = index->rel; RelOptInfo *baserel = index->rel;
bool indexonly = (path->path.pathtype == T_IndexOnlyScan); bool indexonly = (path->path.pathtype == T_IndexOnlyScan);
List *allclauses;
Cost startup_cost = 0; Cost startup_cost = 0;
Cost run_cost = 0; Cost run_cost = 0;
Cost indexStartupCost; Cost indexStartupCost;
@ -239,6 +240,7 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count)
spc_random_page_cost; spc_random_page_cost;
Cost min_IO_cost, Cost min_IO_cost,
max_IO_cost; max_IO_cost;
QualCost qpqual_cost;
Cost cpu_per_tuple; Cost cpu_per_tuple;
double tuples_fetched; double tuples_fetched;
double pages_fetched; double pages_fetched;
@ -267,8 +269,6 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count)
* Note that we force the clauses to be treated as non-join clauses * Note that we force the clauses to be treated as non-join clauses
* during selectivity estimation. * during selectivity estimation.
*/ */
List *allclauses;
allclauses = list_union_ptr(baserel->baserestrictinfo, allclauses = list_union_ptr(baserel->baserestrictinfo,
path->indexclauses); path->indexclauses);
path->path.rows = baserel->tuples * path->path.rows = baserel->tuples *
@ -283,6 +283,9 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count)
} }
else else
{ {
/* allclauses should just be the rel's restriction clauses */
allclauses = baserel->baserestrictinfo;
/* /*
* The number of rows is the same as the parent rel's estimate, since * The number of rows is the same as the parent rel's estimate, since
* this isn't a parameterized path. * this isn't a parameterized path.
@ -442,24 +445,23 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count)
/* /*
* Estimate CPU costs per tuple. * Estimate CPU costs per tuple.
* *
* Normally the indexquals will be removed from the list of restriction * What we want here is cpu_tuple_cost plus the evaluation costs of any
* clauses that we have to evaluate as qpquals, so we should subtract * qual clauses that we have to evaluate as qpquals. We approximate that
* their costs from baserestrictcost. But if we are doing a join then * list as allclauses minus any clauses appearing in indexquals (as
* some of the indexquals are join clauses and shouldn't be subtracted. * before, assuming that pointer equality is enough to recognize duplicate
* Rather than work out exactly how much to subtract, we don't subtract * RestrictInfos). This method neglects some considerations such as
* anything. * clauses that needn't be checked because they are implied by a partial
* index's predicate. It does not seem worth the cycles to try to factor
* those things in at this stage, even though createplan.c will take pains
* to remove such unnecessary clauses from the qpquals list if this path
* is selected for use.
*/ */
startup_cost += baserel->baserestrictcost.startup; cost_qual_eval(&qpqual_cost,
cpu_per_tuple = cpu_tuple_cost + baserel->baserestrictcost.per_tuple; list_difference_ptr(allclauses, path->indexquals),
root);
if (path->path.required_outer == NULL) startup_cost += qpqual_cost.startup;
{ cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;
QualCost index_qual_cost;
cost_qual_eval(&index_qual_cost, path->indexquals, root);
/* any startup cost still has to be paid ... */
cpu_per_tuple -= index_qual_cost.per_tuple;
}
run_cost += cpu_per_tuple * tuples_fetched; run_cost += cpu_per_tuple * tuples_fetched;