Make gincostestimate() cope with hypothetical GIN indexes.

We tried to fetch statistics data from the index metapage, which does not
work if the index isn't actually present.  If the index is hypothetical,
instead extrapolate some plausible internal statistics based on the index
page count provided by the index-advisor plugin.

There was already some code in gincostestimate() to invent internal stats
in this way, but since it was only meant as a stopgap for pre-9.1 GIN
indexes that hadn't been vacuumed since upgrading, it was pretty crude.
If we want it to support index advisors, we should try a little harder.
A small amount of testing says that it's better to estimate the entry pages
as 90% of the index, not 100%.  Also, estimating the number of entries
(keys) as equal to the heap tuple count could be wildly wrong in either
direction.  Instead, let's estimate 100 entries per entry page.

Perhaps someday somebody will want the index advisor to be able to provide
these numbers more directly, but for the moment this should serve.

Problem report and initial patch by Julien Rouhaud; modified by me to
invent less-bogus internal statistics.  Back-patch to all supported
branches, since we've supported index advisors since 9.0.
This commit is contained in:
Tom Lane 2015-12-01 16:24:34 -05:00
parent 95708e1d8e
commit 7fb008c5ee

View File

@ -7260,33 +7260,34 @@ gincostestimate(PG_FUNCTION_ARGS)
qinfos = deconstruct_indexquals(path);
/*
* Obtain statistic information from the meta page
* Obtain statistical information from the meta page, if possible. Else
* set ginStats to zeroes, and we'll cope below.
*/
indexRel = index_open(index->indexoid, AccessShareLock);
ginGetStats(indexRel, &ginStats);
index_close(indexRel, AccessShareLock);
numEntryPages = ginStats.nEntryPages;
numDataPages = ginStats.nDataPages;
numPendingPages = ginStats.nPendingPages;
numEntries = ginStats.nEntries;
/*
* nPendingPages can be trusted, but the other fields are as of the last
* VACUUM. Scale them by the ratio numPages / nTotalPages to account for
* growth since then. If the fields are zero (implying no VACUUM at all,
* and an index created pre-9.1), assume all pages are entry pages.
*/
if (ginStats.nTotalPages == 0 || ginStats.nEntryPages == 0)
if (!index->hypothetical)
{
numEntryPages = numPages;
numDataPages = 0;
numEntries = numTuples; /* bogus, but no other info available */
indexRel = index_open(index->indexoid, AccessShareLock);
ginGetStats(indexRel, &ginStats);
index_close(indexRel, AccessShareLock);
}
else
{
memset(&ginStats, 0, sizeof(ginStats));
}
if (ginStats.nTotalPages > 0 && ginStats.nEntryPages > 0 && numPages > 0)
{
/*
* We got valid stats. nPendingPages can be trusted, but the other
* fields are data as of the last VACUUM. Scale them by the ratio
* numPages / nTotalPages to account for growth since then.
*/
double scale = numPages / ginStats.nTotalPages;
numEntryPages = ginStats.nEntryPages;
numDataPages = ginStats.nDataPages;
numPendingPages = ginStats.nPendingPages;
numEntries = ginStats.nEntries;
numEntryPages = ceil(numEntryPages * scale);
numDataPages = ceil(numDataPages * scale);
numEntries = ceil(numEntries * scale);
@ -7294,6 +7295,23 @@ gincostestimate(PG_FUNCTION_ARGS)
numEntryPages = Min(numEntryPages, numPages);
numDataPages = Min(numDataPages, numPages - numEntryPages);
}
else
{
/*
* It's a hypothetical index, or perhaps an index created pre-9.1 and
* never vacuumed since upgrading. Invent some plausible internal
* statistics based on the index page count. We estimate that 90% of
* the index is entry pages, and the rest is data pages. Estimate 100
* entries per entry page; this is rather bogus since it'll depend on
* the size of the keys, but it's more robust than trying to predict
* the number of entries per heap tuple.
*/
numPages = Max(numPages, 10);
numEntryPages = floor(numPages * 0.90);
numDataPages = numPages - numEntryPages;
numPendingPages = 0;
numEntries = floor(numEntryPages * 100);
}
/* In an empty index, numEntries could be zero. Avoid divide-by-zero */
if (numEntries < 1)