postgresql/contrib/pgstattuple
Tom Lane d04900de7d When updating reltuples after ANALYZE, just extrapolate from our sample.
The existing logic for updating pg_class.reltuples trusted the sampling
results only for the pages ANALYZE actually visited, preferring to
believe the previous tuple density estimate for all the unvisited pages.
While there's some rationale for doing that for VACUUM (first that
VACUUM is likely to visit a very nonrandom subset of pages, and second
that we know for sure that the unvisited pages did not change), there's
no such rationale for ANALYZE: by assumption, it's looked at an unbiased
random sample of the table's pages.  Furthermore, in a very large table
ANALYZE will have examined only a tiny fraction of the table's pages,
meaning it cannot slew the overall density estimate very far at all.
In a table that is physically growing, this causes reltuples to increase
nearly proportionally to the change in relpages, regardless of what is
actually happening in the table.  This has been observed to cause reltuples
to become so much larger than reality that it effectively shuts off
autovacuum, whose threshold for doing anything is a fraction of reltuples.
(Getting to the point where that would happen seems to require some
additional, not well understood, conditions.  But it's undeniable that if
reltuples is seriously off in a large table, ANALYZE alone will not fix it
in any reasonable number of iterations, especially not if the table is
continuing to grow.)

Hence, restrict the use of vac_estimate_reltuples() to VACUUM alone,
and in ANALYZE, just extrapolate from the sample pages on the assumption
that they provide an accurate model of the whole table.  If, by very bad
luck, they don't, at least another ANALYZE will fix it; in the old logic
a single bad estimate could cause problems indefinitely.

In HEAD, let's remove vac_estimate_reltuples' is_analyze argument
altogether; it was never used for anything and now it's totally pointless.
But keep it in the back branches, in case any third-party code is calling
this function.

Per bug #15005.  Back-patch to all supported branches.

David Gould, reviewed by Alexander Kuzmenkov, cosmetic changes by me

Discussion: https://postgr.es/m/20180117164916.3fdcf2e9@engels
2018-03-13 13:24:27 -04:00
..
expected hash: Increase the number of possible overflow bitmaps by 8x. 2017-08-04 16:30:32 -04:00
sql pgstattuple: Fix typo partitiond -> partitioned 2017-03-09 20:06:11 -05:00
.gitignore Add a regression test for pgstattuple. 2011-08-25 00:06:16 -04:00
Makefile Remove superuser checks in pgstattuple 2016-09-29 22:13:38 -04:00
pgstatapprox.c When updating reltuples after ANALYZE, just extrapolate from our sample. 2018-03-13 13:24:27 -04:00
pgstatindex.c Remove unnecessary parentheses in return statements 2017-09-05 14:52:55 -04:00
pgstattuple--1.0--1.1.sql Add pgstatginindex() function to get the size of the GIN pending list. 2012-12-05 09:58:03 +02:00
pgstattuple--1.1--1.2.sql Fix pgstattuple functions to use regclass-type as the argument. 2013-07-19 03:50:20 +09:00
pgstattuple--1.2--1.3.sql Add pgstattuple_approx() to the pgstattuple extension. 2015-05-13 07:35:06 +02:00
pgstattuple--1.3--1.4.sql Update pgstattuple extension for parallel query. 2016-06-10 10:42:03 -04:00
pgstattuple--1.4--1.5.sql Fix pgstattuple's handling of unused hash pages. 2017-04-12 11:53:00 -04:00
pgstattuple--1.4.sql Minor fixes in contrib installation scripts. 2016-06-14 10:47:06 -04:00
pgstattuple--unpackaged--1.0.sql Fix typos in some error messages thrown by extension scripts when fed to psql. 2014-08-25 18:30:37 +02:00
pgstattuple.c Minor code-cleanliness improvements for btree. 2017-09-18 16:36:28 -04:00
pgstattuple.control Remove superuser checks in pgstattuple 2016-09-29 22:13:38 -04:00