Fix nbtree cleanup-only VACUUM stats inaccuracies.

Logic for counting heap TIDs from posting list tuples (added by commit
0d861bbb) was faulty.  It didn't count any TIDs/index tuples in the
event of no callback being set.  This meant that we incorrectly counted
no index tuples in clean-up only VACUUMs, which could lead to
pg_class.reltuples being spuriously set to 0 in affected indexes.

To fix, go back to counting items from the page in cases where there is
no callback.  This approach isn't very accurate, but it works well
enough in practice while avoiding the expense of accessing every index
tuple during cleanup-only VACUUMs.

Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
https://postgr.es/m/20201023174451.69e358f1@firost
Backpatch: 13-, where nbtree deduplication was introduced
This commit is contained in:
Peter Geoghegan 2020-11-04 18:42:27 -08:00
parent c732c3f8c1
commit 48e1291342
1 changed files with 14 additions and 1 deletions

View File

@ -933,6 +933,12 @@ btvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
* double-counting some index tuples, so disbelieve any total that exceeds
* the underlying heap's count ... if we know that accurately. Otherwise
* this might just make matters worse.
*
* Posting list tuples are another source of inaccuracy. Cleanup-only
* btvacuumscan calls assume that the number of index tuples can be used
* as num_index_tuples, even though num_index_tuples is supposed to
* represent the number of TIDs in the index. This naive approach can
* underestimate the number of tuples in the index.
*/
if (!info->estimated_count)
{
@ -1394,11 +1400,18 @@ backtrack:
* separate live tuples). We don't delete when backtracking, though,
* since that would require teaching _bt_pagedel() about backtracking
* (doesn't seem worth adding more complexity to deal with that).
*
* We don't count the number of live TIDs during cleanup-only calls to
* btvacuumscan (i.e. when callback is not set). We count the number
* of index tuples directly instead. This avoids the expense of
* directly examining all of the tuples on each page.
*/
if (minoff > maxoff)
attempt_pagedel = (blkno == scanblkno);
else
else if (callback)
stats->num_index_tuples += nhtidslive;
else
stats->num_index_tuples += maxoff - minoff + 1;
Assert(!attempt_pagedel || nhtidslive == 0);
}