From 48e1291342dd7771cf8c67aa1d7ec1f394b95dd8 Mon Sep 17 00:00:00 2001 From: Peter Geoghegan Date: Wed, 4 Nov 2020 18:42:27 -0800 Subject: [PATCH] Fix nbtree cleanup-only VACUUM stats inaccuracies. Logic for counting heap TIDs from posting list tuples (added by commit 0d861bbb) was faulty. It didn't count any TIDs/index tuples in the event of no callback being set. This meant that we incorrectly counted no index tuples in clean-up only VACUUMs, which could lead to pg_class.reltuples being spuriously set to 0 in affected indexes. To fix, go back to counting items from the page in cases where there is no callback. This approach isn't very accurate, but it works well enough in practice while avoiding the expense of accessing every index tuple during cleanup-only VACUUMs. Author: Peter Geoghegan Reported-By: Jehan-Guillaume de Rorthais https://postgr.es/m/20201023174451.69e358f1@firost Backpatch: 13-, where nbtree deduplication was introduced --- src/backend/access/nbtree/nbtree.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c index c822b49a71..c4f22f1c69 100644 --- a/src/backend/access/nbtree/nbtree.c +++ b/src/backend/access/nbtree/nbtree.c @@ -933,6 +933,12 @@ btvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats) * double-counting some index tuples, so disbelieve any total that exceeds * the underlying heap's count ... if we know that accurately. Otherwise * this might just make matters worse. + * + * Posting list tuples are another source of inaccuracy. Cleanup-only + * btvacuumscan calls assume that the number of index tuples can be used + * as num_index_tuples, even though num_index_tuples is supposed to + * represent the number of TIDs in the index. This naive approach can + * underestimate the number of tuples in the index. */ if (!info->estimated_count) { @@ -1394,11 +1400,18 @@ backtrack: * separate live tuples). We don't delete when backtracking, though, * since that would require teaching _bt_pagedel() about backtracking * (doesn't seem worth adding more complexity to deal with that). + * + * We don't count the number of live TIDs during cleanup-only calls to + * btvacuumscan (i.e. when callback is not set). We count the number + * of index tuples directly instead. This avoids the expense of + * directly examining all of the tuples on each page. */ if (minoff > maxoff) attempt_pagedel = (blkno == scanblkno); - else + else if (callback) stats->num_index_tuples += nhtidslive; + else + stats->num_index_tuples += maxoff - minoff + 1; Assert(!attempt_pagedel || nhtidslive == 0); }