Document LP_DEAD accounting issues in VACUUM.

Document VACUUM's soft assumption that any LP_DEAD items encountered
during pruning will become LP_UNUSED items before VACUUM finishes up.
This is integral to the accounting used by VACUUM to generate its final
report on the table to the stats collector.  It also affects how VACUUM
determines which heap pages are truncatable.  In both cases VACUUM is
concerned with the likely contents of the page in the near future, not
the current contents of the page.

This state of affairs created the false impression that VACUUM's dead
tuple accounting had significant difference with similar accounting used
during ANALYZE.  There were and are no substantive differences, at least
when the soft assumption completely works out.  This is far clearer now.

Also document cases where things don't quite work out for VACUUM's dead
tuple accounting.  It's possible that a significant number of LP_DEAD
items will be left behind by VACUUM, and won't be recorded as remaining
dead tuples in VACUUM's statistics collector report.  This behavior
dates back to commit a96c41fe, which taught VACUUM to run without index
and heap vacuuming at the user's request.  The failsafe mechanism added
to VACUUM more recently by commit 1e55e7d1 takes the same approach to
dead tuple accounting.

Reported-By: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=Jmtu18PrsYq3EvvZJGOmZqSO2u3bvKpx9xJa5uhNp=Q@mail.gmail.com
This commit is contained in:
Peter Geoghegan 2021-04-19 18:55:31 -07:00
parent 640b91c3ed
commit 7136bf34f2
1 changed files with 42 additions and 13 deletions

View File

@ -686,7 +686,16 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
new_min_multi,
false);
/* report results to the stats collector, too */
/*
* Report results to the stats collector, too.
*
* Deliberately avoid telling the stats collector about LP_DEAD items that
* remain in the table due to VACUUM bypassing index and heap vacuuming.
* ANALYZE will consider the remaining LP_DEAD items to be dead tuples.
* It seems like a good idea to err on the side of not vacuuming again too
* soon in cases where the failsafe prevented significant amounts of heap
* vacuuming.
*/
pgstat_report_vacuum(RelationGetRelid(rel),
rel->rd_rel->relisshared,
Max(new_live_tuples, 0),
@ -1334,6 +1343,9 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
*/
lazy_scan_prune(vacrel, buf, blkno, page, vistest, &prunestate);
Assert(!prunestate.all_visible || !prunestate.has_lpdead_items);
Assert(!all_visible_according_to_vm || prunestate.all_visible);
/* Remember the location of the last page with nonremovable tuples */
if (prunestate.hastup)
vacrel->nonempty_pages = blkno + 1;
@ -1404,7 +1416,6 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
* Handle setting visibility map bit based on what the VM said about
* the page before pruning started, and using prunestate
*/
Assert(!prunestate.all_visible || !prunestate.has_lpdead_items);
if (!all_visible_according_to_vm && prunestate.all_visible)
{
uint8 flags = VISIBILITYMAP_ALL_VISIBLE;
@ -1786,6 +1797,14 @@ retry:
* The logic here is a bit simpler than acquire_sample_rows(), as
* VACUUM can't run inside a transaction block, which makes some cases
* impossible (e.g. in-progress insert from the same transaction).
*
* We treat LP_DEAD items a little differently, too -- we don't count
* them as dead_tuples at all (we only consider new_dead_tuples). The
* outcome is no different because we assume that any LP_DEAD items we
* encounter here will become LP_UNUSED inside lazy_vacuum_heap_page()
* before we report anything to the stats collector. (Cases where we
* bypass index vacuuming will violate our assumption, but the overall
* impact of that should be negligible.)
*/
switch (res)
{
@ -1901,9 +1920,6 @@ retry:
* that will need to be vacuumed in indexes later, or a LP_NORMAL tuple
* that remains and needs to be considered for freezing now (LP_UNUSED and
* LP_REDIRECT items also remain, but are of no further interest to us).
*
* Add page level counters to caller's counts, and then actually process
* LP_DEAD and LP_NORMAL items.
*/
vacrel->offnum = InvalidOffsetNumber;
@ -1988,13 +2004,6 @@ retry:
}
#endif
/* Add page-local counts to whole-VACUUM counts */
vacrel->tuples_deleted += tuples_deleted;
vacrel->lpdead_items += lpdead_items;
vacrel->new_dead_tuples += new_dead_tuples;
vacrel->num_tuples += num_tuples;
vacrel->live_tuples += live_tuples;
/*
* Now save details of the LP_DEAD items from the page in the dead_tuples
* array. Also record that page has dead items in per-page prunestate.
@ -2021,6 +2030,13 @@ retry:
pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES,
dead_tuples->num_tuples);
}
/* Finally, add page-local counts to whole-VACUUM counts */
vacrel->tuples_deleted += tuples_deleted;
vacrel->lpdead_items += lpdead_items;
vacrel->new_dead_tuples += new_dead_tuples;
vacrel->num_tuples += num_tuples;
vacrel->live_tuples += live_tuples;
}
/*
@ -2095,6 +2111,14 @@ lazy_vacuum(LVRelState *vacrel, bool onecall)
* not exceed 32MB. This limits the risk that we will bypass index
* vacuuming again and again until eventually there is a VACUUM whose
* dead_tuples space is not CPU cache resident.
*
* We don't take any special steps to remember the LP_DEAD items (such
* as counting them in new_dead_tuples report to the stats collector)
* when the optimization is applied. Though the accounting used in
* analyze.c's acquire_sample_rows() will recognize the same LP_DEAD
* items as dead rows in its own stats collector report, that's okay.
* The discrepancy should be negligible. If this optimization is ever
* expanded to cover more cases then this may need to be reconsidered.
*/
threshold = (double) vacrel->rel_pages * BYPASS_THRESHOLD_PAGES;
do_bypass_optimization =
@ -2146,7 +2170,8 @@ lazy_vacuum(LVRelState *vacrel, bool onecall)
}
/*
* Forget the now-vacuumed tuples -- just press on
* Forget the LP_DEAD items that we just vacuumed (or just decided to not
* vacuum)
*/
vacrel->dead_tuples->num_tuples = 0;
}
@ -3101,6 +3126,10 @@ lazy_cleanup_one_index(Relation indrel, IndexBulkDeleteResult *istat,
*
* Also don't attempt it if wraparound failsafe is in effect. It's hard to
* predict how long lazy_truncate_heap will take. Don't take any chances.
* There is very little chance of truncation working out when the failsafe is
* in effect in any case. lazy_scan_prune makes the optimistic assumption
* that any LP_DEAD items it encounters will always be LP_UNUSED by the time
* we're called.
*
* Also don't attempt it if we are doing early pruning/vacuuming, because a
* scan which cannot find a truncated heap page cannot determine that the