Update nbtree LP_DEAD item deletion comments.

Comments about the consequences of clearing the BTP_HAS_GARBAGE page
flag bit that apply only to VACUUM were added to code that deals with
opportunistic deletion of LP_DEAD items by commit a760893d.  The same
comment block was added to both _bt_delitems_vacuum() and
_bt_delitems_delete().  Correct _bt_delitems_delete()'s copy of the
comment block.

_bt_delitems_delete() reliably deletes items that were found by caller
to have their LP_DEAD bit set.  There is no question about whether or
not unsetting the BTP_HAS_GARBAGE bit can miss some LP_DEAD items that
were set recently.

Also tweak a related section of the nbtree README.
This commit is contained in:
Peter Geoghegan 2019-12-22 19:57:35 -08:00
parent b265aa1f39
commit fe97c61c87
2 changed files with 6 additions and 13 deletions

View File

@ -559,15 +559,15 @@ writer cannot observe the incomplete split flag before the first writer
finishes the split. If we let concurrent writers on the primary observe
an incomplete split flag on the same page, each writer would attempt to
complete the unfinished split, corrupting the parent page. (Similarly,
replay of page deletion records does not hold a write lock on the leaf
page throughout; only the primary needs to blocks out concurrent writers
that insert on to the page being deleted.)
replay of page deletion records does not hold a write lock on the target
leaf page throughout; only the primary needs to block out concurrent
writers that insert on to the page being deleted.)
During recovery all index scans start with ignore_killed_tuples = false
and we never set kill_prior_tuple. We do this because the oldest xmin
on the standby server can be older than the oldest xmin on the master
server, which means tuples can be marked LP_DEAD even when they are
still visible on the standby. We don't WAL log tuple LP_DEAD bits, but
still visible on the standby. We don't WAL log tuple LP_DEAD bits, but
they can still appear in the standby because of full page writes. So
we must always ignore them in standby, and that means it's not worth
setting them either. (When LP_DEAD-marked tuples are eventually deleted

View File

@ -1074,15 +1074,8 @@ _bt_delitems_delete(Relation rel, Buffer buf,
/*
* Unlike _bt_delitems_vacuum, we *must not* clear the vacuum cycle ID,
* because this is not called by VACUUM.
*/
/*
* Mark the page as not containing any LP_DEAD items. This is not
* certainly true (there might be some that have recently been marked, but
* weren't included in our target-item list), but it will almost always be
* true and it doesn't seem worth an additional page scan to check it.
* Remember that BTP_HAS_GARBAGE is only a hint anyway.
* because this is not called by VACUUM. Just clear the BTP_HAS_GARBAGE
* page flag, since we deleted all items with their LP_DEAD bit set.
*/
opaque = (BTPageOpaque) PageGetSpecialPointer(page);
opaque->btpo_flags &= ~BTP_HAS_GARBAGE;