Fix handling of multixacts predating pg_upgrade

After pg_upgrade, it is possible that some tuples' Xmax have multixacts
corresponding to the old installation; such multixacts cannot have
running members anymore.  In many code sites we already know not to read
them and clobber them silently, but at least when VACUUM tries to freeze
a multixact or determine whether one needs freezing, there's an attempt
to resolve it to its member transactions by calling GetMultiXactIdMembers,
and if the multixact value is "in the future" with regards to the
current valid multixact range, an error like this is raised:
    ERROR:  MultiXactId 123 has not been created yet -- apparent wraparound
and vacuuming fails.  Per discussion with Andrew Gierth, it is completely
bogus to try to resolve multixacts coming from before a pg_upgrade,
regardless of where they stand with regards to the current valid
multixact range.

It's possible to get from under this problem by doing SELECT FOR UPDATE
of the problem tuples, but if tables are large, this is slow and
tedious, so a more thorough solution is desirable.

To fix, we realize that multixacts in xmax created in 9.2 and previous
have a specific bit pattern that is never used in 9.3 and later (we
already knew this, per comments and infomask tests sprinkled in various
places, but we weren't leveraging this knowledge appropriately).
Whenever the infomask of the tuple matches that bit pattern, we just
ignore the multixact completely as if Xmax wasn't set; or, in the case
of tuple freezing, we act as if an unwanted value is set and clobber it
without decoding.  This guarantees that no errors will be raised, and
that the values will be progressively removed until all tables are
clean.  Most callers of GetMultiXactIdMembers are patched to recognize
directly that the value is a removable "empty" multixact and avoid
calling GetMultiXactIdMembers altogether.

To avoid changing the signature of GetMultiXactIdMembers() in back
branches, we keep the "allow_old" boolean flag but rename it to
"from_pgupgrade"; if the flag is true, we always return an empty set
instead of looking up the multixact.  (I suppose we could remove the
argument in the master branch, but I chose not to do so in this commit).

This was broken all along, but the error-facing message appeared first
because of commit 8e9a16ab8f and was partially fixed in a25c2b7c4d.
This fix, backpatched all the way back to 9.3, goes approximately in the
same direction as a25c2b7c4d but should cover all cases.

Bug analysis by Andrew Gierth and Álvaro Herrera.

A number of public reports match this bug:
  https://www.postgresql.org/message-id/20140330040029.GY4582@tamriel.snowman.net
  https://www.postgresql.org/message-id/538F3D70.6080902@publicrelay.com
  https://www.postgresql.org/message-id/556439CF.7070109@pscs.co.uk
  https://www.postgresql.org/message-id/SG2PR06MB0760098A111C88E31BD4D96FB3540@SG2PR06MB0760.apcprd06.prod.outlook.com
  https://www.postgresql.org/message-id/20160615203829.5798.4594@wrigleys.postgresql.org
This commit is contained in:
Alvaro Herrera 2016-06-24 18:29:28 -04:00
parent 07f69137b1
commit d372cb173e
5 changed files with 86 additions and 68 deletions

View File

@ -158,8 +158,7 @@ pgrowlocks(PG_FUNCTION_ARGS)
values[Atnum_ismulti] = pstrdup("true"); values[Atnum_ismulti] = pstrdup("true");
allow_old = !(infomask & HEAP_LOCK_MASK) && allow_old = HEAP_LOCKED_UPGRADED(infomask);
(infomask & HEAP_XMAX_LOCK_ONLY);
nmembers = GetMultiXactIdMembers(xmax, &members, allow_old, nmembers = GetMultiXactIdMembers(xmax, &members, allow_old,
false); false);
if (nmembers == -1) if (nmembers == -1)

View File

@ -3621,6 +3621,7 @@ l2:
* HEAP_XMAX_INVALID bit set; that's fine.) * HEAP_XMAX_INVALID bit set; that's fine.)
*/ */
if ((oldtup.t_data->t_infomask & HEAP_XMAX_INVALID) || if ((oldtup.t_data->t_infomask & HEAP_XMAX_INVALID) ||
HEAP_LOCKED_UPGRADED(oldtup.t_data->t_infomask) ||
(checked_lockers && !locker_remains)) (checked_lockers && !locker_remains))
xmax_new_tuple = InvalidTransactionId; xmax_new_tuple = InvalidTransactionId;
else else
@ -5002,8 +5003,7 @@ l5:
* pg_upgrade; both MultiXactIdIsRunning and MultiXactIdExpand assume * pg_upgrade; both MultiXactIdIsRunning and MultiXactIdExpand assume
* that such multis are never passed. * that such multis are never passed.
*/ */
if (!(old_infomask & HEAP_LOCK_MASK) && if (HEAP_LOCKED_UPGRADED(old_infomask))
HEAP_XMAX_IS_LOCKED_ONLY(old_infomask))
{ {
old_infomask &= ~HEAP_XMAX_IS_MULTI; old_infomask &= ~HEAP_XMAX_IS_MULTI;
old_infomask |= HEAP_XMAX_INVALID; old_infomask |= HEAP_XMAX_INVALID;
@ -5363,6 +5363,17 @@ l4:
int i; int i;
MultiXactMember *members; MultiXactMember *members;
/*
* We don't need a test for pg_upgrade'd tuples: this is only
* applied to tuples after the first in an update chain. Said
* first tuple in the chain may well be locked-in-9.2-and-
* pg_upgraded, but that one was already locked by our caller,
* not us; and any subsequent ones cannot be because our
* caller must necessarily have obtained a snapshot later than
* the pg_upgrade itself.
*/
Assert(!HEAP_LOCKED_UPGRADED(mytup.t_data->t_infomask));
nmembers = GetMultiXactIdMembers(rawxmax, &members, false, nmembers = GetMultiXactIdMembers(rawxmax, &members, false,
HEAP_XMAX_IS_LOCKED_ONLY(old_infomask)); HEAP_XMAX_IS_LOCKED_ONLY(old_infomask));
for (i = 0; i < nmembers; i++) for (i = 0; i < nmembers; i++)
@ -5921,14 +5932,14 @@ FreezeMultiXactId(MultiXactId multi, uint16 t_infomask,
bool has_lockers; bool has_lockers;
TransactionId update_xid; TransactionId update_xid;
bool update_committed; bool update_committed;
bool allow_old;
*flags = 0; *flags = 0;
/* We should only be called in Multis */ /* We should only be called in Multis */
Assert(t_infomask & HEAP_XMAX_IS_MULTI); Assert(t_infomask & HEAP_XMAX_IS_MULTI);
if (!MultiXactIdIsValid(multi)) if (!MultiXactIdIsValid(multi) ||
HEAP_LOCKED_UPGRADED(t_infomask))
{ {
/* Ensure infomask bits are appropriately set/reset */ /* Ensure infomask bits are appropriately set/reset */
*flags |= FRM_INVALIDATE_XMAX; *flags |= FRM_INVALIDATE_XMAX;
@ -5941,14 +5952,8 @@ FreezeMultiXactId(MultiXactId multi, uint16 t_infomask,
* was a locker only, it can be removed without any further * was a locker only, it can be removed without any further
* consideration; but if it contained an update, we might need to * consideration; but if it contained an update, we might need to
* preserve it. * preserve it.
*
* Don't assert MultiXactIdIsRunning if the multi came from a
* pg_upgrade'd share-locked tuple, though, as doing that causes an
* error to be raised unnecessarily.
*/ */
Assert((!(t_infomask & HEAP_LOCK_MASK) && Assert(!MultiXactIdIsRunning(multi,
HEAP_XMAX_IS_LOCKED_ONLY(t_infomask)) ||
!MultiXactIdIsRunning(multi,
HEAP_XMAX_IS_LOCKED_ONLY(t_infomask))); HEAP_XMAX_IS_LOCKED_ONLY(t_infomask)));
if (HEAP_XMAX_IS_LOCKED_ONLY(t_infomask)) if (HEAP_XMAX_IS_LOCKED_ONLY(t_infomask))
{ {
@ -5990,10 +5995,8 @@ FreezeMultiXactId(MultiXactId multi, uint16 t_infomask,
* anything. * anything.
*/ */
allow_old = !(t_infomask & HEAP_LOCK_MASK) &&
HEAP_XMAX_IS_LOCKED_ONLY(t_infomask);
nmembers = nmembers =
GetMultiXactIdMembers(multi, &members, allow_old, GetMultiXactIdMembers(multi, &members, false,
HEAP_XMAX_IS_LOCKED_ONLY(t_infomask)); HEAP_XMAX_IS_LOCKED_ONLY(t_infomask));
if (nmembers <= 0) if (nmembers <= 0)
{ {
@ -6535,14 +6538,15 @@ static bool
DoesMultiXactIdConflict(MultiXactId multi, uint16 infomask, DoesMultiXactIdConflict(MultiXactId multi, uint16 infomask,
LockTupleMode lockmode) LockTupleMode lockmode)
{ {
bool allow_old;
int nmembers; int nmembers;
MultiXactMember *members; MultiXactMember *members;
bool result = false; bool result = false;
LOCKMODE wanted = tupleLockExtraInfo[lockmode].hwlock; LOCKMODE wanted = tupleLockExtraInfo[lockmode].hwlock;
allow_old = !(infomask & HEAP_LOCK_MASK) && HEAP_XMAX_IS_LOCKED_ONLY(infomask); if (HEAP_LOCKED_UPGRADED(infomask))
nmembers = GetMultiXactIdMembers(multi, &members, allow_old, return false;
nmembers = GetMultiXactIdMembers(multi, &members, false,
HEAP_XMAX_IS_LOCKED_ONLY(infomask)); HEAP_XMAX_IS_LOCKED_ONLY(infomask));
if (nmembers >= 0) if (nmembers >= 0)
{ {
@ -6625,15 +6629,15 @@ Do_MultiXactIdWait(MultiXactId multi, MultiXactStatus status,
Relation rel, ItemPointer ctid, XLTW_Oper oper, Relation rel, ItemPointer ctid, XLTW_Oper oper,
int *remaining) int *remaining)
{ {
bool allow_old;
bool result = true; bool result = true;
MultiXactMember *members; MultiXactMember *members;
int nmembers; int nmembers;
int remain = 0; int remain = 0;
allow_old = !(infomask & HEAP_LOCK_MASK) && HEAP_XMAX_IS_LOCKED_ONLY(infomask); /* for pre-pg_upgrade tuples, no need to sleep at all */
nmembers = GetMultiXactIdMembers(multi, &members, allow_old, nmembers = HEAP_LOCKED_UPGRADED(infomask) ? -1 :
HEAP_XMAX_IS_LOCKED_ONLY(infomask)); GetMultiXactIdMembers(multi, &members, false,
HEAP_XMAX_IS_LOCKED_ONLY(infomask));
if (nmembers >= 0) if (nmembers >= 0)
{ {
@ -6765,6 +6769,8 @@ heap_tuple_needs_freeze(HeapTupleHeader tuple, TransactionId cutoff_xid,
/* no xmax set, ignore */ /* no xmax set, ignore */
; ;
} }
else if (HEAP_LOCKED_UPGRADED(tuple->t_infomask))
return true;
else if (MultiXactIdPrecedes(multi, cutoff_multi)) else if (MultiXactIdPrecedes(multi, cutoff_multi))
return true; return true;
else else
@ -6772,13 +6778,10 @@ heap_tuple_needs_freeze(HeapTupleHeader tuple, TransactionId cutoff_xid,
MultiXactMember *members; MultiXactMember *members;
int nmembers; int nmembers;
int i; int i;
bool allow_old;
/* need to check whether any member of the mxact is too old */ /* need to check whether any member of the mxact is too old */
allow_old = !(tuple->t_infomask & HEAP_LOCK_MASK) && nmembers = GetMultiXactIdMembers(multi, &members, false,
HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask);
nmembers = GetMultiXactIdMembers(multi, &members, allow_old,
HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask)); HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask));
for (i = 0; i < nmembers; i++) for (i = 0; i < nmembers; i++)

View File

@ -1181,19 +1181,25 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
/* /*
* GetMultiXactIdMembers * GetMultiXactIdMembers
* Returns the set of MultiXactMembers that make up a MultiXactId * Return the set of MultiXactMembers that make up a MultiXactId
* *
* If the given MultiXactId is older than the value we know to be oldest, we * Return value is the number of members found, or -1 if there are none,
* return -1. The caller is expected to allow that only in permissible cases, * and *members is set to a newly palloc'ed array of members. It's the
* i.e. when the infomask lets it presuppose that the tuple had been * caller's responsibility to free it when done with it.
* share-locked before a pg_upgrade; this means that the HEAP_XMAX_LOCK_ONLY
* needs to be set, but HEAP_XMAX_KEYSHR_LOCK and HEAP_XMAX_EXCL_LOCK are not
* set.
* *
* Other border conditions, such as trying to read a value that's larger than * from_pgupgrade must be passed as true if and only if only the multixact
* the value currently known as the next to assign, raise an error. Previously * corresponds to a value from a tuple that was locked in a 9.2-or-older
* these also returned -1, but since this can lead to the wrong visibility * installation and later pg_upgrade'd (that is, the infomask is
* results, it is dangerous to do that. * HEAP_LOCKED_UPGRADED). In this case, we know for certain that no members
* can still be running, so we return -1 just like for an empty multixact
* without any further checking. It would be wrong to try to resolve such a
* multixact: either the multixact is within the current valid multixact
* range, in which case the returned result would be bogus, or outside that
* range, in which case an error would be raised.
*
* In all other cases, the passed multixact must be within the known valid
* range, that is, greater to or equal than oldestMultiXactId, and less than
* nextMXact. Otherwise, an error is raised.
* *
* onlyLock must be set to true if caller is certain that the given multi * onlyLock must be set to true if caller is certain that the given multi
* is used only to lock tuples; can be false without loss of correctness, * is used only to lock tuples; can be false without loss of correctness,
@ -1202,7 +1208,7 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
*/ */
int int
GetMultiXactIdMembers(MultiXactId multi, MultiXactMember **members, GetMultiXactIdMembers(MultiXactId multi, MultiXactMember **members,
bool allow_old, bool onlyLock) bool from_pgupgrade, bool onlyLock)
{ {
int pageno; int pageno;
int prev_pageno; int prev_pageno;
@ -1221,7 +1227,7 @@ GetMultiXactIdMembers(MultiXactId multi, MultiXactMember **members,
debug_elog3(DEBUG2, "GetMembers: asked for %u", multi); debug_elog3(DEBUG2, "GetMembers: asked for %u", multi);
if (!MultiXactIdIsValid(multi)) if (!MultiXactIdIsValid(multi) || from_pgupgrade)
return -1; return -1;
/* See if the MultiXactId is in the local cache */ /* See if the MultiXactId is in the local cache */
@ -1254,18 +1260,11 @@ GetMultiXactIdMembers(MultiXactId multi, MultiXactMember **members,
* *
* An ID older than MultiXactState->oldestMultiXactId cannot possibly be * An ID older than MultiXactState->oldestMultiXactId cannot possibly be
* useful; it has already been removed, or will be removed shortly, by * useful; it has already been removed, or will be removed shortly, by
* truncation. Returning the wrong values could lead to an incorrect * truncation. If one is passed, an error is raised.
* visibility result. However, to support pg_upgrade we need to allow an
* empty set to be returned regardless, if the caller is willing to accept
* it; the caller is expected to check that it's an allowed condition
* (such as ensuring that the infomask bits set on the tuple are
* consistent with the pg_upgrade scenario). If the caller is expecting
* this to be called only on recently created multis, then we raise an
* error.
* *
* Conversely, an ID >= nextMXact shouldn't ever be seen here; if it is * Also, an ID >= nextMXact shouldn't ever be seen here; if it is seen, it
* seen, it implies undetected ID wraparound has occurred. This raises a * implies undetected ID wraparound has occurred. This raises a hard
* hard error. * error.
* *
* Shared lock is enough here since we aren't modifying any global state. * Shared lock is enough here since we aren't modifying any global state.
* Acquire it just long enough to grab the current counter values. We may * Acquire it just long enough to grab the current counter values. We may
@ -1281,7 +1280,7 @@ GetMultiXactIdMembers(MultiXactId multi, MultiXactMember **members,
if (MultiXactIdPrecedes(multi, oldestMXact)) if (MultiXactIdPrecedes(multi, oldestMXact))
{ {
ereport(allow_old ? DEBUG1 : ERROR, ereport(ERROR,
(errcode(ERRCODE_INTERNAL_ERROR), (errcode(ERRCODE_INTERNAL_ERROR),
errmsg("MultiXactId %u does no longer exist -- apparent wraparound", errmsg("MultiXactId %u does no longer exist -- apparent wraparound",
multi))); multi)));

View File

@ -609,15 +609,12 @@ HeapTupleSatisfiesUpdate(HeapTuple htup, CommandId curcid,
{ {
TransactionId xmax; TransactionId xmax;
if (HEAP_LOCKED_UPGRADED(tuple->t_infomask))
return HeapTupleMayBeUpdated;
if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask)) if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask))
{ {
/* if (MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple), true))
* If it's only locked but neither EXCL_LOCK nor KEYSHR_LOCK is
* set, it cannot possibly be running. Otherwise need to check.
*/
if ((tuple->t_infomask & (HEAP_XMAX_EXCL_LOCK |
HEAP_XMAX_KEYSHR_LOCK)) &&
MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple), true))
return HeapTupleBeingUpdated; return HeapTupleBeingUpdated;
SetHintBits(tuple, buffer, HEAP_XMAX_INVALID, InvalidTransactionId); SetHintBits(tuple, buffer, HEAP_XMAX_INVALID, InvalidTransactionId);
@ -1258,26 +1255,21 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
* "Deleting" xact really only locked it, so the tuple is live in any * "Deleting" xact really only locked it, so the tuple is live in any
* case. However, we should make sure that either XMAX_COMMITTED or * case. However, we should make sure that either XMAX_COMMITTED or
* XMAX_INVALID gets set once the xact is gone, to reduce the costs of * XMAX_INVALID gets set once the xact is gone, to reduce the costs of
* examining the tuple for future xacts. Also, marking dead * examining the tuple for future xacts.
* MultiXacts as invalid here provides defense against MultiXactId
* wraparound (see also comments in heap_freeze_tuple()).
*/ */
if (!(tuple->t_infomask & HEAP_XMAX_COMMITTED)) if (!(tuple->t_infomask & HEAP_XMAX_COMMITTED))
{ {
if (tuple->t_infomask & HEAP_XMAX_IS_MULTI) if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
{ {
/* /*
* If it's only locked but neither EXCL_LOCK nor KEYSHR_LOCK * If it's a pre-pg_upgrade tuple, the multixact cannot
* are set, it cannot possibly be running; otherwise have to * possibly be running; otherwise have to check.
* check.
*/ */
if ((tuple->t_infomask & (HEAP_XMAX_EXCL_LOCK | if (!HEAP_LOCKED_UPGRADED(tuple->t_infomask) &&
HEAP_XMAX_KEYSHR_LOCK)) &&
MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple), MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple),
true)) true))
return HEAPTUPLE_LIVE; return HEAPTUPLE_LIVE;
SetHintBits(tuple, buffer, HEAP_XMAX_INVALID, InvalidTransactionId); SetHintBits(tuple, buffer, HEAP_XMAX_INVALID, InvalidTransactionId);
} }
else else
{ {

View File

@ -217,6 +217,31 @@ struct HeapTupleHeaderData
(((infomask) & HEAP_XMAX_LOCK_ONLY) || \ (((infomask) & HEAP_XMAX_LOCK_ONLY) || \
(((infomask) & (HEAP_XMAX_IS_MULTI | HEAP_LOCK_MASK)) == HEAP_XMAX_EXCL_LOCK)) (((infomask) & (HEAP_XMAX_IS_MULTI | HEAP_LOCK_MASK)) == HEAP_XMAX_EXCL_LOCK))
/*
* A tuple that has HEAP_XMAX_IS_MULTI and HEAP_XMAX_LOCK_ONLY but neither of
* XMAX_EXCL_LOCK and XMAX_KEYSHR_LOCK must come from a tuple that was
* share-locked in 9.2 or earlier and then pg_upgrade'd.
*
* In 9.2 and prior, HEAP_XMAX_IS_MULTI was only set when there were multiple
* FOR SHARE lockers of that tuple. That set HEAP_XMAX_LOCK_ONLY (with a
* different name back then) but neither of HEAP_XMAX_EXCL_LOCK and
* HEAP_XMAX_KEYSHR_LOCK. That combination is no longer possible in 9.3 and
* up, so if we see that combination we know for certain that the tuple was
* locked in an earlier release; since all such lockers are gone (they cannot
* survive through pg_upgrade), such tuples can safely be considered not
* locked.
*
* We must not resolve such multixacts locally, because the result would be
* bogus, regardless of where they stand with respect to the current valid
* multixact range.
*/
#define HEAP_LOCKED_UPGRADED(infomask) \
( \
((infomask) & HEAP_XMAX_IS_MULTI) && \
((infomask) & HEAP_XMAX_LOCK_ONLY) && \
(((infomask) & (HEAP_XMAX_EXCL_LOCK | HEAP_XMAX_KEYSHR_LOCK)) == 0) \
)
/* /*
* Use these to test whether a particular lock is applied to a tuple * Use these to test whether a particular lock is applied to a tuple
*/ */