Speed up HeapTupleSatisfiesMVCC() by replacing the XID-in-progress test.

Rather than consulting TransactionIdIsInProgress to see if an in-doubt
transaction is still running, consult XidInMVCCSnapshot.  That requires
the same or fewer cycles as TransactionIdIsInProgress, and what's far
more important, it does not access shared data structures (at least in the
no-subxip-overflow case) so it incurs no contention.  Furthermore, we would
have had to check XidInMVCCSnapshot anyway before deciding that we were
allowed to see the tuple.

There should never be a case where XidInMVCCSnapshot says a transaction is
done while TransactionIdIsInProgress says it's still running.  The other
way around is quite possible though.  The result of that difference is that
HeapTupleSatisfiesMVCC will no longer set hint bits on tuples whose source
transactions recently finished but are still running according to our
snapshot.  The main cost of delaying the hint-bit setting is that repeated
visits to a just-committed tuple, by transactions none of which have
snapshots new enough to see the source transaction as done, will each
execute TransactionIdIsCurrentTransactionId, which they need not have done
before.  However, that's normally just a small overhead, and no contention
costs are involved; so it seems well worth the benefit of removing
TransactionIdIsInProgress calls during the life of the source transaction.

The core idea for this patch is due to Jeff Janes, who also did the legwork
proving its performance benefits.  His original proposal was to swap the
order of TransactionIdIsInProgress and XidInMVCCSnapshot calls in some
cases within HeapTupleSatisfiesMVCC.  That was a bit messy though.
The idea that we could dispense with calling TransactionIdIsInProgress
altogether was mine, as is the final patch.
This commit is contained in:
Tom Lane 2015-08-26 18:18:57 -04:00
parent 16d4f94e67
commit 8a7d070181

View File

@ -10,13 +10,14 @@
* the passed-in buffer. The caller must hold not only a pin, but at least * the passed-in buffer. The caller must hold not only a pin, but at least
* shared buffer content lock on the buffer containing the tuple. * shared buffer content lock on the buffer containing the tuple.
* *
* NOTE: must check TransactionIdIsInProgress (which looks in PGXACT array) * NOTE: When using a non-MVCC snapshot, we must check
* TransactionIdIsInProgress (which looks in the PGXACT array)
* before TransactionIdDidCommit/TransactionIdDidAbort (which look in * before TransactionIdDidCommit/TransactionIdDidAbort (which look in
* pg_clog). Otherwise we have a race condition: we might decide that a * pg_clog). Otherwise we have a race condition: we might decide that a
* just-committed transaction crashed, because none of the tests succeed. * just-committed transaction crashed, because none of the tests succeed.
* xact.c is careful to record commit/abort in pg_clog before it unsets * xact.c is careful to record commit/abort in pg_clog before it unsets
* MyPgXact->xid in PGXACT array. That fixes that problem, but it also * MyPgXact->xid in the PGXACT array. That fixes that problem, but it
* means there is a window where TransactionIdIsInProgress and * also means there is a window where TransactionIdIsInProgress and
* TransactionIdDidCommit will both return true. If we check only * TransactionIdDidCommit will both return true. If we check only
* TransactionIdDidCommit, we could consider a tuple committed when a * TransactionIdDidCommit, we could consider a tuple committed when a
* later GetSnapshotData call will still think the originating transaction * later GetSnapshotData call will still think the originating transaction
@ -26,6 +27,11 @@
* subtransactions of our own main transaction and so there can't be any * subtransactions of our own main transaction and so there can't be any
* race condition. * race condition.
* *
* When using an MVCC snapshot, we rely on XidInMVCCSnapshot rather than
* TransactionIdIsInProgress, but the logic is otherwise the same: do not
* check pg_clog until after deciding that the xact is no longer in progress.
*
*
* Summary of visibility functions: * Summary of visibility functions:
* *
* HeapTupleSatisfiesMVCC() * HeapTupleSatisfiesMVCC()
@ -936,9 +942,21 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
* transactions started after the snapshot was taken * transactions started after the snapshot was taken
* changes made by the current command * changes made by the current command
* *
* (Notice, however, that the tuple status hint bits will be updated on the * Notice that here, we will not update the tuple status hint bits if the
* basis of the true state of the transaction, even if we then pretend we * inserting/deleting transaction is still running according to our snapshot,
* can't see it.) * even if in reality it's committed or aborted by now. This is intentional.
* Checking the true transaction state would require access to high-traffic
* shared data structures, creating contention we'd rather do without, and it
* would not change the result of our visibility check anyway. The hint bits
* will be updated by the first visitor that has a snapshot new enough to see
* the inserting/deleting transaction as done. In the meantime, the cost of
* leaving the hint bits unset is basically that each HeapTupleSatisfiesMVCC
* call will need to run TransactionIdIsCurrentTransactionId in addition to
* XidInMVCCSnapshot (but it would have to do the latter anyway). In the old
* coding where we tried to set the hint bits as soon as possible, we instead
* did TransactionIdIsInProgress in each call --- to no avail, as long as the
* inserting/deleting transaction was still running --- which was more cycles
* and more contention on the PGXACT array.
*/ */
bool bool
HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot, HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
@ -961,7 +979,7 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
if (TransactionIdIsCurrentTransactionId(xvac)) if (TransactionIdIsCurrentTransactionId(xvac))
return false; return false;
if (!TransactionIdIsInProgress(xvac)) if (!XidInMVCCSnapshot(xvac, snapshot))
{ {
if (TransactionIdDidCommit(xvac)) if (TransactionIdDidCommit(xvac))
{ {
@ -980,7 +998,7 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
if (!TransactionIdIsCurrentTransactionId(xvac)) if (!TransactionIdIsCurrentTransactionId(xvac))
{ {
if (TransactionIdIsInProgress(xvac)) if (XidInMVCCSnapshot(xvac, snapshot))
return false; return false;
if (TransactionIdDidCommit(xvac)) if (TransactionIdDidCommit(xvac))
SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED, SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
@ -1035,7 +1053,7 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
else else
return false; /* deleted before scan started */ return false; /* deleted before scan started */
} }
else if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmin(tuple))) else if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), snapshot))
return false; return false;
else if (TransactionIdDidCommit(HeapTupleHeaderGetRawXmin(tuple))) else if (TransactionIdDidCommit(HeapTupleHeaderGetRawXmin(tuple)))
SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED, SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
@ -1048,14 +1066,15 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
return false; return false;
} }
} }
else
/* {
* By here, the inserting transaction has committed - have to check /* xmin is committed, but maybe not according to our snapshot */
* when... if (!HeapTupleHeaderXminFrozen(tuple) &&
*/ XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), snapshot))
if (!HeapTupleHeaderXminFrozen(tuple)
&& XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), snapshot))
return false; /* treat as still in progress */ return false; /* treat as still in progress */
}
/* by here, the inserting transaction has committed */
if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid or aborted */ if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid or aborted */
return true; return true;
@ -1082,15 +1101,10 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
else else
return false; /* deleted before scan started */ return false; /* deleted before scan started */
} }
if (TransactionIdIsInProgress(xmax)) if (XidInMVCCSnapshot(xmax, snapshot))
return true; return true;
if (TransactionIdDidCommit(xmax)) if (TransactionIdDidCommit(xmax))
{ return false; /* updating transaction committed */
/* updating transaction committed, but when? */
if (XidInMVCCSnapshot(xmax, snapshot))
return true; /* treat as still in progress */
return false;
}
/* it must have aborted or crashed */ /* it must have aborted or crashed */
return true; return true;
} }
@ -1105,7 +1119,7 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
return false; /* deleted before scan started */ return false; /* deleted before scan started */
} }
if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmax(tuple))) if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmax(tuple), snapshot))
return true; return true;
if (!TransactionIdDidCommit(HeapTupleHeaderGetRawXmax(tuple))) if (!TransactionIdDidCommit(HeapTupleHeaderGetRawXmax(tuple)))
@ -1120,12 +1134,14 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
SetHintBits(tuple, buffer, HEAP_XMAX_COMMITTED, SetHintBits(tuple, buffer, HEAP_XMAX_COMMITTED,
HeapTupleHeaderGetRawXmax(tuple)); HeapTupleHeaderGetRawXmax(tuple));
} }
else
/* {
* OK, the deleting transaction committed too ... but when? /* xmax is committed, but maybe not according to our snapshot */
*/
if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmax(tuple), snapshot)) if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmax(tuple), snapshot))
return true; /* treat as still in progress */ return true; /* treat as still in progress */
}
/* xmax transaction committed */
return false; return false;
} }
@ -1383,14 +1399,15 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
/* /*
* HeapTupleIsSurelyDead * HeapTupleIsSurelyDead
* *
* Determine whether a tuple is surely dead. We sometimes use this * Cheaply determine whether a tuple is surely dead to all onlookers.
* in lieu of HeapTupleSatisifesVacuum when the tuple has just been * We sometimes use this in lieu of HeapTupleSatisfiesVacuum when the
* tested by HeapTupleSatisfiesMVCC and, therefore, any hint bits that * tuple has just been tested by another visibility routine (usually
* can be set should already be set. We assume that if no hint bits * HeapTupleSatisfiesMVCC) and, therefore, any hint bits that can be set
* either for xmin or xmax, the transaction is still running. This is * should already be set. We assume that if no hint bits are set, the xmin
* therefore faster than HeapTupleSatisfiesVacuum, because we don't * or xmax transaction is still running. This is therefore faster than
* consult CLOG (and also because we don't need to give an exact answer, * HeapTupleSatisfiesVacuum, because we don't consult PGXACT nor CLOG.
* just whether or not the tuple is surely dead). * It's okay to return FALSE when in doubt, but we must return TRUE only
* if the tuple is removable.
*/ */
bool bool
HeapTupleIsSurelyDead(HeapTuple htup, TransactionId OldestXmin) HeapTupleIsSurelyDead(HeapTuple htup, TransactionId OldestXmin)
@ -1443,8 +1460,9 @@ HeapTupleIsSurelyDead(HeapTuple htup, TransactionId OldestXmin)
* *
* Note: GetSnapshotData never stores either top xid or subxids of our own * Note: GetSnapshotData never stores either top xid or subxids of our own
* backend into a snapshot, so these xids will not be reported as "running" * backend into a snapshot, so these xids will not be reported as "running"
* by this function. This is OK for current uses, because we actually only * by this function. This is OK for current uses, because we always check
* apply this for known-committed XIDs. * TransactionIdIsCurrentTransactionId first, except for known-committed
* XIDs which could not be ours anyway.
*/ */
static bool static bool
XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot) XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
@ -1481,7 +1499,7 @@ XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
*/ */
if (!snapshot->suboverflowed) if (!snapshot->suboverflowed)
{ {
/* full data, so search subxip */ /* we have full data, so search subxip */
int32 j; int32 j;
for (j = 0; j < snapshot->subxcnt; j++) for (j = 0; j < snapshot->subxcnt; j++)
@ -1494,7 +1512,10 @@ XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
} }
else else
{ {
/* overflowed, so convert xid to top-level */ /*
* Snapshot overflowed, so convert xid to top-level. This is safe
* because we eliminated too-old XIDs above.
*/
xid = SubTransGetTopmostTransaction(xid); xid = SubTransGetTopmostTransaction(xid);
/* /*
@ -1525,7 +1546,10 @@ XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
*/ */
if (snapshot->suboverflowed) if (snapshot->suboverflowed)
{ {
/* overflowed, so convert xid to top-level */ /*
* Snapshot overflowed, so convert xid to top-level. This is safe
* because we eliminated too-old XIDs above.
*/
xid = SubTransGetTopmostTransaction(xid); xid = SubTransGetTopmostTransaction(xid);
/* /*