diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index 41b794196b..1793783cab 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -37,7 +37,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/storage/ipc/procarray.c,v 1.57 2010/01/16 17:17:26 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/storage/ipc/procarray.c,v 1.58 2010/01/21 00:53:58 sriggs Exp $ * *------------------------------------------------------------------------- */ @@ -1623,17 +1623,57 @@ GetCurrentVirtualXIDs(TransactionId limitXmin, bool excludeXmin0, /* * GetConflictingVirtualXIDs -- returns an array of currently active VXIDs. * - * The array is palloc'd and is terminated with an invalid VXID. - * * Usage is limited to conflict resolution during recovery on standby servers. * limitXmin is supplied as either latestRemovedXid, or InvalidTransactionId * in cases where we cannot accurately determine a value for latestRemovedXid. - * If limitXmin is InvalidTransactionId then we know that the very + * + * If limitXmin is InvalidTransactionId then we are forced to assume that * latest xid that might have caused a cleanup record will be * latestCompletedXid, so we set limitXmin to be latestCompletedXid instead. * We then skip any backends with xmin > limitXmin. This means that * cleanup records don't conflict with some recent snapshots. * + * The reason for using latestCompletedxid is that we aren't certain which + * of the xids in KnownAssignedXids are actually FATAL errors that did + * not write abort records. In almost every case they won't be, but we + * don't know that for certain. So we need to conflict with all current + * snapshots whose xmin is less than latestCompletedXid to be safe. This + * causes false positives in our assessment of which vxids conflict. + * + * By using exclusive lock we prevent new snapshots from being taken while + * we work out which snapshots to conflict with. This protects those new + * snapshots from also being included in our conflict list. + * + * After the lock is released, we allow snapshots again. It is possible + * that we arrive at a snapshot that is identical to one that we just + * decided we should conflict with. This a case of false positives, not an + * actual problem. + * + * There are two cases: (1) if we were correct in using latestCompletedXid + * then that means that all xids in the snapshot lower than that are FATAL + * errors, so not xids that ever commit. We can make no visibility errors + * if we allow such xids into the snapshot. (2) if we erred on the side of + * caution and in fact the latestRemovedXid should have been earlier than + * latestCompletedXid then we conflicted with a snapshot needlessly. Taking + * another identical snapshot is OK, because the earlier conflicted + * snapshot was a false positive. + * + * In either case, a snapshot taken after conflict assessment will still be + * valid and non-conflicting even if an identical snapshot that existed + * before conflict assessment was assessed as conflicting. + * + * If we allowed concurrent snapshots while we were deciding who to + * conflict with we would need to include all concurrent snapshotters in + * the conflict list as well. We'd have difficulty in working out exactly + * who that was, so it is happier for all concerned if we take an exclusive + * lock. Notice that we only hold that lock for as long as it takes to + * make the conflict list, not for the whole duration of the conflict + * resolution. + * + * It also means that users waiting for a snapshot is a good thing, since + * it is more likely that they will live longer after having waited. So it + * is a benefit, not an oversight that we use exclusive lock here. + * * We replace InvalidTransactionId with latestCompletedXid here because * this is the most convenient place to do that, while we hold ProcArrayLock. * The originator of the cleanup record wanted to avoid checking the value of