Fix longstanding bug in HeapTupleSatisfiesVacuum().

HeapTupleSatisfiesVacuum() didn't properly discern between
DELETE_IN_PROGRESS and INSERT_IN_PROGRESS for rows that have been
inserted in the current transaction and deleted in a aborted
subtransaction of the current backend. At the very least that caused
problems for CLUSTER and CREATE INDEX in transactions that had
aborting subtransactions producing rows, leading to warnings like:
WARNING:  concurrent delete in progress within table "..."
possibly in an endless, uninterruptible, loop.

Instead of treating *InProgress xmins the same as *IsCurrent ones,
treat them as being distinct like the other visibility routines. As
implemented this separatation can cause a behaviour change for rows
that have been inserted and deleted in another, still running,
transaction. HTSV will now return INSERT_IN_PROGRESS instead of
DELETE_IN_PROGRESS for those. That's both, more in line with the other
visibility routines and arguably more correct. The latter because a
INSERT_IN_PROGRESS will make callers look at/wait for xmin, instead of
xmax.
The only current caller where that's possibly worse than the old
behaviour is heap_prune_chain() which now won't mark the page as
prunable if a row has concurrently been inserted and deleted. That's
harmless enough.

As a cautionary measure also insert a interrupt check before the gotos
in IndexBuildHeapScan() that lead to the uninterruptible loop. There
are other possible causes, like a row that several sessions try to
update and all fail, for repeated loops and the cost of doing so in
the retry case is low.

As this bug goes back all the way to the introduction of
subtransactions in 573a71a5da backpatch to all supported releases.

Reported-By: Sandro Santilli
This commit is contained in:
Andres Freund 2014-06-04 23:27:38 +02:00
parent ae41bb4bef
commit 80d45ae4e6
2 changed files with 19 additions and 2 deletions

View File

@ -1711,6 +1711,7 @@ IndexBuildHeapScan(Relation heapRelation,
LockBuffer(scan->rs_cbuf, BUFFER_LOCK_UNLOCK);
XactLockTableWait(xwait);
CHECK_FOR_INTERRUPTS();
goto recheck;
}
}
@ -1743,6 +1744,7 @@ IndexBuildHeapScan(Relation heapRelation,
LockBuffer(scan->rs_cbuf, BUFFER_LOCK_UNLOCK);
XactLockTableWait(xwait);
CHECK_FOR_INTERRUPTS();
goto recheck;
}

View File

@ -1099,14 +1099,29 @@ HeapTupleSatisfiesVacuum(HeapTupleHeader tuple, TransactionId OldestXmin,
return HEAPTUPLE_DEAD;
}
}
else if (TransactionIdIsInProgress(HeapTupleHeaderGetXmin(tuple)))
else if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple)))
{
if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid */
return HEAPTUPLE_INSERT_IN_PROGRESS;
if (tuple->t_infomask & HEAP_IS_LOCKED)
return HEAPTUPLE_INSERT_IN_PROGRESS;
/* inserted and then deleted by same xact */
return HEAPTUPLE_DELETE_IN_PROGRESS;
if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple)))
return HEAPTUPLE_DELETE_IN_PROGRESS;
/* deleting subtransaction must have aborted */
return HEAPTUPLE_INSERT_IN_PROGRESS;
}
else if (TransactionIdIsInProgress(HeapTupleHeaderGetXmin(tuple)))
{
/*
* It'd be possible to discern between INSERT/DELETE in progress
* here by looking at xmax - but that doesn't seem beneficial for
* the majority of callers and even detrimental for some. We'd
* rather have callers look at/wait for xmin than xmax. It's
* always correct to return INSERT_IN_PROGRESS because that's
* what's happening from the view of other backends.
*/
return HEAPTUPLE_INSERT_IN_PROGRESS;
}
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,