Do not select new object OIDs that match recently-dead entries.

When selecting a new OID, we take care to avoid picking one that's already
in use in the target table, so as not to create duplicates after the OID
counter has wrapped around.  However, up to now we used SnapshotDirty when
scanning for pre-existing entries.  That ignores committed-dead rows, so
that we could select an OID matching a deleted-but-not-yet-vacuumed row.
While that mostly worked, it has two problems:

* If recently deleted, the dead row might still be visible to MVCC
snapshots, creating a risk for duplicate OIDs when examining the catalogs
within our own transaction.  Such duplication couldn't be visible outside
the object-creating transaction, though, and we've heard few if any field
reports corresponding to such a symptom.

* When selecting a TOAST OID, deleted toast rows definitely *are* visible
to SnapshotToast, and will remain so until vacuumed away.  This leads to
a conflict that will manifest in errors like "unexpected chunk number 0
(expected 1) for toast value nnnnn".  We've been seeing reports of such
errors from the field for years, but the cause was unclear before.

The fix is simple: just use SnapshotAny to search for conflicting rows.
This results in a slightly longer window before object OIDs can be
recycled, but that seems unlikely to create any large problems.

Pavan Deolasee

Discussion: https://postgr.es/m/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-=h4OLosXHf9w@mail.gmail.com
This commit is contained in:
Tom Lane 2018-04-11 17:41:09 -04:00
parent 811969b218
commit 0408e1ed59
2 changed files with 12 additions and 11 deletions

View File

@ -1794,7 +1794,9 @@ toast_delete_datum(Relation rel, Datum value, bool is_speculative)
/* ---------- /* ----------
* toastrel_valueid_exists - * toastrel_valueid_exists -
* *
* Test whether a toast value with the given ID exists in the toast relation * Test whether a toast value with the given ID exists in the toast relation.
* For safety, we consider a value to exist if there are either live or dead
* toast rows with that ID; see notes for GetNewOid().
* ---------- * ----------
*/ */
static bool static bool
@ -1806,7 +1808,6 @@ toastrel_valueid_exists(Relation toastrel, Oid valueid)
int num_indexes; int num_indexes;
int validIndex; int validIndex;
Relation *toastidxs; Relation *toastidxs;
SnapshotData SnapshotToast;
/* Fetch a valid index relation */ /* Fetch a valid index relation */
validIndex = toast_open_indexes(toastrel, validIndex = toast_open_indexes(toastrel,
@ -1825,10 +1826,9 @@ toastrel_valueid_exists(Relation toastrel, Oid valueid)
/* /*
* Is there any such chunk? * Is there any such chunk?
*/ */
init_toast_snapshot(&SnapshotToast);
toastscan = systable_beginscan(toastrel, toastscan = systable_beginscan(toastrel,
RelationGetRelid(toastidxs[validIndex]), RelationGetRelid(toastidxs[validIndex]),
true, &SnapshotToast, 1, &toastkey); true, SnapshotAny, 1, &toastkey);
if (systable_getnext(toastscan) != NULL) if (systable_getnext(toastscan) != NULL)
result = true; result = true;

View File

@ -282,8 +282,12 @@ IsSharedRelation(Oid relationId)
* managed to cycle through 2^32 OIDs and generate the same OID before we * managed to cycle through 2^32 OIDs and generate the same OID before we
* finish inserting our row. This seems unlikely to be a problem. Note * finish inserting our row. This seems unlikely to be a problem. Note
* that if we had to *commit* the row to end the race condition, the risk * that if we had to *commit* the row to end the race condition, the risk
* would be rather higher; therefore we use SnapshotDirty in the test, * would be rather higher; therefore we use SnapshotAny in the test, so that
* so that we will see uncommitted rows. * we will see uncommitted rows. (We used to use SnapshotDirty, but that has
* the disadvantage that it ignores recently-deleted rows, creating a risk
* of transient conflicts for as long as our own MVCC snapshots think a
* recently-deleted row is live. The risk is far higher when selecting TOAST
* OIDs, because SnapshotToast considers dead rows as active indefinitely.)
*/ */
Oid Oid
GetNewOid(Relation relation) GetNewOid(Relation relation)
@ -336,7 +340,6 @@ Oid
GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn) GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
{ {
Oid newOid; Oid newOid;
SnapshotData SnapshotDirty;
SysScanDesc scan; SysScanDesc scan;
ScanKeyData key; ScanKeyData key;
bool collides; bool collides;
@ -349,8 +352,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
*/ */
Assert(!IsBinaryUpgrade || RelationGetRelid(relation) != TypeRelationId); Assert(!IsBinaryUpgrade || RelationGetRelid(relation) != TypeRelationId);
InitDirtySnapshot(SnapshotDirty);
/* Generate new OIDs until we find one not in the table */ /* Generate new OIDs until we find one not in the table */
do do
{ {
@ -363,9 +364,9 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
BTEqualStrategyNumber, F_OIDEQ, BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(newOid)); ObjectIdGetDatum(newOid));
/* see notes above about using SnapshotDirty */ /* see notes above about using SnapshotAny */
scan = systable_beginscan(relation, indexId, true, scan = systable_beginscan(relation, indexId, true,
&SnapshotDirty, 1, &key); SnapshotAny, 1, &key);
collides = HeapTupleIsValid(systable_getnext(scan)); collides = HeapTupleIsValid(systable_getnext(scan));