Commit Graph

4 Commits

Author SHA1 Message Date
Heikki Linnakangas 928408d9e5 Fix a bug with SSI and prepared transactions:
If there's a dangerous structure T0 ---> T1 ---> T2, and T2 commits first,
we need to abort something. If T2 commits before both conflicts appear,
then it should be caught by OnConflict_CheckForSerializationFailure. If
both conflicts appear before T2 commits, it should be caught by
PreCommit_CheckForSerializationFailure. But that is actually run when
T2 *prepares*. Fix that in OnConflict_CheckForSerializationFailure, by
treating a prepared T2 as if it committed already.

This is mostly a problem for prepared transactions, which are in prepared
state for some time, but also for regular transactions because they also go
through the prepared state in the SSI code for a short moment when they're
committed.

Kevin Grittner and Dan Ports
2011-07-07 18:12:15 +03:00
Heikki Linnakangas 38c0e72180 Adjust the alternative expected output file for prepared_xacts test case,
used when max_prepared_transactions=0, for the recent changes in the test
case.
2011-06-21 17:07:10 +03:00
Heikki Linnakangas dafaa3efb7 Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.

To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.

A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.

Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.

We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.

Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.

Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-08 00:09:08 +02:00
Tom Lane 8d4f2ecd41 Change the default value of max_prepared_transactions to zero, and add
documentation warnings against setting it nonzero unless active use of
prepared transactions is intended and a suitable transaction manager has been
installed.  This should help to prevent the type of scenario we've seen
several times now where a prepared transaction is forgotten and eventually
causes severe maintenance problems (or even anti-wraparound shutdown).

The only real reason we had the default be nonzero in the first place was to
support regression testing of the feature.  To still be able to do that,
tweak pg_regress to force a nonzero value during "make check".  Since we
cannot force a nonzero value in "make installcheck", add a variant regression
test "expected" file that shows the results that will be obtained when
max_prepared_transactions is zero.

Also, extend the HINT messages for transaction wraparound warnings to mention
the possibility that old prepared transactions are causing the problem.

All per today's discussion.
2009-04-23 00:23:46 +00:00