postgresql/doc/src/sgml/mvcc.sgml

1017 lines
37 KiB
Plaintext
Raw Normal View History

<!--
$PostgreSQL: pgsql/doc/src/sgml/mvcc.sgml,v 2.49 2005/03/24 00:03:18 neilc Exp $
-->
<chapter id="mvcc">
<title>Concurrency Control</title>
<indexterm>
<primary>concurrency</primary>
</indexterm>
<para>
2002-11-15 04:11:18 +01:00
This chapter describes the behavior of the
<productname>PostgreSQL</productname> database system when two or
more sessions try to access the same data at the same time. The
goals in that situation are to allow efficient access for all
sessions while maintaining strict data integrity. Every developer
of database applications should be familiar with the topics covered
in this chapter.
</para>
<sect1 id="mvcc-intro">
<title>Introduction</title>
2003-08-31 19:32:24 +02:00
<indexterm>
<primary>MVCC</primary>
</indexterm>
<para>
Unlike traditional database systems which use locks for concurrency control,
<productname>PostgreSQL</productname>
maintains data consistency by using a multiversion model
(Multiversion Concurrency Control, <acronym>MVCC</acronym>).
1999-05-27 17:49:08 +02:00
This means that while querying a database each transaction sees
a snapshot of data (a <firstterm>database version</firstterm>)
as it was some
1999-05-27 17:49:08 +02:00
time ago, regardless of the current state of the underlying data.
This protects the transaction from viewing inconsistent data that
could be caused by (other) concurrent transaction updates on the same
data rows, providing <firstterm>transaction isolation</firstterm>
for each database session.
</para>
<para>
The main advantage to using the <acronym>MVCC</acronym> model of
concurrency control rather than locking is that in
<acronym>MVCC</acronym> locks acquired for querying (reading) data
do not conflict with locks acquired for writing data, and so
2002-11-15 04:11:18 +01:00
reading never blocks writing and writing never blocks reading.
</para>
<para>
Table- and row-level locking facilities are also available in
<productname>PostgreSQL</productname> for applications that cannot
2002-11-15 04:11:18 +01:00
adapt easily to <acronym>MVCC</acronym> behavior. However, proper
use of <acronym>MVCC</acronym> will generally provide better
performance than locks.
</para>
</sect1>
<sect1 id="transaction-iso">
<title>Transaction Isolation</title>
2003-08-31 19:32:24 +02:00
<indexterm>
<primary>transaction isolation</primary>
</indexterm>
<para>
The <acronym>SQL</acronym> standard defines four levels of
transaction isolation in terms of three phenomena that must be
prevented between concurrent transactions. These undesirable
phenomena are:
<variablelist>
<varlistentry>
<term>
dirty read
<indexterm><primary>dirty read</primary></indexterm>
</term>
<listitem>
<para>
A transaction reads data written by a concurrent uncommitted transaction.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
nonrepeatable read
<indexterm><primary>nonrepeatable read</primary></indexterm>
</term>
<listitem>
<para>
A transaction re-reads data it has previously read and finds that data
2000-10-11 19:38:36 +02:00
has been modified by another transaction (that committed since the
initial read).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
phantom read
<indexterm><primary>phantom read</primary></indexterm>
</term>
<listitem>
<para>
A transaction re-executes a query returning a set of rows that satisfy a
2000-10-11 19:38:36 +02:00
search condition and finds that the set of rows satisfying the condition
has changed due to another recently-committed transaction.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
<para>
<indexterm>
2003-08-31 19:32:24 +02:00
<primary>transaction isolation level</primary>
</indexterm>
The four transaction isolation levels and the corresponding
behaviors are described in <xref linkend="mvcc-isolevel-table">.
</para>
1999-05-27 17:49:08 +02:00
<table tocentry="1" id="mvcc-isolevel-table">
<title><acronym>SQL</acronym> Transaction Isolation Levels</title>
1999-05-27 17:49:08 +02:00
<tgroup cols="4">
<thead>
<row>
<entry>
Isolation Level
</entry>
1999-05-27 17:49:08 +02:00
<entry>
Dirty Read
</entry>
<entry>
Nonrepeatable Read
1999-05-27 17:49:08 +02:00
</entry>
<entry>
Phantom Read
</entry>
</row>
</thead>
<tbody>
<row>
<entry>
Read uncommitted
</entry>
<entry>
Possible
</entry>
<entry>
Possible
</entry>
<entry>
Possible
</entry>
</row>
<row>
<entry>
Read committed
</entry>
<entry>
Not possible
</entry>
<entry>
Possible
</entry>
<entry>
Possible
</entry>
</row>
<row>
<entry>
Repeatable read
</entry>
<entry>
Not possible
</entry>
<entry>
Not possible
</entry>
<entry>
Possible
</entry>
</row>
<row>
<entry>
Serializable
</entry>
<entry>
Not possible
</entry>
<entry>
Not possible
</entry>
<entry>
Not possible
</entry>
</row>
</tbody>
</tgroup>
</table>
2000-10-11 19:38:36 +02:00
<para>
2004-12-24 00:07:38 +01:00
In <productname>PostgreSQL</productname>, you can request any of the
four standard transaction isolation levels. But internally, there are
only two distinct isolation levels, which correspond to the levels Read
Committed and Serializable. When you select the level Read
Uncommitted you really get Read Committed, and when you select
Repeatable Read you really get Serializable, so the actual
isolation level may be stricter than what you select. This is
permitted by the SQL standard: the four isolation levels only
define which phenomena must not happen, they do not define which
phenomena must happen. The reason that <productname>PostgreSQL</>
only provides two isolation levels is that this is the only
2004-12-24 00:07:38 +01:00
sensible way to map the standard isolation levels to the multiversion
concurrency control architecture. The behavior of the available
isolation levels is detailed in the following subsections.
</para>
<para>
To set the transaction isolation level of a transaction, use the
command <xref linkend="sql-set-transaction">.
</para>
<sect2 id="xact-read-committed">
<title>Read Committed Isolation Level</title>
<indexterm>
2003-08-31 19:32:24 +02:00
<primary>transaction isolation level</primary>
<secondary>read committed</secondary>
</indexterm>
<para>
1999-05-27 17:49:08 +02:00
<firstterm>Read Committed</firstterm>
is the default isolation level in <productname>PostgreSQL</productname>.
2000-10-11 19:38:36 +02:00
When a transaction runs on this isolation level,
a <command>SELECT</command> query sees only data committed before the
query began; it never sees either uncommitted data or changes committed
during query execution by concurrent transactions. (However, the
2000-10-11 19:38:36 +02:00
<command>SELECT</command> does see the effects of previous updates
executed within its own transaction, even though they are not yet
committed.) In effect, a <command>SELECT</command> query
sees a snapshot of the database as of the instant that that query
begins to run. Notice that two successive <command>SELECT</command> commands can
see different data, even though they are within a single transaction, if
other transactions
commit changes during execution of the first <command>SELECT</command>.
</para>
<para>
<command>UPDATE</command>, <command>DELETE</command>, and <command>SELECT
FOR UPDATE</command> commands behave the same as <command>SELECT</command>
in terms of searching for target rows: they will only find target rows
that were committed as of the command start time. However, such a target
row may have already been updated (or deleted or marked for update) by
another concurrent transaction by the time it is found. In this case, the
would-be updater will wait for the first updating transaction to commit or
roll back (if it is still in progress). If the first updater rolls back,
then its effects are negated and the second updater can proceed with
updating the originally found row. If the first updater commits, the
second updater will ignore the row if the first updater deleted it,
otherwise it will attempt to apply its operation to the updated version of
the row. The search condition of the command (the <literal>WHERE</> clause) is
re-evaluated to see if the updated version of the row still matches the
search condition. If so, the second updater proceeds with its operation,
starting from the updated version of the row.
</para>
<para>
Because of the above rule, it is possible for an updating command to see an
inconsistent snapshot: it can see the effects of concurrent updating
commands that affected the same rows it is trying to update, but it
does not see effects of those commands on other rows in the database.
This behavior makes Read Committed mode unsuitable for commands that
involve complex search conditions. However, it is just right for simpler
cases. For example, consider updating bank balances with transactions
like
<screen>
BEGIN;
UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 12345;
UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 7534;
COMMIT;
</screen>
If two such transactions concurrently try to change the balance of account
12345, we clearly want the second transaction to start from the updated
version of the account's row. Because each command is affecting only a
predetermined row, letting it see the updated version of the row does
not create any troublesome inconsistency.
</para>
<para>
Since in Read Committed mode each new command starts with a new snapshot
that includes all transactions committed up to that instant, subsequent
commands in the same transaction will see the effects of the committed
concurrent transaction in any case. The point at issue here is whether
or not within a <emphasis>single</> command we see an absolutely consistent
view of the database.
</para>
<para>
The partial transaction isolation provided by Read Committed mode is
adequate for many applications, and this mode is fast and simple to use.
2000-10-11 19:38:36 +02:00
However, for applications that do complex queries and updates, it may
be necessary to guarantee a more rigorously consistent view of the
database than the Read Committed mode provides.
</para>
</sect2>
<sect2 id="xact-serializable">
<title>Serializable Isolation Level</title>
<indexterm>
2003-08-31 19:32:24 +02:00
<primary>transaction isolation level</primary>
<secondary>serializable</secondary>
</indexterm>
<para>
The level <firstterm>Serializable</firstterm> provides the strictest transaction
2000-10-11 19:38:36 +02:00
isolation. This level emulates serial transaction execution,
as if transactions had been executed one after another, serially,
rather than concurrently. However, applications using this level must
be prepared to retry transactions due to serialization failures.
</para>
<para>
1999-06-05 04:47:13 +02:00
When a transaction is on the serializable level,
2000-10-11 19:38:36 +02:00
a <command>SELECT</command> query sees only data committed before the
transaction began; it never sees either uncommitted data or changes
committed
during transaction execution by concurrent transactions. (However, the
2000-10-11 19:38:36 +02:00
<command>SELECT</command> does see the effects of previous updates
executed within its own transaction, even though they are not yet
committed.) This is different from Read Committed in that the
<command>SELECT</command>
sees a snapshot as of the start of the transaction, not as of the start
of the current query within the transaction. Thus, successive
<command>SELECT</command> commands within a single transaction always see the same
data.
</para>
<para>
<command>UPDATE</command>, <command>DELETE</command>, and <command>SELECT
FOR UPDATE</command> commands behave the same as <command>SELECT</command>
in terms of searching for target rows: they will only find target rows
that were committed as of the transaction start time. However, such a
target
row may have already been updated (or deleted or marked for update) by
another concurrent transaction by the time it is found. In this case, the
serializable transaction will wait for the first updating transaction to commit or
roll back (if it is still in progress). If the first updater rolls back,
then its effects are negated and the serializable transaction can proceed
with updating the originally found row. But if the first updater commits
(and actually updated or deleted the row, not just selected it for update)
then the serializable transaction will be rolled back with the message
2001-11-28 21:49:10 +01:00
<screen>
ERROR: could not serialize access due to concurrent update
2001-11-28 21:49:10 +01:00
</screen>
because a serializable transaction cannot modify rows changed by
other transactions after the serializable transaction began.
</para>
2000-10-11 19:38:36 +02:00
<para>
When the application receives this error message, it should abort
the current transaction and then retry the whole transaction from
the beginning. The second time through, the transaction sees the
previously-committed change as part of its initial view of the database,
so there is no logical conflict in using the new version of the row
as the starting point for the new transaction's update.
</para>
<para>
Note that only updating transactions may need to be retried; read-only
transactions will never have serialization conflicts.
2000-10-11 19:38:36 +02:00
</para>
<para>
The Serializable mode provides a rigorous guarantee that each
transaction sees a wholly consistent view of the database. However,
the application has to be prepared to retry transactions when concurrent
updates make it impossible to sustain the illusion of serial execution.
Since the cost of redoing complex transactions may be significant,
this mode is recommended only when updating transactions contain logic
sufficiently complex that they may give wrong answers in Read
Committed mode. Most commonly, Serializable mode is necessary when
a transaction executes several successive commands that must see
identical views of the database.
2000-10-11 19:38:36 +02:00
</para>
<sect3 id="mvcc-serializability">
<title>Serializable Isolation versus True Serializability</title>
<indexterm>
<primary>serializability</primary>
</indexterm>
<indexterm>
<primary>predicate locking</primary>
</indexterm>
<para>
The intuitive meaning (and mathematical definition) of
<quote>serializable</> execution is that any two successfully committed
concurrent transactions will appear to have executed strictly serially,
one after the other &mdash; although which one appeared to occur first may
not be predictable in advance. It is important to realize that forbidding
the undesirable behaviors listed in <xref linkend="mvcc-isolevel-table">
is not sufficient to guarantee true serializability, and in fact
<productname>PostgreSQL</productname>'s Serializable mode <emphasis>does
not guarantee serializable execution in this sense</>. As an example,
consider a table <structname>mytab</>, initially containing
<screen>
class | value
-------+-------
1 | 10
1 | 20
2 | 100
2 | 200
</screen>
Suppose that serializable transaction A computes
<screen>
SELECT SUM(value) FROM mytab WHERE class = 1;
</screen>
and then inserts the result (30) as the <structfield>value</> in a
new row with <structfield>class</> = 2. Concurrently, serializable
transaction B computes
<screen>
SELECT SUM(value) FROM mytab WHERE class = 2;
</screen>
and obtains the result 300, which it inserts in a new row with
<structfield>class</> = 1. Then both transactions commit. None of
the listed undesirable behaviors have occurred, yet we have a result
that could not have occurred in either order serially. If A had
executed before B, B would have computed the sum 330, not 300, and
similarly the other order would have resulted in a different sum
computed by A.
</para>
<para>
To guarantee true mathematical serializability, it is necessary for
a database system to enforce <firstterm>predicate locking</>, which
means that a transaction cannot insert or modify a row that would
have matched the <literal>WHERE</> condition of a query in another concurrent
transaction. For example, once transaction A has executed the query
<literal>SELECT ... WHERE class = 1</>, a predicate-locking system
would forbid transaction B from inserting any new row with class 1
until A has committed.
<footnote>
<para>
Essentially, a predicate-locking system prevents phantom reads
by restricting what is written, whereas MVCC prevents them by
restricting what is read.
</para>
</footnote>
Such a locking system is complex to
implement and extremely expensive in execution, since every session must
be aware of the details of every query executed by every concurrent
transaction. And this large expense is mostly wasted, since in
practice most applications do not do the sorts of things that could
result in problems. (Certainly the example above is rather contrived
and unlikely to represent real software.) Accordingly,
<productname>PostgreSQL</productname> does not implement predicate
locking, and so far as we are aware no other production DBMS does either.
</para>
<para>
In those cases where the possibility of nonserializable execution
is a real hazard, problems can be prevented by appropriate use of
explicit locking. Further discussion appears in the following
sections.
</para>
</sect3>
</sect2>
</sect1>
<sect1 id="explicit-locking">
<title>Explicit Locking</title>
<indexterm>
2003-08-31 19:32:24 +02:00
<primary>lock</primary>
</indexterm>
<para>
<productname>PostgreSQL</productname> provides various lock modes
2002-11-15 04:11:18 +01:00
to control concurrent access to data in tables. These modes can
be used for application-controlled locking in situations where
<acronym>MVCC</acronym> does not give the desired behavior. Also,
most <productname>PostgreSQL</productname> commands automatically
acquire locks of appropriate modes to ensure that referenced
tables are not dropped or modified in incompatible ways while the
command executes. (For example, <command>ALTER TABLE</> cannot be
executed concurrently with other operations on the same table.)
</para>
<para>
To examine a list of the currently outstanding locks in a database
server, use the <structname>pg_locks</structname> system view
(<xref linkend="view-pg-locks">). For more
information on monitoring the status of the lock manager
subsystem, refer to <xref linkend="monitoring">.
</para>
<sect2 id="locking-tables">
<title>Table-Level Locks</title>
2003-08-31 19:32:24 +02:00
<indexterm zone="locking-tables">
<primary>LOCK</primary>
</indexterm>
<para>
The list below shows the available lock modes and the contexts in
which they are used automatically by
<productname>PostgreSQL</productname>. You can also acquire any
of these locks explicitly with the command <xref linkend="sql-lock">.
Remember that all of these lock modes are table-level locks,
even if the name contains the word
<quote>row</quote>; the names of the lock modes are historical.
To some extent the names reflect the typical usage of each lock
mode &mdash; but the semantics are all the same. The only real difference
between one lock mode and another is the set of lock modes with
which each conflicts. Two transactions cannot hold locks of conflicting
modes on the same table at the same time. (However, a transaction
never conflicts with itself. For example, it may acquire
<literal>ACCESS EXCLUSIVE</literal> lock and later acquire
<literal>ACCESS SHARE</literal> lock on the same table.) Non-conflicting
lock modes may be held concurrently by many transactions. Notice in
particular that some lock modes are self-conflicting (for example,
an <literal>ACCESS EXCLUSIVE</literal> lock cannot be held by more than one
transaction at a time) while others are not self-conflicting (for example,
an <literal>ACCESS SHARE</literal> lock can be held by multiple transactions).
Once acquired, a lock is held till end of transaction.
</para>
<variablelist>
<title>Table-level lock modes</title>
<varlistentry>
<term>
<literal>ACCESS SHARE</literal>
</term>
<listitem>
<para>
Conflicts with the <literal>ACCESS EXCLUSIVE</literal> lock
mode only.
</para>
<para>
The commands <command>SELECT</command> and
<command>ANALYZE</command> acquire a lock of this mode on
referenced tables. In general, any query that only reads a table
and does not modify it will acquire this lock mode.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<literal>ROW SHARE</literal>
</term>
<listitem>
<para>
Conflicts with the <literal>EXCLUSIVE</literal> and
<literal>ACCESS EXCLUSIVE</literal> lock modes.
</para>
<para>
The <command>SELECT FOR UPDATE</command> command acquires a
lock of this mode on the target table(s) (in addition to
<literal>ACCESS SHARE</literal> locks on any other tables
that are referenced but not selected <option>FOR UPDATE</option>).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<literal>ROW EXCLUSIVE</literal>
</term>
<listitem>
<para>
Conflicts with the <literal>SHARE</literal>, <literal>SHARE ROW
EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
<literal>ACCESS EXCLUSIVE</literal> lock modes.
</para>
<para>
The commands <command>UPDATE</command>,
<command>DELETE</command>, and <command>INSERT</command>
acquire this lock mode on the target table (in addition to
<literal>ACCESS SHARE</literal> locks on any other referenced
tables). In general, this lock mode will be acquired by any
command that modifies the data in a table.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<literal>SHARE UPDATE EXCLUSIVE</literal>
</term>
<listitem>
<para>
Conflicts with the <literal>SHARE UPDATE EXCLUSIVE</literal>,
<literal>SHARE</literal>, <literal>SHARE ROW
EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
<literal>ACCESS EXCLUSIVE</literal> lock modes.
This mode protects a table against
2002-09-21 20:32:54 +02:00
concurrent schema changes and <command>VACUUM</> runs.
</para>
<para>
Acquired by <command>VACUUM</command> (without <option>FULL</option>).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<literal>SHARE</literal>
</term>
<listitem>
<para>
Conflicts with the <literal>ROW EXCLUSIVE</literal>,
<literal>SHARE UPDATE EXCLUSIVE</literal>, <literal>SHARE ROW
EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
<literal>ACCESS EXCLUSIVE</literal> lock modes.
This mode protects a table against concurrent data changes.
</para>
<para>
Acquired by <command>CREATE INDEX</command>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<literal>SHARE ROW EXCLUSIVE</literal>
</term>
<listitem>
<para>
Conflicts with the <literal>ROW EXCLUSIVE</literal>,
<literal>SHARE UPDATE EXCLUSIVE</literal>,
<literal>SHARE</literal>, <literal>SHARE ROW
EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
<literal>ACCESS EXCLUSIVE</literal> lock modes.
</para>
<para>
This lock mode is not automatically acquired by any
<productname>PostgreSQL</productname> command.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<literal>EXCLUSIVE</literal>
</term>
<listitem>
<para>
Conflicts with the <literal>ROW SHARE</literal>, <literal>ROW
EXCLUSIVE</literal>, <literal>SHARE UPDATE
EXCLUSIVE</literal>, <literal>SHARE</literal>, <literal>SHARE
ROW EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
<literal>ACCESS EXCLUSIVE</literal> lock modes.
This mode allows only concurrent <literal>ACCESS SHARE</literal> locks,
i.e., only reads from the table can proceed in parallel with a
transaction holding this lock mode.
</para>
<para>
This lock mode is not automatically acquired on user tables by any
<productname>PostgreSQL</productname> command. However it is
acquired on certain system catalogs in some operations.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<literal>ACCESS EXCLUSIVE</literal>
</term>
<listitem>
<para>
Conflicts with locks of all modes (<literal>ACCESS
SHARE</literal>, <literal>ROW SHARE</literal>, <literal>ROW
EXCLUSIVE</literal>, <literal>SHARE UPDATE
EXCLUSIVE</literal>, <literal>SHARE</literal>, <literal>SHARE
ROW EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
<literal>ACCESS EXCLUSIVE</literal>).
This mode guarantees that the
holder is the only transaction accessing the table in any way.
</para>
<para>
Acquired by the <command>ALTER TABLE</command>, <command>DROP
TABLE</command>, <command>REINDEX</command>,
<command>CLUSTER</command>, and <command>VACUUM FULL</command>
commands. This is also the default lock mode for <command>LOCK
TABLE</command> statements that do not specify a mode explicitly.
</para>
</listitem>
</varlistentry>
</variablelist>
<tip>
<para>
Only an <literal>ACCESS EXCLUSIVE</literal> lock blocks a
<command>SELECT</command> (without <option>FOR UPDATE</option>)
statement.
</para>
</tip>
</sect2>
<sect2 id="locking-rows">
<title>Row-Level Locks</title>
<para>
In addition to table-level locks, there are row-level locks.
A row-level lock on a specific row is automatically acquired when the
row is updated (or deleted or marked for update). The lock is held
until the transaction commits or rolls back.
Row-level locks do not affect data
querying; they block <emphasis>writers to the same row</emphasis>
only. To acquire a row-level lock on a row without actually
modifying the row, select the row with <command>SELECT FOR
UPDATE</command>. Note that once a particular row-level lock is
acquired, the transaction may update the row multiple times without
fear of conflicts.
</para>
<para>
<productname>PostgreSQL</productname> doesn't remember any
information about modified rows in memory, so it has no limit to
the number of rows locked at one time. However, locking a row
may cause a disk write; thus, for example, <command>SELECT FOR
UPDATE</command> will modify selected rows to mark them and so
will result in disk writes.
</para>
<para>
In addition to table and row locks, page-level share/exclusive locks are
used to control read/write access to table pages in the shared buffer
pool. These locks are released immediately after a row is fetched or
updated. Application developers normally need not be concerned with
page-level locks, but we mention them for completeness.
</para>
</sect2>
<sect2 id="locking-deadlocks">
<title>Deadlocks</title>
2003-08-31 19:32:24 +02:00
<indexterm zone="locking-deadlocks">
<primary>deadlock</primary>
</indexterm>
<para>
The use of explicit locking can increase the likelihood of
<firstterm>deadlocks</>, wherein two (or more) transactions each
hold locks that the other wants. For example, if transaction 1
acquires an exclusive lock on table A and then tries to acquire
an exclusive lock on table B, while transaction 2 has already
exclusive-locked table B and now wants an exclusive lock on table
A, then neither one can proceed.
<productname>PostgreSQL</productname> automatically detects
deadlock situations and resolves them by aborting one of the
transactions involved, allowing the other(s) to complete.
(Exactly which transaction will be aborted is difficult to
predict and should not be relied on.)
</para>
<para>
Note that deadlocks can also occur as the result of row-level
locks (and thus, they can occur even if explicit locking is not
used). Consider the case in which there are two concurrent
transactions modifying a table. The first transaction executes:
<screen>
UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 11111;
</screen>
This acquires a row-level lock on the row with the specified
account number. Then, the second transaction executes:
<screen>
UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 22222;
UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 11111;
</screen>
The first <command>UPDATE</command> statement successfully
acquires a row-level lock on the specified row, so it succeeds in
updating that row. However, the second <command>UPDATE</command>
statement finds that the row it is attempting to update has
already been locked, so it waits for the transaction that
acquired the lock to complete. Transaction two is now waiting on
transaction one to complete before it continues execution. Now,
transaction one executes:
<screen>
UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
</screen>
Transaction one attempts to acquire a row-level lock on the
specified row, but it cannot: transaction two already holds such
a lock. So it waits for transaction two to complete. Thus,
transaction one is blocked on transaction two, and transaction
two is blocked on transaction one: a deadlock
condition. <productname>PostgreSQL</productname> will detect this
situation and abort one of the transactions.
</para>
<para>
The best defense against deadlocks is generally to avoid them by
being certain that all applications using a database acquire
locks on multiple objects in a consistent order. In the example
above, if both transactions
had updated the rows in the same order, no deadlock would have
occurred. One should also ensure that the first lock acquired on
an object in a transaction is the highest mode that will be
needed for that object. If it is not feasible to verify this in
advance, then deadlocks may be handled on-the-fly by retrying
transactions that are aborted due to deadlock.
</para>
<para>
So long as no deadlock situation is detected, a transaction seeking
either a table-level or row-level lock will wait indefinitely for
conflicting locks to be released. This means it is a bad idea for
applications to hold transactions open for long periods of time
(e.g., while waiting for user input).
</para>
</sect2>
</sect1>
<sect1 id="applevel-consistency">
<title>Data Consistency Checks at the Application Level</title>
<para>
Because readers in <productname>PostgreSQL</productname>
do not lock data, regardless of
transaction isolation level, data read by one transaction can be
overwritten by another concurrent transaction. In other words,
if a row is returned by <command>SELECT</command> it doesn't mean that
the row is still current at the instant it is returned (i.e., sometime
after the current query began). The row might have been modified or
deleted by an already-committed transaction that committed after this one
started.
Even if the row is still valid <quote>now</quote>, it could be changed or
deleted
before the current transaction does a commit or rollback.
</para>
<para>
Another way to think about it is that each
transaction sees a snapshot of the database contents, and concurrently
executing transactions may very well see different snapshots. So the
whole concept of <quote>now</quote> is somewhat ill-defined anyway.
This is not normally
a big problem if the client applications are isolated from each other,
but if the clients can communicate via channels outside the database
then serious confusion may ensue.
</para>
<para>
To ensure the current validity of a row and protect it against
2003-11-04 10:55:39 +01:00
concurrent updates one must use <command>SELECT FOR
UPDATE</command> or an appropriate <command>LOCK TABLE</command>
statement. (<command>SELECT FOR UPDATE</command> locks just the
returned rows against concurrent updates, while <command>LOCK
TABLE</command> locks the whole table.) This should be taken into
account when porting applications to
<productname>PostgreSQL</productname> from other environments.
2003-11-04 10:55:39 +01:00
(Before version 6.5 <productname>PostgreSQL</productname> used
read locks, and so this above consideration is also relevant when
upgrading from <productname>PostgreSQL</productname> versions
prior to 6.5.)
</para>
<para>
Global validity checks require extra thought under <acronym>MVCC</acronym>.
For example, a banking application might wish to check that the sum of
all credits in one table equals the sum of debits in another table,
when both tables are being actively updated. Comparing the results of two
successive <literal>SELECT sum(...)</literal> commands will not work reliably under
Read Committed mode, since the second query will likely include the results
of transactions not counted by the first. Doing the two sums in a
single serializable transaction will give an accurate picture of the
effects of transactions that committed before the serializable transaction
started &mdash; but one might legitimately wonder whether the answer is still
relevant by the time it is delivered. If the serializable transaction
itself applied some changes before trying to make the consistency check,
the usefulness of the check becomes even more debatable, since now it
includes some but not all post-transaction-start changes. In such cases
a careful person might wish to lock all tables needed for the check,
in order to get an indisputable picture of current reality. A
<literal>SHARE</> mode (or higher) lock guarantees that there are no
uncommitted changes in the locked table, other than those of the current
transaction.
</para>
<para>
Note also that if one is
relying on explicit locking to prevent concurrent changes, one should use
Read Committed mode, or in Serializable mode be careful to obtain the
lock(s) before performing queries. A lock obtained by a
serializable transaction guarantees that no other transactions modifying
the table are still running, but if the snapshot seen by the
transaction predates obtaining the lock, it may predate some now-committed
changes in the table. A serializable transaction's snapshot is actually
frozen at the start of its first query or data-modification command
(<literal>SELECT</>, <literal>INSERT</>,
<literal>UPDATE</>, or <literal>DELETE</>), so
it's possible to obtain locks explicitly before the snapshot is
frozen.
2000-10-11 19:38:36 +02:00
</para>
</sect1>
<sect1 id="locking-indexes">
<title>Locking and Indexes</title>
2003-08-31 19:32:24 +02:00
<indexterm zone="locking-indexes">
<primary>index</primary>
<secondary>locks</secondary>
</indexterm>
<para>
Though <productname>PostgreSQL</productname>
2000-10-11 19:38:36 +02:00
provides nonblocking read/write access to table
data, nonblocking read/write access is not currently offered for every
index access method implemented
in <productname>PostgreSQL</productname>.
The various index types are handled as follows:
<variablelist>
<varlistentry>
<term>
B-tree indexes
</term>
<listitem>
<para>
Short-term share/exclusive page-level locks are used for
read/write access. Locks are released immediately after each
index row is fetched or inserted. B-tree indexes provide
the highest concurrency without deadlock conditions.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<acronym>GiST</acronym> and R-tree indexes
</term>
<listitem>
<para>
Share/exclusive index-level locks are used for read/write access.
Locks are released after the command is done.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
Hash indexes
</term>
<listitem>
<para>
Share/exclusive hash-bucket-level locks are used for read/write
access. Locks are released after the whole bucket is processed.
Bucket-level locks provide better concurrency than index-level
ones, but deadlock is possible since the locks are held longer
than one index operation.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
<para>
In short, B-tree indexes offer the best performance for concurrent
applications; since they also have more features than hash
indexes, they are the recommended index type for concurrent
applications that need to index scalar data. When dealing with
non-scalar data, B-trees obviously cannot be used; in that
situation, application developers should be aware of the
relatively poor concurrent performance of GiST and R-tree
indexes.
2003-01-11 01:00:03 +01:00
</para>
</sect1>
</chapter>
<!-- Keep this comment at the end of the file
Local variables:
mode:sgml
sgml-omittag:nil
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
sgml-parent-document:nil
sgml-default-dtd-file:"./reference.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:("/usr/lib/sgml/catalog")
sgml-local-ecat-files:nil
End:
-->