2002-08-31 19:14:28 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
2002-08-17 15:11:43 +02:00
|
|
|
* lockfuncs.c
|
2006-09-19 00:40:40 +02:00
|
|
|
* Functions for SQL access to various lock-manager capabilities.
|
2002-09-04 22:31:48 +02:00
|
|
|
*
|
2019-01-02 18:44:25 +01:00
|
|
|
* Copyright (c) 2002-2019, PostgreSQL Global Development Group
|
2002-08-17 15:11:43 +02:00
|
|
|
*
|
|
|
|
* IDENTIFICATION
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/backend/utils/adt/lockfuncs.c
|
2002-08-31 19:14:28 +02:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
2002-08-17 15:11:43 +02:00
|
|
|
*/
|
|
|
|
#include "postgres.h"
|
2002-08-29 02:17:06 +02:00
|
|
|
|
2012-08-30 22:15:44 +02:00
|
|
|
#include "access/htup_details.h"
|
Create an infrastructure for parallel computation in PostgreSQL.
This does four basic things. First, it provides convenience routines
to coordinate the startup and shutdown of parallel workers. Second,
it synchronizes various pieces of state (e.g. GUCs, combo CID
mappings, transaction snapshot) from the parallel group leader to the
worker processes. Third, it prohibits various operations that would
result in unsafe changes to that state while parallelism is active.
Finally, it propagates events that would result in an ErrorResponse,
NoticeResponse, or NotifyResponse message being sent to the client
from the parallel workers back to the master, from which they can then
be sent on to the client.
Robert Haas, Amit Kapila, Noah Misch, Rushabh Lathia, Jeevan Chalke.
Suggestions and review from Andres Freund, Heikki Linnakangas, Noah
Misch, Simon Riggs, Euler Taveira, and Jim Nasby.
2015-04-30 21:02:14 +02:00
|
|
|
#include "access/xact.h"
|
2002-08-27 06:00:28 +02:00
|
|
|
#include "catalog/pg_type.h"
|
2006-09-19 00:40:40 +02:00
|
|
|
#include "funcapi.h"
|
|
|
|
#include "miscadmin.h"
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
#include "storage/predicate_internals.h"
|
Create a function to reliably identify which sessions block which others.
This patch introduces "pg_blocking_pids(int) returns int[]", which returns
the PIDs of any sessions that are blocking the session with the given PID.
Historically people have obtained such information using a self-join on
the pg_locks view, but it's unreasonably tedious to do it that way with any
modicum of correctness, and the addition of parallel queries has pretty
much broken that approach altogether. (Given some more columns in the view
than there are today, you could imagine handling parallel-query cases with
a 4-way join; but ugh.)
The new function has the following behaviors that are painful or impossible
to get right via pg_locks:
1. Correctly understands which lock modes block which other ones.
2. In soft-block situations (two processes both waiting for conflicting lock
modes), only the one that's in front in the wait queue is reported to
block the other.
3. In parallel-query cases, reports all sessions blocking any member of
the given PID's lock group, and reports a session by naming its leader
process's PID, which will be the pg_backend_pid() value visible to
clients.
The motivation for doing this right now is mostly to fix the isolation
tests. Commit 38f8bdcac4982215beb9f65a19debecaf22fd470 lobotomized
isolationtester's is-it-waiting query by removing its ability to recognize
nonconflicting lock modes, as a crude workaround for the inability to
handle soft-block situations properly. But even without the lock mode
tests, the old query was excessively slow, particularly in
CLOBBER_CACHE_ALWAYS builds; some of our buildfarm animals fail the new
deadlock-hard test because the deadlock timeout elapses before they can
probe the waiting status of all eight sessions. Replacing the pg_locks
self-join with use of pg_blocking_pids() is not only much more correct, but
a lot faster: I measure it at about 9X faster in a typical dev build with
Asserts, and 3X faster in CLOBBER_CACHE_ALWAYS builds. That should provide
enough headroom for the slower CLOBBER_CACHE_ALWAYS animals to pass the
test, without having to lengthen deadlock_timeout yet more and thus slow
down the test for everyone else.
2016-02-22 20:31:43 +01:00
|
|
|
#include "utils/array.h"
|
2002-08-29 02:17:06 +02:00
|
|
|
#include "utils/builtins.h"
|
2002-08-17 15:11:43 +02:00
|
|
|
|
|
|
|
|
2005-05-17 23:46:11 +02:00
|
|
|
/* This must match enum LockTagType! */
|
2016-03-10 18:44:09 +01:00
|
|
|
const char *const LockTagTypeNames[] = {
|
2005-05-17 23:46:11 +02:00
|
|
|
"relation",
|
|
|
|
"extend",
|
|
|
|
"page",
|
|
|
|
"tuple",
|
2005-06-18 21:33:42 +02:00
|
|
|
"transactionid",
|
2007-09-05 20:10:48 +02:00
|
|
|
"virtualxid",
|
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
|
|
|
"speculative token",
|
2005-05-17 23:46:11 +02:00
|
|
|
"object",
|
2006-09-23 01:20:14 +02:00
|
|
|
"userlock",
|
|
|
|
"advisory"
|
2005-05-17 23:46:11 +02:00
|
|
|
};
|
|
|
|
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
/* This must match enum PredicateLockTargetType (predicate_internals.h) */
|
|
|
|
static const char *const PredicateLockTagTypeNames[] = {
|
|
|
|
"relation",
|
|
|
|
"page",
|
|
|
|
"tuple"
|
|
|
|
};
|
|
|
|
|
2002-08-31 19:14:28 +02:00
|
|
|
/* Working status for pg_lock_status */
|
|
|
|
typedef struct
|
|
|
|
{
|
|
|
|
LockData *lockData; /* state data from lmgr */
|
|
|
|
int currIdx; /* current PROCLOCK index */
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
PredicateLockData *predLockData; /* state data for pred locks */
|
|
|
|
int predLockIdx; /* current index for pred lock */
|
2002-08-31 19:14:28 +02:00
|
|
|
} PG_Lock_Status;
|
2002-08-29 02:17:06 +02:00
|
|
|
|
2011-05-29 01:52:00 +02:00
|
|
|
/* Number of columns in pg_locks output */
|
|
|
|
#define NUM_LOCK_STATUS_COLUMNS 15
|
2007-09-05 20:10:48 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* VXIDGetDatum - Construct a text representation of a VXID
|
|
|
|
*
|
|
|
|
* This is currently only used in pg_lock_status, so we put it here.
|
|
|
|
*/
|
|
|
|
static Datum
|
|
|
|
VXIDGetDatum(BackendId bid, LocalTransactionId lxid)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* The representation is "<bid>/<lxid>", decimal and unsigned decimal
|
|
|
|
* respectively. Note that elog.c also knows how to format a vxid.
|
|
|
|
*/
|
2007-11-15 22:14:46 +01:00
|
|
|
char vxidstr[32];
|
2007-09-05 20:10:48 +02:00
|
|
|
|
|
|
|
snprintf(vxidstr, sizeof(vxidstr), "%d/%u", bid, lxid);
|
|
|
|
|
2008-03-25 23:42:46 +01:00
|
|
|
return CStringGetTextDatum(vxidstr);
|
2007-09-05 20:10:48 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
|
2002-08-31 19:14:28 +02:00
|
|
|
/*
|
|
|
|
* pg_lock_status - produce a view with one row per held or awaited lock mode
|
|
|
|
*/
|
2002-08-17 15:11:43 +02:00
|
|
|
Datum
|
2002-08-27 06:00:28 +02:00
|
|
|
pg_lock_status(PG_FUNCTION_ARGS)
|
2002-08-17 15:11:43 +02:00
|
|
|
{
|
2002-09-04 22:31:48 +02:00
|
|
|
FuncCallContext *funcctx;
|
|
|
|
PG_Lock_Status *mystatus;
|
2002-08-31 19:14:28 +02:00
|
|
|
LockData *lockData;
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
PredicateLockData *predLockData;
|
2002-08-17 15:11:43 +02:00
|
|
|
|
|
|
|
if (SRF_IS_FIRSTCALL())
|
|
|
|
{
|
2002-09-04 22:31:48 +02:00
|
|
|
TupleDesc tupdesc;
|
|
|
|
MemoryContext oldcontext;
|
2002-08-17 15:11:43 +02:00
|
|
|
|
2002-08-29 19:14:33 +02:00
|
|
|
/* create a function context for cross-call persistence */
|
|
|
|
funcctx = SRF_FIRSTCALL_INIT();
|
|
|
|
|
2002-09-04 22:31:48 +02:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* switch to memory context appropriate for multiple function calls
|
2002-09-04 22:31:48 +02:00
|
|
|
*/
|
2002-08-29 19:14:33 +02:00
|
|
|
oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
|
|
|
|
|
2002-08-31 19:14:28 +02:00
|
|
|
/* build tupdesc for result tuples */
|
Create a function to reliably identify which sessions block which others.
This patch introduces "pg_blocking_pids(int) returns int[]", which returns
the PIDs of any sessions that are blocking the session with the given PID.
Historically people have obtained such information using a self-join on
the pg_locks view, but it's unreasonably tedious to do it that way with any
modicum of correctness, and the addition of parallel queries has pretty
much broken that approach altogether. (Given some more columns in the view
than there are today, you could imagine handling parallel-query cases with
a 4-way join; but ugh.)
The new function has the following behaviors that are painful or impossible
to get right via pg_locks:
1. Correctly understands which lock modes block which other ones.
2. In soft-block situations (two processes both waiting for conflicting lock
modes), only the one that's in front in the wait queue is reported to
block the other.
3. In parallel-query cases, reports all sessions blocking any member of
the given PID's lock group, and reports a session by naming its leader
process's PID, which will be the pg_backend_pid() value visible to
clients.
The motivation for doing this right now is mostly to fix the isolation
tests. Commit 38f8bdcac4982215beb9f65a19debecaf22fd470 lobotomized
isolationtester's is-it-waiting query by removing its ability to recognize
nonconflicting lock modes, as a crude workaround for the inability to
handle soft-block situations properly. But even without the lock mode
tests, the old query was excessively slow, particularly in
CLOBBER_CACHE_ALWAYS builds; some of our buildfarm animals fail the new
deadlock-hard test because the deadlock timeout elapses before they can
probe the waiting status of all eight sessions. Replacing the pg_locks
self-join with use of pg_blocking_pids() is not only much more correct, but
a lot faster: I measure it at about 9X faster in a typical dev build with
Asserts, and 3X faster in CLOBBER_CACHE_ALWAYS builds. That should provide
enough headroom for the slower CLOBBER_CACHE_ALWAYS animals to pass the
test, without having to lengthen deadlock_timeout yet more and thus slow
down the test for everyone else.
2016-02-22 20:31:43 +01:00
|
|
|
/* this had better match function's declaration in pg_proc.h */
|
Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
|
|
|
tupdesc = CreateTemplateTupleDesc(NUM_LOCK_STATUS_COLUMNS);
|
2005-05-17 23:46:11 +02:00
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 1, "locktype",
|
|
|
|
TEXTOID, -1, 0);
|
2003-02-20 00:41:15 +01:00
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 2, "database",
|
2004-04-01 23:28:47 +02:00
|
|
|
OIDOID, -1, 0);
|
2005-05-17 23:46:11 +02:00
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relation",
|
|
|
|
OIDOID, -1, 0);
|
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 4, "page",
|
|
|
|
INT4OID, -1, 0);
|
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 5, "tuple",
|
|
|
|
INT2OID, -1, 0);
|
2007-09-05 20:10:48 +02:00
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 6, "virtualxid",
|
|
|
|
TEXTOID, -1, 0);
|
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 7, "transactionid",
|
2004-04-01 23:28:47 +02:00
|
|
|
XIDOID, -1, 0);
|
2007-09-05 20:10:48 +02:00
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 8, "classid",
|
2005-05-17 23:46:11 +02:00
|
|
|
OIDOID, -1, 0);
|
2007-09-05 20:10:48 +02:00
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 9, "objid",
|
2005-05-17 23:46:11 +02:00
|
|
|
OIDOID, -1, 0);
|
2007-09-05 20:10:48 +02:00
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 10, "objsubid",
|
2005-05-17 23:46:11 +02:00
|
|
|
INT2OID, -1, 0);
|
2007-09-05 20:10:48 +02:00
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 11, "virtualtransaction",
|
|
|
|
TEXTOID, -1, 0);
|
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 12, "pid",
|
2004-04-01 23:28:47 +02:00
|
|
|
INT4OID, -1, 0);
|
2007-09-05 20:10:48 +02:00
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 13, "mode",
|
2004-04-01 23:28:47 +02:00
|
|
|
TEXTOID, -1, 0);
|
2007-09-05 20:10:48 +02:00
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 14, "granted",
|
2004-04-01 23:28:47 +02:00
|
|
|
BOOLOID, -1, 0);
|
2011-05-29 01:52:00 +02:00
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 15, "fastpath",
|
|
|
|
BOOLOID, -1, 0);
|
2002-08-27 06:00:28 +02:00
|
|
|
|
2004-04-01 23:28:47 +02:00
|
|
|
funcctx->tuple_desc = BlessTupleDesc(tupdesc);
|
2002-08-17 15:11:43 +02:00
|
|
|
|
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Collect all the locking information that we will format and send
|
|
|
|
* out as a result set.
|
2002-08-17 15:11:43 +02:00
|
|
|
*/
|
2002-08-31 19:14:28 +02:00
|
|
|
mystatus = (PG_Lock_Status *) palloc(sizeof(PG_Lock_Status));
|
|
|
|
funcctx->user_fctx = (void *) mystatus;
|
2002-08-17 15:11:43 +02:00
|
|
|
|
2002-08-31 19:14:28 +02:00
|
|
|
mystatus->lockData = GetLockStatusData();
|
|
|
|
mystatus->currIdx = 0;
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
mystatus->predLockData = GetPredicateLockStatusData();
|
|
|
|
mystatus->predLockIdx = 0;
|
2002-08-17 15:11:43 +02:00
|
|
|
|
2002-08-29 19:14:33 +02:00
|
|
|
MemoryContextSwitchTo(oldcontext);
|
2002-08-17 15:11:43 +02:00
|
|
|
}
|
|
|
|
|
2002-09-04 22:31:48 +02:00
|
|
|
funcctx = SRF_PERCALL_SETUP();
|
2002-08-31 19:14:28 +02:00
|
|
|
mystatus = (PG_Lock_Status *) funcctx->user_fctx;
|
|
|
|
lockData = mystatus->lockData;
|
2002-08-17 15:11:43 +02:00
|
|
|
|
2002-08-31 19:14:28 +02:00
|
|
|
while (mystatus->currIdx < lockData->nelements)
|
2002-08-17 15:11:43 +02:00
|
|
|
{
|
2002-09-04 22:31:48 +02:00
|
|
|
bool granted;
|
2004-08-27 19:07:42 +02:00
|
|
|
LOCKMODE mode = 0;
|
2005-05-17 23:46:11 +02:00
|
|
|
const char *locktypename;
|
|
|
|
char tnbuf[32];
|
2011-05-29 01:52:00 +02:00
|
|
|
Datum values[NUM_LOCK_STATUS_COLUMNS];
|
|
|
|
bool nulls[NUM_LOCK_STATUS_COLUMNS];
|
2002-09-04 22:31:48 +02:00
|
|
|
HeapTuple tuple;
|
|
|
|
Datum result;
|
2012-06-10 21:20:04 +02:00
|
|
|
LockInstanceData *instance;
|
2002-09-04 22:31:48 +02:00
|
|
|
|
2011-05-29 01:52:00 +02:00
|
|
|
instance = &(lockData->locks[mystatus->currIdx]);
|
2002-08-17 15:11:43 +02:00
|
|
|
|
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Look to see if there are any held lock modes in this PROCLOCK. If
|
|
|
|
* so, report, and destructively modify lockData so we don't report
|
|
|
|
* again.
|
2002-08-17 15:11:43 +02:00
|
|
|
*/
|
2002-08-31 19:14:28 +02:00
|
|
|
granted = false;
|
2011-05-29 01:52:00 +02:00
|
|
|
if (instance->holdMask)
|
2002-08-17 15:11:43 +02:00
|
|
|
{
|
2004-08-27 19:07:42 +02:00
|
|
|
for (mode = 0; mode < MAX_LOCKMODES; mode++)
|
2002-08-31 19:14:28 +02:00
|
|
|
{
|
2011-05-29 01:52:00 +02:00
|
|
|
if (instance->holdMask & LOCKBIT_ON(mode))
|
2004-08-27 19:07:42 +02:00
|
|
|
{
|
|
|
|
granted = true;
|
2011-05-29 01:52:00 +02:00
|
|
|
instance->holdMask &= LOCKBIT_OFF(mode);
|
2004-08-27 19:07:42 +02:00
|
|
|
break;
|
|
|
|
}
|
2002-08-31 19:14:28 +02:00
|
|
|
}
|
2002-08-17 15:11:43 +02:00
|
|
|
}
|
2002-08-31 19:14:28 +02:00
|
|
|
|
|
|
|
/*
|
2002-09-04 22:31:48 +02:00
|
|
|
* If no (more) held modes to report, see if PROC is waiting for a
|
|
|
|
* lock on this lock.
|
2002-08-31 19:14:28 +02:00
|
|
|
*/
|
|
|
|
if (!granted)
|
2002-08-17 15:11:43 +02:00
|
|
|
{
|
2011-05-29 01:52:00 +02:00
|
|
|
if (instance->waitLockMode != NoLock)
|
2002-08-31 19:14:28 +02:00
|
|
|
{
|
|
|
|
/* Yes, so report it with proper mode */
|
2011-05-29 01:52:00 +02:00
|
|
|
mode = instance->waitLockMode;
|
2002-09-04 22:31:48 +02:00
|
|
|
|
2002-08-31 19:14:28 +02:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* We are now done with this PROCLOCK, so advance pointer to
|
|
|
|
* continue with next one on next call.
|
2002-08-31 19:14:28 +02:00
|
|
|
*/
|
|
|
|
mystatus->currIdx++;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Okay, we've displayed all the locks associated with this
|
|
|
|
* PROCLOCK, proceed to the next one.
|
2002-08-31 19:14:28 +02:00
|
|
|
*/
|
|
|
|
mystatus->currIdx++;
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
}
|
2002-08-17 15:11:43 +02:00
|
|
|
|
2002-08-31 19:14:28 +02:00
|
|
|
/*
|
|
|
|
* Form tuple with appropriate data.
|
|
|
|
*/
|
|
|
|
MemSet(values, 0, sizeof(values));
|
2008-11-02 02:45:28 +01:00
|
|
|
MemSet(nulls, false, sizeof(nulls));
|
2002-08-31 19:14:28 +02:00
|
|
|
|
2011-05-29 01:52:00 +02:00
|
|
|
if (instance->locktag.locktag_type <= LOCKTAG_LAST_TYPE)
|
|
|
|
locktypename = LockTagTypeNames[instance->locktag.locktag_type];
|
2005-05-17 23:46:11 +02:00
|
|
|
else
|
|
|
|
{
|
|
|
|
snprintf(tnbuf, sizeof(tnbuf), "unknown %d",
|
2011-05-29 01:52:00 +02:00
|
|
|
(int) instance->locktag.locktag_type);
|
2005-05-17 23:46:11 +02:00
|
|
|
locktypename = tnbuf;
|
|
|
|
}
|
2008-03-25 23:42:46 +01:00
|
|
|
values[0] = CStringGetTextDatum(locktypename);
|
2005-05-17 23:46:11 +02:00
|
|
|
|
2011-05-29 01:52:00 +02:00
|
|
|
switch ((LockTagType) instance->locktag.locktag_type)
|
2002-08-31 19:14:28 +02:00
|
|
|
{
|
2005-04-30 00:28:24 +02:00
|
|
|
case LOCKTAG_RELATION:
|
|
|
|
case LOCKTAG_RELATION_EXTEND:
|
2011-05-29 01:52:00 +02:00
|
|
|
values[1] = ObjectIdGetDatum(instance->locktag.locktag_field1);
|
|
|
|
values[2] = ObjectIdGetDatum(instance->locktag.locktag_field2);
|
2008-11-02 02:45:28 +01:00
|
|
|
nulls[3] = true;
|
|
|
|
nulls[4] = true;
|
|
|
|
nulls[5] = true;
|
|
|
|
nulls[6] = true;
|
|
|
|
nulls[7] = true;
|
|
|
|
nulls[8] = true;
|
|
|
|
nulls[9] = true;
|
2005-05-17 23:46:11 +02:00
|
|
|
break;
|
2005-04-30 00:28:24 +02:00
|
|
|
case LOCKTAG_PAGE:
|
2011-05-29 01:52:00 +02:00
|
|
|
values[1] = ObjectIdGetDatum(instance->locktag.locktag_field1);
|
|
|
|
values[2] = ObjectIdGetDatum(instance->locktag.locktag_field2);
|
|
|
|
values[3] = UInt32GetDatum(instance->locktag.locktag_field3);
|
2008-11-02 02:45:28 +01:00
|
|
|
nulls[4] = true;
|
|
|
|
nulls[5] = true;
|
|
|
|
nulls[6] = true;
|
|
|
|
nulls[7] = true;
|
|
|
|
nulls[8] = true;
|
|
|
|
nulls[9] = true;
|
2005-05-17 23:46:11 +02:00
|
|
|
break;
|
2005-04-30 00:28:24 +02:00
|
|
|
case LOCKTAG_TUPLE:
|
2011-05-29 01:52:00 +02:00
|
|
|
values[1] = ObjectIdGetDatum(instance->locktag.locktag_field1);
|
|
|
|
values[2] = ObjectIdGetDatum(instance->locktag.locktag_field2);
|
|
|
|
values[3] = UInt32GetDatum(instance->locktag.locktag_field3);
|
|
|
|
values[4] = UInt16GetDatum(instance->locktag.locktag_field4);
|
2008-11-02 02:45:28 +01:00
|
|
|
nulls[5] = true;
|
|
|
|
nulls[6] = true;
|
|
|
|
nulls[7] = true;
|
|
|
|
nulls[8] = true;
|
|
|
|
nulls[9] = true;
|
2005-04-30 00:28:24 +02:00
|
|
|
break;
|
|
|
|
case LOCKTAG_TRANSACTION:
|
2011-05-29 01:52:00 +02:00
|
|
|
values[6] =
|
|
|
|
TransactionIdGetDatum(instance->locktag.locktag_field1);
|
2008-11-02 02:45:28 +01:00
|
|
|
nulls[1] = true;
|
|
|
|
nulls[2] = true;
|
|
|
|
nulls[3] = true;
|
|
|
|
nulls[4] = true;
|
|
|
|
nulls[5] = true;
|
|
|
|
nulls[7] = true;
|
|
|
|
nulls[8] = true;
|
|
|
|
nulls[9] = true;
|
2007-09-05 20:10:48 +02:00
|
|
|
break;
|
|
|
|
case LOCKTAG_VIRTUALTRANSACTION:
|
2011-05-29 01:52:00 +02:00
|
|
|
values[5] = VXIDGetDatum(instance->locktag.locktag_field1,
|
|
|
|
instance->locktag.locktag_field2);
|
2008-11-02 02:45:28 +01:00
|
|
|
nulls[1] = true;
|
|
|
|
nulls[2] = true;
|
|
|
|
nulls[3] = true;
|
|
|
|
nulls[4] = true;
|
|
|
|
nulls[6] = true;
|
|
|
|
nulls[7] = true;
|
|
|
|
nulls[8] = true;
|
|
|
|
nulls[9] = true;
|
2005-05-17 23:46:11 +02:00
|
|
|
break;
|
|
|
|
case LOCKTAG_OBJECT:
|
|
|
|
case LOCKTAG_USERLOCK:
|
2006-09-23 01:20:14 +02:00
|
|
|
case LOCKTAG_ADVISORY:
|
2005-05-17 23:46:11 +02:00
|
|
|
default: /* treat unknown locktags like OBJECT */
|
2011-05-29 01:52:00 +02:00
|
|
|
values[1] = ObjectIdGetDatum(instance->locktag.locktag_field1);
|
|
|
|
values[7] = ObjectIdGetDatum(instance->locktag.locktag_field2);
|
|
|
|
values[8] = ObjectIdGetDatum(instance->locktag.locktag_field3);
|
|
|
|
values[9] = Int16GetDatum(instance->locktag.locktag_field4);
|
2008-11-02 02:45:28 +01:00
|
|
|
nulls[2] = true;
|
|
|
|
nulls[3] = true;
|
|
|
|
nulls[4] = true;
|
|
|
|
nulls[5] = true;
|
|
|
|
nulls[6] = true;
|
2005-04-30 00:28:24 +02:00
|
|
|
break;
|
2002-08-17 15:11:43 +02:00
|
|
|
}
|
|
|
|
|
2011-05-29 01:52:00 +02:00
|
|
|
values[10] = VXIDGetDatum(instance->backend, instance->lxid);
|
|
|
|
if (instance->pid != 0)
|
|
|
|
values[11] = Int32GetDatum(instance->pid);
|
2005-06-18 21:33:42 +02:00
|
|
|
else
|
2008-11-02 02:45:28 +01:00
|
|
|
nulls[11] = true;
|
2011-05-29 01:52:00 +02:00
|
|
|
values[12] = CStringGetTextDatum(GetLockmodeName(instance->locktag.locktag_lockmethodid, mode));
|
2007-09-05 20:10:48 +02:00
|
|
|
values[13] = BoolGetDatum(granted);
|
2011-05-29 01:52:00 +02:00
|
|
|
values[14] = BoolGetDatum(instance->fastpath);
|
2002-08-17 15:11:43 +02:00
|
|
|
|
2008-11-02 02:45:28 +01:00
|
|
|
tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);
|
2004-04-01 23:28:47 +02:00
|
|
|
result = HeapTupleGetDatum(tuple);
|
2002-08-29 19:14:33 +02:00
|
|
|
SRF_RETURN_NEXT(funcctx, result);
|
2002-08-17 15:11:43 +02:00
|
|
|
}
|
|
|
|
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
/*
|
|
|
|
* Have returned all regular locks. Now start on the SIREAD predicate
|
|
|
|
* locks.
|
|
|
|
*/
|
|
|
|
predLockData = mystatus->predLockData;
|
|
|
|
if (mystatus->predLockIdx < predLockData->nelements)
|
|
|
|
{
|
|
|
|
PredicateLockTargetType lockType;
|
|
|
|
|
|
|
|
PREDICATELOCKTARGETTAG *predTag = &(predLockData->locktags[mystatus->predLockIdx]);
|
|
|
|
SERIALIZABLEXACT *xact = &(predLockData->xacts[mystatus->predLockIdx]);
|
2011-05-29 01:52:00 +02:00
|
|
|
Datum values[NUM_LOCK_STATUS_COLUMNS];
|
|
|
|
bool nulls[NUM_LOCK_STATUS_COLUMNS];
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
HeapTuple tuple;
|
|
|
|
Datum result;
|
|
|
|
|
|
|
|
mystatus->predLockIdx++;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Form tuple with appropriate data.
|
|
|
|
*/
|
|
|
|
MemSet(values, 0, sizeof(values));
|
|
|
|
MemSet(nulls, false, sizeof(nulls));
|
|
|
|
|
|
|
|
/* lock type */
|
|
|
|
lockType = GET_PREDICATELOCKTARGETTAG_TYPE(*predTag);
|
|
|
|
|
|
|
|
values[0] = CStringGetTextDatum(PredicateLockTagTypeNames[lockType]);
|
|
|
|
|
|
|
|
/* lock target */
|
|
|
|
values[1] = GET_PREDICATELOCKTARGETTAG_DB(*predTag);
|
|
|
|
values[2] = GET_PREDICATELOCKTARGETTAG_RELATION(*predTag);
|
|
|
|
if (lockType == PREDLOCKTAG_TUPLE)
|
|
|
|
values[4] = GET_PREDICATELOCKTARGETTAG_OFFSET(*predTag);
|
|
|
|
else
|
|
|
|
nulls[4] = true;
|
|
|
|
if ((lockType == PREDLOCKTAG_TUPLE) ||
|
|
|
|
(lockType == PREDLOCKTAG_PAGE))
|
|
|
|
values[3] = GET_PREDICATELOCKTARGETTAG_PAGE(*predTag);
|
|
|
|
else
|
|
|
|
nulls[3] = true;
|
|
|
|
|
|
|
|
/* these fields are targets for other types of locks */
|
|
|
|
nulls[5] = true; /* virtualxid */
|
|
|
|
nulls[6] = true; /* transactionid */
|
|
|
|
nulls[7] = true; /* classid */
|
|
|
|
nulls[8] = true; /* objid */
|
|
|
|
nulls[9] = true; /* objsubid */
|
|
|
|
|
|
|
|
/* lock holder */
|
|
|
|
values[10] = VXIDGetDatum(xact->vxid.backendId,
|
|
|
|
xact->vxid.localTransactionId);
|
2011-04-04 19:20:18 +02:00
|
|
|
if (xact->pid != 0)
|
|
|
|
values[11] = Int32GetDatum(xact->pid);
|
|
|
|
else
|
|
|
|
nulls[11] = true;
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
|
|
|
|
/*
|
2012-06-10 21:20:04 +02:00
|
|
|
* Lock mode. Currently all predicate locks are SIReadLocks, which are
|
|
|
|
* always held (never waiting) and have no fast path
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
*/
|
|
|
|
values[12] = CStringGetTextDatum("SIReadLock");
|
|
|
|
values[13] = BoolGetDatum(true);
|
2011-05-29 01:52:00 +02:00
|
|
|
values[14] = BoolGetDatum(false);
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
|
|
|
|
tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);
|
|
|
|
result = HeapTupleGetDatum(tuple);
|
|
|
|
SRF_RETURN_NEXT(funcctx, result);
|
|
|
|
}
|
|
|
|
|
2002-08-29 19:14:33 +02:00
|
|
|
SRF_RETURN_DONE(funcctx);
|
2002-08-17 15:11:43 +02:00
|
|
|
}
|
2006-09-19 00:40:40 +02:00
|
|
|
|
|
|
|
|
Create a function to reliably identify which sessions block which others.
This patch introduces "pg_blocking_pids(int) returns int[]", which returns
the PIDs of any sessions that are blocking the session with the given PID.
Historically people have obtained such information using a self-join on
the pg_locks view, but it's unreasonably tedious to do it that way with any
modicum of correctness, and the addition of parallel queries has pretty
much broken that approach altogether. (Given some more columns in the view
than there are today, you could imagine handling parallel-query cases with
a 4-way join; but ugh.)
The new function has the following behaviors that are painful or impossible
to get right via pg_locks:
1. Correctly understands which lock modes block which other ones.
2. In soft-block situations (two processes both waiting for conflicting lock
modes), only the one that's in front in the wait queue is reported to
block the other.
3. In parallel-query cases, reports all sessions blocking any member of
the given PID's lock group, and reports a session by naming its leader
process's PID, which will be the pg_backend_pid() value visible to
clients.
The motivation for doing this right now is mostly to fix the isolation
tests. Commit 38f8bdcac4982215beb9f65a19debecaf22fd470 lobotomized
isolationtester's is-it-waiting query by removing its ability to recognize
nonconflicting lock modes, as a crude workaround for the inability to
handle soft-block situations properly. But even without the lock mode
tests, the old query was excessively slow, particularly in
CLOBBER_CACHE_ALWAYS builds; some of our buildfarm animals fail the new
deadlock-hard test because the deadlock timeout elapses before they can
probe the waiting status of all eight sessions. Replacing the pg_locks
self-join with use of pg_blocking_pids() is not only much more correct, but
a lot faster: I measure it at about 9X faster in a typical dev build with
Asserts, and 3X faster in CLOBBER_CACHE_ALWAYS builds. That should provide
enough headroom for the slower CLOBBER_CACHE_ALWAYS animals to pass the
test, without having to lengthen deadlock_timeout yet more and thus slow
down the test for everyone else.
2016-02-22 20:31:43 +01:00
|
|
|
/*
|
|
|
|
* pg_blocking_pids - produce an array of the PIDs blocking given PID
|
|
|
|
*
|
|
|
|
* The reported PIDs are those that hold a lock conflicting with blocked_pid's
|
|
|
|
* current request (hard block), or are requesting such a lock and are ahead
|
|
|
|
* of blocked_pid in the lock's wait queue (soft block).
|
|
|
|
*
|
|
|
|
* In parallel-query cases, we report all PIDs blocking any member of the
|
|
|
|
* given PID's lock group, and the reported PIDs are those of the blocking
|
|
|
|
* PIDs' lock group leaders. This allows callers to compare the result to
|
|
|
|
* lists of clients' pg_backend_pid() results even during a parallel query.
|
|
|
|
*
|
|
|
|
* Parallel query makes it possible for there to be duplicate PIDs in the
|
|
|
|
* result (either because multiple waiters are blocked by same PID, or
|
|
|
|
* because multiple blockers have same group leader PID). We do not bother
|
|
|
|
* to eliminate such duplicates from the result.
|
|
|
|
*
|
|
|
|
* We need not consider predicate locks here, since those don't block anything.
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_blocking_pids(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int blocked_pid = PG_GETARG_INT32(0);
|
|
|
|
Datum *arrayelems;
|
|
|
|
int narrayelems;
|
|
|
|
BlockedProcsData *lockData; /* state data from lmgr */
|
|
|
|
int i,
|
|
|
|
j;
|
|
|
|
|
|
|
|
/* Collect a snapshot of lock manager state */
|
|
|
|
lockData = GetBlockerStatusData(blocked_pid);
|
|
|
|
|
|
|
|
/* We can't need more output entries than there are reported PROCLOCKs */
|
|
|
|
arrayelems = (Datum *) palloc(lockData->nlocks * sizeof(Datum));
|
|
|
|
narrayelems = 0;
|
|
|
|
|
|
|
|
/* For each blocked proc in the lock group ... */
|
|
|
|
for (i = 0; i < lockData->nprocs; i++)
|
|
|
|
{
|
|
|
|
BlockedProcData *bproc = &lockData->procs[i];
|
|
|
|
LockInstanceData *instances = &lockData->locks[bproc->first_lock];
|
|
|
|
int *preceding_waiters = &lockData->waiter_pids[bproc->first_waiter];
|
|
|
|
LockInstanceData *blocked_instance;
|
|
|
|
LockMethod lockMethodTable;
|
|
|
|
int conflictMask;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Locate the blocked proc's own entry in the LockInstanceData array.
|
|
|
|
* There should be exactly one matching entry.
|
|
|
|
*/
|
|
|
|
blocked_instance = NULL;
|
|
|
|
for (j = 0; j < bproc->num_locks; j++)
|
|
|
|
{
|
|
|
|
LockInstanceData *instance = &(instances[j]);
|
|
|
|
|
|
|
|
if (instance->pid == bproc->pid)
|
|
|
|
{
|
|
|
|
Assert(blocked_instance == NULL);
|
|
|
|
blocked_instance = instance;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
Assert(blocked_instance != NULL);
|
|
|
|
|
|
|
|
lockMethodTable = GetLockTagsMethodTable(&(blocked_instance->locktag));
|
|
|
|
conflictMask = lockMethodTable->conflictTab[blocked_instance->waitLockMode];
|
|
|
|
|
|
|
|
/* Now scan the PROCLOCK data for conflicting procs */
|
|
|
|
for (j = 0; j < bproc->num_locks; j++)
|
|
|
|
{
|
|
|
|
LockInstanceData *instance = &(instances[j]);
|
|
|
|
|
|
|
|
/* A proc never blocks itself, so ignore that entry */
|
|
|
|
if (instance == blocked_instance)
|
|
|
|
continue;
|
|
|
|
/* Members of same lock group never block each other, either */
|
|
|
|
if (instance->leaderPid == blocked_instance->leaderPid)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
if (conflictMask & instance->holdMask)
|
|
|
|
{
|
|
|
|
/* hard block: blocked by lock already held by this entry */
|
|
|
|
}
|
|
|
|
else if (instance->waitLockMode != NoLock &&
|
|
|
|
(conflictMask & LOCKBIT_ON(instance->waitLockMode)))
|
|
|
|
{
|
|
|
|
/* conflict in lock requests; who's in front in wait queue? */
|
|
|
|
bool ahead = false;
|
|
|
|
int k;
|
|
|
|
|
|
|
|
for (k = 0; k < bproc->num_waiters; k++)
|
|
|
|
{
|
|
|
|
if (preceding_waiters[k] == instance->pid)
|
|
|
|
{
|
|
|
|
/* soft block: this entry is ahead of blocked proc */
|
|
|
|
ahead = true;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (!ahead)
|
|
|
|
continue; /* not blocked by this entry */
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/* not blocked by this entry */
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* blocked by this entry, so emit a record */
|
|
|
|
arrayelems[narrayelems++] = Int32GetDatum(instance->leaderPid);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Assert we didn't overrun arrayelems[] */
|
|
|
|
Assert(narrayelems <= lockData->nlocks);
|
|
|
|
|
|
|
|
/* Construct array, using hardwired knowledge about int4 type */
|
|
|
|
PG_RETURN_ARRAYTYPE_P(construct_array(arrayelems, narrayelems,
|
|
|
|
INT4OID,
|
|
|
|
sizeof(int32), true, 'i'));
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2017-04-10 16:26:54 +02:00
|
|
|
/*
|
|
|
|
* pg_safe_snapshot_blocking_pids - produce an array of the PIDs blocking
|
|
|
|
* given PID from getting a safe snapshot
|
|
|
|
*
|
|
|
|
* XXX this does not consider parallel-query cases; not clear how big a
|
|
|
|
* problem that is in practice
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_safe_snapshot_blocking_pids(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int blocked_pid = PG_GETARG_INT32(0);
|
|
|
|
int *blockers;
|
|
|
|
int num_blockers;
|
|
|
|
Datum *blocker_datums;
|
|
|
|
|
|
|
|
/* A buffer big enough for any possible blocker list without truncation */
|
|
|
|
blockers = (int *) palloc(MaxBackends * sizeof(int));
|
|
|
|
|
|
|
|
/* Collect a snapshot of processes waited for by GetSafeSnapshot */
|
|
|
|
num_blockers =
|
|
|
|
GetSafeSnapshotBlockingPids(blocked_pid, blockers, MaxBackends);
|
|
|
|
|
|
|
|
/* Convert int array to Datum array */
|
|
|
|
if (num_blockers > 0)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
blocker_datums = (Datum *) palloc(num_blockers * sizeof(Datum));
|
|
|
|
for (i = 0; i < num_blockers; ++i)
|
|
|
|
blocker_datums[i] = Int32GetDatum(blockers[i]);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
blocker_datums = NULL;
|
|
|
|
|
|
|
|
/* Construct array, using hardwired knowledge about int4 type */
|
|
|
|
PG_RETURN_ARRAYTYPE_P(construct_array(blocker_datums, num_blockers,
|
|
|
|
INT4OID,
|
|
|
|
sizeof(int32), true, 'i'));
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
* pg_isolation_test_session_is_blocked - support function for isolationtester
|
|
|
|
*
|
|
|
|
* Check if specified PID is blocked by any of the PIDs listed in the second
|
|
|
|
* argument. Currently, this looks for blocking caused by waiting for
|
|
|
|
* heavyweight locks or safe snapshots. We ignore blockage caused by PIDs
|
|
|
|
* not directly under the isolationtester's control, eg autovacuum.
|
|
|
|
*
|
|
|
|
* This is an undocumented function intended for use by the isolation tester,
|
|
|
|
* and may change in future releases as required for testing purposes.
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_isolation_test_session_is_blocked(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int blocked_pid = PG_GETARG_INT32(0);
|
|
|
|
ArrayType *interesting_pids_a = PG_GETARG_ARRAYTYPE_P(1);
|
|
|
|
ArrayType *blocking_pids_a;
|
|
|
|
int32 *interesting_pids;
|
|
|
|
int32 *blocking_pids;
|
|
|
|
int num_interesting_pids;
|
|
|
|
int num_blocking_pids;
|
|
|
|
int dummy;
|
|
|
|
int i,
|
|
|
|
j;
|
|
|
|
|
|
|
|
/* Validate the passed-in array */
|
|
|
|
Assert(ARR_ELEMTYPE(interesting_pids_a) == INT4OID);
|
|
|
|
if (array_contains_nulls(interesting_pids_a))
|
|
|
|
elog(ERROR, "array must not contain nulls");
|
|
|
|
interesting_pids = (int32 *) ARR_DATA_PTR(interesting_pids_a);
|
|
|
|
num_interesting_pids = ArrayGetNItems(ARR_NDIM(interesting_pids_a),
|
|
|
|
ARR_DIMS(interesting_pids_a));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Get the PIDs of all sessions blocking the given session's attempt to
|
|
|
|
* acquire heavyweight locks.
|
|
|
|
*/
|
|
|
|
blocking_pids_a =
|
|
|
|
DatumGetArrayTypeP(DirectFunctionCall1(pg_blocking_pids, blocked_pid));
|
|
|
|
|
|
|
|
Assert(ARR_ELEMTYPE(blocking_pids_a) == INT4OID);
|
|
|
|
Assert(!array_contains_nulls(blocking_pids_a));
|
|
|
|
blocking_pids = (int32 *) ARR_DATA_PTR(blocking_pids_a);
|
|
|
|
num_blocking_pids = ArrayGetNItems(ARR_NDIM(blocking_pids_a),
|
|
|
|
ARR_DIMS(blocking_pids_a));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Check if any of these are in the list of interesting PIDs, that being
|
|
|
|
* the sessions that the isolation tester is running. We don't use
|
|
|
|
* "arrayoverlaps" here, because it would lead to cache lookups and one of
|
|
|
|
* our goals is to run quickly under CLOBBER_CACHE_ALWAYS. We expect
|
|
|
|
* blocking_pids to be usually empty and otherwise a very small number in
|
|
|
|
* isolation tester cases, so make that the outer loop of a naive search
|
|
|
|
* for a match.
|
|
|
|
*/
|
|
|
|
for (i = 0; i < num_blocking_pids; i++)
|
|
|
|
for (j = 0; j < num_interesting_pids; j++)
|
|
|
|
{
|
|
|
|
if (blocking_pids[i] == interesting_pids[j])
|
|
|
|
PG_RETURN_BOOL(true);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Check if blocked_pid is waiting for a safe snapshot. We could in
|
|
|
|
* theory check the resulting array of blocker PIDs against the
|
|
|
|
* interesting PIDs whitelist, but since there is no danger of autovacuum
|
|
|
|
* blocking GetSafeSnapshot there seems to be no point in expending cycles
|
|
|
|
* on allocating a buffer and searching for overlap; so it's presently
|
|
|
|
* sufficient for the isolation tester's purposes to use a single element
|
|
|
|
* buffer and check if the number of safe snapshot blockers is non-zero.
|
|
|
|
*/
|
|
|
|
if (GetSafeSnapshotBlockingPids(blocked_pid, &dummy, 1) > 0)
|
|
|
|
PG_RETURN_BOOL(true);
|
|
|
|
|
|
|
|
PG_RETURN_BOOL(false);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2006-09-19 00:40:40 +02:00
|
|
|
/*
|
2006-09-23 01:20:14 +02:00
|
|
|
* Functions for manipulating advisory locks
|
2006-09-19 00:40:40 +02:00
|
|
|
*
|
|
|
|
* We make use of the locktag fields as follows:
|
|
|
|
*
|
|
|
|
* field1: MyDatabaseId ... ensures locks are local to each database
|
|
|
|
* field2: first of 2 int4 keys, or high-order half of an int8 key
|
|
|
|
* field3: second of 2 int4 keys, or low-order half of an int8 key
|
|
|
|
* field4: 1 if using an int8 key, 2 if using 2 int4 keys
|
|
|
|
*/
|
|
|
|
#define SET_LOCKTAG_INT64(tag, key64) \
|
2006-09-23 01:20:14 +02:00
|
|
|
SET_LOCKTAG_ADVISORY(tag, \
|
2006-09-19 00:40:40 +02:00
|
|
|
MyDatabaseId, \
|
|
|
|
(uint32) ((key64) >> 32), \
|
|
|
|
(uint32) (key64), \
|
|
|
|
1)
|
|
|
|
#define SET_LOCKTAG_INT32(tag, key1, key2) \
|
2006-09-23 01:20:14 +02:00
|
|
|
SET_LOCKTAG_ADVISORY(tag, MyDatabaseId, key1, key2, 2)
|
2006-09-19 00:40:40 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* pg_advisory_lock(int8) - acquire exclusive lock on an int8 key
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_advisory_lock_int8(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int64 key = PG_GETARG_INT64(0);
|
|
|
|
LOCKTAG tag;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT64(tag, key);
|
|
|
|
|
|
|
|
(void) LockAcquire(&tag, ExclusiveLock, true, false);
|
|
|
|
|
|
|
|
PG_RETURN_VOID();
|
|
|
|
}
|
|
|
|
|
2011-02-18 06:04:34 +01:00
|
|
|
/*
|
|
|
|
* pg_advisory_xact_lock(int8) - acquire xact scoped
|
|
|
|
* exclusive lock on an int8 key
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_advisory_xact_lock_int8(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int64 key = PG_GETARG_INT64(0);
|
|
|
|
LOCKTAG tag;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT64(tag, key);
|
|
|
|
|
|
|
|
(void) LockAcquire(&tag, ExclusiveLock, false, false);
|
|
|
|
|
|
|
|
PG_RETURN_VOID();
|
|
|
|
}
|
|
|
|
|
2006-09-19 00:40:40 +02:00
|
|
|
/*
|
|
|
|
* pg_advisory_lock_shared(int8) - acquire share lock on an int8 key
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_advisory_lock_shared_int8(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int64 key = PG_GETARG_INT64(0);
|
|
|
|
LOCKTAG tag;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT64(tag, key);
|
|
|
|
|
|
|
|
(void) LockAcquire(&tag, ShareLock, true, false);
|
|
|
|
|
|
|
|
PG_RETURN_VOID();
|
|
|
|
}
|
|
|
|
|
2011-02-18 06:04:34 +01:00
|
|
|
/*
|
|
|
|
* pg_advisory_xact_lock_shared(int8) - acquire xact scoped
|
|
|
|
* share lock on an int8 key
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_advisory_xact_lock_shared_int8(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int64 key = PG_GETARG_INT64(0);
|
|
|
|
LOCKTAG tag;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT64(tag, key);
|
|
|
|
|
|
|
|
(void) LockAcquire(&tag, ShareLock, false, false);
|
|
|
|
|
|
|
|
PG_RETURN_VOID();
|
|
|
|
}
|
|
|
|
|
2006-09-19 00:40:40 +02:00
|
|
|
/*
|
|
|
|
* pg_try_advisory_lock(int8) - acquire exclusive lock on an int8 key, no wait
|
|
|
|
*
|
|
|
|
* Returns true if successful, false if lock not available
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_try_advisory_lock_int8(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int64 key = PG_GETARG_INT64(0);
|
|
|
|
LOCKTAG tag;
|
|
|
|
LockAcquireResult res;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT64(tag, key);
|
|
|
|
|
|
|
|
res = LockAcquire(&tag, ExclusiveLock, true, true);
|
|
|
|
|
|
|
|
PG_RETURN_BOOL(res != LOCKACQUIRE_NOT_AVAIL);
|
|
|
|
}
|
|
|
|
|
2011-02-18 06:04:34 +01:00
|
|
|
/*
|
|
|
|
* pg_try_advisory_xact_lock(int8) - acquire xact scoped
|
|
|
|
* exclusive lock on an int8 key, no wait
|
|
|
|
*
|
|
|
|
* Returns true if successful, false if lock not available
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_try_advisory_xact_lock_int8(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int64 key = PG_GETARG_INT64(0);
|
|
|
|
LOCKTAG tag;
|
|
|
|
LockAcquireResult res;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT64(tag, key);
|
|
|
|
|
|
|
|
res = LockAcquire(&tag, ExclusiveLock, false, true);
|
|
|
|
|
|
|
|
PG_RETURN_BOOL(res != LOCKACQUIRE_NOT_AVAIL);
|
|
|
|
}
|
|
|
|
|
2006-09-19 00:40:40 +02:00
|
|
|
/*
|
|
|
|
* pg_try_advisory_lock_shared(int8) - acquire share lock on an int8 key, no wait
|
|
|
|
*
|
|
|
|
* Returns true if successful, false if lock not available
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_try_advisory_lock_shared_int8(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int64 key = PG_GETARG_INT64(0);
|
|
|
|
LOCKTAG tag;
|
|
|
|
LockAcquireResult res;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT64(tag, key);
|
|
|
|
|
|
|
|
res = LockAcquire(&tag, ShareLock, true, true);
|
|
|
|
|
|
|
|
PG_RETURN_BOOL(res != LOCKACQUIRE_NOT_AVAIL);
|
|
|
|
}
|
|
|
|
|
2011-02-18 06:04:34 +01:00
|
|
|
/*
|
|
|
|
* pg_try_advisory_xact_lock_shared(int8) - acquire xact scoped
|
|
|
|
* share lock on an int8 key, no wait
|
|
|
|
*
|
|
|
|
* Returns true if successful, false if lock not available
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_try_advisory_xact_lock_shared_int8(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int64 key = PG_GETARG_INT64(0);
|
|
|
|
LOCKTAG tag;
|
|
|
|
LockAcquireResult res;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT64(tag, key);
|
|
|
|
|
|
|
|
res = LockAcquire(&tag, ShareLock, false, true);
|
|
|
|
|
|
|
|
PG_RETURN_BOOL(res != LOCKACQUIRE_NOT_AVAIL);
|
|
|
|
}
|
|
|
|
|
2006-09-19 00:40:40 +02:00
|
|
|
/*
|
2006-10-04 02:30:14 +02:00
|
|
|
* pg_advisory_unlock(int8) - release exclusive lock on an int8 key
|
2006-09-19 00:40:40 +02:00
|
|
|
*
|
|
|
|
* Returns true if successful, false if lock was not held
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_advisory_unlock_int8(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int64 key = PG_GETARG_INT64(0);
|
|
|
|
LOCKTAG tag;
|
|
|
|
bool res;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT64(tag, key);
|
|
|
|
|
|
|
|
res = LockRelease(&tag, ExclusiveLock, true);
|
|
|
|
|
|
|
|
PG_RETURN_BOOL(res);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* pg_advisory_unlock_shared(int8) - release share lock on an int8 key
|
|
|
|
*
|
|
|
|
* Returns true if successful, false if lock was not held
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_advisory_unlock_shared_int8(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int64 key = PG_GETARG_INT64(0);
|
|
|
|
LOCKTAG tag;
|
|
|
|
bool res;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT64(tag, key);
|
|
|
|
|
|
|
|
res = LockRelease(&tag, ShareLock, true);
|
|
|
|
|
|
|
|
PG_RETURN_BOOL(res);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* pg_advisory_lock(int4, int4) - acquire exclusive lock on 2 int4 keys
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_advisory_lock_int4(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int32 key1 = PG_GETARG_INT32(0);
|
|
|
|
int32 key2 = PG_GETARG_INT32(1);
|
|
|
|
LOCKTAG tag;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT32(tag, key1, key2);
|
|
|
|
|
|
|
|
(void) LockAcquire(&tag, ExclusiveLock, true, false);
|
|
|
|
|
|
|
|
PG_RETURN_VOID();
|
|
|
|
}
|
|
|
|
|
2011-02-18 06:04:34 +01:00
|
|
|
/*
|
|
|
|
* pg_advisory_xact_lock(int4, int4) - acquire xact scoped
|
|
|
|
* exclusive lock on 2 int4 keys
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_advisory_xact_lock_int4(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int32 key1 = PG_GETARG_INT32(0);
|
|
|
|
int32 key2 = PG_GETARG_INT32(1);
|
|
|
|
LOCKTAG tag;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT32(tag, key1, key2);
|
|
|
|
|
|
|
|
(void) LockAcquire(&tag, ExclusiveLock, false, false);
|
|
|
|
|
|
|
|
PG_RETURN_VOID();
|
|
|
|
}
|
|
|
|
|
2006-09-19 00:40:40 +02:00
|
|
|
/*
|
|
|
|
* pg_advisory_lock_shared(int4, int4) - acquire share lock on 2 int4 keys
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_advisory_lock_shared_int4(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int32 key1 = PG_GETARG_INT32(0);
|
|
|
|
int32 key2 = PG_GETARG_INT32(1);
|
|
|
|
LOCKTAG tag;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT32(tag, key1, key2);
|
|
|
|
|
|
|
|
(void) LockAcquire(&tag, ShareLock, true, false);
|
|
|
|
|
|
|
|
PG_RETURN_VOID();
|
|
|
|
}
|
|
|
|
|
2011-02-18 06:04:34 +01:00
|
|
|
/*
|
|
|
|
* pg_advisory_xact_lock_shared(int4, int4) - acquire xact scoped
|
|
|
|
* share lock on 2 int4 keys
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_advisory_xact_lock_shared_int4(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int32 key1 = PG_GETARG_INT32(0);
|
|
|
|
int32 key2 = PG_GETARG_INT32(1);
|
|
|
|
LOCKTAG tag;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT32(tag, key1, key2);
|
|
|
|
|
|
|
|
(void) LockAcquire(&tag, ShareLock, false, false);
|
|
|
|
|
|
|
|
PG_RETURN_VOID();
|
|
|
|
}
|
|
|
|
|
2006-09-19 00:40:40 +02:00
|
|
|
/*
|
|
|
|
* pg_try_advisory_lock(int4, int4) - acquire exclusive lock on 2 int4 keys, no wait
|
|
|
|
*
|
|
|
|
* Returns true if successful, false if lock not available
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_try_advisory_lock_int4(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int32 key1 = PG_GETARG_INT32(0);
|
|
|
|
int32 key2 = PG_GETARG_INT32(1);
|
|
|
|
LOCKTAG tag;
|
|
|
|
LockAcquireResult res;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT32(tag, key1, key2);
|
|
|
|
|
|
|
|
res = LockAcquire(&tag, ExclusiveLock, true, true);
|
|
|
|
|
|
|
|
PG_RETURN_BOOL(res != LOCKACQUIRE_NOT_AVAIL);
|
|
|
|
}
|
|
|
|
|
2011-02-18 06:04:34 +01:00
|
|
|
/*
|
|
|
|
* pg_try_advisory_xact_lock(int4, int4) - acquire xact scoped
|
|
|
|
* exclusive lock on 2 int4 keys, no wait
|
|
|
|
*
|
|
|
|
* Returns true if successful, false if lock not available
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_try_advisory_xact_lock_int4(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int32 key1 = PG_GETARG_INT32(0);
|
|
|
|
int32 key2 = PG_GETARG_INT32(1);
|
|
|
|
LOCKTAG tag;
|
|
|
|
LockAcquireResult res;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT32(tag, key1, key2);
|
|
|
|
|
|
|
|
res = LockAcquire(&tag, ExclusiveLock, false, true);
|
|
|
|
|
|
|
|
PG_RETURN_BOOL(res != LOCKACQUIRE_NOT_AVAIL);
|
|
|
|
}
|
|
|
|
|
2006-09-19 00:40:40 +02:00
|
|
|
/*
|
|
|
|
* pg_try_advisory_lock_shared(int4, int4) - acquire share lock on 2 int4 keys, no wait
|
|
|
|
*
|
|
|
|
* Returns true if successful, false if lock not available
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_try_advisory_lock_shared_int4(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int32 key1 = PG_GETARG_INT32(0);
|
|
|
|
int32 key2 = PG_GETARG_INT32(1);
|
|
|
|
LOCKTAG tag;
|
|
|
|
LockAcquireResult res;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT32(tag, key1, key2);
|
|
|
|
|
|
|
|
res = LockAcquire(&tag, ShareLock, true, true);
|
|
|
|
|
|
|
|
PG_RETURN_BOOL(res != LOCKACQUIRE_NOT_AVAIL);
|
|
|
|
}
|
|
|
|
|
2011-02-18 06:04:34 +01:00
|
|
|
/*
|
|
|
|
* pg_try_advisory_xact_lock_shared(int4, int4) - acquire xact scoped
|
|
|
|
* share lock on 2 int4 keys, no wait
|
|
|
|
*
|
|
|
|
* Returns true if successful, false if lock not available
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_try_advisory_xact_lock_shared_int4(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int32 key1 = PG_GETARG_INT32(0);
|
|
|
|
int32 key2 = PG_GETARG_INT32(1);
|
|
|
|
LOCKTAG tag;
|
|
|
|
LockAcquireResult res;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT32(tag, key1, key2);
|
|
|
|
|
|
|
|
res = LockAcquire(&tag, ShareLock, false, true);
|
|
|
|
|
|
|
|
PG_RETURN_BOOL(res != LOCKACQUIRE_NOT_AVAIL);
|
|
|
|
}
|
|
|
|
|
2006-09-19 00:40:40 +02:00
|
|
|
/*
|
2006-10-04 02:30:14 +02:00
|
|
|
* pg_advisory_unlock(int4, int4) - release exclusive lock on 2 int4 keys
|
2006-09-19 00:40:40 +02:00
|
|
|
*
|
|
|
|
* Returns true if successful, false if lock was not held
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_advisory_unlock_int4(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int32 key1 = PG_GETARG_INT32(0);
|
|
|
|
int32 key2 = PG_GETARG_INT32(1);
|
|
|
|
LOCKTAG tag;
|
|
|
|
bool res;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT32(tag, key1, key2);
|
|
|
|
|
|
|
|
res = LockRelease(&tag, ExclusiveLock, true);
|
|
|
|
|
|
|
|
PG_RETURN_BOOL(res);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* pg_advisory_unlock_shared(int4, int4) - release share lock on 2 int4 keys
|
|
|
|
*
|
|
|
|
* Returns true if successful, false if lock was not held
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_advisory_unlock_shared_int4(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
int32 key1 = PG_GETARG_INT32(0);
|
|
|
|
int32 key2 = PG_GETARG_INT32(1);
|
|
|
|
LOCKTAG tag;
|
|
|
|
bool res;
|
|
|
|
|
|
|
|
SET_LOCKTAG_INT32(tag, key1, key2);
|
|
|
|
|
|
|
|
res = LockRelease(&tag, ShareLock, true);
|
|
|
|
|
|
|
|
PG_RETURN_BOOL(res);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2006-09-23 01:20:14 +02:00
|
|
|
* pg_advisory_unlock_all() - release all advisory locks
|
2006-09-19 00:40:40 +02:00
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_advisory_unlock_all(PG_FUNCTION_ARGS)
|
|
|
|
{
|
2011-02-18 06:04:34 +01:00
|
|
|
LockReleaseSession(USER_LOCKMETHOD);
|
2006-09-19 00:40:40 +02:00
|
|
|
|
|
|
|
PG_RETURN_VOID();
|
|
|
|
}
|