2005-06-18 00:32:51 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* twophase.h
|
|
|
|
* Two-phase-commit related declarations.
|
|
|
|
*
|
|
|
|
*
|
2024-01-04 02:49:05 +01:00
|
|
|
* Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
|
2005-06-18 00:32:51 +02:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
*
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/include/access/twophase.h
|
2005-06-18 00:32:51 +02:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#ifndef TWOPHASE_H
|
|
|
|
#define TWOPHASE_H
|
|
|
|
|
2018-03-28 18:42:50 +02:00
|
|
|
#include "access/xact.h"
|
2019-11-25 03:38:57 +01:00
|
|
|
#include "access/xlogdefs.h"
|
2012-06-25 23:45:15 +02:00
|
|
|
#include "datatype/timestamp.h"
|
|
|
|
#include "storage/lock.h"
|
2005-06-18 00:32:51 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* GlobalTransactionData is defined in twophase.c; other places have no
|
|
|
|
* business knowing the internal definition.
|
|
|
|
*/
|
|
|
|
typedef struct GlobalTransactionData *GlobalTransaction;
|
|
|
|
|
|
|
|
/* GUC variable */
|
2017-12-05 15:23:57 +01:00
|
|
|
extern PGDLLIMPORT int max_prepared_xacts;
|
2005-06-18 00:32:51 +02:00
|
|
|
|
2005-08-21 01:26:37 +02:00
|
|
|
extern Size TwoPhaseShmemSize(void);
|
2005-06-18 00:32:51 +02:00
|
|
|
extern void TwoPhaseShmemInit(void);
|
|
|
|
|
Fix race condition in preparing a transaction for two-phase commit.
To lock a prepared transaction's shared memory entry, we used to mark it
with the XID of the backend. When the XID was no longer active according
to the proc array, the entry was implicitly considered as not locked
anymore. However, when preparing a transaction, the backend's proc array
entry was cleared before transfering the locks (and some other state) to
the prepared transaction's dummy PGPROC entry, so there was a window where
another backend could finish the transaction before it was in fact fully
prepared.
To fix, rewrite the locking mechanism of global transaction entries. Instead
of an XID, just have simple locked-or-not flag in each entry (we store the
locking backend's backend id rather than a simple boolean, but that's just
for debugging purposes). The backend is responsible for explicitly unlocking
the entry, and to make sure that that happens, install a callback to unlock
it on abort or process exit.
Backpatch to all supported versions.
2014-05-15 15:37:50 +02:00
|
|
|
extern void AtAbort_Twophase(void);
|
|
|
|
extern void PostPrepare_Twophase(void);
|
|
|
|
|
2021-10-24 03:36:38 +02:00
|
|
|
extern TransactionId TwoPhaseGetXidByVirtualXID(VirtualTransactionId vxid,
|
|
|
|
bool *have_more);
|
Make release of 2PC identifier and locks consistent in COMMIT PREPARED
When preparing a transaction in two-phase commit, a dummy PGPROC entry
holding the GID used for the transaction is registered, which gets
released once COMMIT PREPARED is run. Prior releasing its shared memory
state, all the locks taken in the prepared transaction are released
using a dedicated set of callbacks (pgstat and multixact having similar
callbacks), which may cause the locks to be released before the GID is
set free.
Hence, there is a small window where lock conflicts could happen, for
example:
- Transaction A releases its locks, still holding its GID in shared
memory.
- Transaction B held a lock which conflicted with locks of transaction
A.
- Transaction B continues its processing, reusing the same GID as
transaction A.
- Transaction B fails because of a conflicting GID, already in use by
transaction A.
This commit changes the shared memory state release so as post-commit
callbacks and predicate lock cleanup happen consistently with the shared
memory state cleanup for the dummy PGPROC entry. The race window is
small and 2PC had this issue from the start, so no backpatch is done.
On top if that fixes discussed involved ABI breakages, which are not
welcome in stable branches.
Reported-by: Oleksii Kliukin, Ildar Musin
Diagnosed-by: Oleksii Kliukin, Ildar Musin
Author: Michael Paquier
Reviewed-by: Masahiko Sawada, Oleksii Kliukin
Discussion: https://postgr.es/m/BF9B38A4-2BFF-46E8-BA87-A2D00A8047A6@hintbits.com
2019-02-25 06:19:34 +01:00
|
|
|
extern PGPROC *TwoPhaseGetDummyProc(TransactionId xid, bool lock_held);
|
|
|
|
extern BackendId TwoPhaseGetDummyBackendId(TransactionId xid, bool lock_held);
|
2005-06-18 00:32:51 +02:00
|
|
|
|
2005-06-18 21:33:42 +02:00
|
|
|
extern GlobalTransaction MarkAsPreparing(TransactionId xid, const char *gid,
|
|
|
|
TimestampTz prepared_at,
|
2005-06-28 07:09:14 +02:00
|
|
|
Oid owner, Oid databaseid);
|
2005-06-18 00:32:51 +02:00
|
|
|
|
|
|
|
extern void StartPrepare(GlobalTransaction gxact);
|
|
|
|
extern void EndPrepare(GlobalTransaction gxact);
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
extern bool StandbyTransactionIdIsPrepared(TransactionId xid);
|
2005-06-18 00:32:51 +02:00
|
|
|
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
extern TransactionId PrescanPreparedTransactions(TransactionId **xids_p,
|
|
|
|
int *nxids_p);
|
2017-04-27 14:41:22 +02:00
|
|
|
extern void StandbyRecoverPreparedTransactions(void);
|
2005-06-18 00:32:51 +02:00
|
|
|
extern void RecoverPreparedTransactions(void);
|
|
|
|
|
2005-06-19 22:00:39 +02:00
|
|
|
extern void CheckPointTwoPhase(XLogRecPtr redo_horizon);
|
|
|
|
|
2005-06-18 21:33:42 +02:00
|
|
|
extern void FinishPreparedTransaction(const char *gid, bool isCommit);
|
2005-06-18 00:32:51 +02:00
|
|
|
|
2017-04-04 21:56:56 +02:00
|
|
|
extern void PrepareRedoAdd(char *buf, XLogRecPtr start_lsn,
|
2018-03-28 18:42:50 +02:00
|
|
|
XLogRecPtr end_lsn, RepOriginId origin_id);
|
2017-04-04 21:56:56 +02:00
|
|
|
extern void PrepareRedoRemove(TransactionId xid, bool giveWarning);
|
|
|
|
extern void restoreTwoPhaseData(void);
|
2022-09-20 04:18:36 +02:00
|
|
|
extern bool LookupGXact(const char *gid, XLogRecPtr prepare_end_lsn,
|
Add support for prepared transactions to built-in logical replication.
To add support for streaming transactions at prepare time into the
built-in logical replication, we need to do the following things:
* Modify the output plugin (pgoutput) to implement the new two-phase API
callbacks, by leveraging the extended replication protocol.
* Modify the replication apply worker, to properly handle two-phase
transactions by replaying them on prepare.
* Add a new SUBSCRIPTION option "two_phase" to allow users to enable
two-phase transactions. We enable the two_phase once the initial data sync
is over.
We however must explicitly disable replication of two-phase transactions
during replication slot creation, even if the plugin supports it. We
don't need to replicate the changes accumulated during this phase,
and moreover, we don't have a replication connection open so we don't know
where to send the data anyway.
The streaming option is not allowed with this new two_phase option. This
can be done as a separate patch.
We don't allow to toggle two_phase option of a subscription because it can
lead to an inconsistent replica. For the same reason, we don't allow to
refresh the publication once the two_phase is enabled for a subscription
unless copy_data option is false.
Author: Peter Smith, Ajin Cherian and Amit Kapila based on previous work by Nikhil Sontakke and Stas Kelvich
Reviewed-by: Amit Kapila, Sawada Masahiko, Vignesh C, Dilip Kumar, Takamichi Osumi, Greg Nancarrow
Tested-By: Haiying Tang
Discussion: https://postgr.es/m/02DA5F5E-CECE-4D9C-8B4B-418077E2C010@postgrespro.ru
Discussion: https://postgr.es/m/CAA4eK1+opiV4aFTmWWUF9h_32=HfPOW9vZASHarT0UA5oBrtGw@mail.gmail.com
2021-07-14 04:03:50 +02:00
|
|
|
TimestampTz origin_prepare_timestamp);
|
2005-06-18 00:32:51 +02:00
|
|
|
#endif /* TWOPHASE_H */
|