Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* commit_ts.c
|
|
|
|
* PostgreSQL commit timestamp manager
|
|
|
|
*
|
2017-03-17 14:46:58 +01:00
|
|
|
* This module is a pg_xact-like system that stores the commit timestamp
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
* for each transaction.
|
|
|
|
*
|
|
|
|
* XLOG interactions: this module generates an XLOG record whenever a new
|
|
|
|
* CommitTs page is initialized to zeroes. Also, one XLOG record is
|
|
|
|
* generated for setting of values when the caller requests it; this allows
|
|
|
|
* us to support values coming from places other than transaction commit.
|
|
|
|
* Other writes of CommitTS come from recording of transaction commit in
|
|
|
|
* xact.c, which generates its own XLOG records for these events and will
|
|
|
|
* re-perform the status update on redo; so we need make no additional XLOG
|
|
|
|
* entry here.
|
|
|
|
*
|
2018-01-03 05:30:12 +01:00
|
|
|
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
*
|
|
|
|
* src/backend/access/transam/commit_ts.c
|
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#include "postgres.h"
|
|
|
|
|
|
|
|
#include "access/commit_ts.h"
|
|
|
|
#include "access/htup_details.h"
|
|
|
|
#include "access/slru.h"
|
|
|
|
#include "access/transam.h"
|
|
|
|
#include "catalog/pg_type.h"
|
|
|
|
#include "funcapi.h"
|
|
|
|
#include "miscadmin.h"
|
|
|
|
#include "pg_trace.h"
|
2016-09-13 15:21:35 +02:00
|
|
|
#include "storage/shmem.h"
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
#include "utils/builtins.h"
|
|
|
|
#include "utils/snapmgr.h"
|
|
|
|
#include "utils/timestamp.h"
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Defines for CommitTs page sizes. A page is the same BLCKSZ as is used
|
|
|
|
* everywhere else in Postgres.
|
|
|
|
*
|
|
|
|
* Note: because TransactionIds are 32 bits and wrap around at 0xFFFFFFFF,
|
|
|
|
* CommitTs page numbering also wraps around at
|
|
|
|
* 0xFFFFFFFF/COMMIT_TS_XACTS_PER_PAGE, and CommitTs segment numbering at
|
|
|
|
* 0xFFFFFFFF/COMMIT_TS_XACTS_PER_PAGE/SLRU_PAGES_PER_SEGMENT. We need take no
|
|
|
|
* explicit notice of that fact in this module, except when comparing segment
|
|
|
|
* and page numbers in TruncateCommitTs (see CommitTsPagePrecedes).
|
|
|
|
*/
|
|
|
|
|
|
|
|
/*
|
Introduce replication progress tracking infrastructure.
When implementing a replication solution ontop of logical decoding, two
related problems exist:
* How to safely keep track of replication progress
* How to change replication behavior, based on the origin of a row;
e.g. to avoid loops in bi-directional replication setups
The solution to these problems, as implemented here, consist out of
three parts:
1) 'replication origins', which identify nodes in a replication setup.
2) 'replication progress tracking', which remembers, for each
replication origin, how far replay has progressed in a efficient and
crash safe manner.
3) The ability to filter out changes performed on the behest of a
replication origin during logical decoding; this allows complex
replication topologies. E.g. by filtering all replayed changes out.
Most of this could also be implemented in "userspace", e.g. by inserting
additional rows contain origin information, but that ends up being much
less efficient and more complicated. We don't want to require various
replication solutions to reimplement logic for this independently. The
infrastructure is intended to be generic enough to be reusable.
This infrastructure also replaces the 'nodeid' infrastructure of commit
timestamps. It is intended to provide all the former capabilities,
except that there's only 2^16 different origins; but now they integrate
with logical decoding. Additionally more functionality is accessible via
SQL. Since the commit timestamp infrastructure has also been introduced
in 9.5 (commit 73c986add) changing the API is not a problem.
For now the number of origins for which the replication progress can be
tracked simultaneously is determined by the max_replication_slots
GUC. That GUC is not a perfect match to configure this, but there
doesn't seem to be sufficient reason to introduce a separate new one.
Bumps both catversion and wal page magic.
Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer
Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer
Discussion: 20150216002155.GI15326@awork2.anarazel.de,
20140923182422.GA15776@alap3.anarazel.de,
20131114172632.GE7522@alap2.anarazel.de
2015-04-29 19:30:53 +02:00
|
|
|
* We need 8+2 bytes per xact. Note that enlarging this struct might mean
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
* the largest possible file name is more than 5 chars long; see
|
|
|
|
* SlruScanDirectory.
|
|
|
|
*/
|
|
|
|
typedef struct CommitTimestampEntry
|
|
|
|
{
|
2015-05-24 03:35:49 +02:00
|
|
|
TimestampTz time;
|
|
|
|
RepOriginId nodeid;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
} CommitTimestampEntry;
|
|
|
|
|
|
|
|
#define SizeOfCommitTimestampEntry (offsetof(CommitTimestampEntry, nodeid) + \
|
Introduce replication progress tracking infrastructure.
When implementing a replication solution ontop of logical decoding, two
related problems exist:
* How to safely keep track of replication progress
* How to change replication behavior, based on the origin of a row;
e.g. to avoid loops in bi-directional replication setups
The solution to these problems, as implemented here, consist out of
three parts:
1) 'replication origins', which identify nodes in a replication setup.
2) 'replication progress tracking', which remembers, for each
replication origin, how far replay has progressed in a efficient and
crash safe manner.
3) The ability to filter out changes performed on the behest of a
replication origin during logical decoding; this allows complex
replication topologies. E.g. by filtering all replayed changes out.
Most of this could also be implemented in "userspace", e.g. by inserting
additional rows contain origin information, but that ends up being much
less efficient and more complicated. We don't want to require various
replication solutions to reimplement logic for this independently. The
infrastructure is intended to be generic enough to be reusable.
This infrastructure also replaces the 'nodeid' infrastructure of commit
timestamps. It is intended to provide all the former capabilities,
except that there's only 2^16 different origins; but now they integrate
with logical decoding. Additionally more functionality is accessible via
SQL. Since the commit timestamp infrastructure has also been introduced
in 9.5 (commit 73c986add) changing the API is not a problem.
For now the number of origins for which the replication progress can be
tracked simultaneously is determined by the max_replication_slots
GUC. That GUC is not a perfect match to configure this, but there
doesn't seem to be sufficient reason to introduce a separate new one.
Bumps both catversion and wal page magic.
Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer
Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer
Discussion: 20150216002155.GI15326@awork2.anarazel.de,
20140923182422.GA15776@alap3.anarazel.de,
20131114172632.GE7522@alap2.anarazel.de
2015-04-29 19:30:53 +02:00
|
|
|
sizeof(RepOriginId))
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
|
|
|
#define COMMIT_TS_XACTS_PER_PAGE \
|
|
|
|
(BLCKSZ / SizeOfCommitTimestampEntry)
|
|
|
|
|
2015-05-24 03:35:49 +02:00
|
|
|
#define TransactionIdToCTsPage(xid) \
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
((xid) / (TransactionId) COMMIT_TS_XACTS_PER_PAGE)
|
|
|
|
#define TransactionIdToCTsEntry(xid) \
|
|
|
|
((xid) % (TransactionId) COMMIT_TS_XACTS_PER_PAGE)
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Link to shared-memory data structures for CommitTs control
|
|
|
|
*/
|
|
|
|
static SlruCtlData CommitTsCtlData;
|
|
|
|
|
|
|
|
#define CommitTsCtl (&CommitTsCtlData)
|
|
|
|
|
|
|
|
/*
|
2015-10-27 19:06:50 +01:00
|
|
|
* We keep a cache of the last value set in shared memory.
|
|
|
|
*
|
|
|
|
* This is also good place to keep the activation status. We keep this
|
|
|
|
* separate from the GUC so that the standby can activate the module if the
|
|
|
|
* primary has it active independently of the value of the GUC.
|
|
|
|
*
|
|
|
|
* This is protected by CommitTsLock. In some places, we use commitTsActive
|
|
|
|
* without acquiring the lock; where this happens, a comment explains the
|
|
|
|
* rationale for it.
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
*/
|
|
|
|
typedef struct CommitTimestampShared
|
|
|
|
{
|
2015-05-24 03:35:49 +02:00
|
|
|
TransactionId xidLastCommit;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
CommitTimestampEntry dataLastCommit;
|
2016-06-10 00:02:36 +02:00
|
|
|
bool commitTsActive;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
} CommitTimestampShared;
|
|
|
|
|
2015-05-24 03:35:49 +02:00
|
|
|
CommitTimestampShared *commitTsShared;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
|
|
|
|
|
|
|
/* GUC variable */
|
2015-05-24 03:35:49 +02:00
|
|
|
bool track_commit_timestamp;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
|
|
|
static void SetXidCommitTsInPage(TransactionId xid, int nsubxids,
|
|
|
|
TransactionId *subxids, TimestampTz ts,
|
Introduce replication progress tracking infrastructure.
When implementing a replication solution ontop of logical decoding, two
related problems exist:
* How to safely keep track of replication progress
* How to change replication behavior, based on the origin of a row;
e.g. to avoid loops in bi-directional replication setups
The solution to these problems, as implemented here, consist out of
three parts:
1) 'replication origins', which identify nodes in a replication setup.
2) 'replication progress tracking', which remembers, for each
replication origin, how far replay has progressed in a efficient and
crash safe manner.
3) The ability to filter out changes performed on the behest of a
replication origin during logical decoding; this allows complex
replication topologies. E.g. by filtering all replayed changes out.
Most of this could also be implemented in "userspace", e.g. by inserting
additional rows contain origin information, but that ends up being much
less efficient and more complicated. We don't want to require various
replication solutions to reimplement logic for this independently. The
infrastructure is intended to be generic enough to be reusable.
This infrastructure also replaces the 'nodeid' infrastructure of commit
timestamps. It is intended to provide all the former capabilities,
except that there's only 2^16 different origins; but now they integrate
with logical decoding. Additionally more functionality is accessible via
SQL. Since the commit timestamp infrastructure has also been introduced
in 9.5 (commit 73c986add) changing the API is not a problem.
For now the number of origins for which the replication progress can be
tracked simultaneously is determined by the max_replication_slots
GUC. That GUC is not a perfect match to configure this, but there
doesn't seem to be sufficient reason to introduce a separate new one.
Bumps both catversion and wal page magic.
Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer
Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer
Discussion: 20150216002155.GI15326@awork2.anarazel.de,
20140923182422.GA15776@alap3.anarazel.de,
20131114172632.GE7522@alap2.anarazel.de
2015-04-29 19:30:53 +02:00
|
|
|
RepOriginId nodeid, int pageno);
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
static void TransactionIdSetCommitTs(TransactionId xid, TimestampTz ts,
|
2015-05-24 03:35:49 +02:00
|
|
|
RepOriginId nodeid, int slotno);
|
2015-12-03 23:22:31 +01:00
|
|
|
static void error_commit_ts_disabled(void);
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
static int ZeroCommitTsPage(int pageno, bool writeXlog);
|
|
|
|
static bool CommitTsPagePrecedes(int page1, int page2);
|
2015-10-01 20:06:55 +02:00
|
|
|
static void ActivateCommitTs(void);
|
2015-10-27 19:06:50 +01:00
|
|
|
static void DeactivateCommitTs(void);
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
static void WriteZeroPageXlogRec(int pageno);
|
Fix race condition in reading commit timestamps
If a user requests the commit timestamp for a transaction old enough
that its data is concurrently being truncated away by vacuum at just the
right time, they would receive an ugly internal file-not-found error
message from slru.c rather than the expected NULL return value.
In a primary server, the window for the race is very small: the lookup
has to occur exactly between the two calls by vacuum, and there's not a
lot that happens between them (mostly just a multixact truncate). In a
standby server, however, the window is larger because the truncation is
executed as soon as the WAL record for it is replayed, but the advance
of the oldest-Xid is not executed until the next checkpoint record.
To fix in the primary, simply reverse the order of operations in
vac_truncate_clog. To fix in the standby, augment the WAL truncation
record so that the standby is aware of the new oldest-XID value and can
apply the update immediately. WAL version bumped because of this.
No backpatch, because of the low importance of the bug and its rarity.
Author: Craig Ringer
Reviewed-By: Petr Jelínek, Peter Eisentraut
Discussion: https://postgr.es/m/CAMsr+YFhVtRQT1VAwC+WGbbxZZRzNou=N9Ed-FrCqkwQ8H8oJQ@mail.gmail.com
2017-01-19 22:23:09 +01:00
|
|
|
static void WriteTruncateXlogRec(int pageno, TransactionId oldestXid);
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
static void WriteSetTimestampXlogRec(TransactionId mainxid, int nsubxids,
|
|
|
|
TransactionId *subxids, TimestampTz timestamp,
|
Introduce replication progress tracking infrastructure.
When implementing a replication solution ontop of logical decoding, two
related problems exist:
* How to safely keep track of replication progress
* How to change replication behavior, based on the origin of a row;
e.g. to avoid loops in bi-directional replication setups
The solution to these problems, as implemented here, consist out of
three parts:
1) 'replication origins', which identify nodes in a replication setup.
2) 'replication progress tracking', which remembers, for each
replication origin, how far replay has progressed in a efficient and
crash safe manner.
3) The ability to filter out changes performed on the behest of a
replication origin during logical decoding; this allows complex
replication topologies. E.g. by filtering all replayed changes out.
Most of this could also be implemented in "userspace", e.g. by inserting
additional rows contain origin information, but that ends up being much
less efficient and more complicated. We don't want to require various
replication solutions to reimplement logic for this independently. The
infrastructure is intended to be generic enough to be reusable.
This infrastructure also replaces the 'nodeid' infrastructure of commit
timestamps. It is intended to provide all the former capabilities,
except that there's only 2^16 different origins; but now they integrate
with logical decoding. Additionally more functionality is accessible via
SQL. Since the commit timestamp infrastructure has also been introduced
in 9.5 (commit 73c986add) changing the API is not a problem.
For now the number of origins for which the replication progress can be
tracked simultaneously is determined by the max_replication_slots
GUC. That GUC is not a perfect match to configure this, but there
doesn't seem to be sufficient reason to introduce a separate new one.
Bumps both catversion and wal page magic.
Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer
Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer
Discussion: 20150216002155.GI15326@awork2.anarazel.de,
20140923182422.GA15776@alap3.anarazel.de,
20131114172632.GE7522@alap2.anarazel.de
2015-04-29 19:30:53 +02:00
|
|
|
RepOriginId nodeid);
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* TransactionTreeSetCommitTsData
|
|
|
|
*
|
|
|
|
* Record the final commit timestamp of transaction entries in the commit log
|
|
|
|
* for a transaction and its subtransaction tree, as efficiently as possible.
|
|
|
|
*
|
|
|
|
* xid is the top level transaction id.
|
|
|
|
*
|
|
|
|
* subxids is an array of xids of length nsubxids, representing subtransactions
|
|
|
|
* in the tree of xid. In various cases nsubxids may be zero.
|
|
|
|
* The reason why tracking just the parent xid commit timestamp is not enough
|
|
|
|
* is that the subtrans SLRU does not stay valid across crashes (it's not
|
|
|
|
* permanent) so we need to keep the information about them here. If the
|
|
|
|
* subtrans implementation changes in the future, we might want to revisit the
|
|
|
|
* decision of storing timestamp info for each subxid.
|
|
|
|
*
|
2015-09-29 19:40:56 +02:00
|
|
|
* The write_xlog parameter tells us whether to include an XLog record of this
|
|
|
|
* or not. Normally, this is called from transaction commit routines (both
|
|
|
|
* normal and prepared) and the information will be stored in the transaction
|
|
|
|
* commit XLog record, and so they should pass "false" for this. The XLog redo
|
|
|
|
* code should use "false" here as well. Other callers probably want to pass
|
|
|
|
* true, so that the given values persist in case of crashes.
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
*/
|
|
|
|
void
|
|
|
|
TransactionTreeSetCommitTsData(TransactionId xid, int nsubxids,
|
|
|
|
TransactionId *subxids, TimestampTz timestamp,
|
2015-10-01 20:06:55 +02:00
|
|
|
RepOriginId nodeid, bool write_xlog)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
{
|
|
|
|
int i;
|
|
|
|
TransactionId headxid;
|
|
|
|
TransactionId newestXact;
|
|
|
|
|
2015-10-01 20:06:55 +02:00
|
|
|
/*
|
2015-10-27 19:06:50 +01:00
|
|
|
* No-op if the module is not active.
|
|
|
|
*
|
|
|
|
* An unlocked read here is fine, because in a standby (the only place
|
2016-06-10 00:02:36 +02:00
|
|
|
* where the flag can change in flight) this routine is only called by the
|
|
|
|
* recovery process, which is also the only process which can change the
|
|
|
|
* flag.
|
2015-10-01 20:06:55 +02:00
|
|
|
*/
|
2015-10-27 19:06:50 +01:00
|
|
|
if (!commitTsShared->commitTsActive)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
return;
|
|
|
|
|
|
|
|
/*
|
2015-05-24 03:35:49 +02:00
|
|
|
* Comply with the WAL-before-data rule: if caller specified it wants this
|
|
|
|
* value to be recorded in WAL, do so before touching the data.
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
*/
|
2015-09-29 19:40:56 +02:00
|
|
|
if (write_xlog)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
WriteSetTimestampXlogRec(xid, nsubxids, subxids, timestamp, nodeid);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Figure out the latest Xid in this batch: either the last subxid if
|
|
|
|
* there's any, otherwise the parent xid.
|
|
|
|
*/
|
|
|
|
if (nsubxids > 0)
|
|
|
|
newestXact = subxids[nsubxids - 1];
|
|
|
|
else
|
|
|
|
newestXact = xid;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We split the xids to set the timestamp to in groups belonging to the
|
|
|
|
* same SLRU page; the first element in each such set is its head. The
|
2015-05-24 03:35:49 +02:00
|
|
|
* first group has the main XID as the head; subsequent sets use the first
|
|
|
|
* subxid not on the previous page as head. This way, we only have to
|
|
|
|
* lock/modify each SLRU page once.
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
*/
|
|
|
|
for (i = 0, headxid = xid;;)
|
|
|
|
{
|
|
|
|
int pageno = TransactionIdToCTsPage(headxid);
|
|
|
|
int j;
|
|
|
|
|
|
|
|
for (j = i; j < nsubxids; j++)
|
|
|
|
{
|
|
|
|
if (TransactionIdToCTsPage(subxids[j]) != pageno)
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
/* subxids[i..j] are on the same page as the head */
|
|
|
|
|
|
|
|
SetXidCommitTsInPage(headxid, j - i, subxids + i, timestamp, nodeid,
|
|
|
|
pageno);
|
|
|
|
|
|
|
|
/* if we wrote out all subxids, we're done. */
|
|
|
|
if (j + 1 >= nsubxids)
|
|
|
|
break;
|
|
|
|
|
|
|
|
/*
|
2015-05-24 03:35:49 +02:00
|
|
|
* Set the new head and skip over it, as well as over the subxids we
|
|
|
|
* just wrote.
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
*/
|
|
|
|
headxid = subxids[j];
|
|
|
|
i += j - i + 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* update the cached value in shared memory */
|
|
|
|
LWLockAcquire(CommitTsLock, LW_EXCLUSIVE);
|
|
|
|
commitTsShared->xidLastCommit = xid;
|
|
|
|
commitTsShared->dataLastCommit.time = timestamp;
|
|
|
|
commitTsShared->dataLastCommit.nodeid = nodeid;
|
|
|
|
|
|
|
|
/* and move forwards our endpoint, if needed */
|
2015-12-28 21:34:11 +01:00
|
|
|
if (TransactionIdPrecedes(ShmemVariableCache->newestCommitTsXid, newestXact))
|
|
|
|
ShmemVariableCache->newestCommitTsXid = newestXact;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
LWLockRelease(CommitTsLock);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Record the commit timestamp of transaction entries in the commit log for all
|
|
|
|
* entries on a single page. Atomic only on this page.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
SetXidCommitTsInPage(TransactionId xid, int nsubxids,
|
|
|
|
TransactionId *subxids, TimestampTz ts,
|
Introduce replication progress tracking infrastructure.
When implementing a replication solution ontop of logical decoding, two
related problems exist:
* How to safely keep track of replication progress
* How to change replication behavior, based on the origin of a row;
e.g. to avoid loops in bi-directional replication setups
The solution to these problems, as implemented here, consist out of
three parts:
1) 'replication origins', which identify nodes in a replication setup.
2) 'replication progress tracking', which remembers, for each
replication origin, how far replay has progressed in a efficient and
crash safe manner.
3) The ability to filter out changes performed on the behest of a
replication origin during logical decoding; this allows complex
replication topologies. E.g. by filtering all replayed changes out.
Most of this could also be implemented in "userspace", e.g. by inserting
additional rows contain origin information, but that ends up being much
less efficient and more complicated. We don't want to require various
replication solutions to reimplement logic for this independently. The
infrastructure is intended to be generic enough to be reusable.
This infrastructure also replaces the 'nodeid' infrastructure of commit
timestamps. It is intended to provide all the former capabilities,
except that there's only 2^16 different origins; but now they integrate
with logical decoding. Additionally more functionality is accessible via
SQL. Since the commit timestamp infrastructure has also been introduced
in 9.5 (commit 73c986add) changing the API is not a problem.
For now the number of origins for which the replication progress can be
tracked simultaneously is determined by the max_replication_slots
GUC. That GUC is not a perfect match to configure this, but there
doesn't seem to be sufficient reason to introduce a separate new one.
Bumps both catversion and wal page magic.
Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer
Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer
Discussion: 20150216002155.GI15326@awork2.anarazel.de,
20140923182422.GA15776@alap3.anarazel.de,
20131114172632.GE7522@alap2.anarazel.de
2015-04-29 19:30:53 +02:00
|
|
|
RepOriginId nodeid, int pageno)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
{
|
|
|
|
int slotno;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
LWLockAcquire(CommitTsControlLock, LW_EXCLUSIVE);
|
|
|
|
|
|
|
|
slotno = SimpleLruReadPage(CommitTsCtl, pageno, true, xid);
|
|
|
|
|
|
|
|
TransactionIdSetCommitTs(xid, ts, nodeid, slotno);
|
|
|
|
for (i = 0; i < nsubxids; i++)
|
|
|
|
TransactionIdSetCommitTs(subxids[i], ts, nodeid, slotno);
|
|
|
|
|
|
|
|
CommitTsCtl->shared->page_dirty[slotno] = true;
|
|
|
|
|
|
|
|
LWLockRelease(CommitTsControlLock);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Sets the commit timestamp of a single transaction.
|
|
|
|
*
|
|
|
|
* Must be called with CommitTsControlLock held
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
TransactionIdSetCommitTs(TransactionId xid, TimestampTz ts,
|
Introduce replication progress tracking infrastructure.
When implementing a replication solution ontop of logical decoding, two
related problems exist:
* How to safely keep track of replication progress
* How to change replication behavior, based on the origin of a row;
e.g. to avoid loops in bi-directional replication setups
The solution to these problems, as implemented here, consist out of
three parts:
1) 'replication origins', which identify nodes in a replication setup.
2) 'replication progress tracking', which remembers, for each
replication origin, how far replay has progressed in a efficient and
crash safe manner.
3) The ability to filter out changes performed on the behest of a
replication origin during logical decoding; this allows complex
replication topologies. E.g. by filtering all replayed changes out.
Most of this could also be implemented in "userspace", e.g. by inserting
additional rows contain origin information, but that ends up being much
less efficient and more complicated. We don't want to require various
replication solutions to reimplement logic for this independently. The
infrastructure is intended to be generic enough to be reusable.
This infrastructure also replaces the 'nodeid' infrastructure of commit
timestamps. It is intended to provide all the former capabilities,
except that there's only 2^16 different origins; but now they integrate
with logical decoding. Additionally more functionality is accessible via
SQL. Since the commit timestamp infrastructure has also been introduced
in 9.5 (commit 73c986add) changing the API is not a problem.
For now the number of origins for which the replication progress can be
tracked simultaneously is determined by the max_replication_slots
GUC. That GUC is not a perfect match to configure this, but there
doesn't seem to be sufficient reason to introduce a separate new one.
Bumps both catversion and wal page magic.
Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer
Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer
Discussion: 20150216002155.GI15326@awork2.anarazel.de,
20140923182422.GA15776@alap3.anarazel.de,
20131114172632.GE7522@alap2.anarazel.de
2015-04-29 19:30:53 +02:00
|
|
|
RepOriginId nodeid, int slotno)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
{
|
|
|
|
int entryno = TransactionIdToCTsEntry(xid);
|
|
|
|
CommitTimestampEntry entry;
|
|
|
|
|
|
|
|
Assert(TransactionIdIsNormal(xid));
|
|
|
|
|
|
|
|
entry.time = ts;
|
|
|
|
entry.nodeid = nodeid;
|
|
|
|
|
|
|
|
memcpy(CommitTsCtl->shared->page_buffer[slotno] +
|
|
|
|
SizeOfCommitTimestampEntry * entryno,
|
|
|
|
&entry, SizeOfCommitTimestampEntry);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Interrogate the commit timestamp of a transaction.
|
|
|
|
*
|
2015-08-21 19:36:54 +02:00
|
|
|
* The return value indicates whether a commit timestamp record was found for
|
|
|
|
* the given xid. The timestamp value is returned in *ts (which may not be
|
|
|
|
* null), and the origin node for the Xid is returned in *nodeid, if it's not
|
|
|
|
* null.
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
*/
|
|
|
|
bool
|
|
|
|
TransactionIdGetCommitTsData(TransactionId xid, TimestampTz *ts,
|
Introduce replication progress tracking infrastructure.
When implementing a replication solution ontop of logical decoding, two
related problems exist:
* How to safely keep track of replication progress
* How to change replication behavior, based on the origin of a row;
e.g. to avoid loops in bi-directional replication setups
The solution to these problems, as implemented here, consist out of
three parts:
1) 'replication origins', which identify nodes in a replication setup.
2) 'replication progress tracking', which remembers, for each
replication origin, how far replay has progressed in a efficient and
crash safe manner.
3) The ability to filter out changes performed on the behest of a
replication origin during logical decoding; this allows complex
replication topologies. E.g. by filtering all replayed changes out.
Most of this could also be implemented in "userspace", e.g. by inserting
additional rows contain origin information, but that ends up being much
less efficient and more complicated. We don't want to require various
replication solutions to reimplement logic for this independently. The
infrastructure is intended to be generic enough to be reusable.
This infrastructure also replaces the 'nodeid' infrastructure of commit
timestamps. It is intended to provide all the former capabilities,
except that there's only 2^16 different origins; but now they integrate
with logical decoding. Additionally more functionality is accessible via
SQL. Since the commit timestamp infrastructure has also been introduced
in 9.5 (commit 73c986add) changing the API is not a problem.
For now the number of origins for which the replication progress can be
tracked simultaneously is determined by the max_replication_slots
GUC. That GUC is not a perfect match to configure this, but there
doesn't seem to be sufficient reason to introduce a separate new one.
Bumps both catversion and wal page magic.
Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer
Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer
Discussion: 20150216002155.GI15326@awork2.anarazel.de,
20140923182422.GA15776@alap3.anarazel.de,
20131114172632.GE7522@alap2.anarazel.de
2015-04-29 19:30:53 +02:00
|
|
|
RepOriginId *nodeid)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
{
|
|
|
|
int pageno = TransactionIdToCTsPage(xid);
|
|
|
|
int entryno = TransactionIdToCTsEntry(xid);
|
|
|
|
int slotno;
|
|
|
|
CommitTimestampEntry entry;
|
2015-12-28 21:34:11 +01:00
|
|
|
TransactionId oldestCommitTsXid;
|
|
|
|
TransactionId newestCommitTsXid;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
2016-11-24 19:39:55 +01:00
|
|
|
if (!TransactionIdIsValid(xid))
|
2015-10-27 19:06:50 +01:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
errmsg("cannot retrieve commit timestamp for transaction %u", xid)));
|
2016-11-24 19:39:55 +01:00
|
|
|
else if (!TransactionIdIsNormal(xid))
|
|
|
|
{
|
|
|
|
/* frozen and bootstrap xids are always committed far in the past */
|
|
|
|
*ts = 0;
|
|
|
|
if (nodeid)
|
|
|
|
*nodeid = 0;
|
|
|
|
return false;
|
|
|
|
}
|
2015-10-27 19:06:50 +01:00
|
|
|
|
|
|
|
LWLockAcquire(CommitTsLock, LW_SHARED);
|
|
|
|
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
/* Error if module not enabled */
|
2015-10-27 19:06:50 +01:00
|
|
|
if (!commitTsShared->commitTsActive)
|
2015-12-03 23:22:31 +01:00
|
|
|
error_commit_ts_disabled();
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
|
|
|
/*
|
2015-10-27 19:06:50 +01:00
|
|
|
* If we're asked for the cached value, return that. Otherwise, fall
|
|
|
|
* through to read from SLRU.
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
*/
|
2015-10-27 19:06:50 +01:00
|
|
|
if (commitTsShared->xidLastCommit == xid)
|
|
|
|
{
|
|
|
|
*ts = commitTsShared->dataLastCommit.time;
|
|
|
|
if (nodeid)
|
|
|
|
*nodeid = commitTsShared->dataLastCommit.nodeid;
|
|
|
|
|
|
|
|
LWLockRelease(CommitTsLock);
|
|
|
|
return *ts != 0;
|
|
|
|
}
|
|
|
|
|
2015-12-28 21:34:11 +01:00
|
|
|
oldestCommitTsXid = ShmemVariableCache->oldestCommitTsXid;
|
|
|
|
newestCommitTsXid = ShmemVariableCache->newestCommitTsXid;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
/* neither is invalid, or both are */
|
2015-12-28 21:34:11 +01:00
|
|
|
Assert(TransactionIdIsValid(oldestCommitTsXid) == TransactionIdIsValid(newestCommitTsXid));
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
LWLockRelease(CommitTsLock);
|
|
|
|
|
2015-10-27 19:06:50 +01:00
|
|
|
/*
|
|
|
|
* Return empty if the requested value is outside our valid range.
|
|
|
|
*/
|
2015-12-28 21:34:11 +01:00
|
|
|
if (!TransactionIdIsValid(oldestCommitTsXid) ||
|
|
|
|
TransactionIdPrecedes(xid, oldestCommitTsXid) ||
|
|
|
|
TransactionIdPrecedes(newestCommitTsXid, xid))
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
{
|
2015-08-21 19:36:54 +02:00
|
|
|
*ts = 0;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
if (nodeid)
|
Introduce replication progress tracking infrastructure.
When implementing a replication solution ontop of logical decoding, two
related problems exist:
* How to safely keep track of replication progress
* How to change replication behavior, based on the origin of a row;
e.g. to avoid loops in bi-directional replication setups
The solution to these problems, as implemented here, consist out of
three parts:
1) 'replication origins', which identify nodes in a replication setup.
2) 'replication progress tracking', which remembers, for each
replication origin, how far replay has progressed in a efficient and
crash safe manner.
3) The ability to filter out changes performed on the behest of a
replication origin during logical decoding; this allows complex
replication topologies. E.g. by filtering all replayed changes out.
Most of this could also be implemented in "userspace", e.g. by inserting
additional rows contain origin information, but that ends up being much
less efficient and more complicated. We don't want to require various
replication solutions to reimplement logic for this independently. The
infrastructure is intended to be generic enough to be reusable.
This infrastructure also replaces the 'nodeid' infrastructure of commit
timestamps. It is intended to provide all the former capabilities,
except that there's only 2^16 different origins; but now they integrate
with logical decoding. Additionally more functionality is accessible via
SQL. Since the commit timestamp infrastructure has also been introduced
in 9.5 (commit 73c986add) changing the API is not a problem.
For now the number of origins for which the replication progress can be
tracked simultaneously is determined by the max_replication_slots
GUC. That GUC is not a perfect match to configure this, but there
doesn't seem to be sufficient reason to introduce a separate new one.
Bumps both catversion and wal page magic.
Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer
Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer
Discussion: 20150216002155.GI15326@awork2.anarazel.de,
20140923182422.GA15776@alap3.anarazel.de,
20131114172632.GE7522@alap2.anarazel.de
2015-04-29 19:30:53 +02:00
|
|
|
*nodeid = InvalidRepOriginId;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* lock is acquired by SimpleLruReadPage_ReadOnly */
|
|
|
|
slotno = SimpleLruReadPage_ReadOnly(CommitTsCtl, pageno, xid);
|
|
|
|
memcpy(&entry,
|
|
|
|
CommitTsCtl->shared->page_buffer[slotno] +
|
|
|
|
SizeOfCommitTimestampEntry * entryno,
|
|
|
|
SizeOfCommitTimestampEntry);
|
|
|
|
|
2015-08-21 19:36:54 +02:00
|
|
|
*ts = entry.time;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
if (nodeid)
|
|
|
|
*nodeid = entry.nodeid;
|
|
|
|
|
|
|
|
LWLockRelease(CommitTsControlLock);
|
|
|
|
return *ts != 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Return the Xid of the latest committed transaction. (As far as this module
|
|
|
|
* is concerned, anyway; it's up to the caller to ensure the value is useful
|
|
|
|
* for its purposes.)
|
|
|
|
*
|
|
|
|
* ts and extra are filled with the corresponding data; they can be passed
|
|
|
|
* as NULL if not wanted.
|
|
|
|
*/
|
|
|
|
TransactionId
|
Introduce replication progress tracking infrastructure.
When implementing a replication solution ontop of logical decoding, two
related problems exist:
* How to safely keep track of replication progress
* How to change replication behavior, based on the origin of a row;
e.g. to avoid loops in bi-directional replication setups
The solution to these problems, as implemented here, consist out of
three parts:
1) 'replication origins', which identify nodes in a replication setup.
2) 'replication progress tracking', which remembers, for each
replication origin, how far replay has progressed in a efficient and
crash safe manner.
3) The ability to filter out changes performed on the behest of a
replication origin during logical decoding; this allows complex
replication topologies. E.g. by filtering all replayed changes out.
Most of this could also be implemented in "userspace", e.g. by inserting
additional rows contain origin information, but that ends up being much
less efficient and more complicated. We don't want to require various
replication solutions to reimplement logic for this independently. The
infrastructure is intended to be generic enough to be reusable.
This infrastructure also replaces the 'nodeid' infrastructure of commit
timestamps. It is intended to provide all the former capabilities,
except that there's only 2^16 different origins; but now they integrate
with logical decoding. Additionally more functionality is accessible via
SQL. Since the commit timestamp infrastructure has also been introduced
in 9.5 (commit 73c986add) changing the API is not a problem.
For now the number of origins for which the replication progress can be
tracked simultaneously is determined by the max_replication_slots
GUC. That GUC is not a perfect match to configure this, but there
doesn't seem to be sufficient reason to introduce a separate new one.
Bumps both catversion and wal page magic.
Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer
Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer
Discussion: 20150216002155.GI15326@awork2.anarazel.de,
20140923182422.GA15776@alap3.anarazel.de,
20131114172632.GE7522@alap2.anarazel.de
2015-04-29 19:30:53 +02:00
|
|
|
GetLatestCommitTsData(TimestampTz *ts, RepOriginId *nodeid)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
{
|
2015-05-24 03:35:49 +02:00
|
|
|
TransactionId xid;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
2015-10-27 19:06:50 +01:00
|
|
|
LWLockAcquire(CommitTsLock, LW_SHARED);
|
|
|
|
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
/* Error if module not enabled */
|
2015-10-27 19:06:50 +01:00
|
|
|
if (!commitTsShared->commitTsActive)
|
2015-12-03 23:22:31 +01:00
|
|
|
error_commit_ts_disabled();
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
|
|
|
xid = commitTsShared->xidLastCommit;
|
|
|
|
if (ts)
|
|
|
|
*ts = commitTsShared->dataLastCommit.time;
|
|
|
|
if (nodeid)
|
|
|
|
*nodeid = commitTsShared->dataLastCommit.nodeid;
|
|
|
|
LWLockRelease(CommitTsLock);
|
|
|
|
|
|
|
|
return xid;
|
|
|
|
}
|
|
|
|
|
2015-12-03 23:22:31 +01:00
|
|
|
static void
|
|
|
|
error_commit_ts_disabled(void)
|
|
|
|
{
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
|
|
|
|
errmsg("could not get commit timestamp data"),
|
|
|
|
RecoveryInProgress() ?
|
2015-12-11 04:05:27 +01:00
|
|
|
errhint("Make sure the configuration parameter \"%s\" is set on the master server.",
|
2015-12-03 23:22:31 +01:00
|
|
|
"track_commit_timestamp") :
|
|
|
|
errhint("Make sure the configuration parameter \"%s\" is set.",
|
|
|
|
"track_commit_timestamp")));
|
|
|
|
}
|
|
|
|
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
/*
|
|
|
|
* SQL-callable wrapper to obtain commit time of a transaction
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
pg_xact_commit_timestamp(PG_FUNCTION_ARGS)
|
|
|
|
{
|
2015-05-24 03:35:49 +02:00
|
|
|
TransactionId xid = PG_GETARG_UINT32(0);
|
|
|
|
TimestampTz ts;
|
|
|
|
bool found;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
|
|
|
found = TransactionIdGetCommitTsData(xid, &ts, NULL);
|
|
|
|
|
|
|
|
if (!found)
|
|
|
|
PG_RETURN_NULL();
|
|
|
|
|
|
|
|
PG_RETURN_TIMESTAMPTZ(ts);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
Datum
|
|
|
|
pg_last_committed_xact(PG_FUNCTION_ARGS)
|
|
|
|
{
|
2015-05-24 03:35:49 +02:00
|
|
|
TransactionId xid;
|
|
|
|
TimestampTz ts;
|
|
|
|
Datum values[2];
|
|
|
|
bool nulls[2];
|
|
|
|
TupleDesc tupdesc;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
HeapTuple htup;
|
|
|
|
|
|
|
|
/* and construct a tuple with our data */
|
|
|
|
xid = GetLatestCommitTsData(&ts, NULL);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Construct a tuple descriptor for the result row. This must match this
|
|
|
|
* function's pg_proc entry!
|
|
|
|
*/
|
|
|
|
tupdesc = CreateTemplateTupleDesc(2, false);
|
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 1, "xid",
|
|
|
|
XIDOID, -1, 0);
|
|
|
|
TupleDescInitEntry(tupdesc, (AttrNumber) 2, "timestamp",
|
|
|
|
TIMESTAMPTZOID, -1, 0);
|
|
|
|
tupdesc = BlessTupleDesc(tupdesc);
|
|
|
|
|
|
|
|
if (!TransactionIdIsNormal(xid))
|
|
|
|
{
|
|
|
|
memset(nulls, true, sizeof(nulls));
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
values[0] = TransactionIdGetDatum(xid);
|
|
|
|
nulls[0] = false;
|
|
|
|
|
|
|
|
values[1] = TimestampTzGetDatum(ts);
|
|
|
|
nulls[1] = false;
|
|
|
|
}
|
|
|
|
|
|
|
|
htup = heap_form_tuple(tupdesc, values, nulls);
|
|
|
|
|
|
|
|
PG_RETURN_DATUM(HeapTupleGetDatum(htup));
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Number of shared CommitTS buffers.
|
|
|
|
*
|
|
|
|
* We use a very similar logic as for the number of CLOG buffers; see comments
|
|
|
|
* in CLOGShmemBuffers.
|
|
|
|
*/
|
|
|
|
Size
|
|
|
|
CommitTsShmemBuffers(void)
|
|
|
|
{
|
|
|
|
return Min(16, Max(4, NBuffers / 1024));
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Shared memory sizing for CommitTs
|
|
|
|
*/
|
|
|
|
Size
|
|
|
|
CommitTsShmemSize(void)
|
|
|
|
{
|
|
|
|
return SimpleLruShmemSize(CommitTsShmemBuffers(), 0) +
|
|
|
|
sizeof(CommitTimestampShared);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Initialize CommitTs at system startup (postmaster start or standalone
|
|
|
|
* backend)
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
CommitTsShmemInit(void)
|
|
|
|
{
|
2015-05-24 03:35:49 +02:00
|
|
|
bool found;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
|
|
|
CommitTsCtl->PagePrecedes = CommitTsPagePrecedes;
|
2015-11-12 20:59:09 +01:00
|
|
|
SimpleLruInit(CommitTsCtl, "commit_timestamp", CommitTsShmemBuffers(), 0,
|
2016-02-02 12:42:14 +01:00
|
|
|
CommitTsControlLock, "pg_commit_ts",
|
|
|
|
LWTRANCHE_COMMITTS_BUFFERS);
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
|
|
|
commitTsShared = ShmemInitStruct("CommitTs shared",
|
|
|
|
sizeof(CommitTimestampShared),
|
|
|
|
&found);
|
|
|
|
|
|
|
|
if (!IsUnderPostmaster)
|
|
|
|
{
|
|
|
|
Assert(!found);
|
|
|
|
|
|
|
|
commitTsShared->xidLastCommit = InvalidTransactionId;
|
|
|
|
TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
|
Introduce replication progress tracking infrastructure.
When implementing a replication solution ontop of logical decoding, two
related problems exist:
* How to safely keep track of replication progress
* How to change replication behavior, based on the origin of a row;
e.g. to avoid loops in bi-directional replication setups
The solution to these problems, as implemented here, consist out of
three parts:
1) 'replication origins', which identify nodes in a replication setup.
2) 'replication progress tracking', which remembers, for each
replication origin, how far replay has progressed in a efficient and
crash safe manner.
3) The ability to filter out changes performed on the behest of a
replication origin during logical decoding; this allows complex
replication topologies. E.g. by filtering all replayed changes out.
Most of this could also be implemented in "userspace", e.g. by inserting
additional rows contain origin information, but that ends up being much
less efficient and more complicated. We don't want to require various
replication solutions to reimplement logic for this independently. The
infrastructure is intended to be generic enough to be reusable.
This infrastructure also replaces the 'nodeid' infrastructure of commit
timestamps. It is intended to provide all the former capabilities,
except that there's only 2^16 different origins; but now they integrate
with logical decoding. Additionally more functionality is accessible via
SQL. Since the commit timestamp infrastructure has also been introduced
in 9.5 (commit 73c986add) changing the API is not a problem.
For now the number of origins for which the replication progress can be
tracked simultaneously is determined by the max_replication_slots
GUC. That GUC is not a perfect match to configure this, but there
doesn't seem to be sufficient reason to introduce a separate new one.
Bumps both catversion and wal page magic.
Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer
Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer
Discussion: 20150216002155.GI15326@awork2.anarazel.de,
20140923182422.GA15776@alap3.anarazel.de,
20131114172632.GE7522@alap2.anarazel.de
2015-04-29 19:30:53 +02:00
|
|
|
commitTsShared->dataLastCommit.nodeid = InvalidRepOriginId;
|
2015-10-27 19:06:50 +01:00
|
|
|
commitTsShared->commitTsActive = false;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
}
|
|
|
|
else
|
|
|
|
Assert(found);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* This function must be called ONCE on system install.
|
|
|
|
*
|
|
|
|
* (The CommitTs directory is assumed to have been created by initdb, and
|
|
|
|
* CommitTsShmemInit must have been called already.)
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
BootStrapCommitTs(void)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Nothing to do here at present, unlike most other SLRU modules; segments
|
2015-05-24 03:35:49 +02:00
|
|
|
* are created when the server is started with this module enabled. See
|
2015-12-03 23:22:31 +01:00
|
|
|
* ActivateCommitTs.
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
*/
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Initialize (or reinitialize) a page of CommitTs to zeroes.
|
2017-08-16 06:22:32 +02:00
|
|
|
* If writeXlog is true, also emit an XLOG record saying we did this.
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
*
|
|
|
|
* The page is not actually written, just set up in shared memory.
|
|
|
|
* The slot number of the new page is returned.
|
|
|
|
*
|
|
|
|
* Control lock must be held at entry, and will be held at exit.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
ZeroCommitTsPage(int pageno, bool writeXlog)
|
|
|
|
{
|
|
|
|
int slotno;
|
|
|
|
|
|
|
|
slotno = SimpleLruZeroPage(CommitTsCtl, pageno);
|
|
|
|
|
|
|
|
if (writeXlog)
|
|
|
|
WriteZeroPageXlogRec(pageno);
|
|
|
|
|
|
|
|
return slotno;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* This must be called ONCE during postmaster or standalone-backend startup,
|
|
|
|
* after StartupXLOG has initialized ShmemVariableCache->nextXid.
|
|
|
|
*/
|
|
|
|
void
|
2015-12-11 18:30:43 +01:00
|
|
|
StartupCommitTs(void)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
{
|
2015-12-11 18:30:43 +01:00
|
|
|
ActivateCommitTs();
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* This must be called ONCE during postmaster or standalone-backend startup,
|
2015-10-01 20:06:55 +02:00
|
|
|
* after recovery has finished.
|
2015-03-09 21:44:00 +01:00
|
|
|
*/
|
|
|
|
void
|
|
|
|
CompleteCommitTsInitialization(void)
|
|
|
|
{
|
2015-10-01 20:06:55 +02:00
|
|
|
/*
|
|
|
|
* If the feature is not enabled, turn it off for good. This also removes
|
|
|
|
* any leftover data.
|
2015-12-11 18:30:43 +01:00
|
|
|
*
|
|
|
|
* Conversely, we activate the module if the feature is enabled. This is
|
2018-09-26 03:25:54 +02:00
|
|
|
* necessary for primary and standby as the activation depends on the
|
|
|
|
* control file contents at the beginning of recovery or when a
|
|
|
|
* XLOG_PARAMETER_CHANGE is replayed.
|
2015-10-01 20:06:55 +02:00
|
|
|
*/
|
2015-03-09 21:44:00 +01:00
|
|
|
if (!track_commit_timestamp)
|
2015-10-27 19:06:50 +01:00
|
|
|
DeactivateCommitTs();
|
2015-12-11 18:30:43 +01:00
|
|
|
else
|
|
|
|
ActivateCommitTs();
|
2015-03-09 21:44:00 +01:00
|
|
|
}
|
|
|
|
|
2015-10-01 20:06:55 +02:00
|
|
|
/*
|
|
|
|
* Activate or deactivate CommitTs' upon reception of a XLOG_PARAMETER_CHANGE
|
2018-09-26 03:25:54 +02:00
|
|
|
* XLog record during recovery.
|
2015-10-01 20:06:55 +02:00
|
|
|
*/
|
|
|
|
void
|
|
|
|
CommitTsParameterChange(bool newvalue, bool oldvalue)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* If the commit_ts module is disabled in this server and we get word from
|
|
|
|
* the master server that it is enabled there, activate it so that we can
|
|
|
|
* replay future WAL records involving it; also mark it as active on
|
|
|
|
* pg_control. If the old value was already set, we already did this, so
|
|
|
|
* don't do anything.
|
|
|
|
*
|
2015-10-02 17:49:01 +02:00
|
|
|
* If the module is disabled in the master, disable it here too, unless
|
|
|
|
* the module is enabled locally.
|
2015-12-11 18:30:43 +01:00
|
|
|
*
|
|
|
|
* Note this only runs in the recovery process, so an unlocked read is
|
|
|
|
* fine.
|
2015-10-01 20:06:55 +02:00
|
|
|
*/
|
|
|
|
if (newvalue)
|
|
|
|
{
|
2015-10-27 19:06:50 +01:00
|
|
|
if (!commitTsShared->commitTsActive)
|
2015-10-01 20:06:55 +02:00
|
|
|
ActivateCommitTs();
|
|
|
|
}
|
2015-10-27 19:06:50 +01:00
|
|
|
else if (commitTsShared->commitTsActive)
|
|
|
|
DeactivateCommitTs();
|
2015-10-01 20:06:55 +02:00
|
|
|
}
|
|
|
|
|
2015-03-09 21:44:00 +01:00
|
|
|
/*
|
|
|
|
* Activate this module whenever necessary.
|
2017-02-06 10:33:58 +01:00
|
|
|
* This must happen during postmaster or standalone-backend startup,
|
2015-05-24 03:35:49 +02:00
|
|
|
* or during WAL replay anytime the track_commit_timestamp setting is
|
|
|
|
* changed in the master.
|
2015-03-09 21:44:00 +01:00
|
|
|
*
|
|
|
|
* The reason why this SLRU needs separate activation/deactivation functions is
|
|
|
|
* that it can be enabled/disabled during start and the activation/deactivation
|
2017-08-07 23:42:47 +02:00
|
|
|
* on master is propagated to standby via replay. Other SLRUs don't have this
|
2015-03-09 21:44:00 +01:00
|
|
|
* property and they can be just initialized during normal startup.
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
*
|
|
|
|
* This is in charge of creating the currently active segment, if it's not
|
|
|
|
* already there. The reason for this is that the server might have been
|
|
|
|
* running with this module disabled for a while and thus might have skipped
|
|
|
|
* the normal creation point.
|
|
|
|
*/
|
2015-10-01 20:06:55 +02:00
|
|
|
static void
|
2015-03-09 21:44:00 +01:00
|
|
|
ActivateCommitTs(void)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
{
|
2015-12-11 18:30:43 +01:00
|
|
|
TransactionId xid;
|
|
|
|
int pageno;
|
|
|
|
|
|
|
|
/* If we've done this already, there's nothing to do */
|
|
|
|
LWLockAcquire(CommitTsLock, LW_EXCLUSIVE);
|
|
|
|
if (commitTsShared->commitTsActive)
|
|
|
|
{
|
|
|
|
LWLockRelease(CommitTsLock);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
LWLockRelease(CommitTsLock);
|
|
|
|
|
|
|
|
xid = ShmemVariableCache->nextXid;
|
|
|
|
pageno = TransactionIdToCTsPage(xid);
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Re-Initialize our idea of the latest page number.
|
|
|
|
*/
|
|
|
|
LWLockAcquire(CommitTsControlLock, LW_EXCLUSIVE);
|
|
|
|
CommitTsCtl->shared->latest_page_number = pageno;
|
|
|
|
LWLockRelease(CommitTsControlLock);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If CommitTs is enabled, but it wasn't in the previous server run, we
|
|
|
|
* need to set the oldest and newest values to the next Xid; that way, we
|
|
|
|
* will not try to read data that might not have been set.
|
|
|
|
*
|
|
|
|
* XXX does this have a problem if a server is started with commitTs
|
|
|
|
* enabled, then started with commitTs disabled, then restarted with it
|
|
|
|
* enabled again? It doesn't look like it does, because there should be a
|
|
|
|
* checkpoint that sets the value to InvalidTransactionId at end of
|
|
|
|
* recovery; and so any chance of injecting new transactions without
|
2015-12-28 21:34:11 +01:00
|
|
|
* CommitTs values would occur after the oldestCommitTsXid has been set to
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
* Invalid temporarily.
|
|
|
|
*/
|
|
|
|
LWLockAcquire(CommitTsLock, LW_EXCLUSIVE);
|
2015-12-28 21:34:11 +01:00
|
|
|
if (ShmemVariableCache->oldestCommitTsXid == InvalidTransactionId)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
{
|
2015-12-28 21:34:11 +01:00
|
|
|
ShmemVariableCache->oldestCommitTsXid =
|
|
|
|
ShmemVariableCache->newestCommitTsXid = ReadNewTransactionId();
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
}
|
|
|
|
LWLockRelease(CommitTsLock);
|
|
|
|
|
2015-10-27 19:06:50 +01:00
|
|
|
/* Create the current segment file, if necessary */
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
if (!SimpleLruDoesPhysicalPageExist(CommitTsCtl, pageno))
|
|
|
|
{
|
2015-05-24 03:35:49 +02:00
|
|
|
int slotno;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
|
|
|
LWLockAcquire(CommitTsControlLock, LW_EXCLUSIVE);
|
|
|
|
slotno = ZeroCommitTsPage(pageno, false);
|
|
|
|
SimpleLruWritePage(CommitTsCtl, slotno);
|
|
|
|
Assert(!CommitTsCtl->shared->page_dirty[slotno]);
|
|
|
|
LWLockRelease(CommitTsControlLock);
|
|
|
|
}
|
2015-10-01 20:06:55 +02:00
|
|
|
|
2015-10-27 19:06:50 +01:00
|
|
|
/* Change the activation status in shared memory. */
|
|
|
|
LWLockAcquire(CommitTsLock, LW_EXCLUSIVE);
|
|
|
|
commitTsShared->commitTsActive = true;
|
|
|
|
LWLockRelease(CommitTsLock);
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
}
|
|
|
|
|
2015-03-09 21:44:00 +01:00
|
|
|
/*
|
|
|
|
* Deactivate this module.
|
|
|
|
*
|
|
|
|
* This must be called when the track_commit_timestamp parameter is turned off.
|
|
|
|
* This happens during postmaster or standalone-backend startup, or during WAL
|
|
|
|
* replay.
|
|
|
|
*
|
|
|
|
* Resets CommitTs into invalid state to make sure we don't hand back
|
|
|
|
* possibly-invalid data; also removes segments of old data.
|
|
|
|
*/
|
2015-10-01 20:06:55 +02:00
|
|
|
static void
|
2015-10-27 19:06:50 +01:00
|
|
|
DeactivateCommitTs(void)
|
2015-03-09 21:44:00 +01:00
|
|
|
{
|
|
|
|
/*
|
2015-10-27 19:06:50 +01:00
|
|
|
* Cleanup the status in the shared memory.
|
|
|
|
*
|
|
|
|
* We reset everything in the commitTsShared record to prevent user from
|
|
|
|
* getting confusing data about last committed transaction on the standby
|
|
|
|
* when the module was activated repeatedly on the primary.
|
2015-03-09 21:44:00 +01:00
|
|
|
*/
|
|
|
|
LWLockAcquire(CommitTsLock, LW_EXCLUSIVE);
|
2015-10-27 19:06:50 +01:00
|
|
|
|
|
|
|
commitTsShared->commitTsActive = false;
|
|
|
|
commitTsShared->xidLastCommit = InvalidTransactionId;
|
|
|
|
TIMESTAMP_NOBEGIN(commitTsShared->dataLastCommit.time);
|
|
|
|
commitTsShared->dataLastCommit.nodeid = InvalidRepOriginId;
|
|
|
|
|
2015-12-28 21:34:11 +01:00
|
|
|
ShmemVariableCache->oldestCommitTsXid = InvalidTransactionId;
|
|
|
|
ShmemVariableCache->newestCommitTsXid = InvalidTransactionId;
|
2015-10-27 19:06:50 +01:00
|
|
|
|
2015-03-09 21:44:00 +01:00
|
|
|
LWLockRelease(CommitTsLock);
|
|
|
|
|
2015-10-01 20:06:55 +02:00
|
|
|
/*
|
|
|
|
* Remove *all* files. This is necessary so that there are no leftover
|
|
|
|
* files; in the case where this feature is later enabled after running
|
|
|
|
* with it disabled for some time there may be a gap in the file sequence.
|
|
|
|
* (We can probably tolerate out-of-sequence files, as they are going to
|
|
|
|
* be overwritten anyway when we wrap around, but it seems better to be
|
|
|
|
* tidy.)
|
|
|
|
*/
|
2015-10-27 19:06:50 +01:00
|
|
|
LWLockAcquire(CommitTsControlLock, LW_EXCLUSIVE);
|
2015-10-01 20:06:55 +02:00
|
|
|
(void) SlruScanDirectory(CommitTsCtl, SlruScanDirCbDeleteAll, NULL);
|
2015-10-27 19:06:50 +01:00
|
|
|
LWLockRelease(CommitTsControlLock);
|
2015-03-09 21:44:00 +01:00
|
|
|
}
|
|
|
|
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
/*
|
|
|
|
* This must be called ONCE during postmaster or standalone-backend shutdown
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
ShutdownCommitTs(void)
|
|
|
|
{
|
|
|
|
/* Flush dirty CommitTs pages to disk */
|
|
|
|
SimpleLruFlush(CommitTsCtl, false);
|
2017-03-27 18:33:01 +02:00
|
|
|
|
|
|
|
/*
|
2017-05-17 22:31:56 +02:00
|
|
|
* fsync pg_commit_ts to ensure that any files flushed previously are
|
|
|
|
* durably on disk.
|
2017-03-27 18:33:01 +02:00
|
|
|
*/
|
|
|
|
fsync_fname("pg_commit_ts", true);
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Perform a checkpoint --- either during shutdown, or on-the-fly
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
CheckPointCommitTs(void)
|
|
|
|
{
|
|
|
|
/* Flush dirty CommitTs pages to disk */
|
|
|
|
SimpleLruFlush(CommitTsCtl, true);
|
2017-03-27 18:33:01 +02:00
|
|
|
|
|
|
|
/*
|
2017-05-17 22:31:56 +02:00
|
|
|
* fsync pg_commit_ts to ensure that any files flushed previously are
|
|
|
|
* durably on disk.
|
2017-03-27 18:33:01 +02:00
|
|
|
*/
|
|
|
|
fsync_fname("pg_commit_ts", true);
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Make sure that CommitTs has room for a newly-allocated XID.
|
|
|
|
*
|
|
|
|
* NB: this is called while holding XidGenLock. We want it to be very fast
|
|
|
|
* most of the time; even when it's not so fast, no actual I/O need happen
|
|
|
|
* unless we're forced to write out a dirty CommitTs or xlog page to make room
|
|
|
|
* in shared memory.
|
|
|
|
*
|
|
|
|
* NB: the current implementation relies on track_commit_timestamp being
|
|
|
|
* PGC_POSTMASTER.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
ExtendCommitTs(TransactionId newestXact)
|
|
|
|
{
|
|
|
|
int pageno;
|
|
|
|
|
2015-10-27 19:06:50 +01:00
|
|
|
/*
|
2016-06-10 00:02:36 +02:00
|
|
|
* Nothing to do if module not enabled. Note we do an unlocked read of
|
|
|
|
* the flag here, which is okay because this routine is only called from
|
2015-10-27 19:06:50 +01:00
|
|
|
* GetNewTransactionId, which is never called in a standby.
|
|
|
|
*/
|
|
|
|
Assert(!InRecovery);
|
|
|
|
if (!commitTsShared->commitTsActive)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
return;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* No work except at first XID of a page. But beware: just after
|
|
|
|
* wraparound, the first XID of page zero is FirstNormalTransactionId.
|
|
|
|
*/
|
|
|
|
if (TransactionIdToCTsEntry(newestXact) != 0 &&
|
|
|
|
!TransactionIdEquals(newestXact, FirstNormalTransactionId))
|
|
|
|
return;
|
|
|
|
|
|
|
|
pageno = TransactionIdToCTsPage(newestXact);
|
|
|
|
|
|
|
|
LWLockAcquire(CommitTsControlLock, LW_EXCLUSIVE);
|
|
|
|
|
|
|
|
/* Zero the page and make an XLOG entry about it */
|
|
|
|
ZeroCommitTsPage(pageno, !InRecovery);
|
|
|
|
|
|
|
|
LWLockRelease(CommitTsControlLock);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Remove all CommitTs segments before the one holding the passed
|
|
|
|
* transaction ID.
|
|
|
|
*
|
|
|
|
* Note that we don't need to flush XLOG here.
|
|
|
|
*/
|
|
|
|
void
|
2015-10-27 19:06:50 +01:00
|
|
|
TruncateCommitTs(TransactionId oldestXact)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
{
|
|
|
|
int cutoffPage;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The cutoff point is the start of the segment containing oldestXact. We
|
|
|
|
* pass the *page* containing oldestXact to SimpleLruTruncate.
|
|
|
|
*/
|
|
|
|
cutoffPage = TransactionIdToCTsPage(oldestXact);
|
|
|
|
|
|
|
|
/* Check to see if there's any files that could be removed */
|
|
|
|
if (!SlruScanDirectory(CommitTsCtl, SlruScanDirCbReportPresence,
|
|
|
|
&cutoffPage))
|
|
|
|
return; /* nothing to remove */
|
|
|
|
|
|
|
|
/* Write XLOG record */
|
Fix race condition in reading commit timestamps
If a user requests the commit timestamp for a transaction old enough
that its data is concurrently being truncated away by vacuum at just the
right time, they would receive an ugly internal file-not-found error
message from slru.c rather than the expected NULL return value.
In a primary server, the window for the race is very small: the lookup
has to occur exactly between the two calls by vacuum, and there's not a
lot that happens between them (mostly just a multixact truncate). In a
standby server, however, the window is larger because the truncation is
executed as soon as the WAL record for it is replayed, but the advance
of the oldest-Xid is not executed until the next checkpoint record.
To fix in the primary, simply reverse the order of operations in
vac_truncate_clog. To fix in the standby, augment the WAL truncation
record so that the standby is aware of the new oldest-XID value and can
apply the update immediately. WAL version bumped because of this.
No backpatch, because of the low importance of the bug and its rarity.
Author: Craig Ringer
Reviewed-By: Petr Jelínek, Peter Eisentraut
Discussion: https://postgr.es/m/CAMsr+YFhVtRQT1VAwC+WGbbxZZRzNou=N9Ed-FrCqkwQ8H8oJQ@mail.gmail.com
2017-01-19 22:23:09 +01:00
|
|
|
WriteTruncateXlogRec(cutoffPage, oldestXact);
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
|
|
|
/* Now we can remove the old CommitTs segment(s) */
|
|
|
|
SimpleLruTruncate(CommitTsCtl, cutoffPage);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Set the limit values between which commit TS can be consulted.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
SetCommitTsLimit(TransactionId oldestXact, TransactionId newestXact)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Be careful not to overwrite values that are either further into the
|
|
|
|
* "future" or signal a disabled committs.
|
|
|
|
*/
|
|
|
|
LWLockAcquire(CommitTsLock, LW_EXCLUSIVE);
|
2015-12-28 21:34:11 +01:00
|
|
|
if (ShmemVariableCache->oldestCommitTsXid != InvalidTransactionId)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
{
|
2015-12-28 21:34:11 +01:00
|
|
|
if (TransactionIdPrecedes(ShmemVariableCache->oldestCommitTsXid, oldestXact))
|
|
|
|
ShmemVariableCache->oldestCommitTsXid = oldestXact;
|
|
|
|
if (TransactionIdPrecedes(newestXact, ShmemVariableCache->newestCommitTsXid))
|
|
|
|
ShmemVariableCache->newestCommitTsXid = newestXact;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
2015-12-28 21:34:11 +01:00
|
|
|
Assert(ShmemVariableCache->newestCommitTsXid == InvalidTransactionId);
|
2016-10-24 14:27:24 +02:00
|
|
|
ShmemVariableCache->oldestCommitTsXid = oldestXact;
|
|
|
|
ShmemVariableCache->newestCommitTsXid = newestXact;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
}
|
|
|
|
LWLockRelease(CommitTsLock);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Move forwards the oldest commitTS value that can be consulted
|
|
|
|
*/
|
|
|
|
void
|
2015-12-28 21:34:11 +01:00
|
|
|
AdvanceOldestCommitTsXid(TransactionId oldestXact)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
{
|
|
|
|
LWLockAcquire(CommitTsLock, LW_EXCLUSIVE);
|
2015-12-28 21:34:11 +01:00
|
|
|
if (ShmemVariableCache->oldestCommitTsXid != InvalidTransactionId &&
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
TransactionIdPrecedes(ShmemVariableCache->oldestCommitTsXid, oldestXact))
|
2015-12-28 21:34:11 +01:00
|
|
|
ShmemVariableCache->oldestCommitTsXid = oldestXact;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
LWLockRelease(CommitTsLock);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
2018-06-22 06:30:26 +02:00
|
|
|
* Decide which of two commitTS page numbers is "older" for truncation
|
|
|
|
* purposes.
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
*
|
|
|
|
* We need to use comparison of TransactionIds here in order to do the right
|
|
|
|
* thing with wraparound XID arithmetic. However, if we are asked about
|
|
|
|
* page number zero, we don't want to hand InvalidTransactionId to
|
|
|
|
* TransactionIdPrecedes: it'll get weird about permanent xact IDs. So,
|
|
|
|
* offset both xids by FirstNormalTransactionId to avoid that.
|
|
|
|
*/
|
|
|
|
static bool
|
|
|
|
CommitTsPagePrecedes(int page1, int page2)
|
|
|
|
{
|
|
|
|
TransactionId xid1;
|
|
|
|
TransactionId xid2;
|
|
|
|
|
|
|
|
xid1 = ((TransactionId) page1) * COMMIT_TS_XACTS_PER_PAGE;
|
|
|
|
xid1 += FirstNormalTransactionId;
|
|
|
|
xid2 = ((TransactionId) page2) * COMMIT_TS_XACTS_PER_PAGE;
|
|
|
|
xid2 += FirstNormalTransactionId;
|
|
|
|
|
|
|
|
return TransactionIdPrecedes(xid1, xid2);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Write a ZEROPAGE xlog record
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
WriteZeroPageXlogRec(int pageno)
|
|
|
|
{
|
|
|
|
XLogBeginInsert();
|
|
|
|
XLogRegisterData((char *) (&pageno), sizeof(int));
|
|
|
|
(void) XLogInsert(RM_COMMIT_TS_ID, COMMIT_TS_ZEROPAGE);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Write a TRUNCATE xlog record
|
|
|
|
*/
|
|
|
|
static void
|
Fix race condition in reading commit timestamps
If a user requests the commit timestamp for a transaction old enough
that its data is concurrently being truncated away by vacuum at just the
right time, they would receive an ugly internal file-not-found error
message from slru.c rather than the expected NULL return value.
In a primary server, the window for the race is very small: the lookup
has to occur exactly between the two calls by vacuum, and there's not a
lot that happens between them (mostly just a multixact truncate). In a
standby server, however, the window is larger because the truncation is
executed as soon as the WAL record for it is replayed, but the advance
of the oldest-Xid is not executed until the next checkpoint record.
To fix in the primary, simply reverse the order of operations in
vac_truncate_clog. To fix in the standby, augment the WAL truncation
record so that the standby is aware of the new oldest-XID value and can
apply the update immediately. WAL version bumped because of this.
No backpatch, because of the low importance of the bug and its rarity.
Author: Craig Ringer
Reviewed-By: Petr Jelínek, Peter Eisentraut
Discussion: https://postgr.es/m/CAMsr+YFhVtRQT1VAwC+WGbbxZZRzNou=N9Ed-FrCqkwQ8H8oJQ@mail.gmail.com
2017-01-19 22:23:09 +01:00
|
|
|
WriteTruncateXlogRec(int pageno, TransactionId oldestXid)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
{
|
Fix race condition in reading commit timestamps
If a user requests the commit timestamp for a transaction old enough
that its data is concurrently being truncated away by vacuum at just the
right time, they would receive an ugly internal file-not-found error
message from slru.c rather than the expected NULL return value.
In a primary server, the window for the race is very small: the lookup
has to occur exactly between the two calls by vacuum, and there's not a
lot that happens between them (mostly just a multixact truncate). In a
standby server, however, the window is larger because the truncation is
executed as soon as the WAL record for it is replayed, but the advance
of the oldest-Xid is not executed until the next checkpoint record.
To fix in the primary, simply reverse the order of operations in
vac_truncate_clog. To fix in the standby, augment the WAL truncation
record so that the standby is aware of the new oldest-XID value and can
apply the update immediately. WAL version bumped because of this.
No backpatch, because of the low importance of the bug and its rarity.
Author: Craig Ringer
Reviewed-By: Petr Jelínek, Peter Eisentraut
Discussion: https://postgr.es/m/CAMsr+YFhVtRQT1VAwC+WGbbxZZRzNou=N9Ed-FrCqkwQ8H8oJQ@mail.gmail.com
2017-01-19 22:23:09 +01:00
|
|
|
xl_commit_ts_truncate xlrec;
|
|
|
|
|
|
|
|
xlrec.pageno = pageno;
|
|
|
|
xlrec.oldestXid = oldestXid;
|
|
|
|
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
XLogBeginInsert();
|
Fix race condition in reading commit timestamps
If a user requests the commit timestamp for a transaction old enough
that its data is concurrently being truncated away by vacuum at just the
right time, they would receive an ugly internal file-not-found error
message from slru.c rather than the expected NULL return value.
In a primary server, the window for the race is very small: the lookup
has to occur exactly between the two calls by vacuum, and there's not a
lot that happens between them (mostly just a multixact truncate). In a
standby server, however, the window is larger because the truncation is
executed as soon as the WAL record for it is replayed, but the advance
of the oldest-Xid is not executed until the next checkpoint record.
To fix in the primary, simply reverse the order of operations in
vac_truncate_clog. To fix in the standby, augment the WAL truncation
record so that the standby is aware of the new oldest-XID value and can
apply the update immediately. WAL version bumped because of this.
No backpatch, because of the low importance of the bug and its rarity.
Author: Craig Ringer
Reviewed-By: Petr Jelínek, Peter Eisentraut
Discussion: https://postgr.es/m/CAMsr+YFhVtRQT1VAwC+WGbbxZZRzNou=N9Ed-FrCqkwQ8H8oJQ@mail.gmail.com
2017-01-19 22:23:09 +01:00
|
|
|
XLogRegisterData((char *) (&xlrec), SizeOfCommitTsTruncate);
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
(void) XLogInsert(RM_COMMIT_TS_ID, COMMIT_TS_TRUNCATE);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Write a SETTS xlog record
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
WriteSetTimestampXlogRec(TransactionId mainxid, int nsubxids,
|
|
|
|
TransactionId *subxids, TimestampTz timestamp,
|
Introduce replication progress tracking infrastructure.
When implementing a replication solution ontop of logical decoding, two
related problems exist:
* How to safely keep track of replication progress
* How to change replication behavior, based on the origin of a row;
e.g. to avoid loops in bi-directional replication setups
The solution to these problems, as implemented here, consist out of
three parts:
1) 'replication origins', which identify nodes in a replication setup.
2) 'replication progress tracking', which remembers, for each
replication origin, how far replay has progressed in a efficient and
crash safe manner.
3) The ability to filter out changes performed on the behest of a
replication origin during logical decoding; this allows complex
replication topologies. E.g. by filtering all replayed changes out.
Most of this could also be implemented in "userspace", e.g. by inserting
additional rows contain origin information, but that ends up being much
less efficient and more complicated. We don't want to require various
replication solutions to reimplement logic for this independently. The
infrastructure is intended to be generic enough to be reusable.
This infrastructure also replaces the 'nodeid' infrastructure of commit
timestamps. It is intended to provide all the former capabilities,
except that there's only 2^16 different origins; but now they integrate
with logical decoding. Additionally more functionality is accessible via
SQL. Since the commit timestamp infrastructure has also been introduced
in 9.5 (commit 73c986add) changing the API is not a problem.
For now the number of origins for which the replication progress can be
tracked simultaneously is determined by the max_replication_slots
GUC. That GUC is not a perfect match to configure this, but there
doesn't seem to be sufficient reason to introduce a separate new one.
Bumps both catversion and wal page magic.
Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer
Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer
Discussion: 20150216002155.GI15326@awork2.anarazel.de,
20140923182422.GA15776@alap3.anarazel.de,
20131114172632.GE7522@alap2.anarazel.de
2015-04-29 19:30:53 +02:00
|
|
|
RepOriginId nodeid)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
{
|
2015-05-24 03:35:49 +02:00
|
|
|
xl_commit_ts_set record;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
|
|
|
record.timestamp = timestamp;
|
|
|
|
record.nodeid = nodeid;
|
|
|
|
record.mainxid = mainxid;
|
|
|
|
|
|
|
|
XLogBeginInsert();
|
|
|
|
XLogRegisterData((char *) &record,
|
|
|
|
offsetof(xl_commit_ts_set, mainxid) +
|
|
|
|
sizeof(TransactionId));
|
|
|
|
XLogRegisterData((char *) subxids, nsubxids * sizeof(TransactionId));
|
|
|
|
XLogInsert(RM_COMMIT_TS_ID, COMMIT_TS_SETTS);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* CommitTS resource manager's routines
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
commit_ts_redo(XLogReaderState *record)
|
|
|
|
{
|
|
|
|
uint8 info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
|
|
|
|
|
|
|
|
/* Backup blocks are not used in commit_ts records */
|
|
|
|
Assert(!XLogRecHasAnyBlockRefs(record));
|
|
|
|
|
|
|
|
if (info == COMMIT_TS_ZEROPAGE)
|
|
|
|
{
|
|
|
|
int pageno;
|
|
|
|
int slotno;
|
|
|
|
|
|
|
|
memcpy(&pageno, XLogRecGetData(record), sizeof(int));
|
|
|
|
|
|
|
|
LWLockAcquire(CommitTsControlLock, LW_EXCLUSIVE);
|
|
|
|
|
|
|
|
slotno = ZeroCommitTsPage(pageno, false);
|
|
|
|
SimpleLruWritePage(CommitTsCtl, slotno);
|
|
|
|
Assert(!CommitTsCtl->shared->page_dirty[slotno]);
|
|
|
|
|
|
|
|
LWLockRelease(CommitTsControlLock);
|
|
|
|
}
|
|
|
|
else if (info == COMMIT_TS_TRUNCATE)
|
|
|
|
{
|
Fix race condition in reading commit timestamps
If a user requests the commit timestamp for a transaction old enough
that its data is concurrently being truncated away by vacuum at just the
right time, they would receive an ugly internal file-not-found error
message from slru.c rather than the expected NULL return value.
In a primary server, the window for the race is very small: the lookup
has to occur exactly between the two calls by vacuum, and there's not a
lot that happens between them (mostly just a multixact truncate). In a
standby server, however, the window is larger because the truncation is
executed as soon as the WAL record for it is replayed, but the advance
of the oldest-Xid is not executed until the next checkpoint record.
To fix in the primary, simply reverse the order of operations in
vac_truncate_clog. To fix in the standby, augment the WAL truncation
record so that the standby is aware of the new oldest-XID value and can
apply the update immediately. WAL version bumped because of this.
No backpatch, because of the low importance of the bug and its rarity.
Author: Craig Ringer
Reviewed-By: Petr Jelínek, Peter Eisentraut
Discussion: https://postgr.es/m/CAMsr+YFhVtRQT1VAwC+WGbbxZZRzNou=N9Ed-FrCqkwQ8H8oJQ@mail.gmail.com
2017-01-19 22:23:09 +01:00
|
|
|
xl_commit_ts_truncate *trunc = (xl_commit_ts_truncate *) XLogRecGetData(record);
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
Fix race condition in reading commit timestamps
If a user requests the commit timestamp for a transaction old enough
that its data is concurrently being truncated away by vacuum at just the
right time, they would receive an ugly internal file-not-found error
message from slru.c rather than the expected NULL return value.
In a primary server, the window for the race is very small: the lookup
has to occur exactly between the two calls by vacuum, and there's not a
lot that happens between them (mostly just a multixact truncate). In a
standby server, however, the window is larger because the truncation is
executed as soon as the WAL record for it is replayed, but the advance
of the oldest-Xid is not executed until the next checkpoint record.
To fix in the primary, simply reverse the order of operations in
vac_truncate_clog. To fix in the standby, augment the WAL truncation
record so that the standby is aware of the new oldest-XID value and can
apply the update immediately. WAL version bumped because of this.
No backpatch, because of the low importance of the bug and its rarity.
Author: Craig Ringer
Reviewed-By: Petr Jelínek, Peter Eisentraut
Discussion: https://postgr.es/m/CAMsr+YFhVtRQT1VAwC+WGbbxZZRzNou=N9Ed-FrCqkwQ8H8oJQ@mail.gmail.com
2017-01-19 22:23:09 +01:00
|
|
|
AdvanceOldestCommitTsXid(trunc->oldestXid);
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* During XLOG replay, latest_page_number isn't set up yet; insert a
|
|
|
|
* suitable value to bypass the sanity test in SimpleLruTruncate.
|
|
|
|
*/
|
Fix race condition in reading commit timestamps
If a user requests the commit timestamp for a transaction old enough
that its data is concurrently being truncated away by vacuum at just the
right time, they would receive an ugly internal file-not-found error
message from slru.c rather than the expected NULL return value.
In a primary server, the window for the race is very small: the lookup
has to occur exactly between the two calls by vacuum, and there's not a
lot that happens between them (mostly just a multixact truncate). In a
standby server, however, the window is larger because the truncation is
executed as soon as the WAL record for it is replayed, but the advance
of the oldest-Xid is not executed until the next checkpoint record.
To fix in the primary, simply reverse the order of operations in
vac_truncate_clog. To fix in the standby, augment the WAL truncation
record so that the standby is aware of the new oldest-XID value and can
apply the update immediately. WAL version bumped because of this.
No backpatch, because of the low importance of the bug and its rarity.
Author: Craig Ringer
Reviewed-By: Petr Jelínek, Peter Eisentraut
Discussion: https://postgr.es/m/CAMsr+YFhVtRQT1VAwC+WGbbxZZRzNou=N9Ed-FrCqkwQ8H8oJQ@mail.gmail.com
2017-01-19 22:23:09 +01:00
|
|
|
CommitTsCtl->shared->latest_page_number = trunc->pageno;
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
|
Fix race condition in reading commit timestamps
If a user requests the commit timestamp for a transaction old enough
that its data is concurrently being truncated away by vacuum at just the
right time, they would receive an ugly internal file-not-found error
message from slru.c rather than the expected NULL return value.
In a primary server, the window for the race is very small: the lookup
has to occur exactly between the two calls by vacuum, and there's not a
lot that happens between them (mostly just a multixact truncate). In a
standby server, however, the window is larger because the truncation is
executed as soon as the WAL record for it is replayed, but the advance
of the oldest-Xid is not executed until the next checkpoint record.
To fix in the primary, simply reverse the order of operations in
vac_truncate_clog. To fix in the standby, augment the WAL truncation
record so that the standby is aware of the new oldest-XID value and can
apply the update immediately. WAL version bumped because of this.
No backpatch, because of the low importance of the bug and its rarity.
Author: Craig Ringer
Reviewed-By: Petr Jelínek, Peter Eisentraut
Discussion: https://postgr.es/m/CAMsr+YFhVtRQT1VAwC+WGbbxZZRzNou=N9Ed-FrCqkwQ8H8oJQ@mail.gmail.com
2017-01-19 22:23:09 +01:00
|
|
|
SimpleLruTruncate(CommitTsCtl, trunc->pageno);
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
}
|
|
|
|
else if (info == COMMIT_TS_SETTS)
|
|
|
|
{
|
|
|
|
xl_commit_ts_set *setts = (xl_commit_ts_set *) XLogRecGetData(record);
|
|
|
|
int nsubxids;
|
|
|
|
TransactionId *subxids;
|
|
|
|
|
|
|
|
nsubxids = ((XLogRecGetDataLen(record) - SizeOfCommitTsSet) /
|
|
|
|
sizeof(TransactionId));
|
|
|
|
if (nsubxids > 0)
|
|
|
|
{
|
|
|
|
subxids = palloc(sizeof(TransactionId) * nsubxids);
|
|
|
|
memcpy(subxids,
|
|
|
|
XLogRecGetData(record) + SizeOfCommitTsSet,
|
|
|
|
sizeof(TransactionId) * nsubxids);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
subxids = NULL;
|
|
|
|
|
|
|
|
TransactionTreeSetCommitTsData(setts->mainxid, nsubxids, subxids,
|
2015-10-01 20:06:55 +02:00
|
|
|
setts->timestamp, setts->nodeid, true);
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
if (subxids)
|
|
|
|
pfree(subxids);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
elog(PANIC, "commit_ts_redo: unknown op code %u", info);
|
|
|
|
}
|