postgresql/src
Michael Paquier 8a4237908c Fix snapshot builds during promotion of hot standby node with 2PC
Some specific logic is done at the end of recovery when involving 2PC
transactions:
1) Call RecoverPreparedTransactions(), to recover the state of 2PC
transactions into memory (re-acquire locks, etc.).
2) ShutdownRecoveryTransactionEnvironment(), to move back to normal
operations, mainly cleaning up recovery locks and KnownAssignedXids
(including any 2PC transaction tracked previously).
3) Switch XLogCtl->SharedRecoveryState to RECOVERY_STATE_DONE, which is
the tipping point for any process calling RecoveryInProgress() to check
if the cluster is still in recovery or not.

Any snapshot taken between steps 2) and 3) would be empty, causing any
transaction relying on a snapshot at this point to potentially corrupt
data as there could still be some 2PC transactions to track, with
RecentXmin moving backwards on successive calls to GetSnapshotData() in
the same transaction.

As SharedRecoveryState is the point to take into account to know if it
is safe to discard KnownAssignedXids, this commit moves step 2) after
step 3), so as we can never finish with empty snapshots.

This exists since the introduction of hot standby, so backpatch all the
way down.  The window with incorrect snapshots is extremely small, but I
have seen it when running 023_pitr_prepared_xact.pl, as did buildfarm
member fairywren.  Thomas Munro also found it independently.  Special
thanks to Andres Freund for taking the time to analyze this issue.

Reported-by: Thomas Munro, Michael Paquier
Analyzed-by: Andres Freund
Discussion: https://postgr.es/m/20210422203603.fdnh3fu2mmfp2iov@alap3.anarazel.de
Backpatch-through: 9.6
2021-10-04 14:05:20 +09:00
..
backend Fix snapshot builds during promotion of hot standby node with 2PC 2021-10-04 14:05:20 +09:00
bin Update our mapping of Windows time zone names using CLDR info. 2021-10-02 16:05:42 -04:00
common Fix memory leak in pg_hmac 2021-10-01 22:47:05 +02:00
fe_utils Skip trailing whitespaces when parsing integer options 2021-07-27 10:39:05 +09:00
include Fix checking of query type in plpgsql's RETURN QUERY command. 2021-10-03 13:21:20 -04:00
interfaces Clear conn->errorMessage at successful completion of PQconnectdb(). 2021-09-13 16:53:11 -04:00
makefiles Add NO_INSTALL option to pgxs 2021-05-27 13:58:29 +02:00
pl Fix checking of query type in plpgsql's RETURN QUERY command. 2021-10-03 13:21:20 -04:00
port Treat ETIMEDOUT as indicating a non-recoverable connection failure. 2021-09-30 14:16:08 -04:00
template Further tweaking of PG_SYSROOT heuristics for macOS. 2021-01-20 12:07:23 -05:00
test Fix checking of query type in plpgsql's RETURN QUERY command. 2021-10-03 13:21:20 -04:00
timezone Update time zone data files to tzdata release 2021a. 2021-01-24 16:29:47 -05:00
tools Fix WAL replay in presence of an incomplete record 2021-09-29 11:21:51 -03:00
tutorial doc: Prefer explicit JOIN syntax over old implicit syntax in tutorial 2021-04-08 10:51:26 +02:00
.gitignore
DEVELOPERS
Makefile Remove the option to build thread_test.c outside configure. 2020-10-21 12:08:48 -04:00
Makefile.global.in Update Unicode data to Unicode 14.0.0 2021-09-15 08:16:44 +02:00
Makefile.shlib AIX: Fix missing libpq symbols by respecting SHLIB_EXPORTS. 2021-09-06 11:27:59 -07:00
nls-global.mk Add errhint_plural() function and make use of it 2021-03-31 09:16:25 +02:00