Fix corruption of templates after CREATE DATABASE .. STRATEGY WAL_LOG

WAL_LOG does a scan of the template's pg_class to determine the set of
relations that need to be copied from a template database to the new
one.  However, as coded in 9c08aea, this copy strategy would load the
pages of pg_class without considering it as a permanent relation,
causing the loaded pages to never be flushed when they should.  Any
modification of the template's pg_class, mostly through DDLs, would then
be missed, causing corruptions.

STRATEGY = WAL_LOG is the default over FILE_COPY since it has been
introduced, so any changes done to pg_class on a database template would
be gone.  Updates of database templates should be a rare thing, so the
impact of this bug should be hopefully limited.  The pre-14 default
strategy FILE_COPY is safe, and can be used as a workaround.

Ryo Matsumura has found and analyzed the issue, and Nathan has written a
test able to reproduce the failure (with few tweaks from me).

Backpatch down to 15, where STRATEGY = WAL_LOG has been introduced.

Author: Nathan Bossart, Ryo Matsumura
Reviewed-by: Dilip Kumar, Michael Paquier
Discussion: https://postgr.es/m/TYCPR01MB6868677E499C9AD5123084B5E8A39@TYCPR01MB6868.jpnprd01.prod.outlook.com
Backpatch-through: 15
This commit is contained in:
Michael Paquier 2023-02-22 10:14:52 +09:00
parent b3e184a5d4
commit 8a8661828a
3 changed files with 47 additions and 1 deletions

View File

@ -296,7 +296,7 @@ ScanSourceDatabasePgClass(Oid tbid, Oid dbid, char *srcpath)
CHECK_FOR_INTERRUPTS();
buf = ReadBufferWithoutRelcache(rlocator, MAIN_FORKNUM, blkno,
RBM_NORMAL, bstrategy, false);
RBM_NORMAL, bstrategy, true);
LockBuffer(buf, BUFFER_LOCK_SHARE);
page = BufferGetPage(buf);

View File

@ -39,6 +39,7 @@ tests += {
't/031_recovery_conflict.pl',
't/032_relfilenode_reuse.pl',
't/033_replay_tsp_drops.pl',
't/034_create_database.pl',
],
},
}

View File

@ -0,0 +1,45 @@
# Copyright (c) 2023, PostgreSQL Global Development Group
# Test WAL replay for CREATE DATABASE .. STRATEGY WAL_LOG.
use strict;
use warnings;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
my $node = PostgreSQL::Test::Cluster->new('node');
$node->init;
$node->start;
# This checks that any DDLs run on the template database that modify pg_class
# are persisted after creating a database from it using the WAL_LOG strategy,
# as a direct copy of the template database's pg_class is used in this case.
my $db_template = "template1";
my $db_new = "test_db_1";
# Create table. It should persist on the template database.
$node->safe_psql("postgres",
"CREATE DATABASE $db_new STRATEGY WAL_LOG TEMPLATE $db_template;");
$node->safe_psql($db_template, "CREATE TABLE tab_db_after_create_1 (a INT);");
# Flush the changes affecting the template database, then replay them.
$node->safe_psql("postgres", "CHECKPOINT;");
$node->stop('immediate');
$node->start;
my $result = $node->safe_psql($db_template,
"SELECT count(*) FROM pg_class WHERE relname LIKE 'tab_db_%';");
is($result, "1",
"check that table exists on template after crash, with checkpoint");
# The new database should have no tables.
$result = $node->safe_psql($db_new,
"SELECT count(*) FROM pg_class WHERE relname LIKE 'tab_db_%';");
is($result, "0",
"check that there are no tables from template on new database after crash"
);
done_testing();