postgresql/contrib/test_decoding/specs/catalog_change_snapshot.spec
Amit Kapila 16b1fe0037 Fix assertion failures while processing NEW_CID record in logical decoding.
When the logical decoding restarts from NEW_CID, since there is no
association between the top transaction and its subtransaction, both are
created as top transactions and have the same LSN. This caused the
assertion failure in AssertTXNLsnOrder().

This patch skips the assertion check until we reach the LSN at which we
start decoding the contents of the transaction, specifically
start_decoding_at LSN in SnapBuild. This is okay because we don't
guarantee to make the association between top transaction and
subtransaction until we try to decode the actual contents of transaction.
The ordering of the records prior to the start_decoding_at LSN should have
been checked before the restart.

The other assertion failure is due to the reason that we forgot to track
that we have considered top-level transaction id in the list of catalog
changing transactions that were committed when one of its subtransactions
is marked as containing catalog change.

Reported-by: Tomas Vondra, Osumi Takamichi
Author: Masahiko Sawada, Kuroda Hayato
Reviewed-by: Amit Kapila, Dilip Kumar, Kuroda Hayato, Kyotaro Horiguchi, Masahiko Sawada
Backpatch-through: 10
Discussion: https://postgr.es/m/a89b46b6-0239-2fd5-71a9-b19b1f7a7145%40enterprisedb.com
Discussion: https://postgr.es/m/TYCPR01MB83733C6CEAE47D0280814D5AED7A9%40TYCPR01MB8373.jpnprd01.prod.outlook.com
2022-10-20 08:49:48 +05:30

76 lines
4.1 KiB
Ruby

# Test decoding only the commit record of the transaction that have
# modified catalogs.
setup
{
DROP TABLE IF EXISTS tbl1;
DROP TABLE IF EXISTS tbl2;
CREATE TABLE tbl1 (val1 integer, val2 integer);
CREATE TABLE tbl2 (val1 integer, val2 integer);
CREATE TABLE user_cat (val1 integer) WITH (user_catalog_table = true);
}
teardown
{
DROP TABLE tbl1;
DROP TABLE tbl2;
DROP TABLE user_cat;
SELECT 'stop' FROM pg_drop_replication_slot('isolation_slot');
}
session "s0"
setup { SET synchronous_commit=on; }
step "s0_init" { SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); }
step "s0_begin" { BEGIN; }
step "s0_savepoint" { SAVEPOINT sp1; }
step "s0_truncate" { TRUNCATE tbl1; }
step "s0_insert" { INSERT INTO tbl1 VALUES (1); }
step "s0_insert2" { INSERT INTO user_cat VALUES (1); }
step "s0_commit" { COMMIT; }
session "s1"
setup { SET synchronous_commit=on; }
step "s1_checkpoint" { CHECKPOINT; }
step "s1_get_changes" { SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', '1', 'include-xids', '0'); }
session "s2"
setup { SET synchronous_commit=on; }
step "s2_begin" { BEGIN; }
step "s2_truncate" { TRUNCATE tbl2; }
step "s2_commit" { COMMIT; }
# For the transaction that TRUNCATEd the table tbl1, the last decoding decodes
# only its COMMIT record, because it starts from the RUNNING_XACTS record emitted
# during the first checkpoint execution. This transaction must be marked as
# containing catalog changes while decoding the COMMIT record and the decoding
# of the INSERT record must read the pg_class with the correct historic snapshot.
#
# Note that in a case where bgwriter wrote the RUNNING_XACTS record between "s0_commit"
# and "s0_begin", this doesn't happen as the decoding starts from the RUNNING_XACTS
# record written by bgwriter. One might think we can either stop the bgwriter or
# increase LOG_SNAPSHOT_INTERVAL_MS but it's not practical via tests.
permutation "s0_init" "s0_begin" "s0_savepoint" "s0_truncate" "s1_checkpoint" "s1_get_changes" "s0_commit" "s0_begin" "s0_insert" "s1_checkpoint" "s1_get_changes" "s0_commit" "s1_get_changes"
# Test that we can purge the old catalog modifying transactions after restoring
# them from the serialized snapshot. The first checkpoint will serialize the list
# of two catalog modifying xacts. The purpose of the second checkpoint is to allow
# partial pruning of the list of catalog modifying xact. The third checkpoint
# followed by get_changes establishes a restart_point at the first checkpoint LSN.
# The last get_changes will start decoding from the first checkpoint which
# restores the list of catalog modifying xacts and then while decoding the second
# checkpoint record it prunes one of the xacts in that list and when decoding the
# next checkpoint, it will completely prune that list.
permutation "s0_init" "s0_begin" "s0_truncate" "s2_begin" "s2_truncate" "s1_checkpoint" "s1_get_changes" "s0_commit" "s0_begin" "s0_insert" "s1_checkpoint" "s1_get_changes" "s2_commit" "s1_checkpoint" "s1_get_changes" "s0_commit" "s1_get_changes"
# Test that we can handle the case where there is no association between top-level
# transaction and its subtransactions. The last decoding restarts from the first
# checkpoint, decodes NEW_CID generated by "s0_insert2", and marks the subtransaction
# as containing catalog changes while adding tuple cids to its top-level transaction.
# During that, both transaction entries are created in ReorderBuffer as top-level
# transactions and have the same LSN. We check if the assertion check for the order
# of transaction LSNs in AssertTXNLsnOrder() is skipped since we are still before the
# LSN at which we start replaying the contents of transactions. Besides, when decoding
# the commit record of the top-level transaction, we must force the top-level
# transaction to do timetravel since one of its subtransactions has been marked as
# containing catalog changes.
permutation "s0_init" "s0_begin" "s0_savepoint" "s0_insert" "s1_checkpoint" "s1_get_changes" "s0_insert2" "s0_commit" "s0_begin" "s0_insert" "s1_checkpoint" "s1_get_changes" "s0_commit" "s1_get_changes"