postgresql/src/test
Alvaro Herrera 20b6552242 Fix freezing of a dead HOT-updated tuple
Vacuum calls page-level HOT prune to remove dead HOT tuples before doing
liveness checks (HeapTupleSatisfiesVacuum) on the remaining tuples.  But
concurrent transaction commit/abort may turn DEAD some of the HOT tuples
that survived the prune, before HeapTupleSatisfiesVacuum tests them.
This happens to activate the code that decides to freeze the tuple ...
which resuscitates it, duplicating data.

(This is especially bad if there's any unique constraints, because those
are now internally violated due to the duplicate entries, though you
won't know until you try to REINDEX or dump/restore the table.)

One possible fix would be to simply skip doing anything to the tuple,
and hope that the next HOT prune would remove it.  But there is a
problem: if the tuple is older than freeze horizon, this would leave an
unfrozen XID behind, and if no HOT prune happens to clean it up before
the containing pg_clog segment is truncated away, it'd later cause an
error when the XID is looked up.

Fix the problem by having the tuple freezing routines cope with the
situation: don't freeze the tuple (and keep it dead).  In the cases that
the XID is older than the freeze age, set the HEAP_XMAX_COMMITTED flag
so that there is no need to look up the XID in pg_clog later on.

An isolation test is included, authored by Michael Paquier, loosely
based on Daniel Wood's original reproducer.  It only tests one
particular scenario, though, not all the possible ways for this problem
to surface; it be good to have a more reliable way to test this more
fully, but it'd require more work.
In message https://postgr.es/m/20170911140103.5akxptyrwgpc25bw@alvherre.pgsql
I outlined another test case (more closely matching Dan Wood's) that
exposed a few more ways for the problem to occur.

Backpatch all the way back to 9.3, where this problem was introduced by
multixact juggling.  In branches 9.3 and 9.4, this includes a backpatch
of commit e5ff9fefcd50 (of 9.5 era), since the original is not
correctable without matching the coding pattern in 9.5 up.

Reported-by: Daniel Wood
Diagnosed-by: Daniel Wood
Reviewed-by: Yi Wen Wong, Michaël Paquier
Discussion: https://postgr.es/m/E5711E62-8FDF-4DCA-A888-C200BF6B5742@amazon.com
2017-09-28 16:44:01 +02:00
..
authentication Post-PG 10 beta1 pgperltidy run 2017-05-17 19:01:23 -04:00
examples Update copyright via script for 2017 2017-01-03 13:48:53 -05:00
isolation Fix freezing of a dead HOT-updated tuple 2017-09-28 16:44:01 +02:00
ldap src/test/ldap: Fix test function in Linux port 2017-09-16 00:39:37 +02:00
locale Clean up Perl code according to perlcritic 2017-03-27 08:18:22 -04:00
mb Fix MB regression tests for WAL-logging of hash indexes. 2017-03-15 07:25:36 -04:00
modules Test BRIN autosummarization 2017-09-23 14:15:06 +02:00
perl Turn on log_replication_commands in PostgresNode 2017-09-26 16:05:25 -04:00
recovery Ten-second timeout in 013_crash_restart.pl is not enough, let's try 60. 2017-09-23 12:56:31 -04:00
regress Fix behavior when converting a float infinity to numeric. 2017-09-27 17:05:53 -04:00
ssl Remove incorrect comment 2017-06-17 10:19:48 +02:00
subscription Handle heap rewrites better in logical replication 2017-09-26 10:13:43 -04:00
thread Avoid use of bool in thread_test.c 2017-09-14 13:27:54 -04:00
Makefile Add LDAP authentication test suite 2017-09-15 11:44:29 -04:00
README Add TAP tests for password-based authentication methods. 2017-03-17 11:34:16 +02:00

README

PostgreSQL tests
================

This directory contains a variety of test infrastructure as well as some of the
tests in PostgreSQL. Not all tests are here -- in particular, there are more in
individual contrib/ modules and in src/bin.

Not all these tests get run by "make check". Check src/test/Makefile to see
which tests get run automatically.

authentication/
  Tests for authentication

examples/
  Demonstration programs for libpq that double as regression tests via
  "make check"

isolation/
  Tests for concurrent behavior at the SQL level

locale/
  Sanity checks for locale data, encodings, etc

mb/
  Tests for multibyte encoding (UTF-8) support

modules/
  Extensions used only or mainly for test purposes, generally not suitable
  for installing in production databases

perl/
  Infrastructure for Perl-based TAP tests

recovery/
  Test suite for recovery and replication

regress/
  PostgreSQL's main regression test suite, pg_regress

ssl/
  Tests to exercise and verify SSL certificate handling

subscription/
  Tests for logical replication

thread/
  A thread-safety-testing utility used by configure