The previous method for doing that was to write zeroes into a
predetermined set of page locations. However, there's a roughly
1-in-64K chance that the existing checksum will match by chance,
and yesterday several buildfarm animals started to reproducibly
see that, resulting in test failures because no checksum mismatch
was reported.
Since the checksum includes the page LSN, test success depends on
the length of the installation's WAL history, which is affected by
(at least) the initial catalog contents, the set of locales installed
on the system, and the length of the pathname of the test directory.
Sooner or later we were going to hit a chance match, and today is
that day.
Harden these tests by specifically inverting the checksum field and
leaving all else alone, thereby guaranteeing that the checksum is
incorrect.
In passing, fix places that were using seek() to set up for syswrite(),
a combination that the Perl docs very explicitly warn against. We've
probably escaped problems because no regular buffered I/O is done on
these filehandles; but if it ever breaks, we wouldn't deserve or get
much sympathy.
Although we've only seen problems in HEAD, now that we recognize the
environmental dependencies it seems like it might be just a matter
of time until someone manages to hit this in back-branch testing.
Hence, back-patch to v11 where we started doing this kind of test.
Discussion: https://postgr.es/m/3192026.1648185780@sss.pgh.pa.us
Incrementing the level of the call stack reported is useful for
debugging purposes as it allows to control which part of the test is
exactly failing, especially if a test is structured with subroutines
that call routines from Test::More.
This adds more incrementations of $Test::Builder::Level where debugging
gets improved (for example it does not make sense for some paths like
pg_rewind where long subroutines are used).
A note is added to src/test/perl/README about that, based on a
suggestion from Andrew Dunstan and a wording coming from both of us.
Usage of Test::Builder::Level has spread in 12, so a backpatch down to
this version is done.
Reviewed-by: Andrew Dunstan, Peter Eisentraut, Daniel Gustafsson
Discussion: https://postgr.es/m/YV1CCFwgM1RV1LeS@paquier.xyz
Backpatch-through: 12
Also "make reformat-dat-files".
The only change worthy of note is that pgindent messed up the formatting
of launcher.c's struct LogicalRepWorkerId, which led me to notice that
that struct wasn't used at all anymore, so I just took it out.
A couple of tests have been using 0 as magic constant while SEEK_SET can
be used instead. This makes the code easier to understand, and more
consistent with the changes done in 3c5b068.
Per discussion with Andrew Dunstan.
Discussion: https://postgr.es/m/YHrc24AgJQ6tQ1q0@paquier.xyz
Commit 866e24d47d added an assert that HEAP_XMAX_LOCK_ONLY and
HEAP_KEYS_UPDATED cannot appear together, on the faulty assumption that
the latter necessarily referred to an update and not a tuple lock; but
that's wrong, because SELECT FOR UPDATE can use precisely that
combination, as evidenced by the amcheck test case added here.
Remove the Assert(), and also patch amcheck's verify_heapam.c to not
complain if the combination is found. Also, out of overabundance of
caution, update (across all branches) README.tuplock to be more explicit
about this.
Author: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Mahendra Singh Thalor <mahi6run@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Discussion: https://postgr.es/m/20210124061758.GA11756@nol
By default VACUUM will skip pages that it can't immediately get
exclusive access to, which means that even activities as harmless
and unpredictable as checkpoint buffer writes might prevent a page
from being processed. Ordinarily this is no big deal, but we have
a small number of test cases that examine the results of VACUUM's
processing and therefore will fail if the page of interest is skipped.
This seems to be the explanation for some rare buildfarm failures.
To fix, add the DISABLE_PAGE_SKIPPING option to the VACUUM commands
in tests where this could be an issue.
In passing, remove a duplicated query in pageinspect/sql/page.sql.
Back-patch as necessary (some of these cases are as old as v10).
Discussion: https://postgr.es/m/413923.1611006484@sss.pgh.pa.us
verify_heapam() wasn't being careful to sanity-check tuple line
pointers before using them, resulting in SIGBUS on alignment-picky
architectures. Fix that, add some more test coverage.
Mark Dilger, some tweaking by me
Discussion: https://postgr.es/m/30B8E99A-2D9C-48D4-A55C-741C9D5F1563@enterprisedb.com