Commit Graph

5 Commits

Author SHA1 Message Date
Tom Lane 741d7f1047 Use annotations to reduce instability of isolation-test results.
We've long contended with isolation test results that aren't entirely
stable.  Some test scripts insert long delays to try to force stable
results, which is not terribly desirable; but other erratic failure
modes remain, causing unrepeatable buildfarm failures.  I've spent a
fair amount of time trying to solve this by improving the server-side
support code, without much success: that way is fundamentally unable
to cope with diffs that stem from chance ordering of arrival of
messages from different server processes.

We can improve matters on the client side, however, by annotating
the test scripts themselves to show the desired reporting order
of events that might occur in different orders.  This patch adds
three types of annotations to deal with (a) test steps that might or
might not complete their waits before the isolationtester can see them
waiting; (b) test steps in different sessions that can legitimately
complete in either order; and (c) NOTIFY messages that might arrive
before or after the completion of a step in another session.  We might
need more annotation types later, but this seems to be enough to deal
with the instabilities we've seen in the buildfarm.  It also lets us
get rid of all the long delays that were previously used, cutting more
than a minute off the runtime of the isolation tests.

Back-patch to all supported branches, because the buildfarm
instabilities affect all the branches, and because it seems desirable
to keep isolationtester's capabilities the same across all branches
to simplify possible future back-patching of tests.

Discussion: https://postgr.es/m/327948.1623725828@sss.pgh.pa.us
2021-06-22 21:43:12 -04:00
Tom Lane 3992188c2a Revert "Still further tweaking of deadlock isolation tests."
This reverts commit d03130d378.
That was dependent on an isolationtester.c change that now proves
to be broken; we will need to find another solution.
2016-02-12 17:02:59 -05:00
Tom Lane d03130d378 Still further tweaking of deadlock isolation tests.
It turns out that there is a second race condition in the new deadlock-hard
test: once the deadlock detector fires, it's uncertain whether step s7a8 or
step s8a1 will report first, because killing s8's transaction unblocks s7.
So far, s7 has only been seen to report first in CLOBBER_CACHE_ALWAYS
builds, but it's pretty reproducible there, and in theory it should
sometimes occur in normal builds too.  If s7 were a bit slower than usual,
that could also break the test, since the existing expected-file assumes
that we'll see s7a8 report the first time we check it after s8a1 completes.
To fix, add a post-lock delay to s7a8.
2016-02-12 14:19:57 -05:00
Tom Lane b11d07b6a3 Make new deadlock isolation test more reproducible.
The original formulation of 4c9864b9b4
was extremely timing-sensitive, because it arranged for the deadlock
detector to be running (and possibly unblocking the current query)
at almost exactly the same time as isolationtester would be probing
to see if the query is blocked.  The committed expected-file assumed
that the deadlock detection would finish first, but we see the opposite
on both fast and slow buildfarm animals.  Adjust the deadlock timeout
settings to make it predictable that isolationtester *will* see the
query as waiting before deadlock detection unblocks it.

I used a 5s timeout for the same reasons mentioned in
a7921f71a3.
2016-02-11 11:59:11 -05:00
Robert Haas 4c9864b9b4 Add some isolation tests for deadlock detection and resolution.
Previously, we had no test coverage for the deadlock detector.
2016-02-11 08:38:09 -05:00