Fix order of shutdown cleanup operations in PostgresNode.pm.

Previously, database clusters created by a TAP test were shut down by
DESTROY methods attached to the PostgresNode objects representing them.
The trouble with that is that if the objects survive into the final global
destruction phase (which they do), Perl executes the DESTROY methods in an
unspecified order.  Thus, the order of shutdown of multiple clusters was
indeterminate, which might lead to not-very-reproducible errors getting
logged (eg from a slave whose master might or might not get killed first).
Worse, the File::Temp objects representing the temporary PGDATA directories
might get destroyed before the PostgresNode objects, resulting in attempts
to delete PGDATA directories that still have live servers in them.  On
Windows, this would lead to directory deletion failures; on Unix, it
usually had no effects worse than erratic "could not open temporary
statistics file "pg_stat/global.tmp": No such file or directory" log
messages.

While none of this would affect the reported result of the TAP test, which
is already determined, it could be very confusing when one is trying to
understand from the logs what went wrong with a failed test.

To fix, do the postmaster shutdowns in an END block rather than at object
destruction time.  The END block will execute at a well-defined (and
reasonable) time during script termination, and it will stop the
postmasters in order of PostgresNode object creation.  (Perhaps we should
change that to be reverse order of creation, but the main point here is
that we now have control which we did not before.)  Use "pg_ctl stop", not
an asynchronous kill(SIGQUIT), so that we wait for the postmasters to shut
down before proceeding with directory deletion.

Deletion of temporary directories still happens in an unspecified order
during global destruction, but I can see no reason to care about that
once the postmasters are stopped.
This commit is contained in:
Tom Lane 2016-04-26 12:43:03 -04:00
parent 82311bcdd7
commit 08af921906

View File

@ -662,6 +662,7 @@ sub stop
my $pgdata = $self->data_dir;
my $name = $self->name;
$mode = 'fast' unless defined $mode;
return unless defined $self->{_pid};
print "### Stopping node \"$name\" using mode $mode\n";
TestLib::system_log('pg_ctl', '-D', $pgdata, '-m', $mode, 'stop');
$self->{_pid} = undef;
@ -826,8 +827,8 @@ sub _update_pid
Build a new PostgresNode object, assigning a free port number. Standalone
function that's automatically imported.
We also register the node, to avoid the port number from being reused
for another node even when this one is not active.
Remembers the node, to prevent its port number from being reused for another
node, and to ensure that it gets shut down when the test script exits.
You should generally use this instead of PostgresNode::new(...).
@ -889,14 +890,21 @@ sub get_new_node
return $node;
}
# Attempt automatic cleanup
sub DESTROY
# Automatically shut down any still-running nodes when the test script exits.
# Note that this just stops the postmasters (in the same order the nodes were
# created in). Temporary PGDATA directories are deleted, in an unspecified
# order, later when the File::Temp objects are destroyed.
END
{
my $self = shift;
my $name = $self->name;
return unless defined $self->{_pid};
print "### Signalling QUIT to $self->{_pid} for node \"$name\"\n";
TestLib::system_log('pg_ctl', 'kill', 'QUIT', $self->{_pid});
# take care not to change the script's exit value
my $exit_code = $?;
foreach my $node (@all_nodes)
{
$node->teardown_node;
}
$? = $exit_code;
}
=pod