Commit Graph

67 Commits

Author SHA1 Message Date
Magnus Hagander d7754822c5 Silence some warnings in TAP tests
Author: Michael Paquier
2018-04-09 21:46:17 +02:00
Stephen Frost c37b3d08ca Allow group access on PGDATA
Allow the cluster to be optionally init'd with read access for the
group.

This means a relatively non-privileged user can perform a backup of the
cluster without requiring write privileges, which enhances security.

The mode of PGDATA is used to determine whether group permissions are
enabled for directory and file creates.  This method was chosen as it's
simple and works well for the various utilities that write into PGDATA.

Changing the mode of PGDATA manually will not automatically change the
mode of all the files contained therein.  If the user would like to
enable group access on an existing cluster then changing the mode of all
the existing files will be required.  Note that pg_upgrade will
automatically change the mode of all migrated files if the new cluster
is init'd with the -g option.

Tests are included for the backend and all the utilities which operate
on the PG data directory to ensure that the correct mode is set based on
the data directory permissions.

Author: David Steele <david@pgmasters.net>
Reviewed-By: Michael Paquier, with discussion amongst many others.
Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net
2018-04-07 17:45:39 -04:00
Stephen Frost da9b580d89 Refactor dir/file permissions
Consolidate directory and file create permissions for tools which work
with the PG data directory by adding a new module (common/file_perm.c)
that contains variables (pg_file_create_mode, pg_dir_create_mode) and
constants to initialize them (0600 for files and 0700 for directories).

Convert mkdir() calls in the backend to MakePGDirectory() if the
original call used default permissions (always the case for regular PG
directories).

Add tests to make sure permissions in PGDATA are set correctly by the
tools which modify the PG data directory.

Authors: David Steele <david@pgmasters.net>,
         Adam Brightwell <adam.brightwell@crunchydata.com>
Reviewed-By: Michael Paquier, with discussion amongst many others.
Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net
2018-04-07 17:45:39 -04:00
Peter Eisentraut fde03e8b55 Use croak instead of die in Perl code when appropriate 2018-02-24 14:54:16 -05:00
Peter Eisentraut bbd3363e12 Refactor subscription tests to use PostgresNode's wait_for_catchup
This was nearly the same code.  Extend wait_for_catchup to allow waiting
for pg_current_wal_lsn() and use that in the subscription tests.  Also
change one use in the pg_rewind tests to use this.

Also remove some broken code in wait_for_catchup and
wait_for_slot_catchup.  The error message in case the waiting failed
wanted to show the current LSN, but the way it was written never
worked.  So since nobody ever cared, just remove it.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2018-01-11 13:35:38 -05:00
Simon Riggs 6271fceb8a Add TIMELINE to backup_label file
Allows new test to confirm timelines match

Author: Michael Paquier
Reviewed-by: David Steele
2018-01-06 12:24:19 +00:00
Peter Eisentraut 5eb8bf2d42 Remove wal_keep_segments from default configuration in PostgresNode.pm
This is only used in the pg_rewind tests, so only set it there.  It's
better if other tests run closer to a default configuration.

Author: Michael Paquier <michael.paquier@gmail.com>
2017-11-02 12:38:59 -04:00
Peter Eisentraut 43588f58aa Turn on log_replication_commands in PostgresNode
This is useful for example for the pg_basebackup and related tests.
2017-09-26 16:05:25 -04:00
Tom Lane ed8a7c6fcf Add much-more-extensive TAP tests for pgbench.
Fabien Coelho, reviewed by Nikolay Shaplov and myself

Discussion: https://postgr.es/m/alpine.DEB.2.20.1704171422500.4025@lancre
2017-09-08 09:32:50 -04:00
Peter Eisentraut 90627cf98a Support retaining data dirs on successful TAP tests
This moves the data directories from using temporary directories with
randomness in the directory name to a static name, to make it easier to
debug.  The data directory will be retained if tests fail or the test
code dies/exits with failure, and is automatically removed on the next
make check.

If the environment variable PG_TEST_NOCLEAN is defined, the data
directories will be retained regardless of test or exit status.

Author: Daniel Gustafsson <daniel@yesql.se>
2017-09-05 12:24:06 -04:00
Tom Lane 21d304dfed Final pgindent + perltidy run for v10. 2017-08-14 17:29:33 -04:00
Alvaro Herrera 54dacc7466 Make PostgresNode easily subclassable
This module becomes much more useful if we allow it to be used as base
class for external projects.  To achieve this, change the exported
get_new_node function into a class method instead, and use the standard
Perl idiom of accepting the class as first argument.  This method works
as expected for subclasses.  The standalone function is kept for
backwards compatibility, though it could be removed in pg11.

Author: Chap Flackman, based on an earlier patch from Craig Ringer
Discussion: https://postgr.es/m/CAMsr+YF8kO+4+K-_U4PtN==2FndJ+5Bn6A19XHhMiBykEwv0wA@mail.gmail.com
2017-07-25 18:51:47 -04:00
Andrew Dunstan cde11fa3c0 Improve legibility of numeric literal 2017-07-17 15:35:46 -04:00
Andrew Dunstan 6c6970a280 Use usleep instead of select for timeouts in PostgresNode.pm
select() for pure timeouts is not portable, and in particular doesn't
work on Windows.

Discussion: https://postgr.es/m/186943e0-3405-978d-b19d-9d3335427c86@2ndQuadrant.com
2017-07-17 15:22:37 -04:00
Tom Lane efdb4f29ba Fix bug in PostgresNode::query_hash's split() call.
By default, Perl's split() function drops trailing empty fields,
which is not what we want here.  Oversight in commit fb093e4cb.
We'd managed to miss it thus far thanks to the very limited usage
of this function.

Discussion: https://postgr.es/m/14837.1499029831@sss.pgh.pa.us
2017-07-02 17:22:09 -04:00
Tom Lane de3de0afd7 Improve TAP test function PostgresNode::poll_query_until().
Add an optional "expected" argument to override the default assumption
that we're waiting for the query to return "t".  This allows replacing
a handwritten polling loop in recovery/t/007_sync_rep.pl with use of
poll_query_until(); AFAICS that's the only remaining ad-hoc polling
loop in our TAP tests.

Change poll_query_until() to probe ten times per second not once per
second.  Like some similar changes I've been making recently, the
one-second interval seems to be rooted in ancient traditions rather
than the actual likely wait duration on modern machines.  I'd consider
reducing it further if there were a convenient way to spawn just one
psql for the whole loop rather than one per probe attempt.

Discussion: https://postgr.es/m/12486.1498938782@sss.pgh.pa.us
2017-07-02 14:03:41 -04:00
Tom Lane b0f069d931 Clean up misuse and nonuse of poll_query_until().
Several callers of PostgresNode::poll_query_until() neglected to check
for failure; I do not think that's optional.  Also, rewrite one place
that had reinvented poll_query_until() for no very good reason.
2017-07-01 14:25:09 -04:00
Tom Lane 2710ccd782 Reduce wal_retrieve_retry_interval in applicable TAP tests.
By default, wal_retrieve_retry_interval is five seconds, which is far
more than is needed in any of our TAP tests, leaving the test cases
just twiddling their thumbs for significant stretches.  Moreover,
because it's so large, we get basically no testing of the retry-before-
master-is-ready code path.  Hence, make PostgresNode::init set up
wal_retrieve_retry_interval = '500ms' as part of its customization of
test clusters' postgresql.conf.  This shaves quite a few seconds off
the runtime of the recovery TAP tests.

Back-patch into 9.6.  We have wal_retrieve_retry_interval in 9.5,
but the test infrastructure isn't there.

Discussion: https://postgr.es/m/31624.1498500416@sss.pgh.pa.us
2017-06-26 19:01:26 -04:00
Bruce Momjian ce55481032 Post-PG 10 beta1 pgperltidy run 2017-05-17 19:01:23 -04:00
Peter Eisentraut c1a7f64b4a Replace "transaction log" with "write-ahead log"
This makes documentation and error messages match the renaming of "xlog"
to "wal" in APIs and file naming.
2017-05-12 11:52:43 -04:00
Tom Lane d10c626de4 Rename WAL-related functions and views to use "lsn" not "location".
Per discussion, "location" is a rather vague term that could refer to
multiple concepts.  "LSN" is an unambiguous term for WAL locations and
should be preferred.  Some function names, view column names, and function
output argument names used "lsn" already, but others used "location",
as well as yet other terms such as "wal_position".  Since we've already
renamed a lot of things in this area from "xlog" to "wal" for v10,
we may as well incur a bit more compatibility pain and make these names
all consistent.

David Rowley, minor additional docs hacking by me

Discussion: https://postgr.es/m/CAKJS1f8O0njDKe8ePFQ-LK5-EjwThsDws6ohJ-+c6nWK+oUxtg@mail.gmail.com
2017-05-11 11:49:59 -04:00
Andrew Dunstan 33f3bbc6d3 Fix TAP infrastructure to support Mingw better
archive_command and restore_command need to refer to Windows paths, not
Msys virtual file system paths, as postgres is completely unaware of the
latter, so prefix them with the Windows path to the virtual file system
root. Clean psql and pg_recvlogical output of carriage returns.
2017-04-23 09:21:38 -04:00
Tom Lane 7d68f2281a Make PostgresNode.pm check server status more carefully.
PostgresNode blithely ignored the exit status of pg_ctl, and in general
made no effort to be sure that the server was running when it should be.
This caused it to miss server crashes, which is a serious shortcoming
in a test scaffold.  Make it complain if pg_ctl fails, and modify the
start and stop logic to complain if the server doesn't start, or doesn't
stop, when expected.

Also, have it turn off the "restart_after_crash" configuration parameter
in created clusters, as bitter experience has shown that leaving that on
can mask crashes too.

We might at some point need variant functions that allow for, eg,
server start failure to be expected.  But no existing test case appears
to want that, and it surely shouldn't be the default behavior.

Note that this *will* break the buildfarm, as it will expose known
bugs that the previous testing failed to.  I'm committing it despite
that, to verify that we get the expected failures in the buildfarm
not just in manual testing.

Back-patch into 9.6 where PostgresNode was introduced.  (The 9.6
branch is not expected to show any failures.)

Discussion: https://postgr.es/m/21432.1492886428@sss.pgh.pa.us
2017-04-22 18:18:25 -04:00
Tom Lane 8a19c1a373 Make PostgresNode::append_conf append a newline automatically.
Although the documentation for append_conf said clearly that it didn't
add a newline, many test authors seem to have forgotten that ... or maybe
they just consulted the example at the top of the POD documentation,
which clearly shows adding a config entry without bothering to add a
trailing newline.  The worst part of that is that it works, as long as
you don't do it more than once, since the backend isn't picky about
whether config files end with newlines.  So there's not a strong forcing
function reminding test authors not to do it like that.  Upshot is that
this is a terribly fragile way to go about things, and there's at least
one existing test case that is demonstrably broken and not testing what
it thinks it is.

Let's just make append_conf append a newline, instead; that is clearly
way safer than the old definition.

I also cleaned up a few call sites that were unnecessarily ugly.
(I left things alone in places where it's plausible that additional
config lines would need to be added someday.)

Back-patch the change in append_conf itself to 9.6 where it was added,
as having a definitional inconsistency between branches would obviously
be pretty hazardous for back-patching TAP tests.  The other changes are
just cosmetic and don't need to be back-patched.

Discussion: https://postgr.es/m/19751.1492892376@sss.pgh.pa.us
2017-04-22 16:58:15 -04:00
Peter Eisentraut 3371e4d9b1 Change default of log_directory to 'log'
The previous default 'pg_log' might have indicated by its "pg_" prefix
that it is an internal system directory.  The new default is more in
line with the typical naming of directories with user-facing log files.
Together with the renaming of pg_clog and pg_xlog, this should clear up
that difference.

Author: Andreas Karlsson <andreas@proxel.se>
2017-03-27 10:34:33 -04:00
Peter Eisentraut facde2a98f Clean up Perl code according to perlcritic
Fix all perlcritic warnings of severity level 5, except in
src/backend/utils/Gen_dummy_probes.pl, which is automatically generated.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
2017-03-27 08:18:22 -04:00
Simon Riggs eb2a6131be Add a pg_recvlogical wrapper to PostgresNode
Allows testing of logical decoding using SQL interface and/or pg_recvlogical
Most logical decoding tests are in contrib/test_decoding. This module
is for work that doesn't fit well there, like where server restarts
are required.

Craig Ringer
2017-03-21 14:04:49 +00:00
Peter Eisentraut be37c2120a Enable replication connections by default in pg_hba.conf
initdb now initializes a pg_hba.conf that allows replication connections
from the local host, same as it does for regular connections.  The
connecting user still needs to have the REPLICATION attribute or be a
superuser.

The intent is to allow pg_basebackup from the local host to succeed
without requiring additional configuration.

Michael Paquier <michael.paquier@gmail.com> and me
2017-03-09 08:39:44 -05:00
Peter Eisentraut 231f48796b Fix timeouts in PostgresNode::psql
Newer Perl or IPC::Run versions default to appending the filename to string
exceptions, e.g. the exception

    psql timed out

 is thrown as

    psql timed out at /usr/share/perl5/vendor_perl/IPC/Run.pm line 2961.

To handle this, match exceptions with !~ rather than ne.

From: Craig Ringer <craig@2ndquadrant.com>
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
2017-03-01 14:18:51 -05:00
Robert Haas 806091c96f Remove all references to "xlog" from SQL-callable functions in pg_proc.
Commit f82ec32ac3 renamed the pg_xlog
directory to pg_wal.  To make things consistent, and because "xlog" is
terrible terminology for either "transaction log" or "write-ahead log"
rename all SQL-callable functions that contain "xlog" in the name to
instead contain "wal".  (Note that this may pose an upgrade hazard for
some users.)

Similarly, rename the xlog_position argument of the functions that
create slots to be called wal_position.

Discussion: https://www.postgresql.org/message-id/CA+Tgmob=YmA=H3DbW1YuOXnFVgBheRmyDkWcD9M8f=5bGWYEoQ@mail.gmail.com
2017-02-09 15:10:09 -05:00
Peter Eisentraut 665d1fad99 Logical replication
- Add PUBLICATION catalogs and DDL
- Add SUBSCRIPTION catalog and DDL
- Define logical replication protocol and output plugin
- Add logical replication workers

From: Petr Jelinek <petr@2ndquadrant.com>
Reviewed-by: Steve Singer <steve@ssinger.info>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Erik Rijkers <er@xs4all.nl>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
2017-01-20 09:04:49 -05:00
Magnus Hagander f6d6d2920d Change default values for backup and replication parameters
This changes the default values of the following parameters:

wal_level = replica
max_wal_senders = 10
max_replication_slots = 10

in order to make it possible to make a backup and set up simple
replication on the default settings, without requiring a system restart.

Discussion: https://postgr.es/m/CABUevEy4PR_EAvZEzsbF5s+V0eEvw7shJ2t-AUwbHOjT+yRb3A@mail.gmail.com

Reviewed by Peter Eisentraut. Benchmark help from Tomas Vondra.
2017-01-14 17:14:56 +01:00
Peter Eisentraut 05cd12ed5b pg_ctl: Change default to wait for all actions
The different actions in pg_ctl had different defaults for -w and -W,
mostly for historical reasons.  Most users will want the -w behavior, so
make that the default.

Remove the -w option in most example and test code, so avoid confusion
and reduce verbosity.  pg_upgrade is not touched, so it can continue to
work with older installations.

Reviewed-by: Beena Emerson <memissemerson@gmail.com>
Reviewed-by: Ryan Murphy <ryanfmurphy@gmail.com>
2017-01-14 09:15:08 -05:00
Peter Eisentraut 750c59d7ec Fix mistake in comment
The node->restart() function doesn't take a mode argument.
2017-01-12 10:24:10 -05:00
Simon Riggs 2e44f379bc Fix format for TAP test docs
Small number of fixes to perl docs for TAP tests.
Plus two comments that use "xlog" rather than WAL

Michael Paquier
2017-01-05 10:07:59 +00:00
Simon Riggs fb093e4cb3 Allow PostgresNode.pm tests to wait for catchup
Add methods to the core test framework PostgresNode.pm to allow us to
test that standby nodes have caught up with the master, as well as
basic LSN handling.  Used in tests recovery/t/001_stream_rep.pl and
recovery/t/004_timeline_switch.pl

Craig Ringer, reviewed by Aleksander Alekseev and Simon Riggs
2017-01-04 16:50:23 +00:00
Magnus Hagander 9a4d51077c Make wal streaming the default mode for pg_basebackup
Since streaming is now supported for all output formats, make this the
default as this is what most people want.

To get the old behavior, the parameter -X none can be specified to turn
it off.

This also removes the parameter -x for fetch, now requiring -X fetch to
be specified to use that.

Reviewed by Vladimir Rusinov, Michael Paquier and Simon Riggs
2017-01-04 10:40:38 +01:00
Peter Eisentraut e5a9bcb529 Use pg_ctl promote -w in TAP tests
Switch TAP tests to use the new wait mode of pg_ctl promote.  This
allows avoiding extra logic with poll_query_until() to be sure that a
promoted standby is ready for read-write queries.

From: Michael Paquier <michael.paquier@gmail.com>
2016-10-19 09:18:50 -04:00
Peter Eisentraut 5d58c07a44 initdb pg_basebackup: Rename --noxxx options to --no-xxx
--noclean and --nosync were the only options spelled without a hyphen,
so change this for consistency with other options.  The options in
pg_basebackup have not been in a release, so we just rename them.  For
initdb, we retain the old variants.

Vik Fearing and me
2016-10-19 08:48:48 -04:00
Robert Haas 61f9e7ba3c Update obsolete comments and perldoc.
Loose ends from commit 2a0f89cd71.

Daniel Gustafsson
2016-10-05 13:09:52 -04:00
Peter Eisentraut a4327296df Set log_line_prefix and application name in test drivers
Before pg_regress runs psql, set the application name to the test name.
Similarly, set the application name to the test file name in the TAP
tests.  Also, set a default log_line_prefix that show the application
name, as well as the PID and a time stamp.

That way, the server log output can be correlated to the test input
files, making debugging a bit easier.
2016-09-30 21:32:33 -04:00
Peter Eisentraut 728a3e73e9 Switch pg_basebackup commands in Postgres.pm to use --nosync
On slow machines, this greatly reduces the I/O pressure induced by the
tests.

From: Michael Paquier <michael.paquier@gmail.com>
2016-09-29 12:00:00 -04:00
Peter Eisentraut 8b845520fb Add tests for various connection string issues
Add tests for consistent support of connection strings in frontend
programs as well as proper handling of unusual characters in database
and user names.  These tests were developed for the issues of
CVE-2016-5424.

To allow testing of names with spaces, change the pg_regress
command-line options --create-role and --dbname to split their arguments
by comma only, not space or comma as before.  Only commas were actually
used in existing uses.

Noah Misch, Michael Paquier, Peter Eisentraut
2016-09-22 12:00:00 -04:00
Tom Lane b5bce6c1ec Final pgindent + perltidy run for 9.6. 2016-08-15 13:42:51 -04:00
Alvaro Herrera 2a0f89cd71 Give recovery tests more time to finish
These tests are currently only running in buildfarm member hamster,
which is purposefully very slow.  This suite has failed a couple of
times recently because of timeouts, so increase the allowed number of
iterations to avoid spurious failures.

Author: Michaël Paquier
2016-07-25 01:34:35 -04:00
Tom Lane 30b2731bd2 Fix TAP tests and MSVC scripts for pathnames with spaces.
Change assorted places in our Perl code that did things like
	system("prog $path/file");
to do it more like
	system('prog', "$path/file");
which is safe against spaces and other special characters in the path
variable.  The latter was already the prevailing style, but a few bits
of code hadn't gotten this memo.  Back-patch to 9.4 as relevant.

Michael Paquier, Kyotaro Horiguchi

Discussion: <20160704.160213.111134711.horiguchi.kyotaro@lab.ntt.co.jp>
2016-07-09 16:47:38 -04:00
Noah Misch 3be0a62ffe Finish pgindent run for 9.6: Perl files. 2016-06-12 04:19:56 -04:00
Tom Lane 08af921906 Fix order of shutdown cleanup operations in PostgresNode.pm.
Previously, database clusters created by a TAP test were shut down by
DESTROY methods attached to the PostgresNode objects representing them.
The trouble with that is that if the objects survive into the final global
destruction phase (which they do), Perl executes the DESTROY methods in an
unspecified order.  Thus, the order of shutdown of multiple clusters was
indeterminate, which might lead to not-very-reproducible errors getting
logged (eg from a slave whose master might or might not get killed first).
Worse, the File::Temp objects representing the temporary PGDATA directories
might get destroyed before the PostgresNode objects, resulting in attempts
to delete PGDATA directories that still have live servers in them.  On
Windows, this would lead to directory deletion failures; on Unix, it
usually had no effects worse than erratic "could not open temporary
statistics file "pg_stat/global.tmp": No such file or directory" log
messages.

While none of this would affect the reported result of the TAP test, which
is already determined, it could be very confusing when one is trying to
understand from the logs what went wrong with a failed test.

To fix, do the postmaster shutdowns in an END block rather than at object
destruction time.  The END block will execute at a well-defined (and
reasonable) time during script termination, and it will stop the
postmasters in order of PostgresNode object creation.  (Perhaps we should
change that to be reverse order of creation, but the main point here is
that we now have control which we did not before.)  Use "pg_ctl stop", not
an asynchronous kill(SIGQUIT), so that we wait for the postmasters to shut
down before proceeding with directory deletion.

Deletion of temporary directories still happens in an unspecified order
during global destruction, but I can see no reason to care about that
once the postmasters are stopped.
2016-04-26 12:43:03 -04:00
Tom Lane 40e89e2ab8 Try harder to detect a port conflict in PostgresNode.pm.
Commit fab84c7787 tried to get away without doing an actual bind(),
but buildfarm results show that that doesn't get the job done.  So we must
really bind to the target port --- and at least on my Linux box, we need a
listen() as well, or conflicts won't be detected.  We rely on SO_REUSEADDR
to prevent problems from starting a postmaster on the socket immediately
after we've bound to it in the test code.  (There may be platforms where
that doesn't work too well.  But fortunately, we only really care whether
this works on Windows, and there the default behavior should be OK.)
2016-04-25 12:28:49 -04:00
Tom Lane fab84c7787 Improve PostgresNode.pm's logic for detecting already-in-use ports.
Buildfarm members bowerbird and jacana have shown intermittent "could not
bind IPv4 socket" failures in the BinInstallCheck stage since mid-December,
shortly after commits 1caef31d9e and 9821492ee4 changed the
logic for selecting which port to use in temporary installations.  One
plausible explanation is that we are randomly selecting ports that are
already in use for some non-Postgres purpose.  Although the code tried
to defend against already-in-use ports, it used pg_isready to probe
the port which is quite unhelpful: if some non-Postgres server responds
at the given address, pg_isready will generally say "no response",
leading to exactly the wrong conclusion about whether the port is free.

Instead, let's use a simple TCP connect() call to see if anything answers
without making assumptions about what it is.  Note that this means there's
no direct check for a conflicting Unix socket, but that should be okay
because there should be no other Unix sockets in use in the temporary
socket directory created for a test run.

This is only a partial solution for the TCP case, since if the port number
is in use for an outgoing connection rather than a listening socket, we'll
fail to detect that.  We could try to bind() to the proposed port as a
means of detecting that case, but that would introduce its own failure
modes, since the system might consider the address to remain reserved for
some period of time after we drop the bound socket.  Close study of the
errors returned by bowerbird and jacana suggests that what we're seeing
there may be conflicts with listening not outgoing sockets, so let's try
this and see if it improves matters.  It's certainly better than what's
there now, in any case.

Michael Paquier, adjusted by me to work on non-Windows as well as Windows
2016-04-24 15:31:45 -04:00