Get rid of the separate "FATAL" log level, as it was applied
so inconsistently as to be meaningless. This mostly involves
s/pg_log_fatal/pg_log_error/g.
Create a macro pg_fatal() to handle the common use-case of
pg_log_error() immediately followed by exit(1). Various
modules had already invented either this or equivalent macros;
standardize on pg_fatal() and apply it where possible.
Invent the ability to add "detail" and "hint" messages to a
frontend message, much as we have long had in the backend.
Except where rewording was needed to convert existing coding
to detail/hint style, I have (mostly) resisted the temptation
to change existing message wording.
Patch by me. Design and patch reviewed at various stages by
Robert Haas, Kyotaro Horiguchi, Peter Eisentraut and
Daniel Gustafsson.
Discussion: https://postgr.es/m/1363732.1636496441@sss.pgh.pa.us
Exclusive-mode backups have been deprecated since 9.6 (when
non-exclusive backups were introduced) due to the issues
they can cause should the system crash while one is running and
generally because non-exclusive provides a much better interface.
Further, exclusive backup mode wasn't really being tested (nor was most
of the related code- like being able to log in just to stop an exclusive
backup and the bits of the state machine related to that) and having to
possibly deal with an exclusive backup and the backup_label file
existing during pg_basebackup, pg_rewind, etc, added other complexities
that we are better off without.
This patch removes the exclusive backup mode, the various special cases
for dealing with it, and greatly simplifies the online backup code and
documentation.
Authors: David Steele, Nathan Bossart
Reviewed-by: Chapman Flack
Discussion: https://postgr.es/m/ac7339ca-3718-3c93-929f-99e725d1172c@pgmasters.nethttps://postgr.es/m/CAHg+QDfiM+WU61tF6=nPZocMZvHDzCK47Kneyb0ZRULYzV5sKQ@mail.gmail.com
Before this patch, there was some code that tried to make sure that the
buffer was always big enough at the start, and then asserted that it
didn't need to be enlarged later. However, the code to make sure it was
big enough at the start doesn't actually work, and therefore it was
possible to fail an assertion and crash later.
Remove the code that tries to make sure the buffer is always big enough
at the start in favor of enlarging the buffer as we go along whenever
that is necessary.
The mistake probably happened because, on the server side, we do
actually need to guarantee that the buffer is big enough at the start
to avoid subsequent resizings. However, in that case, the calling
code makes promises about how much data it will provide at once, but
here, that's not the case.
Report by Justin Pryzby. Analysis by me. Patch by Dipesh Pandit.
Discussion: http://postgr.es/m/20220330143536.GG28503@telsasoft.com
libzstd allows transparent parallel compression just by setting
an option when creating the compression context, so permit that
for both client and server-side backup compression. To use this,
use something like pg_basebackup --compress WHERE-zstd:workers=N
where WHERE is "client" or "server" and N is an integer.
When compression is performed on the server side, this will spawn
threads inside the PostgreSQL backend. While there is almost no
PostgreSQL server code which is thread-safe, the threads here are used
internally by libzstd and touch only data structures controlled by
libzstd.
Patch by me, based in part on earlier work by Dipesh Pandit
and Jeevan Ladhe. Reviewed by Justin Pryzby.
Discussion: http://postgr.es/m/CA+Tgmobj6u-nWF-j=FemygUhobhryLxf9h-wJN7W-2rSsseHNA@mail.gmail.com
When we try to set the zstd compression level either on the client
or on the server, check for errors.
For any algorithm, on the client side, don't try to set the compression
level unless the user specified one. This was visibly broken for
zstd, which managed to set -1 rather than 0 in this case, but tidy
up the code for the other methods, too.
On the client side, if we fail to create a ZSTD_CCtx, exit after
reporting the error. Otherwise we'll dereference a null pointer.
Patch by me, reviewed by Dipesh Pandit.
Discussion: http://postgr.es/m/CA+TgmoZK3zLQUCGi1h4XZw4jHiAWtcACc+GsdJR1_Mc19jUjXA@mail.gmail.com
The previous method for doing that was to write zeroes into a
predetermined set of page locations. However, there's a roughly
1-in-64K chance that the existing checksum will match by chance,
and yesterday several buildfarm animals started to reproducibly
see that, resulting in test failures because no checksum mismatch
was reported.
Since the checksum includes the page LSN, test success depends on
the length of the installation's WAL history, which is affected by
(at least) the initial catalog contents, the set of locales installed
on the system, and the length of the pathname of the test directory.
Sooner or later we were going to hit a chance match, and today is
that day.
Harden these tests by specifically inverting the checksum field and
leaving all else alone, thereby guaranteeing that the checksum is
incorrect.
In passing, fix places that were using seek() to set up for syswrite(),
a combination that the Perl docs very explicitly warn against. We've
probably escaped problems because no regular buffered I/O is done on
these filehandles; but if it ever breaks, we wouldn't deserve or get
much sympathy.
Although we've only seen problems in HEAD, now that we recognize the
environmental dependencies it seems like it might be just a matter
of time until someone manages to hit this in back-branch testing.
Hence, back-patch to v11 where we started doing this kind of test.
Discussion: https://postgr.es/m/3192026.1648185780@sss.pgh.pa.us
The Config and Cwd modules were no longer used, but remained imported,
in a number of tests. Remove to keep the imports to the actually used
modules.
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/A5A074CD-3198-492B-BE5E-7961EFC3733F@yesql.se
Commit ffd53659c4 messed up the
mechanism that was being used to pass parameters to LogStreamerMain()
on Windows. It worked on Linux because only Windows was using threads.
Repair by moving the additional parameters added by that commit into
the 'logstreamer_param' struct.
Along the way, fix a compiler warning on builds without HAVE_LIBZ.
Discussion: http://postgr.es/m/CA+TgmoY5=AmWOtMj3v+cySP2rR=Bt6EGyF_joAq4CfczMddKtw@mail.gmail.com
There are more compression parameters that can be specified than just
an integer compression level, so rename the new COMPRESSION_LEVEL
option to COMPRESSION_DETAIL before it gets released. Introduce a
flexible syntax for that option to allow arbitrary options to be
specified without needing to adjust the main replication grammar,
and common code to parse it that is shared between the client and
the server.
This commit doesn't actually add any new compression parameters,
so the only user-visible change is that you can now type something
like pg_basebackup --compress gzip:level=5 instead of writing just
pg_basebackup --compress gzip:5. However, it should make it easy to
add new options. If for example gzip starts offering fries, we can
support pg_basebackup --compress gzip:level=5,fries=true for the
benefit of users who want fries with that.
Along the way, this fixes a few things in pg_basebackup so that the
pg_basebackup can be used with a server-side compression algorithm
that pg_basebackup itself does not understand. For example,
pg_basebackup --compress server-lz4 could still succeed even if
only the server and not the client has LZ4 support, provided that
the other options to pg_basebackup don't require the client to
decompress the archive.
Patch by me. Reviewed by Justin Pryzby and Dagfinn Ilmari Mannsåker.
Discussion: http://postgr.es/m/CA+TgmoYvpetyRAbbg1M8b3-iHsaN4nsgmWPjOENu5-doHuJ7fA@mail.gmail.com
lz4frame.h was getting declared after the headers specific to Postgres,
but it needs to be included between postgres_fe.h and the internal
headers.
Issue introduced by babbbb5.
Reported-by: Justin Prysby
Discussion: https://postgr.es/m/20220317111220.GI28503@telsasoft.com
These tests were added recently, but older code tests USE_LZ4 rathr
than HAVE_LIBLZ4, so let's follow the established precedent. It
also seems more consistent with the intent of the configure tests,
since I think that the USE_* symbols are intended to correspond to
what the user requested, and the HAVE_* symbols to what configure
found while probing.
Discussion: http://postgr.es/m/CA+Tgmoap+hTD2-QNPJLH4tffeFE8MX5+xkbFKMU3FKBy=ZSNKA@mail.gmail.com
Fail with a suitable error message instead. We can't inject the backup
manifest into the output tarfile without decompressing it, and if
we did that, we'd have to recompress the tarfile afterwards to produce
the result the user is expecting. While we have enough infrastructure
in pg_basebackup now to accomplish that whole series of steps without
much additional code, it seems like excessively surprising behavior.
The user probably did not select server-side compression with the idea
that the client was going to end up decompressing it and then
recompressing.
Report from Justin Pryzby. Fix by me.
Discussion: http://postgr.es/m/CA+Tgmob6Rnjz-Qv32h3yJn8nnUkLhrtQDAS4y5AtsgtorAFHRA@mail.gmail.com
Both client-side compression and server-side compression are now
supported for zstd. In addition, a backup compressed by the server
using zstd can now be decompressed by the client in order to
accommodate the use of -Fp.
Jeevan Ladhe, with some edits by me.
Discussion: http://postgr.es/m/CA+Tgmobyzfbz=gyze2_LL1ZumZunmaEKbHQxjrFkOR7APZGu-g@mail.gmail.com
If the log streaming child process (thread on Windows) dies during
backup then the whole backup will be aborted at the end of the
backup. Instead, trap ungraceful termination of the log streaming
child and exit early. This also adds a TAP test for simulating this
by terminating the responsible backend.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Magnus Hagander <magnus@hagander.net>
Discussion: https://postgr.es/m/0F69E282-97F9-4DB7-8D6D-F927AA6340C8@yesql.se
Discussion: https://postgr.es/m/VI1PR83MB0189818B82C19059CB62E26199A89@VI1PR83MB0189.EURPRD83.prod.outlook.com
I have not been able to reproduce the occasional failures of
019_replslot_limit.pl we are seeing in the buildfarm and not for lack of
trying. The additional logging and increased log level will hopefully help.
Will be reverted once the cause is identified.
Discussion: https://postgr.es/m/20220218231415.c4plkp4i3reqcwip@alap3.anarazel.de
Ill-considered refactoring in 23a1c6578 led to progress_filename
sometimes pointing to data that had gone out of scope. The most
bulletproof fix is to hang onto a copy of whatever's passed in.
Compared to the work spent elsewhere per file, that's not very
expensive, plus we can skip it except in verbose logging mode.
Per buildfarm.
Discussion: https://postgr.es/m/20220212211316.GK31460@telsasoft.com
Rather than doing manual book keeping to plan the number of tests to run
in each TAP suite, conclude each run with done_testing() summing up the
the number of tests that ran. This removes the need for maintaning and
updating the plan count at the expense of an accurate count of remaining
during the test suite runtime.
This patch has been discussed a number of times, often in the context of
other patches which updates tests, so a larger number of discussions can
be found in the archives.
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/DD399313-3D56-4666-8079-88949DAC870F@yesql.se
LZ4 compression can now be performed on the client using
pg_basebackup -Ft --compress client-lz4, and LZ4 decompression of
a backup compressed on the server can be performed on the client
using pg_basebackup -Fp --compress server-lz4.
Dipesh Pandit, reviewed and tested by Jeevan Ladhe and Tushar Ahuja,
with a few corrections - and some documentation - by me.
Discussion: http://postgr.es/m/CAN1g5_FeDmiA9D8wdG2W6Lkq5CpubxOAqTmd2et9hsinTJtsMQ@mail.gmail.com
LZ4 compression can be a lot faster than gzip compression, so users
may prefer it even if the compression ratio is not as good. We will
want pg_basebackup to support LZ4 compression and decompression on the
client side as well, and there is a pending patch for that, but it's
by a different author, so I am committing this part separately for
that reason.
Jeevan Ladhe, reviewed by Tushar Ahuja and by me.
Discussion: http://postgr.es/m/CANm22Cg9cArXEaYgHVZhCnzPLfqXCZLAzjwTq7Fc0quXRPfbxA@mail.gmail.com
Commit 8e2b6d45a0 added a new unprivileged user for testing
pg_basebackup, but omitted to add them to the cluster's authorized
logins, breaking Windows tests run without using Unix sockets.
When this code executed as superuser it appeared to work because no
system catalog lookups happened, but otherwise it crashes because there
is no transaction environment. Fix that.
Report and code change by me. Test case by Dagfinn Ilmari Mannsåker.
Discussion: http://postgr.es/m/CA+TgmobiKLXne-2AVzYyWRiO8=rChBQ=7ywoxp=2SmcFw=oDDw@mail.gmail.com
Tushar Ahuja discovered that if you use both --compress and --gzip,
or --compress multiple times, the last instance of one of these
options doesn't in all cases overwrite the compression level set by
an earlier option. That's not a serious bug, but it also has nothing
to recommend it. Repair.
Discussion: http://postgr.es/m/CA+TgmoZfP=rsZB_9vDGfhuNgSu_M_09UWu8SjvsP65y_1pQFCg@mail.gmail.com
I intended to include a change to the "skip" count in the test
case, but it didn't get folded into the commit. Do that now,
so that non-zlib builds don't break.
The new file bbstreamer_gzip.c needs <unistd.h> to avoid
complaints about dup() not having a prototype, as per buildfarm
returns.
If you have a low-bandwidth connection between the client and the
server, it's reasonable to want to compress on the server side but
then decompress and extract the backup on the client side. This
commit allows you do to do just that.
Dipesh Pandit, with minor and mostly cosmetic changes by me.
Discussion: http://postgr.es/m/CAN1g5_HiSh8ajUMd4ePtGyCXo89iKZTzaNyzP_qv1eJbi4YHXA@mail.gmail.com
The server expects the compression level to be between 1 and 9, but
Z_DEFAULT_COMPRESSION is -1, so we must not try to send that value
to the server.
Because pg_basebackup's -R option is implemented on the client side,
it can't be used in combination with a backup target. Error out if
someone tries that, instead of silently ignoring the option.
Both issues were reported by Tushar Ahuja; patch by me.
Discussion: http://postgr.es/m/CA+TgmoaMwgdx8HxBjF8hmbohVvPL_0H5LqNrSq0uU+7BKp_Q2A@mail.gmail.com
Commit 0ad8032910 failed to update
the pg_basebackup documentation to mention that "client-" or
"server-" can now be prepended to the compression method name. Fix
it there, and also in the --help output that you get from running
the binary.
Also in the documentation, there's an old issue that the arguments to
--checkpoint shouldn't be marked as parameters, because "fast" and
"spread" are literal strings. Fix that too.
Dagfinn Ilmari Mannsåker, partly as per a report from
Shinoda Noriyoshi.
Discussion: http://postgr.es/m/PH7PR84MB1885C1CF433057807551172BEE5F9@PH7PR84MB1885.NAMPRD84.PROD.OUTLOOK.COM
pg_basebackup's --compression option now lets you write either
"client-gzip" or "server-gzip" instead of just "gzip" to specify
where the compression should be performed. If you write simply
"gzip" it's taken to mean "client-gzip" unless you also use
--target, in which case it is interpreted to mean "server-gzip",
because that's the only thing that makes any sense in that case.
To make this work, the BASE_BACKUP command now takes new
COMPRESSION and COMPRESSION_LEVEL options.
At present, pg_basebackup cannot decompress .gz files, so
server-side compression will cause a failure if (1) -Ft is not
used or (2) -R is used or (3) -D- is used without --no-manifest.
Along the way, I removed the information message added by commit
5c649fe153 which occurred if you
specified no compression level and told you that the default level
had been used instead. That seemed like more output than most
people would want.
Also along the way, this adds a check to the server for
unrecognized base backup options. This repairs a bug introduced
by commit 0ba281cb4b.
This commit also adds some new test cases for pg_verifybackup.
They take a server-side backup with and without compression, and
then extract the backup if we have the OS facilities available
to do so, and then run pg_verifybackup on the extracted
directory. That is a good test of the functionality added by
this commit and also improves test coverage for the backup target
patch (commit 3500ccc39b) and for
pg_verifybackup itself.
Patch by me, with a bug fix by Jeevan Ladhe. The patch set of which
this is a part has also had review and/or testing from Tushar Ahuja,
Suraj Kharage, Dipesh Pandit, and Mark Dilger.
Discussion: http://postgr.es/m/CA+Tgmoa-ST7fMLsVJduOB7Eub=2WjfpHS+QxHVEpUoinf4bOSg@mail.gmail.com
Commit 5c649fe15 introduced a memory leak into pg_basebackup's
parse_compress_options. (I simplified nearby code while at it.)
Commit 9a974cbcb introduced a memory leak into pg_dump's
binary_upgrade_set_pg_class_oids.
Coverity also complained about a call of SnapBuildProcessChange that
ignored the result, unlike every other call of that function. This
is evidently intentional, so add a (void) cast to indicate that.
(It's also old, dating to b89e15105; I suppose the reason it showed
up now is 7a5f6b474's recent rearrangement of nearby code.)
The option --compress is extended to accept a compression method and an
optional compression level, as of the grammar METHOD[:LEVEL]. The
methods currently support are "none" and "gzip", for client-side
compression. Any of those methods use only an integer value for the
compression level, but any method implemented in the future could use
more specific keywords if necessary.
This commit keeps the logic backward-compatible. Hence, the following
compatibility rules apply for the new format of the option --compress:
* -z/--gzip is a synonym of --compress=gzip.
* --compress=NUM implies:
** --compress=none if NUM = 0.
** --compress=gzip:NUM if NUM > 0.
Note that there are also plans to extend more this grammar with
server-side compression.
Reviewed-by: Robert Haas, Magnus Hagander, Álvaro Herrera, David
G. Johnston, Georgios Kokolatos
Discussion: https://postgr.es/m/Yb3GEgWwcu4wZDuA@paquier.xyz
pg_basebackup now has a --target=TARGET[:DETAIL] option. If specfied,
it is sent to the server as the value of the TARGET option to the
BASE_BACKUP command. If DETAIL is included, it is sent as the value of
the new TARGET_DETAIL option to the BASE_BACKUP command. If the
target is anything other than 'client', pg_basebackup assumes that it
will now be the server's job to write the backup in a location somehow
defined by the target, and that it therefore needs to write nothing
locally. However, the server will still send messages to the client
for progress reporting purposes.
On the server side, we now support two additional types of backup
targets. There is a 'blackhole' target, which just throws away the
backup data without doing anything at all with it. Naturally, this
should only be used for testing and debugging purposes, since you will
not actually have a backup when it finishes running. More usefully,
there is also a 'server' target, so you can now use something like
'pg_basebackup -Xnone -t server:/SOME/PATH' to write a backup to some
location on the server. We can extend this to more types of targets
in the future, and might even want to create an extensibility
mechanism for adding new target types.
Since WAL fetching is handled with separate client-side logic, it's
not part of this mechanism; thus, backups with non-default targets
must use -Xnone or -Xfetch.
Patch by me, with a bug fix by Jeevan Ladhe. The patch set of which
this is a part has also had review and/or testing from Tushar Ahuja,
Suraj Kharage, Dipesh Pandit, and Mark Dilger.
Discussion: http://postgr.es/m/CA+TgmoaYZbz0=Yk797aOJwkGJC-LK3iXn+wzzMx7KdwNpZhS5g@mail.gmail.com
In the new approach, all files across all tablespaces are sent in a
single COPY OUT operation. The CopyData messages are no longer raw
archive content; rather, each message is prefixed with a type byte
that describes its purpose, e.g. 'n' signifies the start of a new
archive and 'd' signifies archive or manifest data. This protocol
is significantly more extensible than the old approach, since we can
later create more message types, though not without concern for
backward compatibility.
The new protocol sends a few things to the client that the old one
did not. First, it sends the name of each archive explicitly, instead
of letting the client compute it. This is intended to make it easier
to write future patches that might send archives in a format other
that tar (e.g. cpio, pax, tar.gz). Second, it sends explicit progress
messages rather than allowing the client to assume that progress is
defined by the number of bytes received. This will help with future
features where the server compresses the data, or sends it someplace
directly rather than transmitting it to the client.
The old protocol is still supported for compatibility with previous
releases. The new protocol is selected by means of a new
TARGET option to the BASE_BACKUP command. Currently, the
only supported target is 'client'. Support for additional
targets will be added in a later commit.
Patch by me. The patch set of which this is a part has had review
and/or testing from Jeevan Ladhe, Tushar Ahuja, Suraj Kharage,
Dipesh Pandit, and Mark Dilger.
Discussion: http://postgr.es/m/CA+TgmoaYZbz0=Yk797aOJwkGJC-LK3iXn+wzzMx7KdwNpZhS5g@mail.gmail.com
Most tests invoking pg_basebackup themselves did not yet use -cfast, which
makes pg_basebackup take considerably longer. The only reason this didn't
cause the tests to take many minutes is that spread checkpoints only throttle
when writing out a buffer and there aren't that many dirty buffers in the
tests...
Discussion: https://postgr.es/m/20220117195711.xx4qbxutrrlmo2dg@alap3.anarazel.de
By default, wait_for_catchup() waits for the replication connection
to reach the primary's write LSN. That's fine, but in an apparent
attempt to save one query round-trip, it was coded so that we
executed pg_current_wal_lsn() again during each probe query.
Thus, we presented the standby with a moving target to be reached.
(While the test script itself couldn't be causing the write LSN
to advance while it's blocked in wait_for_catchup(), it's plenty
plausible that background activity such as autovacuum is emitting
more WAL.) That could make the test take longer than necessary,
and potentially it could mask bugs by allowing the standby to process
more WAL than a strict interpretation of the test scenario allows.
So, change wait_for_catchup() to do it "by the book", explicitly
collecting the write LSN to wait for at the outset.
Also, various call sites were instructing wait_for_catchup() to
wait for the standby to reach the primary's insert LSN rather than
its write LSN. This also seems like a bad idea. While in most
test scenarios those are the same, if they are different then the
inserted-but-not-yet-written WAL is not presently available to the
standby. The test isn't doing anything to make it become so, so
again we have the potential for unwanted test delay, perhaps even
a test timeout. (Again, background activity would be needed to
make this more than a hypothetical problem.) Hence, change the
callers where necessary so that the wait target is always the
primary's write LSN.
While at it, simplify callers by making use of wait_for_catchup's
default arguments wherever possible (the preceding change makes
this possible in more places than it was before). And rewrite
wait_for_catchup's documentation a bit.
Patch by me; thanks to Julien Rouhaud for review.
Discussion: https://postgr.es/m/2368336.1641843098@sss.pgh.pa.us
pg_basebackup.c relies on the compression level to not be 0 to decide if
compression should be used, but 000f3adf missed the fact that the
default compression (Z_DEFAULT_COMPRESSION) is -1, which would be used
if specifying --gzip without --compress.
While on it, add some coverage for --gzip, as this is rather easy to
miss.
Reported-by: Christoph Berg
Discussion: https://postgr.es/m/YdhRDMLjabtXOnhY@msg.df7cb.de
pg_basebackup is able to use gzip to compress the contents of backups
with the tar format, but there were no tests for that. This adds a
minimalistic set of tests to check the contents of such base backups,
including sanity checks on the contents generated with gzip commands.
The tests are skipped if Postgres is compiled --without-zlib, and the
checks based on the gzip command are skipped if nothing can be found,
following the same model as the existing tests for pg_receivewal.
Reviewed-by: Georgios Kokolatos
Discussion: https://postgr.es/m/Yb3GEgWwcu4wZDuA@paquier.xyz
Since d62bcc8, the directory method of walmethods.c uses the compression
method to determine which code path to take. The tar method, used by
pg_basebackup --format=t, was inconsistent regarding that, as it relied
on the compression level to check if no compression or gzip should be
used. This commit makes the code more consistent as a whole in this
file, making the tar logic use a compression method rather than
assigning COMPRESSION_NONE that would be ignored.
The options of pg_basebackup are planned to be reworked but we are not
sure yet of the shape they should have as this has some dependency with
the integration of the server-side compression for base backups, so this
is left out for the moment. This change has as benefit to make easier
the future integration of new compression methods for the tar method of
walmethods.c, for the client-side compression.
Reviewed-by: Georgios Kokolatos
Discussion: https://postgr.es/m/Yb3GEgWwcu4wZDuA@paquier.xyz
If an error occurs when writing the LZ4 file header, LZ4F_compressEnd()
was called in the error code path of write(), followed by
LZ4F_freeCompressionContext() to finish the cleanup. The code as-is was
not broken, but the LZ4F_compressEnd() proves to not be necessary as
there are no contents to flush at this stage, so remove it.
Per gripe from Jeevan Ladhe and Robert Haas.
Discussion: https://postgr.es/m/CAOgcT0PE33wbD7giAT1OSkNJt=p-vu8huq++qh=ny9O=SCP5aA@mail.gmail.com
Previously it was impossible to terminate these programs via control-C
while they were prompting for a password. We can fix that trivially
for their initial password prompts, by moving setup of the SIGINT
handler from just before to just after their initial GetConnection()
calls.
This fix doesn't permit escaping out of later re-prompts, but those
should be exceedingly rare, since the user's password or the server's
authentication setup would have to have changed meanwhile. We
considered applying a fix similar to commit 46d665bc2, but that
seemed more complicated than it'd be worth. Moreover, this way is
back-patchable, which that wasn't.
The misbehavior exists in all supported versions, so back-patch to all.
Tom Lane and Nathan Bossart
Discussion: https://postgr.es/m/747443.1635536754@sss.pgh.pa.us
The error handling here was a mess, as a result of a fundamentally
bad design (relying on errno to keep its value much longer than is
safe to assume) as well as a lot of just plain sloppiness, both as
to noticing errors at all and as to reporting the correct errno.
Moreover, the recent addition of LZ4 compression broke things
completely, because liblz4 doesn't use errno to report errors.
To improve matters, keep the error state in the DirectoryMethodData or
TarMethodData struct, and add a string field so we can handle cases
that don't set errno. (The tar methods already had a version of this,
but it can be done more efficiently since all these cases use a
constant error string.) Make the dir and tar methods handle errors
in basically identical ways, which they didn't before.
This requires copying errno into the state struct in a lot of places,
which is a bit tedious, but it has the virtue that we can get rid of
ad-hoc code to save and restore errno in a number of places ... not
to mention that it fixes other places that should've saved/restored
errno but neglected to.
In passing, fix some pointlessly static buffers to be ordinary
local variables.
There remains an issue about exactly how to handle errors from
fsync(), but that seems like material for its own patch.
While the LZ4 problems are new, all the rest of this is fixes for
old bugs, so backpatch to v10 where walmethods.c was introduced.
Patch by me; thanks to Michael Paquier for review.
Discussion: https://postgr.es/m/1343113.1636489231@sss.pgh.pa.us
Coverity complained that applying get_gz_error after a failed gzclose,
as we did in one place in pg_basebackup, is unsafe. I think it's
right: it's entirely likely that the call is touching freed memory.
Change that to inspect errno, as we do for other gzclose calls.
Also, be careful to initialize errno to zero immediately before any
gzclose() call where we care about the error status. (There are
some calls where we don't, because we already failed at some previous
step.) This ensures that we don't get a misleadingly irrelevant
error code if gzclose() fails in a way that doesn't set errno.
We could work harder at that, but it looks to me like all such cases
are basically can't-happen if we're not misusing zlib, so it's
not worth the extra notational cruft that would be required.
Also, fix several places that simply failed to check for close-time
errors at all, mostly at some remove from the close or gzclose itself;
and one place that did check but didn't bother to report the errno.
Back-patch to v12. These mistakes are older than that, but between
the frontend logging API changes that happened in v12 and the fact
that frontend code can't rely on %m before that, the patch would need
substantial revision to work in older branches. It doesn't quite
seem worth the trouble given the lack of related field complaints.
Patch by me; thanks to Michael Paquier for review.
Discussion: https://postgr.es/m/1343113.1636489231@sss.pgh.pa.us
Earlier versions of PostgreSQL featured a version of pg_basebackup
that wanted to edit tar archives but was too dumb to parse them
properly. The server made things easier for the client by failing
to add the two blocks of zero bytes that ought to end a tar file,
leaving it up to the client to do that.
But since commit 23a1c6578c, we
don't need this hack any more, because pg_basebackup is now smarter
and can parse tar files even if they are properly terminated! So
change the server to always properly terminate the tar files. Older
versions of pg_basebackup can't talk to new servers anyway, so
there's no compatibility break.
On the pg_basebackup side, we see still need to add the terminating
zero bytes if we're talking to an older server, but not when the
server is v15+. Hopefully at some point we'll be able to remove
some of this compatibility cruft, but it seems best to hang on to
it for now.
In passing, add a file header comment to bbstreamer_tar.c, to make
it clearer what's going on here.
Discussion: http://postgr.es/m/CA+TgmoZbNzsWwM4BE5Jb_qHncY817DYZwGf+2-7hkMQ27ZwsMQ@mail.gmail.com
Commit 23a1c6578c improved
pg_basebackup's ability to parse tar archives, but also arranged
to parse them only when we need to make some modification to the
contents of the archive. That's a problem, because the server
doesn't actually terminate tar archives. When the new parsing
logic was engaged, pg_basebackup would properly terminate the
tar file, but when it was skipped, pg_basebackup would just write
whatever it got from the server, meaning that the terminator
was missing.
Most versions of tar are willing to overlook the missing terminator, but
the AIX buildfarm animals were not. Fix by inventing a new kind of
bbstreamer that just blindly adds a terminator, and using it whenever we
don't parse the tar archive.
Discussion: http://postgr.es/m/CA+TgmoZbNzsWwM4BE5Jb_qHncY817DYZwGf+2-7hkMQ27ZwsMQ@mail.gmail.com
pg_basebackup knows how to do quite a few things with a backup that it
gets from the server, like just write out the files, or compress them
first, or even parse the tar format and inject a modified
postgresql.auto.conf file into the archive generated by the server.
Unforatunely, this makes pg_basebackup.c a very large source file, and
also somewhat difficult to enhance, because for example the knowledge
that the server is sending us a 'tar' file rather than some other sort
of archive is spread all over the place rather than centralized.
In an effort to improve this situation, this commit invents a new
'bbstreamer' abstraction. Each archive received from the server is
fed to a bbstreamer which may choose to dispose of it or pass it
along to some other bbstreamer. Chunks may also be "labelled"
according to whether they are part of the payload data of a file
in the archive or part of the archive metadata.
So, for example, if we want to take a tar file, modify the
postgresql.auto.conf file it contains, and the gzip the result
and write it out, we can use a bbstreamer_tar_parser to parse the
tar file received from the server, a bbstreamer_recovery_injector
to modify the contents of postgresql.auto.conf, a
bbstreamer_tar_archiver to replace the tar headers for the file
modified in the previous step with newly-built ones that are
correct for the modified file, and a bbstreamer_gzip_writer to
gzip and write the resulting data. Only the objects with "tar"
in the name know anything about the tar archive format, and in
theory we could re-archive using some other format rather than
"tar" if somebody wanted to write the code.
These chances do add a substantial amount of code, but I think the
result is a lot more maintainable and extensible. pg_basebackup.c
itself shrinks by roughly a third, with a lot of the complexity
previously contained there moving into the newly-added files.
Patch by me. The larger patch series of which this is a part has been
reviewed and tested at various times by Andres Freund, Sumanta
Mukherjee, Dilip Kumar, Suraj Kharage, Dipesh Pandit, Tushar Ahuja,
Mark Dilger, Sergei Kornilov, and Jeevan Ladhe.
Discussion: https://postgr.es/m/CA+TgmoZGwR=ZVWFeecncubEyPdwghnvfkkdBe9BLccLSiqdf9Q@mail.gmail.com
Discussion: https://postgr.es/m/CA+TgmoZvqk7UuzxsX1xjJRmMGkqoUGYTZLDCH8SmU1xTPr1Xig@mail.gmail.com
pg_receivewal gains a new option, --compression-method=lz4, available
when the code is compiled with --with-lz4. Similarly to gzip, this
gives the possibility to compress archived WAL segments with LZ4. This
option is not compatible with --compress.
The implementation uses LZ4 frames, and is compatible with simple lz4
commands. Like gzip, using --synchronous ensures that any data will be
flushed to disk within the current .partial segment, so as it is
possible to retrieve as much WAL data as possible even from a
non-completed segment (this requires completing the partial file with
zeros up to the WAL segment size supported by the backend after
decompression, but this is the same as gzip).
The calculation of the streaming start LSN is able to transparently find
and check LZ4-compressed segments. Contrary to gzip where the
uncompressed size is directly stored in the object read, the LZ4 chunk
protocol does not store the uncompressed data by default. There is
contentSize that can be used with LZ4 frames by that would not help if
using an archive that includes segments compressed with the defaults of
a "lz4" command, where this is not stored. So, this commit has taken
the most extensible approach by decompressing the already-archived
segment to check its uncompressed size, through a blank output buffer in
chunks of 64kB (no actual performance difference noticed with 8kB, 16kB
or 32kB, and the operation in itself is actually fast).
Tests have been added to verify the creation and correctness of the
generated LZ4 files. The latter is achieved by the use of command
"lz4", if found in the environment.
The tar-based WAL method in walmethods.c, used now only by
pg_basebackup, does not know yet about LZ4. Its code could be extended
for this purpose.
Author: Georgios Kokolatos
Reviewed-by: Michael Paquier, Jian Guo, Magnus Hagander, Dilip Kumar
Discussion: https://postgr.es/m/ZCm1J5vfyQ2E6dYvXz8si39HQ2gwxSZ3IpYaVgYa3lUwY88SLapx9EEnOf5uEwrddhx2twG7zYKjVeuP5MwZXCNPybtsGouDsAD1o2L_I5E=@pm.me
The option name was incorrect in one of the error messages, and the
short option 'I' was used in the code but we did not intend things to be
this way. While on it, fix the documentation to refer to a "method",
and not a "level.
Oversights in commit d62bcc8, that I have detected after more review of
the LZ4 patch for pg_receivewal.
pg_receivewal includes since cada1af the option --compress, to allow the
compression of WAL segments using gzip, with a value of 0 (the default)
meaning that no compression can be used.
This commit introduces a new option, called --compression-method, able
to use as values "none", the default, and "gzip", to make things more
extensible. The case of --compress=0 becomes fuzzy with this option
layer, so we have made the choice to make pg_receivewal return an error
when using "none" and a non-zero compression level, meaning that the
authorized values of --compress are now [1,9] instead of [0,9]. Not
specifying --compress with "gzip" as compression method makes
pg_receivewal use the default of zlib instead (Z_DEFAULT_COMPRESSION).
The code in charge of finding the streaming start LSN when scanning the
existing archives is refactored and made more extensible. While on it,
rename "compression" to "compression_level" in walmethods.c, to reduce
the confusion with the introduction of the compression method, even if
the tar method used by pg_basebackup does not rely on the compression
method (yet, at least), but just on the compression level (this area
could be improved more, actually).
This is in preparation for an upcoming patch that adds LZ4 support to
pg_receivewal.
Author: Georgios Kokolatos
Reviewed-by: Michael Paquier, Jian Guo, Magnus Hagander, Dilip Kumar,
Robert Haas
Discussion: https://postgr.es/m/ZCm1J5vfyQ2E6dYvXz8si39HQ2gwxSZ3IpYaVgYa3lUwY88SLapx9EEnOf5uEwrddhx2twG7zYKjVeuP5MwZXCNPybtsGouDsAD1o2L_I5E=@pm.me
pg_receivewal is able to follow a timeline switch, but this was not
tested. This test uses an empty archive location with a restart done
from a slot, making its implementation a tad simpler than if we would
reuse an existing archive directory.
Author: Ronan Dunklau
Reviewed-by: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/18708360.4lzOvYHigE@aivenronan
This commit improves the speed of those tests by 25~30%, using some
simple ideas to reduce the amount of data written by pg_receivewal:
- Use a segment size of 1MB. While reducing the amount of data zeroed
by pg_receivewal for the new segments, this improves the code coverage
with a non-default segment size.
- In the last test involving a slot's restart_lsn, generate a checkpoint
to advance the redo LSN and the WAL retained by the slot created,
reducing the number of segments that need to be archived. This counts
for most of the gain.
- Minimize the amount of data inserted into the dummy table.
Reviewed-by: Ronan Dunklau
Discussion: https://postgr.es/m/YXqYKAdVEqmyTltK@paquier.xyz
Prior to this patch, when running pg_receivewal, the streaming start
point would be the current location of the archives if anything is
found in the local directory where WAL segments are written, and
pg_receivewal would fall back to the current WAL flush location if there
are no archives, as of the result of an IDENTIFY_SYSTEM command.
If for some reason the WAL files from pg_receivewal were moved, it is
better to try a restart where we left at, which is the replication
slot's restart_lsn instead of skipping right to the current flush
location, to avoid holes in the WAL backed up. This commit changes
pg_receivewal to use the following sequence of methods to determine the
starting streaming LSN:
- Scan the local archives.
- Use the slot's restart_lsn, if supported by the backend and if a slot
is defined.
- Fallback to the current flush LSN as reported by IDENTIFY_SYSTEM.
To keep compatibility with older server versions, we only attempt to use
READ_REPLICATION_SLOT if the backend version is at least 15, and
fallback to the older behavior of streaming from the current flush
LSN if the command is not supported.
Some TAP tests are added to cover this feature.
Author: Ronan Dunklau
Reviewed-by: Kyotaro Horiguchi, Michael Paquier, Bharath Rupireddy
Discussion: https://postgr.es/m/18708360.4lzOvYHigE@aivenronan
The five modules in our TAP test framework all had names in the top
level namespace. This is unwise because, even though we're not
exporting them to CPAN, the names can leak, for example if they are
exported by the RPM build process. We therefore move the modules to the
PostgreSQL::Test namespace. In the process PostgresNode is renamed to
Cluster, and TestLib is renamed to Utils. PostgresVersion becomes simply
PostgreSQL::Version, to avoid possible confusion about what it's the
version of.
Discussion: https://postgr.es/m/aede93a4-7d92-ef26-398f-5094944c2504@dunslane.net
Reviewed by Erik Rijkers and Michael Paquier
Make sure that the string parsing is limited by the size of the
destination buffer.
In pg_basebackup the available values sent from the server
is limited to two characters so there was no risk of overflow.
In pg_dump the buffer is bounded by MAXPGPATH, and thus the limit
must be inserted via preprocessor expansion and the buffer increased
by one to account for the terminator. There is no risk of overflow
here, since in this case, the buffer scanned is smaller than the
destination buffer.
Backpatch the pg_basebackup fix to 11 where it was introduced, and
the pg_dump fix all the way down to 9.6.
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/B14D3D7B-F98C-4E20-9459-C122C67647FB@yesql.se
Backpatch-through: 11 and 9.6
Like BASE_BACKUP, CREATE_REPLICATION_SLOT has historically used a
hard-coded syntax. To improve future extensibility, adopt a flexible
options syntax here, too.
In the new syntax, instead of three mutually exclusive options
EXPORT_SNAPSHOT, USE_SNAPSHOT, and NOEXPORT_SNAPSHOT, there is now a single
SNAPSHOT option with three possible values: 'export', 'use', and 'nothing'.
This commit does not remove support for the old syntax. It just adds
the new one as an additional option, makes pg_receivewal,
pg_recvlogical, and walreceiver processes use it.
Patch by me, reviewed by Fabien Coelho, Sergei Kornilov, and
Fujii Masao.
Discussion: http://postgr.es/m/CA+TgmobAczXDRO_Gr2euo_TxgzaH1JxbNxvFx=HYvBinefNH8Q@mail.gmail.com
Discussion: http://postgr.es/m/CA+TgmoZGwR=ZVWFeecncubEyPdwghnvfkkdBe9BLccLSiqdf9Q@mail.gmail.com
Previously, BASE_BACKUP used an entirely hard-coded syntax, but that's
hard to extend. Instead, adopt the same kind of syntax we've used for
SQL commands such as VACUUM, ANALYZE, COPY, and EXPLAIN, where it's
not necessary for all of the option names to be parser keywords.
In the new syntax, most of the options now take an optional Boolean
argument. To match our practice in other in places, the options which
the old syntax called NOWAIT and NOVERIFY_CHECKSUMS options are in the
new syntax called WAIT and VERIFY_CHECKUMS, and the default value is
false. In the new syntax, the FAST option has been replaced by a
CHECKSUM option whose value may be 'fast' or 'spread'.
This commit does not remove support for the old syntax. It just adds
the new one as an additional option, and makes pg_basebackup prefer
the new syntax when the server is new enough to support it.
Patch by me, reviewed and tested by Fabien Coelho, Sergei Kornilov,
Fujii Masao, and Tushar Ahuja.
Discussion: http://postgr.es/m/CA+TgmobAczXDRO_Gr2euo_TxgzaH1JxbNxvFx=HYvBinefNH8Q@mail.gmail.com
Discussion: http://postgr.es/m/CA+TgmoZGwR=ZVWFeecncubEyPdwghnvfkkdBe9BLccLSiqdf9Q@mail.gmail.com
The following things are improved:
- Fetch the system identifier from the source server before any
WAL streaming loop. This triggers extra checks to make sure that
pg_receivewal is still connected to a server with the same system ID
with a correct timeline.
- Switch umask() (for file creation mode mask) and RetrieveWalSegSize()
(to fetch the size of WAL segments) a bit later before the initial
stream attempt. If the connection was done with a database,
pg_receivewal would fail but those commands were still executed, which
was a waste. The slot creation and drop are now done before retrieving
the segment size.
Author: Bharath Rupireddy
Reviewed-by: Ronan Dunklau, Michael Paquier
Discussion: https://postgr.es/m/CALj2ACX00YYeyBfoi55Cy=NrP-FcfMgiYYx1qRUEib3yjCVoaA@mail.gmail.com
A WAL segment closed during a WAL stream for pg_receivewal would
generate incorrect error messages depending on the context, as the file
name used when referring to a WAL segment ignored partial files or the
compression method used. In such cases, the error message generated
(failure on close, seek or rename) would not match a physical file
name. The same code paths are used by pg_basebackup, but it uses no
partial suffix so it is not impacted.
7fbe0c8 has introduced in walmethods.c a callback to get the exact
physical file name used for a given context, this commit makes use of it
to improve those error messages. This could be extended to more code
paths of pg_basebackup/ in the future, if necessary.
Extracted from a larger patch by the same author.
Author: Georgios Kokolatos
Discussion: https://postgr.es/m/ZCm1J5vfyQ2E6dYvXz8si39HQ2gwxSZ3IpYaVgYa3lUwY88SLapx9EEnOf5uEwrddhx2twG7zYKjVeuP5MwZXCNPybtsGouDsAD1o2L_I5E=@pm.me
Also remove obsolete comments about why the 64-bit integers need to be
printed in a separate buffer. The reason used to be portability, but
now the remaining reason is that we need the string lengths for the
progress displays. That is evident by looking at the code right
below, so a new comment doesn't seem necessary.
0c013e0 has done a large refactoring to unify all the code paths using
replication commands, but forgot one code path doing WAL streaming that
checks the validity of a cluster connecting to with IDENTIFY_SYSTEM.
There is a generic routine able to handle that, so make use of it in
this code path. This impacts pg_receivewal and pg_basebackup.
Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACVKKYUMC8GE72Y7BP9g1batrrq3sEwUh+1_i2krWZC_2Q@mail.gmail.com
This reverts commit 6a2c532. fairywren and bowerbird failed those tests
because of incorrect versions of ZLIB linked to, causing errors like
SIGBREAKs that stopped buildfarm runs or EACCES failures when writing
compressed WAL segments.
Andrew Dunstan has done all the investigation here, so he deserves all
the credit for being able to enable those tests on Windows.
Discussion: https://postgr.es/m/9040d5ed-6462-66a4-07ac-2923785ae563@dunslane.net
Those tests are not designed to fail, but if they do, like on some cases
for Windows because of ZLIB (?), they could remain stuck. Using
--no-loop makes the test fail immediately. The oldest test with
--endpos already did that.
Those tests have been added in ffc9dda.
Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/ec093ff1-a53c-0091-46a2-4537354b0dd4@dunslane.net
Certain versions of msys2/Windows have been observed to resolve symlinks
in perl2host rather than just follow them. This defeats using a
symlinked shorter path to a longer path, and makes certain tests fail.
We therefore call perl2host on the parent directory of the symlink and
thereafter just use that result.
Apply to release 14 where the problem has been observed.
There is only one constructor now for PostgresNode, with the idiomatic
name 'new'. The method is not exported by the class, and must be called
as "PostgresNode->new('name',[args])". All the TAP tests that use
PostgresNode are modified accordingly. Third party scripts will need
adjusting, which is a fairly mechanical process (I just used a sed
script).
The following changes are done:
- In pg_archivecleanup, the cleanup of older WAL segments would never
fail immediately.
- In pgbench, the initialization of a thread barrier would not fail
hard.
- In pg_recvlogical, a stat() failure never got the call.
- In pg_basebackup, two chmod() reported a failure without exit()'ing
when unpacking some tar data freshly received. It may be possible to
continue writing some data even after this failure, but that could be
confusing to the user at the end.
These are arguably bugs, but they would happen for code paths where a
failure is unlikely going to happen, so no backpatch is done.
Reviewed-by: Robert Haas, Fabien Coelho
Discussion: https://postgr.es/m/YQDMdB+B68yePFeT@paquier.xyz
A check in the ZLIB portion of the test to match the name of a
non-compressed partial segment with a completed compressed segment was
using m//, while a simple equality check is enough. This makes the test
a bit stricter without impacting its coverage.
Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20210726.174622.826565852378770261.horikyota.ntt@gmail.com
strtoint(), via strtol(), would skip leading whitespaces but the same
rule was not applied for trailing whitespaces, leading to an
inconsistent behavior. Some tests are changed to cover more this area.
Author: Michael Paquier
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/YP5Pv0d13Ct+03ve@paquier.xyz
These have been introduced by 7fbe0c8, and could happen for
pg_basebackup and pg_receivewal.
Per report from Coverity for the ones in walmethods.c, I have spotted
the ones in receivelog.c after more review.
Backpatch-through: 10
This fixes two compilation failures caused by 6f164e6. Interesting to
see that missing <limits.h> dies not fail in Linux or even Windows. On
MacOS, it fails, though.
Per various buildfarm members.
Most of the integer options for command-line binaries now make use of a
single routine able to do the job, fixing issues with the detection of
sloppy values caused for example by the use of atoi(), that fails on
strings beginning with numerical characters with junk trailing
characters.
This commit cuts down the number of strings requiring translation by 26
per my count, switching the code to have two error types for invalid and
out-of-range values instead.
Much more could be done here, with float or even int64 options, but
int32 was the most appealing case as it is possible to rely on strtol()
to do the job reliably. Note that there are some exceptions for now,
like pg_ctl or pg_upgrade that use their own logging logic. A couple of
negative TAP tests required some adjustments for the new errors
generated.
pg_dump and pg_restore tracked the maximum number of parallel jobs
within the option parsing. The code is refactored a bit to track that
in the code dedicated to parallelism instead.
Author: Kyotaro Horiguchi, Michael Paquier
Reviewed-by: David Rowley, Álvaro Herrera
Discussion: https://postgr.es/m/CALj2ACXqdG9WhqVoJ9zYf-iZt7sgK7Szv5USs=he6NnWQ2ofTA@mail.gmail.com
This reverts commit 91d395f, to avoid running those tests on Windows.
The tests are globally stable across all buildfarm members, except
fairywren (crash of pg_receivewal) and bowerdird (SIGBREAK preventing
the buildfarm run to complete). Those errors are rather strange, as
other hosts with very similar characteristics are able to run those
tests without breaking a sweat.
For now, disable those tests on Windows to turn back the buildfarm to
green.
Per discussion with Andrew Dunstan.
Discussion: https://postgr.es/m/9040d5ed-6462-66a4-07ac-2923785ae563@dunslane.net
This is a revert of 6cea447, that disabled those tests temporarily on
Windows due to failures with bowerbird where gzflush() would fail when
executed on a freshly-opened compressed and partial segment. This
problem should be taken care of now thanks to 7fbe0c8, so let's see what
the buildfarm has to say on Windows for those tests.
Discussion: https://postgr.es/m/YPDLz2x3o1aX2wRh@paquier.xyz
The logic handling the opening of new WAL segments was fuzzy when using
--compress if a partial, non-compressed, segment with the same base name
existed in the repository storing those files. In this case, using
--compress would cause the code to first check for the existence and the
size of a non-compressed segment, followed by the opening of a new
compressed, partial, segment. The code was accidentally working
correctly on most platforms as the buildfarm has proved, except
bowerbird where gzflush() could fail in this code path. It is wrong
anyway to take the code path used pre-padding when creating a new
partial, non-compressed, segment, so let's fix it.
Note that this issue exists when users mix successive runs of
pg_receivewal with or without compression, as discovered with the tests
introduced by ffc9dda.
While on it, this refactors the code so as code paths that need to know
about the ".gz" suffix are down from four to one in walmethods.c, easing
a bit the introduction of new compression methods. This addresses a
second issue where log messages generated for an unexpected failure
would not show the compressed segment name involved, which was
confusing, printing instead the name of the non-compressed equivalent.
Reported-by: Georgios Kokolatos
Discussion: https://postgr.es/m/YPDLz2x3o1aX2wRh@paquier.xyz
Backpatch-through: 10
As reported by buildfarm member bowerbird, those tests are unstable on
Windows. The failure produced there points to a problem with gzflush(),
that fails to sync a file freshly-opened, with a gzFile properly
opened. While testing this myself with MSVC, I bumped into a different
error where a file could simply not be opened, so this makes me rather
doubtful that testing this area on Windows is a good idea if this
finishes with random concurrency failures. This requires more
investigation, and keeping this buildfarm member red is not a good thing
in the long-term, so for now this just disables this set of tests on
Windows.
Discussion: https://postgr.es/m/YPDLz2x3o1aX2wRh@paquier.xyz
The OpenBSD implementation of gzip considers only files suffixed by "Z",
"gz", "z", "tgz" or "taz" as valid targets, discarding anything else and
making a command using --test exit with an error code of 512 if anything
invalid is found. The test introduced in ffc9dda tested a WAL segment
suffixed as .gz.partial, enough to make the test fail.
Testing only a full segment is fine enough in terms of coverage, so
simplify the code by discarding the .gz.partial segment in this check.
This should be enough to make the test pass with OpenBSD environments.
Per report from curculio.
Discussion: https://postgr.es/m/YPAdf9r5aJbDoHoq@paquier.xyz
There is a non-trivial amount of code that handles ZLIB compression in
pg_receivewal, from basics like the format name, the calculation of the
start streaming position and of course the compression itself, but there
was no automated coverage for it.
This commit introduces a set of conditional tests (if the build supports
ZLIB) to cover the creation of ZLIB-compressed WAL segments, the
handling of the partial, compressed, WAL segments and the compression
operation in itself. Note that there is an extra phase checking the
validity of the generated files by using directly a gzip command, passed
down by the Makefile of pg_receivewal. This part is skipped if the
command cannot be found, something likely going to happen on Windows
with MSVC except if one sets the variable GZIP_PROGRAM in the
environment of the test.
This set of tests will become handy for upcoming patches that add more
options for the compression methods used by pg_receivewal, like LZ4, to
make sure that no existing facilities are broken.
Author: Georgios Kokolatos
Reviewed-by: Gilles Darold, Michael Paquier
Discussion: https://postgr.es/m/07BK3Mk5aEOsTwGaY77qBVyf9GjoEzn8TMgHLyPGfEFPIpTEmoQuP2P4c7teesjSg-LPeUafsp1flnPeQYINMSMB_UpggJDoduB5EDYBqaQ=@protonmail.com
Extend the replication command CREATE_REPLICATION_SLOT to support the
TWO_PHASE option. This will allow decoding commands like PREPARE
TRANSACTION, COMMIT PREPARED and ROLLBACK PREPARED for slots created with
this option. The decoding of the transaction happens at prepare command.
This patch also adds support of two-phase in pg_recvlogical via a new
option --two-phase.
This option will also be used by future patches that allow streaming of
transactions at prepare time for built-in logical replication. With this,
the out-of-core logical replication solutions can enable replication of
two-phase transactions via replication protocol.
Author: Ajin Cherian
Reviewed-By: Jeff Davis, Vignesh C, Amit Kapila
Discussion:
https://postgr.es/m/02DA5F5E-CECE-4D9C-8B4B-418077E2C010@postgrespro.ruhttps://postgr.es/m/64b9f783c6e125f18f88fbc0c0234e34e71d8639.camel@j-davis.com
Also "make reformat-dat-files".
The only change worthy of note is that pgindent messed up the formatting
of launcher.c's struct LogicalRepWorkerId, which led me to notice that
that struct wasn't used at all anymore, so I just took it out.
Instead, put them in via a format placeholder. This reduces the
number of distinct translatable messages and also reduces the chances
of typos during translation. We already did this for the system call
arguments in a number of cases, so this is just the same thing taken a
bit further.
Discussion: https://www.postgresql.org/message-id/flat/92d6f545-5102-65d8-3c87-489f71ea0a37%40enterprisedb.com
A couple of tests have been using 0 as magic constant while SEEK_SET can
be used instead. This makes the code easier to understand, and more
consistent with the changes done in 3c5b068.
Per discussion with Andrew Dunstan.
Discussion: https://postgr.es/m/YHrc24AgJQ6tQ1q0@paquier.xyz
Buildfarm member fairywren doesn't like the test case I added
in commit 081876d75. I'm guessing the reason is that I shouldn't
be using a perl2host-ified path in the tar command line.
The existing test script does run pg_basebackup with the -Ft option,
but it makes no real attempt to verify the sanity of the results.
We wouldn't know if the output is incompatible with standard "tar"
programs, nor if the server fails to start from the restored output.
Notably, this means that xlog.c's read_tablespace_map() is not being
meaningfully tested, since that code is used only in the tar-format
case. (We do have reasonable coverage of restoring from plain-format
output, though it's over in src/test/recovery not here.)
Hence, attempt to untar the output and start a server from it,
rather just hoping it's OK.
This test assumes that the local "tar" has the "-C directory"
switch. Although that's not promised by POSIX, my research
suggests that all non-extinct tar implementations have it.
Should the buildfarm's opinion differ, we can complicate the
test a bit to avoid requiring that.
Possibly this should be back-patched, but I'm unsure about
whether it could work on Windows before d66b23b03.
CREATE TABLE AS has been preferred over SELECT INTO (outside of ecpg
and PL/pgSQL) for a long time. There were still a few uses of SELECT
INTO in tests and documentation, some old, some more recent. This
changes them to CREATE TABLE AS. Some occurrences in the tests remain
where they are specifically testing SELECT INTO parsing or similar.
Discussion: https://www.postgresql.org/message-id/flat/96dc0df3-e13a-a85d-d045-d6e2c85218da%40enterprisedb.com
Existing code used various inconsistent ways to printf struct stat's
st_size member. The type of that is off_t, which is in most cases a
signed 64-bit integer, so use the long long int format for it.
This patch started out with the goal of harmonizing various arbitrary
limits on password length, but after awhile a better idea emerged:
let's just get rid of those fixed limits.
recv_password_packet() has an arbitrary limit on the packet size,
which we don't really need, so just drop it. (Note that this doesn't
really affect anything for MD5 or SCRAM password verification, since
those will hash the user's password to something shorter anyway.
It does matter for auth methods that require a cleartext password.)
Likewise remove the arbitrary error condition in pg_saslprep().
The remaining limits are mostly in client-side code that prompts
for passwords. To improve those, refactor simple_prompt() so that
it allocates its own result buffer that can be made as big as
necessary. Actually, it proves best to make a separate routine
pg_get_line() that has essentially the semantics of fgets(), except
that it allocates a suitable result buffer and hence will never
return a truncated line. (pg_get_line has a lot of potential
applications to replace randomly-sized fgets buffers elsewhere,
but I'll leave that for another patch.)
I built pg_get_line() atop stringinfo.c, which requires moving
that code to src/common/; but that seems fine since it was a poor
fit for src/port/ anyway.
This patch is mostly mine, but it owes a good deal to Nathan Bossart
who pressed for a solution to the password length problem and
created a predecessor patch. Also thanks to Peter Eisentraut and
Stephen Frost for ideas and discussion.
Discussion: https://postgr.es/m/09512C4F-8CB9-4021-B455-EF4C4F0D55A0@amazon.com
A number of client programs have a "--progress" option that when printing
to a TTY, updates the current line by printing a '\r' and overwriting it.
After the last line, '\n' needs to be printed to move the cursor to the
next line. pg_basebackup and pgbench got this right, but pg_rewind and
pg_checksums were slightly wrong. pg_rewind printed the newline to stdout
instead of stderr, and pg_checksums printed the newline even when not
printing to a TTY. Fix them, and also add a 'finished' argument to
pg_basebackup's progress_report() function, to keep it consistent with
the other programs.
Backpatch to v12. pg_rewind's newline was broken with the logging changes
in commit cc8d415117 in v12, and pg_checksums was introduced in v12.
Discussion: https://www.postgresql.org/message-id/82b539e5-ae33-34b0-1aee-22b3379fd3eb@iki.fi
Any libpq client can use the header. Clients include backend components
postgres_fdw, dblink, and logical replication apply worker. Back-patch
to v10, because another fix needs this. In released branches, just copy
the header and keep the original.
Windows has junction points which function as symbolic links for
directories. This patch introduces a new function TestLib::dir_symlink()
which creates a junction point on Windows and a standard Unix type
symbolic link elsewhere.
The function TestLib::perl2host is also modified, first to use cygpath
where it's available (e.g. msys2) and second to allow it to succeed if
the gandparent directory exists but the parent does not.
Given these changes the only symlink tests that need to be skipped on
Windows are those related to permissions or to use of readlink. The
relevant tests for pg_basebackup and pg_rewind are therefore adjusted
accordingly.
Andrew Dunstan, reviewed by Peter Eisentraut and Michael Paquier.
Discussion: https://postgr.es/m/c50a646c-d9bb-7c62-a4bf-8256ff6ff338@2ndquadrant.com
A few places calling fwrite and gzwrite were not setting errno to ENOSPC
when reporting errors, as is customary; this led to some failures being
reported as
"could not write file: Success"
which makes us look silly. Make a few of these places in pg_dump and
pg_basebackup use our customary pattern.
Backpatch-to: 9.5
Author: Justin Pryzby <pryzby@telsasoft.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20200611153753.GU14879@telsasoft.com
Introduce TAR_BLOCK_SIZE and replace many instances of 512 with
the new constant. Introduce function tarPaddingBytesRequired
and use it to replace numerous repetitions of (x + 511) & ~511.
Add preprocessor guards against multiple inclusion to pgtar.h.
Reformat the prototype for tarCreateHeader so it doesn't extend
beyond 80 characters.
Discussion: http://postgr.es/m/CA+TgmobWbfReO9-XFk8urR1K4wTNwqoHx_v56t7=T8KaiEoKNw@mail.gmail.com
Includes some manual cleanup of places that pgindent messed up,
most of which weren't per project style anyway.
Notably, it seems some people didn't absorb the style rules of
commit c9d297751, because there were a bunch of new occurrences
of function calls with a newline just after the left paren, all
with faulty expectations about how the rest of the call would get
indented.
The defect suppressed a Standby Status Update message when bytes flushed
to disk had changed but bytes received had not changed. If
pg_recvlogical then exited with no intervening Standby Status Update,
the next pg_recvlogical repeated already-flushed records. The defect
could also cause superfluous messages, which are functionally harmless.
Back-patch to 9.5 (all supported versions).
Discussion: https://postgr.es/m/20200502221647.GA3941274@rfd.leadboat.com
pg_recvlogical merely called PQfinish(), so the backend sent messages
after the disconnect. When that caused EPIPE in internal_flush(),
before a LogicalConfirmReceivedLocation(), the next pg_recvlogical would
repeat already-acknowledged records. Whether or not the defect causes
EPIPE, post-disconnect messages could contain an ErrorResponse that the
user should see. One properly ends PGRES_COPY_OUT by repeating
PQgetCopyData() until it returns a negative value. Augment one of the
tests to cover the case of WAL past --endpos. Back-patch to v10, where
commit 7c030783a5 first appeared. Before
that commit, pg_recvlogical never reached PGRES_COPY_OUT.
Reported by Thomas Munro.
Discussion: https://postgr.es/m/CAEepm=1MzM2Z_xNe4foGwZ1a+MO_2S9oYDq3M5D11=JDU_+0Nw@mail.gmail.com
When pg_basebackup -R is used, we inject standby.signal into the
tar file for the main tablespace. The proper thing to do is to pad
each file injected into the tar file out to a 512-byte boundary
by appending nulls, but here the file is of length 0 and we add
511 zero bytes. Since 0 is already a multiple of 512, we should
not add any zero bytes. Do that instead.
Patch by me, reviewed by Tom Lane.
Discussion: http://postgr.es/m/CA+TgmobWbfReO9-XFk8urR1K4wTNwqoHx_v56t7=T8KaiEoKNw@mail.gmail.com
The result of the query used to retrieve the WAL segment size from the
backend was not getting freed in two code paths. Both pg_basebackup and
pg_receivewal exit immediately if a failure happened on this query, so
this was not an actual problem, but it could be an issue if this code
gets used for other tools in different ways, be they future tools in
this code tree or external, existing, ones.
Oversight in commit fc49e24, so backpatch down to 11.
Author: Jie Zhang
Discussion: https://postgr.es/m/970ad9508461469b9450b64027842331@G08CNEXMBPEKD06.g08.fujitsu.local
Backpatch-through: 11
Since 0d8c9c1, pg_basebackup would generate an error if connected to a
backend version older than 12 where backup manifests are not supported.
Avoiding this error is possible by using the --no-manifest option.
This error handling could be confusing for some users, where patching a
backup script that interacts with multiple backend versions would cause
the addition of --no-manifest to potentially not generate a backup
manifest even for Postgres 13 and newer versions. As we want to
encourage the use of backup manifests as much as possible, this commit
silently disables manifests where not supported, instead of generating
an error.
While on it, rework a bit the code to make it more consistent with the
surroundings when generating the BASE_BACKUP command.
Per discussion with Andres Freund, Stephen Frost, Robert Haas, Álvaro
Herrera, Kyotaro Horiguchi, Tom Lane, David Steele, and me.
Author: Michael Paquier
Discussion: https://postgr.es/m/20200410080910.GZ1606@paquier.xyz
This commit prevents pg_basebackup from receiving backup_manifest file
when --no-manifest is specified. Previously, when pg_basebackup was
writing a tarfile to stdout, it tried to receive backup_manifest file even
when --no-manifest was specified, and reported an error.
Also remove unused -m option from pg_basebackup.
Also fix typo in BASE_BACKUP command documentation.
Author: Fujii Masao
Reviewed-by: Michael Paquier, Robert Haas
Discussion: https://postgr.es/m/01e3ed3a-8729-5aaa-ca84-e60e3ca59db8@oss.nttdata.com
A manifest is a JSON document which includes (1) the file name, size,
last modification time, and an optional checksum for each file backed
up, (2) timelines and LSNs for whatever WAL will need to be replayed
to make the backup consistent, and (3) a checksum for the manifest
itself. By default, we use CRC-32C when checksumming data files,
because we are trying to detect corruption and user error, not foil an
adversary. However, pg_basebackup and the server-side BASE_BACKUP
command now have options to select a different algorithm, so users
wanting a cryptographic hash function can select SHA-224, SHA-256,
SHA-384, or SHA-512. Users not wanting file checksums at all can
disable them, or disable generating of the backup manifest altogether.
Using a cryptographic hash function in place of CRC-32C consumes
significantly more CPU cycles, which may slow down backups in some
cases.
A new tool called pg_validatebackup can validate a backup against the
manifest. If no checksums are present, it can still check that the
right files exist and that they have the expected sizes. If checksums
are present, it can also verify that each file has the expected
checksum. Additionally, it calls pg_waldump to verify that the
expected WAL files are present and parseable. Only plain format
backups can be validated directly, but tar format backups can be
validated after extracting them.
Robert Haas, with help, ideas, review, and testing from David Steele,
Stephen Frost, Andrew Dunstan, Rushabh Lathia, Suraj Kharage, Tushar
Ahuja, Rajkumar Raghuwanshi, Mark Dilger, Davinder Singh, Jeevan
Chalke, Amit Kapila, Andres Freund, and Noah Misch.
Discussion: http://postgr.es/m/CA+TgmoZV8dw1H2bzZ9xkKwdrk8+XYa+DC9H=F7heO2zna5T6qg@mail.gmail.com
This commit changes pg_basebackup so that it specifies PROGRESS option in
BASE_BACKUP replication command whether --progress is specified or not.
This causes the server to estimate the total backup size and report it in
pg_stat_progress_basebackup.backup_total, by default. This is reasonable
default because the time required for the estimation would not be so large
in most cases.
Also this commit adds new option --no-estimate-size to pg_basebackup.
This option prevents the server from the estimation, and so is useful to
avoid such estimation time if it's too long.
Author: Fujii Masao
Reviewed-by: Magnus Hagander, Amit Langote
Discussion: https://postgr.es/m/CABUevEyDPPSjP7KRvfTXPdqOdY5aWNkqsB5aAXs3bco5ZwtGHg@mail.gmail.com
An instance of PostgreSQL crashing with a bad timing could leave behind
temporary pg_internal.init files, potentially causing failures when
verifying checksums. As the same exclusion lists are used between
pg_rewind, pg_checksums and basebackup.c, all those tools are extended
with prefix checks to keep everything in sync, with dedicated checks
added for pg_internal.init.
Backpatch down to 11, where pg_checksums (pg_verify_checksums in 11) and
checksum verification for base backups have been introduced.
Reported-by: Michael Banck
Author: Michael Paquier
Reviewed-by: Kyotaro Horiguchi, David Steele
Discussion: https://postgr.es/m/62031974fd8e941dd8351fbc8c7eff60d59c5338.camel@credativ.de
Backpatch-through: 11
This reverts commit 7bae0ad, as this is not ideal with the tar format,
and we may want to explore more options like what is done by tar with
some equivalents of --owner and --group, but for pg_basebackup.
Per complaints from Magnus Hagander and Stephen Frost.
Discussion: https://postgr.es/m/20200205172259.GW3195@tamriel.snowman.net
First, this code did not bother checking for a failure when calling
dup(). Then, per zlib, gzerror() returns NULL for a NULL input, which
can happen if passing down to gzdopen() an invalid file descriptor or if
there was an allocation failure.
No back-patch is done as this would unlikely be a problem in the field.
Per Coverity.
Reported-by: Tom Lane
Similarly to pg_upgrade, pg_ctl and initdb, a root user is able to use
--version and --help, but cannot execute the actual operation to avoid
the creation of files with permissions incompatible with the
postmaster.
This is a behavior change, so not back-patching is done.
Author: Ian Barwick
Discussion: https://postgr.es/m/CABvVfJVqOdD2neLkYdygdOHvbWz_5K_iWiqY+psMfA=FeAa3qQ@mail.gmail.com
Add a new function ReceiveCopyData that does just that, taking a
callback as an argument to specify what should be done with each chunk
as it is received. This allows a single copy of the logic to be shared
between ReceiveTarFile and ReceiveAndUnpackTarFile, and eliminates
a few #ifdef conditions based on HAVE_LIBZ.
While this is slightly more code, it's arguably clearer, and
there is a pending patch that introduces additional calls to
ReceiveCopyData.
This commit is not intended to result in any functional change.
Discussion: http://postgr.es/m/CA+TgmoYZDTHbSpwZtW=JDgAhwVAYvmdSrRUjOd+AYdfNNXVBDg@mail.gmail.com
When maintaining or merging patches, one of the most common sources
for conflicts are the list of objects in makefiles. Especially when
the split across lines has been changed on both sides, which is
somewhat common due to attempting to stay below 80 columns, those
conflicts are unnecessarily laborious to resolve.
By splitting, and alphabetically sorting, OBJS style lines into one
object per line, conflicts should be less frequent, and easier to
resolve when they still occur.
Author: Andres Freund
Discussion: https://postgr.es/m/20191029200901.vww4idgcxv74cwes@alap3.anarazel.de
Since the addition of fsync requests in bc34223 to make base backup data
consistent on disk once pg_basebackup finishes, each tablespace tar file
is individually flushed once completed, with an additional flush of the
parent directory when the base backup finishes. While holding a
connection to the server, a fsync request taking a long time may cause a
failure of the base backup, which is annoying for any integration. A
recent example of breakage can involve tcp_user_timeout, but
wal_sender_timeout can cause similar problems.
While reviewing the code, there was a second issue causing too many
fsync requests to be done for the same WAL data. As recursive fsyncs
are done at the end of the backup for both the plain and tar formats
from the base target directory where everything is written, it is fine
to disable fsyncs when fetching or streaming WAL.
Reported-by: Ryohei Takahashi
Author: Michael Paquier
Reviewed-by: Ryohei Takahashi
Discussion: https://postgr.es/m/OSBPR01MB4550DAE2F8C9502894A45AAB82BE0@OSBPR01MB4550.jpnprd01.prod.outlook.com
Backpatch-through: 10
Document that the tablespace sizes are in units of kilobytes. Make
the pg_basebackup source code a bit clearer about this, too.
Reviewed-by: Magnus Hagander <magnus@hagander.net>
Several buildfarm members have been complaining about that with gcc,
like jacana. Weirdly enough, Visual Studio's compilers do not find this
issue.
Author: Michael Paquier
Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/20190719050830.GK1859@paquier.xyz
This changes various places where appendPQExpBuffer was used in places
where it was possible to use appendPQExpBufferStr, and likewise for
appendStringInfo and appendStringInfoString. This is really just a
stylistic improvement, but there are also small performance gains to be
had from doing this.
Discussion: http://postgr.es/m/CAKJS1f9P=M-3ULmPvr8iCno8yvfDViHibJjpriHU8+SXUgeZ=w@mail.gmail.com