Commit Graph

266 Commits

Author SHA1 Message Date
Robert Haas 0ad8032910 Server-side gzip compression.
pg_basebackup's --compression option now lets you write either
"client-gzip" or "server-gzip" instead of just "gzip" to specify
where the compression should be performed. If you write simply
"gzip" it's taken to mean "client-gzip" unless you also use
--target, in which case it is interpreted to mean "server-gzip",
because that's the only thing that makes any sense in that case.

To make this work, the BASE_BACKUP command now takes new
COMPRESSION and COMPRESSION_LEVEL options.

At present, pg_basebackup cannot decompress .gz files, so
server-side compression will cause a failure if (1) -Ft is not
used or (2) -R is used or (3) -D- is used without --no-manifest.

Along the way, I removed the information message added by commit
5c649fe153 which occurred if you
specified no compression level and told you that the default level
had been used instead. That seemed like more output than most
people would want.

Also along the way, this adds a check to the server for
unrecognized base backup options. This repairs a bug introduced
by commit 0ba281cb4b.

This commit also adds some new test cases for pg_verifybackup.
They take a server-side backup with and without compression, and
then extract the backup if we have the OS facilities available
to do so, and then run pg_verifybackup on the extracted
directory. That is a good test of the functionality added by
this commit and also improves test coverage for the backup target
patch (commit 3500ccc39b) and for
pg_verifybackup itself.

Patch by me, with a bug fix by Jeevan Ladhe.  The patch set of which
this is a part has also had review and/or testing from Tushar Ahuja,
Suraj Kharage, Dipesh Pandit, and Mark Dilger.

Discussion: http://postgr.es/m/CA+Tgmoa-ST7fMLsVJduOB7Eub=2WjfpHS+QxHVEpUoinf4bOSg@mail.gmail.com
2022-01-24 15:13:18 -05:00
Andres Freund 9c86d9337e pg_basebackup: Skip a few more fsyncs if --no-sync is specified.
This is mostly interesting for running the regression tests on machines with
slow / overloaded IO.

Discussion: https://postgr.es/m/20220119041646.rhuo3youiqxqjmo2@alap3.anarazel.de
2022-01-23 14:09:27 -08:00
Tom Lane 353708e1fb Clean up recent Coverity complaints.
Commit 5c649fe15 introduced a memory leak into pg_basebackup's
parse_compress_options.  (I simplified nearby code while at it.)

Commit 9a974cbcb introduced a memory leak into pg_dump's
binary_upgrade_set_pg_class_oids.

Coverity also complained about a call of SnapBuildProcessChange that
ignored the result, unlike every other call of that function.  This
is evidently intentional, so add a (void) cast to indicate that.
(It's also old, dating to b89e15105; I suppose the reason it showed
up now is 7a5f6b474's recent rearrangement of nearby code.)
2022-01-23 12:51:38 -05:00
Michael Paquier 5c649fe153 Extend the options of pg_basebackup to control compression
The option --compress is extended to accept a compression method and an
optional compression level, as of the grammar METHOD[:LEVEL].  The
methods currently support are "none" and "gzip", for client-side
compression.  Any of those methods use only an integer value for the
compression level, but any method implemented in the future could use
more specific keywords if necessary.

This commit keeps the logic backward-compatible.  Hence, the following
compatibility rules apply for the new format of the option --compress:
* -z/--gzip is a synonym of --compress=gzip.
* --compress=NUM implies:
** --compress=none if NUM = 0.
** --compress=gzip:NUM if NUM > 0.

Note that there are also plans to extend more this grammar with
server-side compression.

Reviewed-by: Robert Haas, Magnus Hagander, Álvaro Herrera, David
G. Johnston, Georgios Kokolatos
Discussion: https://postgr.es/m/Yb3GEgWwcu4wZDuA@paquier.xyz
2022-01-21 11:08:43 +09:00
Robert Haas 3500ccc39b Support base backup targets.
pg_basebackup now has a --target=TARGET[:DETAIL] option. If specfied,
it is sent to the server as the value of the TARGET option to the
BASE_BACKUP command. If DETAIL is included, it is sent as the value of
the new TARGET_DETAIL option to the BASE_BACKUP command.  If the
target is anything other than 'client', pg_basebackup assumes that it
will now be the server's job to write the backup in a location somehow
defined by the target, and that it therefore needs to write nothing
locally. However, the server will still send messages to the client
for progress reporting purposes.

On the server side, we now support two additional types of backup
targets.  There is a 'blackhole' target, which just throws away the
backup data without doing anything at all with it. Naturally, this
should only be used for testing and debugging purposes, since you will
not actually have a backup when it finishes running. More usefully,
there is also a 'server' target, so you can now use something like
'pg_basebackup -Xnone -t server:/SOME/PATH' to write a backup to some
location on the server. We can extend this to more types of targets
in the future, and might even want to create an extensibility
mechanism for adding new target types.

Since WAL fetching is handled with separate client-side logic, it's
not part of this mechanism; thus, backups with non-default targets
must use -Xnone or -Xfetch.

Patch by me, with a bug fix by Jeevan Ladhe.  The patch set of which
this is a part has also had review and/or testing from Tushar Ahuja,
Suraj Kharage, Dipesh Pandit, and Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmoaYZbz0=Yk797aOJwkGJC-LK3iXn+wzzMx7KdwNpZhS5g@mail.gmail.com
2022-01-20 10:46:33 -05:00
Robert Haas cc333f3233 Modify pg_basebackup to use a new COPY subprotocol for base backups.
In the new approach, all files across all tablespaces are sent in a
single COPY OUT operation. The CopyData messages are no longer raw
archive content; rather, each message is prefixed with a type byte
that describes its purpose, e.g. 'n' signifies the start of a new
archive and 'd' signifies archive or manifest data. This protocol
is significantly more extensible than the old approach, since we can
later create more message types, though not without concern for
backward compatibility.

The new protocol sends a few things to the client that the old one
did not. First, it sends the name of each archive explicitly, instead
of letting the client compute it. This is intended to make it easier
to write future patches that might send archives in a format other
that tar (e.g. cpio, pax, tar.gz). Second, it sends explicit progress
messages rather than allowing the client to assume that progress is
defined by the number of bytes received. This will help with future
features where the server compresses the data, or sends it someplace
directly rather than transmitting it to the client.

The old protocol is still supported for compatibility with previous
releases. The new protocol is selected by means of a new
TARGET option to the BASE_BACKUP command. Currently, the
only supported target is 'client'. Support for additional
targets will be added in a later commit.

Patch by me. The patch set of which this is a part has had review
and/or testing from Jeevan Ladhe, Tushar Ahuja, Suraj Kharage,
Dipesh Pandit, and Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmoaYZbz0=Yk797aOJwkGJC-LK3iXn+wzzMx7KdwNpZhS5g@mail.gmail.com
2022-01-18 13:47:49 -05:00
Michael Paquier d0d62262d3 Fix thinko coming from 000f3adf
pg_basebackup.c relies on the compression level to not be 0 to decide if
compression should be used, but 000f3adf missed the fact that the
default compression (Z_DEFAULT_COMPRESSION) is -1, which would be used
if specifying --gzip without --compress.

While on it, add some coverage for --gzip, as this is rather easy to
miss.

Reported-by: Christoph Berg
Discussion: https://postgr.es/m/YdhRDMLjabtXOnhY@msg.df7cb.de
2022-01-08 09:12:21 +09:00
Bruce Momjian 27b77ecf9f Update copyright for 2022
Backpatch-through: 10
2022-01-07 19:04:57 -05:00
Michael Paquier 000f3adfdc Refactor tar method of walmethods.c to rely on the compression method
Since d62bcc8, the directory method of walmethods.c uses the compression
method to determine which code path to take.  The tar method, used by
pg_basebackup --format=t, was inconsistent regarding that, as it relied
on the compression level to check if no compression or gzip should be
used.  This commit makes the code more consistent as a whole in this
file, making the tar logic use a compression method rather than
assigning COMPRESSION_NONE that would be ignored.

The options of pg_basebackup are planned to be reworked but we are not
sure yet of the shape they should have as this has some dependency with
the integration of the server-side compression for base backups, so this
is left out for the moment.  This change has as benefit to make easier
the future integration of new compression methods for the tar method of
walmethods.c, for the client-side compression.

Reviewed-by: Georgios Kokolatos
Discussion: https://postgr.es/m/Yb3GEgWwcu4wZDuA@paquier.xyz
2022-01-07 13:48:59 +09:00
Robert Haas 5a1007a508 Have the server properly terminate tar archives.
Earlier versions of PostgreSQL featured a version of pg_basebackup
that wanted to edit tar archives but was too dumb to parse them
properly. The server made things easier for the client by failing
to add the two blocks of zero bytes that ought to end a tar file,
leaving it up to the client to do that.

But since commit 23a1c6578c, we
don't need this hack any more, because pg_basebackup is now smarter
and can parse tar files even if they are properly terminated! So
change the server to always properly terminate the tar files. Older
versions of pg_basebackup can't talk to new servers anyway, so
there's no compatibility break.

On the pg_basebackup side, we see still need to add the terminating
zero bytes if we're talking to an older server, but not when the
server is v15+. Hopefully at some point we'll be able to remove
some of this compatibility cruft, but it seems best to hang on to
it for now.

In passing, add a file header comment to bbstreamer_tar.c, to make
it clearer what's going on here.

Discussion: http://postgr.es/m/CA+TgmoZbNzsWwM4BE5Jb_qHncY817DYZwGf+2-7hkMQ27ZwsMQ@mail.gmail.com
2021-11-09 14:29:15 -05:00
Robert Haas 57b5a9646d Minimal fix for unterminated tar archive problem.
Commit 23a1c6578c improved
pg_basebackup's ability to parse tar archives, but also arranged
to parse them only when we need to make some modification to the
contents of the archive. That's a problem, because the server
doesn't actually terminate tar archives. When the new parsing
logic was engaged, pg_basebackup would properly terminate the
tar file, but when it was skipped, pg_basebackup would just write
whatever it got from the server, meaning that the terminator
was missing.

Most versions of tar are willing to overlook the missing terminator, but
the AIX buildfarm animals were not. Fix by inventing a new kind of
bbstreamer that just blindly adds a terminator, and using it whenever we
don't parse the tar archive.

Discussion: http://postgr.es/m/CA+TgmoZbNzsWwM4BE5Jb_qHncY817DYZwGf+2-7hkMQ27ZwsMQ@mail.gmail.com
2021-11-08 16:36:06 -05:00
Robert Haas 23a1c6578c Introduce 'bbstreamer' abstraction to modularize pg_basebackup.
pg_basebackup knows how to do quite a few things with a backup that it
gets from the server, like just write out the files, or compress them
first, or even parse the tar format and inject a modified
postgresql.auto.conf file into the archive generated by the server.
Unforatunely, this makes pg_basebackup.c a very large source file, and
also somewhat difficult to enhance, because for example the knowledge
that the server is sending us a 'tar' file rather than some other sort
of archive is spread all over the place rather than centralized.

In an effort to improve this situation, this commit invents a new
'bbstreamer' abstraction. Each archive received from the server is
fed to a bbstreamer which may choose to dispose of it or pass it
along to some other bbstreamer. Chunks may also be "labelled"
according to whether they are part of the payload data of a file
in the archive or part of the archive metadata.

So, for example, if we want to take a tar file, modify the
postgresql.auto.conf file it contains, and the gzip the result
and write it out, we can use a bbstreamer_tar_parser to parse the
tar file received from the server, a bbstreamer_recovery_injector
to modify the contents of postgresql.auto.conf, a
bbstreamer_tar_archiver to replace the tar headers for the file
modified in the previous step with newly-built ones that are
correct for the modified file, and a bbstreamer_gzip_writer to
gzip and write the resulting data. Only the objects with "tar"
in the name know anything about the tar archive format, and in
theory we could re-archive using some other format rather than
"tar" if somebody wanted to write the code.

These chances do add a substantial amount of code, but I think the
result is a lot more maintainable and extensible. pg_basebackup.c
itself shrinks by roughly a third, with a lot of the complexity
previously contained there moving into the newly-added files.

Patch by me. The larger patch series of which this is a part has been
reviewed and tested at various times by Andres Freund, Sumanta
Mukherjee, Dilip Kumar, Suraj Kharage, Dipesh Pandit, Tushar Ahuja,
Mark Dilger, Sergei Kornilov, and Jeevan Ladhe.

Discussion: https://postgr.es/m/CA+TgmoZGwR=ZVWFeecncubEyPdwghnvfkkdBe9BLccLSiqdf9Q@mail.gmail.com
Discussion: https://postgr.es/m/CA+TgmoZvqk7UuzxsX1xjJRmMGkqoUGYTZLDCH8SmU1xTPr1Xig@mail.gmail.com
2021-11-05 10:26:18 -04:00
Michael Paquier d62bcc8b07 Rework compression options of pg_receivewal
pg_receivewal includes since cada1af the option --compress, to allow the
compression of WAL segments using gzip, with a value of 0 (the default)
meaning that no compression can be used.

This commit introduces a new option, called --compression-method, able
to use as values "none", the default, and "gzip", to make things more
extensible.  The case of --compress=0 becomes fuzzy with this option
layer, so we have made the choice to make pg_receivewal return an error
when using "none" and a non-zero compression level, meaning that the
authorized values of --compress are now [1,9] instead of [0,9].  Not
specifying --compress with "gzip" as compression method makes
pg_receivewal use the default of zlib instead (Z_DEFAULT_COMPRESSION).

The code in charge of finding the streaming start LSN when scanning the
existing archives is refactored and made more extensible.  While on it,
rename "compression" to "compression_level" in walmethods.c, to reduce
the confusion with the introduction of the compression method, even if
the tar method used by pg_basebackup does not rely on the compression
method (yet, at least), but just on the compression level (this area
could be improved more, actually).

This is in preparation for an upcoming patch that adds LZ4 support to
pg_receivewal.

Author: Georgios Kokolatos
Reviewed-by: Michael Paquier, Jian Guo, Magnus Hagander, Dilip Kumar,
Robert Haas
Discussion: https://postgr.es/m/ZCm1J5vfyQ2E6dYvXz8si39HQ2gwxSZ3IpYaVgYa3lUwY88SLapx9EEnOf5uEwrddhx2twG7zYKjVeuP5MwZXCNPybtsGouDsAD1o2L_I5E=@pm.me
2021-11-04 11:10:31 +09:00
Robert Haas 0ba281cb4b Flexible options for BASE_BACKUP.
Previously, BASE_BACKUP used an entirely hard-coded syntax, but that's
hard to extend. Instead, adopt the same kind of syntax we've used for
SQL commands such as VACUUM, ANALYZE, COPY, and EXPLAIN, where it's
not necessary for all of the option names to be parser keywords.

In the new syntax, most of the options now take an optional Boolean
argument. To match our practice in other in places, the options which
the old syntax called NOWAIT and NOVERIFY_CHECKSUMS options are in the
new syntax called WAIT and VERIFY_CHECKUMS, and the default value is
false. In the new syntax, the FAST option has been replaced by a
CHECKSUM option whose value may be 'fast' or 'spread'.

This commit does not remove support for the old syntax. It just adds
the new one as an additional option, and makes pg_basebackup prefer
the new syntax when the server is new enough to support it.

Patch by me, reviewed and tested by Fabien Coelho, Sergei Kornilov,
Fujii Masao, and Tushar Ahuja.

Discussion: http://postgr.es/m/CA+TgmobAczXDRO_Gr2euo_TxgzaH1JxbNxvFx=HYvBinefNH8Q@mail.gmail.com
Discussion: http://postgr.es/m/CA+TgmoZGwR=ZVWFeecncubEyPdwghnvfkkdBe9BLccLSiqdf9Q@mail.gmail.com
2021-10-05 11:50:21 -04:00
Peter Eisentraut e03b807e12 Fix incorrect format placeholders
Also remove obsolete comments about why the 64-bit integers need to be
printed in a separate buffer.  The reason used to be portability, but
now the remaining reason is that we need the string lengths for the
progress displays.  That is evident by looking at the code right
below, so a new comment doesn't seem necessary.
2021-09-15 09:19:01 +02:00
Michael Paquier 856de3b39c Add some missing exit() calls in error paths for various binaries
The following changes are done:
- In pg_archivecleanup, the cleanup of older WAL segments would never
fail immediately.
- In pgbench, the initialization of a thread barrier would not fail
hard.
- In pg_recvlogical, a stat() failure never got the call.
- In pg_basebackup, two chmod() reported a failure without exit()'ing
when unpacking some tar data freshly received.  It may be possible to
continue writing some data even after this failure, but that could be
confusing to the user at the end.

These are arguably bugs, but they would happen for code paths where a
failure is unlikely going to happen, so no backpatch is done.

Reviewed-by: Robert Haas, Fabien Coelho
Discussion: https://postgr.es/m/YQDMdB+B68yePFeT@paquier.xyz
2021-07-29 11:42:58 +09:00
Michael Paquier bc0cc68f8a Add missing header declarations for pg_basebackup and pg_{dump,restore}
This fixes two compilation failures caused by 6f164e6.  Interesting to
see that missing <limits.h> dies not fail in Linux or even Windows.  On
MacOS, it fails, though.

Per various buildfarm members.
2021-07-24 19:05:14 +09:00
Michael Paquier 6f164e6d17 Unify parsing logic for command-line integer options
Most of the integer options for command-line binaries now make use of a
single routine able to do the job, fixing issues with the detection of
sloppy values caused for example by the use of atoi(), that fails on
strings beginning with numerical characters with junk trailing
characters.

This commit cuts down the number of strings requiring translation by 26
per my count, switching the code to have two error types for invalid and
out-of-range values instead.

Much more could be done here, with float or even int64 options, but
int32 was the most appealing case as it is possible to rely on strtol()
to do the job reliably.  Note that there are some exceptions for now,
like pg_ctl or pg_upgrade that use their own logging logic.  A couple of
negative TAP tests required some adjustments for the new errors
generated.

pg_dump and pg_restore tracked the maximum number of parallel jobs
within the option parsing.  The code is refactored a bit to track that
in the code dedicated to parallelism instead.

Author: Kyotaro Horiguchi, Michael Paquier
Reviewed-by: David Rowley, Álvaro Herrera
Discussion: https://postgr.es/m/CALj2ACXqdG9WhqVoJ9zYf-iZt7sgK7Szv5USs=he6NnWQ2ofTA@mail.gmail.com
2021-07-24 18:35:03 +09:00
Amit Kapila cda03cfed6 Allow enabling two-phase option via replication protocol.
Extend the replication command CREATE_REPLICATION_SLOT to support the
TWO_PHASE option. This will allow decoding commands like PREPARE
TRANSACTION, COMMIT PREPARED and ROLLBACK PREPARED for slots created with
this option. The decoding of the transaction happens at prepare command.

This patch also adds support of two-phase in pg_recvlogical via a new
option --two-phase.

This option will also be used by future patches that allow streaming of
transactions at prepare time for built-in logical replication. With this,
the out-of-core logical replication solutions can enable replication of
two-phase transactions via replication protocol.

Author: Ajin Cherian
Reviewed-By: Jeff Davis, Vignesh C, Amit Kapila
Discussion:
https://postgr.es/m/02DA5F5E-CECE-4D9C-8B4B-418077E2C010@postgrespro.ru
https://postgr.es/m/64b9f783c6e125f18f88fbc0c0234e34e71d8639.camel@j-davis.com
2021-06-30 08:45:47 +05:30
Bruce Momjian ca3b37487b Update copyright for 2021
Backpatch-through: 9.5
2021-01-02 13:06:25 -05:00
Heikki Linnakangas 734478200a Avoid non-constant format string argument to fprintf().
As Tom Lane pointed out, it could defeat the compiler's printf() format
string verification.

Backpatch to v12, like that patch that introduced it.

Discussion: https://www.postgresql.org/message-id/1069283.1597672779%40sss.pgh.pa.us
2020-08-18 13:13:09 +03:00
Heikki Linnakangas d7ec8337f9 Fix printing last progress report line in client programs.
A number of client programs have a "--progress" option that when printing
to a TTY, updates the current line by printing a '\r' and overwriting it.
After the last line, '\n' needs to be printed to move the cursor to the
next line. pg_basebackup and pgbench got this right, but pg_rewind and
pg_checksums were slightly wrong. pg_rewind printed the newline to stdout
instead of stderr, and pg_checksums printed the newline even when not
printing to a TTY. Fix them, and also add a 'finished' argument to
pg_basebackup's progress_report() function, to keep it consistent with
the other programs.

Backpatch to v12. pg_rewind's newline was broken with the logging changes
in commit cc8d415117 in v12, and pg_checksums was introduced in v12.

Discussion: https://www.postgresql.org/message-id/82b539e5-ae33-34b0-1aee-22b3379fd3eb@iki.fi
2020-08-17 09:27:29 +03:00
Alvaro Herrera ae3259c550
Ensure write failure reports no-disk-space
A few places calling fwrite and gzwrite were not setting errno to ENOSPC
when reporting errors, as is customary; this led to some failures being
reported as
"could not write file: Success"
which makes us look silly.  Make a few of these places in pg_dump and
pg_basebackup use our customary pattern.

Backpatch-to: 9.5
Author: Justin Pryzby <pryzby@telsasoft.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20200611153753.GU14879@telsasoft.com
2020-06-19 16:46:07 -04:00
Robert Haas 2961c9711c Assorted cleanup of tar-related code.
Introduce TAR_BLOCK_SIZE and replace many instances of 512 with
the new constant. Introduce function tarPaddingBytesRequired
and use it to replace numerous repetitions of (x + 511) & ~511.

Add preprocessor guards against multiple inclusion to pgtar.h.

Reformat the prototype for tarCreateHeader so it doesn't extend
beyond 80 characters.

Discussion: http://postgr.es/m/CA+TgmobWbfReO9-XFk8urR1K4wTNwqoHx_v56t7=T8KaiEoKNw@mail.gmail.com
2020-06-15 15:28:49 -04:00
Peter Eisentraut 47d4d0cfad Error message refactoring
Take some untranslatable things out of the message and replace by
format placeholders, to reduce translatable strings and reduce
translation mistakes.
2020-06-15 08:46:56 +02:00
Tom Lane 5cbfce562f Initial pgindent and pgperltidy run for v13.
Includes some manual cleanup of places that pgindent messed up,
most of which weren't per project style anyway.

Notably, it seems some people didn't absorb the style rules of
commit c9d297751, because there were a bunch of new occurrences
of function calls with a newline just after the left paren, all
with faulty expectations about how the rest of the call would get
indented.
2020-05-14 13:06:50 -04:00
Peter Eisentraut 3c800ae0b9 Put new command-line options into alphabetical order in help output 2020-05-01 11:49:52 +02:00
Robert Haas 0278d3f79a Fix bogus tar-file padding logic for standby.signal.
When pg_basebackup -R is used, we inject standby.signal into the
tar file for the main tablespace. The proper thing to do is to pad
each file injected into the tar file out to a 512-byte boundary
by appending nulls, but here the file is of length 0 and we add
511 zero bytes.  Since 0 is already a multiple of 512, we should
not add any zero bytes. Do that instead.

Patch by me, reviewed by Tom Lane.

Discussion: http://postgr.es/m/CA+TgmobWbfReO9-XFk8urR1K4wTNwqoHx_v56t7=T8KaiEoKNw@mail.gmail.com
2020-04-27 13:04:35 -04:00
Michael Paquier 542d7817f7 Disable silently generation of manifests with servers <= 12 in pg_basebackup
Since 0d8c9c1, pg_basebackup would generate an error if connected to a
backend version older than 12 where backup manifests are not supported.
Avoiding this error is possible by using the --no-manifest option.

This error handling could be confusing for some users, where patching a
backup script that interacts with multiple backend versions would cause
the addition of --no-manifest to potentially not generate a backup
manifest even for Postgres 13 and newer versions.  As we want to
encourage the use of backup manifests as much as possible, this commit
silently disables manifests where not supported, instead of generating
an error.

While on it, rework a bit the code to make it more consistent with the
surroundings when generating the BASE_BACKUP command.

Per discussion with Andres Freund, Stephen Frost, Robert Haas, Álvaro
Herrera, Kyotaro Horiguchi, Tom Lane, David Steele, and me.

Author: Michael Paquier
Discussion: https://postgr.es/m/20200410080910.GZ1606@paquier.xyz
2020-04-16 13:57:07 +09:00
Fujii Masao a2ac73e7be Code review for backup manifest.
This commit prevents pg_basebackup from receiving backup_manifest file
when --no-manifest is specified. Previously, when pg_basebackup was
writing a tarfile to stdout, it tried to receive backup_manifest file even
when --no-manifest was specified, and reported an error.

Also remove unused -m option from pg_basebackup.

Also fix typo in BASE_BACKUP command documentation.

Author: Fujii Masao
Reviewed-by: Michael Paquier, Robert Haas
Discussion: https://postgr.es/m/01e3ed3a-8729-5aaa-ca84-e60e3ca59db8@oss.nttdata.com
2020-04-15 11:15:12 +09:00
Robert Haas 0d8c9c1210 Generate backup manifests for base backups, and validate them.
A manifest is a JSON document which includes (1) the file name, size,
last modification time, and an optional checksum for each file backed
up, (2) timelines and LSNs for whatever WAL will need to be replayed
to make the backup consistent, and (3) a checksum for the manifest
itself. By default, we use CRC-32C when checksumming data files,
because we are trying to detect corruption and user error, not foil an
adversary. However, pg_basebackup and the server-side BASE_BACKUP
command now have options to select a different algorithm, so users
wanting a cryptographic hash function can select SHA-224, SHA-256,
SHA-384, or SHA-512. Users not wanting file checksums at all can
disable them, or disable generating of the backup manifest altogether.
Using a cryptographic hash function in place of CRC-32C consumes
significantly more CPU cycles, which may slow down backups in some
cases.

A new tool called pg_validatebackup can validate a backup against the
manifest. If no checksums are present, it can still check that the
right files exist and that they have the expected sizes. If checksums
are present, it can also verify that each file has the expected
checksum. Additionally, it calls pg_waldump to verify that the
expected WAL files are present and parseable. Only plain format
backups can be validated directly, but tar format backups can be
validated after extracting them.

Robert Haas, with help, ideas, review, and testing from David Steele,
Stephen Frost, Andrew Dunstan, Rushabh Lathia, Suraj Kharage, Tushar
Ahuja, Rajkumar Raghuwanshi, Mark Dilger, Davinder Singh, Jeevan
Chalke, Amit Kapila, Andres Freund, and Noah Misch.

Discussion: http://postgr.es/m/CA+TgmoZV8dw1H2bzZ9xkKwdrk8+XYa+DC9H=F7heO2zna5T6qg@mail.gmail.com
2020-04-03 15:05:59 -04:00
Fujii Masao fab13dc50b Make pg_basebackup ask the server to estimate the total backup size, by default.
This commit changes pg_basebackup so that it specifies PROGRESS option in
BASE_BACKUP replication command whether --progress is specified or not.
This causes the server to estimate the total backup size and report it in
pg_stat_progress_basebackup.backup_total, by default. This is reasonable
default because the time required for the estimation would not be so large
in most cases.

Also this commit adds new option --no-estimate-size to pg_basebackup.
This option prevents the server from the estimation, and so is useful to
avoid such estimation time if it's too long.

Author: Fujii Masao
Reviewed-by: Magnus Hagander, Amit Langote
Discussion: https://postgr.es/m/CABUevEyDPPSjP7KRvfTXPdqOdY5aWNkqsB5aAXs3bco5ZwtGHg@mail.gmail.com
2020-03-19 17:09:00 +09:00
Peter Eisentraut 1933ae629e Add PostgreSQL home page to --help output
Per emerging standard in GNU programs and elsewhere.  Autoconf already
has support for specifying a home page, so we can just that.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/8d389c5f-7fb5-8e48-9a4a-68cec44786fa%402ndquadrant.com
2020-02-28 13:12:21 +01:00
Peter Eisentraut 864934131e Refer to bug report address by symbol rather than hardcoding
Use the PACKAGE_BUGREPORT macro that is created by Autoconf for
referring to the bug reporting address rather than hardcoding it
everywhere.  This makes it easier to change the address and it reduces
translation work.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/8d389c5f-7fb5-8e48-9a4a-68cec44786fa%402ndquadrant.com
2020-02-28 13:12:21 +01:00
Michael Paquier dcddc3f813 Revert "Prevent running pg_basebackup as root"
This reverts commit 7bae0ad, as this is not ideal with the tar format,
and we may want to explore more options like what is done by tar with
some equivalents of --owner and --group, but for pg_basebackup.

Per complaints from Magnus Hagander and Stephen Frost.

Discussion: https://postgr.es/m/20200205172259.GW3195@tamriel.snowman.net
2020-02-07 10:51:17 +09:00
Michael Paquier 177be9edf4 Fix fuzzy error handling in pg_basebackup when opening gzFile
First, this code did not bother checking for a failure when calling
dup().  Then, per zlib, gzerror() returns NULL for a NULL input, which
can happen if passing down to gzdopen() an invalid file descriptor or if
there was an allocation failure.

No back-patch is done as this would unlikely be a problem in the field.

Per Coverity.

Reported-by: Tom Lane
2020-02-04 13:56:04 +09:00
Michael Paquier 7bae0ad9fc Prevent running pg_basebackup as root
Similarly to pg_upgrade, pg_ctl and initdb, a root user is able to use
--version and --help, but cannot execute the actual operation to avoid
the creation of files with permissions incompatible with the
postmaster.

This is a behavior change, so not back-patching is done.

Author: Ian Barwick
Discussion: https://postgr.es/m/CABvVfJVqOdD2neLkYdygdOHvbWz_5K_iWiqY+psMfA=FeAa3qQ@mail.gmail.com
2020-02-01 18:30:25 +09:00
Bruce Momjian 7559d8ebfa Update copyrights for 2020
Backpatch-through: update all files in master, backpatch legal files through 9.4
2020-01-01 12:21:45 -05:00
Robert Haas 431ba7bebf pg_basebackup: Refactor code for reading COPY and tar data.
Add a new function ReceiveCopyData that does just that, taking a
callback as an argument to specify what should be done with each chunk
as it is received. This allows a single copy of the logic to be shared
between ReceiveTarFile and ReceiveAndUnpackTarFile, and eliminates
a few #ifdef conditions based on HAVE_LIBZ.

While this is slightly more code, it's arguably clearer, and
there is a pending patch that introduces additional calls to
ReceiveCopyData.

This commit is not intended to result in any functional change.

Discussion: http://postgr.es/m/CA+TgmoYZDTHbSpwZtW=JDgAhwVAYvmdSrRUjOd+AYdfNNXVBDg@mail.gmail.com
2019-12-05 15:14:09 -05:00
Amit Kapila dddf4cdc33 Make the order of the header file includes consistent in non-backend modules.
Similar to commit 7e735035f2, this commit makes the order of header file
inclusion consistent for non-backend modules.

In passing, fix the case where we were using angle brackets (<>) for the
local module includes instead of quotes ("").

Author: Vignesh C
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com
2019-10-25 07:41:52 +05:30
Alvaro Herrera caba97a9d9 Split out recovery confing-writing code from pg_basebackup
... into a new file, fe_utils/recovery_gen.c.

This can later be used by pg_rewind.

Authors: Paul Guo, Jimmy Yih, Ashwin Agrawal.  A few tweaks by Álvaro Herrera
Reviewed-by: Michaël Paquier
Discussion: https://postgr.es/m/CAEET0ZEffUkXc48pg2iqARQgGRYDiiVxDu+yYek_bTwJF+q=Uw@mail.gmail.com
2019-09-25 14:35:24 -03:00
Michael Paquier 522baf1484 Delay fsyncs of pg_basebackup until the end of backup
Since the addition of fsync requests in bc34223 to make base backup data
consistent on disk once pg_basebackup finishes, each tablespace tar file
is individually flushed once completed, with an additional flush of the
parent directory when the base backup finishes.  While holding a
connection to the server, a fsync request taking a long time may cause a
failure of the base backup, which is annoying for any integration.  A
recent example of breakage can involve tcp_user_timeout, but
wal_sender_timeout can cause similar problems.

While reviewing the code, there was a second issue causing too many
fsync requests to be done for the same WAL data.  As recursive fsyncs
are done at the end of the backup for both the plain and tar formats
from the base target directory where everything is written, it is fine
to disable fsyncs when fetching or streaming WAL.

Reported-by: Ryohei Takahashi
Author: Michael Paquier
Reviewed-by: Ryohei Takahashi
Discussion: https://postgr.es/m/OSBPR01MB4550DAE2F8C9502894A45AAB82BE0@OSBPR01MB4550.jpnprd01.prod.outlook.com
Backpatch-through: 10
2019-09-04 13:21:11 +09:00
Peter Eisentraut bde8c2d319 Improve base backup protocol documentation
Document that the tablespace sizes are in units of kilobytes.  Make
the pg_basebackup source code a bit clearer about this, too.

Reviewed-by: Magnus Hagander <magnus@hagander.net>
2019-09-03 11:59:36 +02:00
Alvaro Herrera 0994cfc0ac Don't uselessly escape a string that doesn't need escaping
Per gripe from Ian Barwick

Co-authored-by: Ian Barwick <ian@2ndquadrant.com>
Discussion: https://postgr.es/m/CABvVfJWNnNKb8cHsTLhkTsvL1+G6BVcV+57+w1JZ61p8YGPdWQ@mail.gmail.com
2019-07-26 17:46:40 -04:00
Michael Paquier 90317ab7e6 Fix compilation warning of pg_basebackup with MinGW
Several buildfarm members have been complaining about that with gcc,
like jacana.  Weirdly enough, Visual Studio's compilers do not find this
issue.

Author: Michael Paquier
Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/20190719050830.GK1859@paquier.xyz
2019-07-21 22:27:11 +09:00
Peter Eisentraut 24c7000f64 Remove redundant newlines from error messages
These are no longer needed/allowed with the new logging API.
2019-07-02 23:18:43 +01:00
Noah Misch 31d250e049 Update stale comments, and fix comment typos. 2019-06-08 10:12:26 -07:00
Tom Lane 8255c7a5ee Phase 2 pgindent run for v12.
Switch to 2.1 version of pg_bsd_indent.  This formats
multiline function declarations "correctly", that is with
additional lines of parameter declarations indented to match
where the first line's left parenthesis is.

Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
2019-05-22 13:04:48 -04:00
Tom Lane fc9a62af3f Move logging.h and logging.c from src/fe_utils/ to src/common/.
The original placement of this module in src/fe_utils/ is ill-considered,
because several src/common/ modules have dependencies on it, meaning that
libpgcommon and libpgfeutils now have mutual dependencies.  That makes it
pointless to have distinct libraries at all.  The intended design is that
libpgcommon is lower-level than libpgfeutils, so only dependencies from
the latter to the former are acceptable.

We already have the precedent that fe_memutils and a couple of other
modules in src/common/ are frontend-only, so it's not stretching anything
out of whack to treat logging.c as a frontend-only module in src/common/.
To the extent that such modules help provide a common frontend/backend
environment for the rest of common/ to use, it's a reasonable design.
(logging.c does not yet provide an ereport() emulation, but one can
dream.)

Hence, move these files over, and revert basically all of the build-system
changes made by commit cc8d41511.  There are no places that need to grow
new dependencies on libpgcommon, further reinforcing the idea that this
is the right solution.

Discussion: https://postgr.es/m/a912ffff-f6e4-778a-c86a-cf5c47a12933@2ndquadrant.com
2019-05-14 14:20:10 -04:00
Peter Eisentraut cc8d415117 Unified logging system for command-line programs
This unifies the various ad hoc logging (message printing, error
printing) systems used throughout the command-line programs.

Features:

- Program name is automatically prefixed.

- Message string does not end with newline.  This removes a common
  source of inconsistencies and omissions.

- Additionally, a final newline is automatically stripped, simplifying
  use of PQerrorMessage() etc., another common source of mistakes.

- I converted error message strings to use %m where possible.

- As a result of the above several points, more translatable message
  strings can be shared between different components and between
  frontends and backend, without gratuitous punctuation or whitespace
  differences.

- There is support for setting a "log level".  This is not meant to be
  user-facing, but can be used internally to implement debug or
  verbose modes.

- Lazy argument evaluation, so no significant overhead if logging at
  some level is disabled.

- Some color in the messages, similar to gcc and clang.  Set
  PG_COLOR=auto to try it out.  Some colors are predefined, but can be
  customized by setting PG_COLORS.

- Common files (common/, fe_utils/, etc.) can handle logging much more
  simply by just using one API without worrying too much about the
  context of the calling program, requiring callbacks, or having to
  pass "progname" around everywhere.

- Some programs called setvbuf() to make sure that stderr is
  unbuffered, even on Windows.  But not all programs did that.  This
  is now done centrally.

Soft goals:

- Reduces vertical space use and visual complexity of error reporting
  in the source code.

- Encourages more deliberate classification of messages.  For example,
  in some cases it wasn't clear without analyzing the surrounding code
  whether a message was meant as an error or just an info.

- Concepts and terms are vaguely aligned with popular logging
  frameworks such as log4j and Python logging.

This is all just about printing stuff out.  Nothing affects program
flow (e.g., fatal exits).  The uses are just too varied to do that.
Some existing code had wrappers that do some kind of print-and-exit,
and I adapted those.

I tried to keep the output mostly the same, but there is a lot of
historical baggage to unwind and special cases to consider, and I
might not always have succeeded.  One significant change is that
pg_rewind used to write all error messages to stdout.  That is now
changed to stderr.

Reviewed-by: Donald Dong <xdong@csumb.edu>
Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru>
Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 20:01:35 +02:00
Michael Paquier beeb8e2e07 Fix compatibility of pg_basebackup -R with 11 and older versions
When 2dedf4d9 has integrated recovery.conf into postgresql.conf, it also
changed pg_basebackup -R in the way recovery configuration is
generated.  However this implementation forgot the fact that
pg_basebackup needs to keep compatibility with older server versions as
well.

Reported-by: Devrim Gündüz
Author: Sergei Kornilov, Michael Paquier
Discussion: https://postgr.es/m/3458f7cd12d74acd90180a671c8d5a081d60e162.camel@gunduz.org
2019-03-08 10:17:23 +09:00
Peter Eisentraut 37d9916020 More unconstify use
Replace casts whose only purpose is to cast away const with the
unconstify() macro.

Discussion: https://www.postgresql.org/message-id/flat/53a28052-f9f3-1808-fed9-460fd43035ab%402ndquadrant.com
2019-02-13 11:50:16 +01:00
Magnus Hagander 0301db623d Replace @postgresql.org with @lists.postgresql.org for mailinglists
Commit c0d0e54084 replaced the ones in the documentation, but missed out
on the ones in the code. Replace those as well, but unlike c0d0e54084,
don't backpatch the code changes to avoid breaking translations.
2019-01-19 19:06:35 +01:00
Peter Eisentraut a4205fa00d pg_basebackup: Use atexit()
Instead of using our custom disconnect_and_exit(), just register the
desired cleanup using atexit() and use the standard exit() to leave
the program.

Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/ec4135ba-84e9-28bf-b584-0e78d47448d5@2ndquadrant.com/
2019-01-07 16:21:47 +01:00
Bruce Momjian 97c39498e5 Update copyright for 2019
Backpatch-through: certain files through 9.4
2019-01-02 12:44:25 -05:00
Peter Eisentraut f4eabaf3e0 Fix ancient compiler warnings and typos in !HAVE_SYMLINK code
This has never been correct since this code was introduced.
2018-12-22 07:21:40 +01:00
Tom Lane a73d083195 Modernize our code for looking up descriptive strings for Unix signals.
At least as far back as the 2008 spec, POSIX has defined strsignal(3)
for looking up descriptive strings for signal numbers.  We hadn't gotten
the word though, and were still using the crufty old sys_siglist array,
which is in no standard even though most Unixen provide it.

Aside from not being formally standards-compliant, this was just plain
ugly because it involved #ifdef's at every place using the code.

To eliminate the #ifdef's, create a portability function pg_strsignal,
which wraps strsignal(3) if available and otherwise falls back to
sys_siglist[] if available.  The set of Unixen with neither API is
probably empty these days, but on any platform with neither, you'll
just get "unrecognized signal".  All extant callers print the numeric
signal number too, so no need to work harder than that.

Along the way, upgrade pg_basebackup's child-error-exit reporting
to match the rest of the system.

Discussion: https://postgr.es/m/25758.1544983503@sss.pgh.pa.us
2018-12-16 19:38:57 -05:00
Peter Eisentraut 2dedf4d9a8 Integrate recovery.conf into postgresql.conf
recovery.conf settings are now set in postgresql.conf (or other GUC
sources).  Currently, all the affected settings are PGC_POSTMASTER;
this could be refined in the future case by case.

Recovery is now initiated by a file recovery.signal.  Standby mode is
initiated by a file standby.signal.  The standby_mode setting is
gone.  If a recovery.conf file is found, an error is issued.

The trigger_file setting has been renamed to promote_trigger_file as
part of the move.

The documentation chapter "Recovery Configuration" has been integrated
into "Server Configuration".

pg_basebackup -R now appends settings to postgresql.auto.conf and
creates a standby.signal file.

Author: Fujii Masao <masao.fujii@gmail.com>
Author: Simon Riggs <simon@2ndquadrant.com>
Author: Abhijit Menon-Sen <ams@2ndquadrant.com>
Author: Sergei Kornilov <sk@zsrv.org>
Discussion: https://www.postgresql.org/message-id/flat/607741529606767@web3g.yandex.ru/
2018-11-25 16:33:40 +01:00
Magnus Hagander a9da329be0 Fix speling error
Reported by Alexander Lakhin in bug #15423
2018-10-08 08:57:24 +02:00
Michael Paquier fa7d5b704a Add verbosity to pg_basebackup for sync
This is useful to know when the data copy has been finished.  The
current situation can be confusing for users as the last message is
"waiting for background process to finish streaming", so it looks like
this is taking time but the final sync is instead.

Author: Jeff Janes
Discussion: https://postgr.es/m/CAMkU=1ypeoMJ=tFBG8vP13sxEtXd4Pm_x1SqsJdW_RvzpcvN=A@mail.gmail.com
2018-07-29 07:53:11 +09:00
Peter Eisentraut 3ce7f72529 pg_basebackup: Remove short option -k
-k meant --no-verify-checksums, which is the opposite of what initdb
uses -k for.  After discussion, a short option does not seem necessary,
so just keep the long option.

Discussion: https://www.postgresql.org/message-id/flat/d510f8aa-19e1-d06e-7630-ad27f7441d68%402ndquadrant.com
2018-05-21 10:01:49 -04:00
Peter Eisentraut 9effb63e0d Message wording and pluralization improvements 2018-05-17 23:05:27 -04:00
Stephen Frost c37b3d08ca Allow group access on PGDATA
Allow the cluster to be optionally init'd with read access for the
group.

This means a relatively non-privileged user can perform a backup of the
cluster without requiring write privileges, which enhances security.

The mode of PGDATA is used to determine whether group permissions are
enabled for directory and file creates.  This method was chosen as it's
simple and works well for the various utilities that write into PGDATA.

Changing the mode of PGDATA manually will not automatically change the
mode of all the files contained therein.  If the user would like to
enable group access on an existing cluster then changing the mode of all
the existing files will be required.  Note that pg_upgrade will
automatically change the mode of all migrated files if the new cluster
is init'd with the -g option.

Tests are included for the backend and all the utilities which operate
on the PG data directory to ensure that the correct mode is set based on
the data directory permissions.

Author: David Steele <david@pgmasters.net>
Reviewed-By: Michael Paquier, with discussion amongst many others.
Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net
2018-04-07 17:45:39 -04:00
Stephen Frost da9b580d89 Refactor dir/file permissions
Consolidate directory and file create permissions for tools which work
with the PG data directory by adding a new module (common/file_perm.c)
that contains variables (pg_file_create_mode, pg_dir_create_mode) and
constants to initialize them (0600 for files and 0700 for directories).

Convert mkdir() calls in the backend to MakePGDirectory() if the
original call used default permissions (always the case for regular PG
directories).

Add tests to make sure permissions in PGDATA are set correctly by the
tools which modify the PG data directory.

Authors: David Steele <david@pgmasters.net>,
         Adam Brightwell <adam.brightwell@crunchydata.com>
Reviewed-By: Michael Paquier, with discussion amongst many others.
Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net
2018-04-07 17:45:39 -04:00
Magnus Hagander 4eb77d50c2 Validate page level checksums in base backups
When base backups are run over the replication protocol (for example
using pg_basebackup), verify the checksums of all data blocks if
checksums are enabled. If checksum failures are encountered, log them
as warnings but don't abort the backup.

This becomes the default behaviour in pg_basebackup (provided checksums
are enabled on the server), so add a switch (-k) to disable the checks
if necessary.

Author: Michael Banck
Reviewed-By: Magnus Hagander, David Steele
Discussion: https://postgr.es/m/20180228180856.GE13784@nighthawk.caipicrew.dd-dns.de
2018-04-03 13:47:16 +02:00
Bruce Momjian 9d4649ca49 Update copyright for 2018
Backpatch-through: certain files through 9.3
2018-01-02 23:30:12 -05:00
Peter Eisentraut 143b54d21d pg_basebackup: Fix progress messages when writing to a file
The progress messages print out \r to keep overwriting the same line on
the screen.  But this does not yield useful results when writing the
output to a file.  So in that case, print out \n instead.

Author: Martín Marqués <martin@2ndquadrant.com>
Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru>
2017-12-01 09:21:34 -05:00
Tom Lane 0772c152b9 Mark some more functions as pg_attribute_noreturn().
Doing this suppresses Coverity warnings and might allow improved
code in some cases.  The prospects of that are not so bright as
to warrant back-patching, though.

Michael Paquier, per Coverity
2017-11-27 20:56:46 -05:00
Peter Eisentraut 067a2259fd pg_basebackup: Fix comparison handling of tablespace mappings on Windows
A candidate path needs to be canonicalized before being checked against
the mappings, because the mappings are also canonicalized.  This is
especially relevant on Windows

Reported-by: nb <nbedxp@gmail.com>
Author: Michael Paquier <michael.paquier@gmail.com>
Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com>
2017-11-01 10:20:05 -04:00
Peter Eisentraut 3709ca1cf0 pg_basebackup: Add option to create replication slot
When requesting a particular replication slot, the new pg_basebackup
option -C/--create-slot creates it before starting to replicate from it.

Further refactor the slot creation logic to include the temporary slot
creation logic into the same function.  Add new arguments is_temporary
and preserve_wal to CreateReplicationSlot().  Print in --verbose mode
that a slot has been created.

Author: Michael Banck <michael.banck@credativ.de>
2017-09-27 08:49:47 -04:00
Peter Eisentraut 15a8010ed6 Sort pg_basebackup options better
The --slot option somehow ended up under options controlling the output,
and some other options were in a nonsensical place or were not moved
after recent renamings, so tidy all that up a bit.
2017-09-26 11:58:22 -04:00
Andres Freund fc49e24fa6 Make WAL segment size configurable at initdb time.
For performance reasons a larger segment size than the default 16MB
can be useful. A larger segment size has two main benefits: Firstly,
in setups using archiving, it makes it easier to write scripts that
can keep up with higher amounts of WAL, secondly, the WAL has to be
written and synced to disk less frequently.

But at the same time large segment size are disadvantageous for
smaller databases. So far the segment size had to be configured at
compile time, often making it unrealistic to choose one fitting to a
particularly load. Therefore change it to a initdb time setting.

This includes a breaking changes to the xlogreader.h API, which now
requires the current segment size to be configured.  For that and
similar reasons a number of binaries had to be taught how to recognize
the current segment size.

Author: Beena Emerson, editorialized by Andres Freund
Reviewed-By: Andres Freund, David Steele, Kuntal Ghosh, Michael
    Paquier, Peter Eisentraut, Robert Hass, Tushar Ahuja
Discussion: https://postgr.es/m/CAOG9ApEAcQ--1ieKbhFzXSQPw_YLmepaa4hNdnY5+ZULpt81Mw@mail.gmail.com
2017-09-19 22:03:48 -07:00
Peter Eisentraut 8e67380126 Remove useless empty string initializations
This coding style probably stems from the days of shell scripts.

Reviewed-by: Aleksandr Parfenov <a.parfenov@postgrespro.ru>
2017-09-08 12:37:05 -04:00
Heikki Linnakangas 8046465c2e Fix pg_basebackup output to stdout on Windows.
When writing a backup to stdout with pg_basebackup on Windows, put stdout
to binary mode. Any CR bytes in the output will otherwise be output
incorrectly as CR+LF.

In the passing, standardize on using "_setmode" instead of "setmode", for
the sake of consistency. They both do the same thing, but according to
MSDN documentation, setmode is deprecated.

Fixes bug #14634, reported by Henry Boehlert. Patch by Haribabu Kommi.
Backpatch to all supported versions.

Discussion: https://www.postgresql.org/message-id/20170428082818.24366.13134@wrigleys.postgresql.org
2017-07-14 16:02:53 +03:00
Tom Lane 382ceffdf7 Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.

By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis.  However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent.  That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.

This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.

This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 15:35:54 -04:00
Tom Lane c7b8998ebb Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.

Commit e3860ffa4d wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code.  The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there.  BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs.  So the
net result is that in about half the cases, such comments are placed
one tab stop left of before.  This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.

Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.

This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 15:19:25 -04:00
Peter Eisentraut 4e88fe8f8f Add missing serial comma 2017-06-14 14:43:54 -04:00
Magnus Hagander 2712da8b64 Generate pg_basebackup temporary slot name using backend pid
Using the client pid can easily be non-unique when used on different
hosts. Using the backend pid should be guaranteed unique, since the
temporary slot gets removed when the client disconnects so it will be
gone even if the pid is renewed.

Reported by Ludovic Vaugeois-Pepin
2017-05-31 21:00:37 +02:00
Bruce Momjian a6fd7b7a5f Post-PG 10 beta1 pgindent run
perltidy run not included.
2017-05-17 16:31:56 -04:00
Tom Lane 05b5feb60e Revert changes to pg_basebackup and pg_waldump usage() code.
Partially revert commit c079673dcb.
There were complaints that splitting switch descriptions would
complicate translation efforts.  There are probably ways to resolve
the formatting problem without doing that, but undo it while we're
discussing.
2017-05-17 13:04:03 -04:00
Tom Lane c079673dcb Preventive maintenance in advance of pgindent run.
Reformat various places in which pgindent will make a mess, and
fix a few small violations of coding style that I happened to notice
while perusing the diffs from a pgindent dry run.

There is one actual bug fix here: the need-to-enlarge-the-buffer code
path in icu_convert_case was obviously broken.  Perhaps it's unreachable
in our usage?  Or maybe this is just sadly undertested.
2017-05-16 20:36:35 -04:00
Magnus Hagander b1c45afb01 Fix typo in comment
Michael Paquier
2017-05-15 11:08:02 +02:00
Peter Eisentraut d496a65790 Standardize "WAL location" terminology
Other previously used terms were "WAL position" or "log position".
2017-05-12 13:51:27 -04:00
Peter Eisentraut c1a7f64b4a Replace "transaction log" with "write-ahead log"
This makes documentation and error messages match the renaming of "xlog"
to "wal" in APIs and file naming.
2017-05-12 11:52:43 -04:00
Tom Lane 7834d20b57 Avoid slow shutdown of pg_basebackup.
pg_basebackup's child process did not pay any attention to the pipe
from its parent while waiting for input from the source server.
If no server data was arriving, it would only wake up and check the
pipe every standby_message_timeout or so.  This creates a problem
since the parent process might determine and send the desired stop
position only after the server has reached end-of-WAL and stopped
sending data.  In the src/test/recovery regression tests, the timing
is repeatably such that it takes nearly 10 seconds for the child
process to realize that it should shut down.  It's not clear how
often that would happen in real-world cases, but it sure seems like
a bug --- and if the user turns off standby_message_timeout or sets
it very large, the delay could be a lot worse.

To fix, expand the StreamCtl API to allow the pipe input FD to be
passed down to the low-level wait routine, and watch both sockets
when sleeping.

(Note: AFAICS this issue doesn't affect the Windows port, since
it doesn't rely on a pipe to transfer the stop position to the
child thread.)

Discussion: https://postgr.es/m/6456.1493263884@sss.pgh.pa.us
2017-04-27 18:27:02 -04:00
Magnus Hagander 7220c7b3e5 Write "waiting for checkpoint" on regular progress row
When reporting progress, make the "waiting for checkpoint" test be
overwritten by the file-based progress once it's completed. This is more
consistent with how we report the rest of the progress.

Suggested by Jeff Janes
2017-04-01 17:04:14 +02:00
Peter Eisentraut 788af6f854 Move atooid() definition to a central place 2017-03-01 11:55:28 -05:00
Magnus Hagander 1513dbea7f Add missing progname prefix to some messages
Author: Michael Banck
2017-02-26 21:32:00 +01:00
Magnus Hagander 51e26c9c3d Clarify the role of checkpoint at the begininng of base backups
Output a message about checkpoint starting in verbose mode of
pg_basebackup, and make the documentation state more clearly that this
happens.

Author: Michael Banck
2017-02-26 21:31:54 +01:00
Tom Lane 9e3755ecb2 Remove useless duplicate inclusions of system header files.
c.h #includes a number of core libc header files, such as <stdio.h>.
There's no point in re-including these after having read postgres.h,
postgres_fe.h, or c.h; so remove code that did so.

While at it, also fix some places that were ignoring our standard pattern
of "include postgres[_fe].h, then system header files, then other Postgres
header files".  While there's not any great magic in doing it that way
rather than system headers last, it's silly to have just a few files
deviating from the general pattern.  (But I didn't attempt to enforce this
globally, only in files I was touching anyway.)

I'd be the first to say that this is mostly compulsive neatnik-ism,
but over time it might save enough compile cycles to be useful.
2017-02-25 16:12:55 -05:00
Magnus Hagander 1a16af8b35 Fix help message for pg_basebackup -R
The recovery.conf file that's generated is specifically for replication,
and not needed (or wanted) for regular backup restore, so indicate that
in the message.
2017-02-18 13:45:52 +01:00
Fujii Masao 0dfa89ba29 Replace reference to "xlog-method" with "wal-method" in error message.
Commit 62e8b38 renamed "--xlog-method" option for pg_basebackup to
"--wal-method", but forgot to update the error message mentioning that option.
2017-02-15 01:26:44 +09:00
Robert Haas 62e8b38751 Rename command line options for ongoing xlog -> wal conversion.
initdb and pg_basebackup now have a --waldir option rather --xlogdir,
and pg_basebackup now has --wal-method rather than --xlog-method.
2017-02-09 16:42:51 -05:00
Magnus Hagander cada1af31d Add compression support to pg_receivexlog
Author: Michael Paquier, review and small changes by me
2017-01-17 12:10:26 +01:00
Magnus Hagander fcf708623e Fix incorrect comparison due to bad merge
Noted by Fujii Masao
2017-01-16 18:20:57 +01:00
Magnus Hagander e7b020f786 Make pg_basebackup use temporary replication slots
Temporary replication slots will be used by default when wal streaming
is used and no slot name is specified with -S. If a slot name is
specified, then a permanent slot with that name is used. If --no-slot is
specified, then no permanent or temporary slot will be used.

Temporary slots are only used on 10.0 and newer, of course.
2017-01-16 13:56:43 +01:00
Magnus Hagander 534b6f3ef2 Use an enum instead of two bools to indicate wal inclusion in base backups
This makes the code easier to read as it becomes more explicit what the
different allowed combinations really are.

Suggested by Michael Paquier
2017-01-09 16:03:47 +01:00
Magnus Hagander 9a4d51077c Make wal streaming the default mode for pg_basebackup
Since streaming is now supported for all output formats, make this the
default as this is what most people want.

To get the old behavior, the parameter -X none can be specified to turn
it off.

This also removes the parameter -x for fetch, now requiring -X fetch to
be specified to use that.

Reviewed by Vladimir Rusinov, Michael Paquier and Simon Riggs
2017-01-04 10:40:38 +01:00
Bruce Momjian 1d25779284 Update copyright via script for 2017 2017-01-03 13:48:53 -05:00
Fujii Masao ecbdc4c555 Forbid invalid combination of options in pg_basebackup.
Commit 56c7d8d455 allowed pg_basebackup
to stream WAL in tar mode. But there is the restriction that WAL
streaming in tar mode works only when the value - (dash) is not
specified as output directory. This means that the combination of
three options "-D -", "-F t" and "-X stream" is invalid. However,
previously, even when those options were specified at the same time,
pg_basebackup background process unexpectedly started streaming WAL.
And then it exited with an error.

This commit changes pg_basebackup so that it errors out on such
invalid combination of options at the beginning.

Reviewed by Magnus Hagander, and patch by me.
2016-12-21 20:27:37 +09:00