The patch needs test cases, reorganization, and cfbot testing.
Technically reverts commits 5c31afc49d..e35b2bad1a (exclusive/inclusive)
and 08db7c63f3..ccbe34139b.
Reported-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/E1ktAAG-0002V2-VB@gemulon.postgresql.org
A manifest is a JSON document which includes (1) the file name, size,
last modification time, and an optional checksum for each file backed
up, (2) timelines and LSNs for whatever WAL will need to be replayed
to make the backup consistent, and (3) a checksum for the manifest
itself. By default, we use CRC-32C when checksumming data files,
because we are trying to detect corruption and user error, not foil an
adversary. However, pg_basebackup and the server-side BASE_BACKUP
command now have options to select a different algorithm, so users
wanting a cryptographic hash function can select SHA-224, SHA-256,
SHA-384, or SHA-512. Users not wanting file checksums at all can
disable them, or disable generating of the backup manifest altogether.
Using a cryptographic hash function in place of CRC-32C consumes
significantly more CPU cycles, which may slow down backups in some
cases.
A new tool called pg_validatebackup can validate a backup against the
manifest. If no checksums are present, it can still check that the
right files exist and that they have the expected sizes. If checksums
are present, it can also verify that each file has the expected
checksum. Additionally, it calls pg_waldump to verify that the
expected WAL files are present and parseable. Only plain format
backups can be validated directly, but tar format backups can be
validated after extracting them.
Robert Haas, with help, ideas, review, and testing from David Steele,
Stephen Frost, Andrew Dunstan, Rushabh Lathia, Suraj Kharage, Tushar
Ahuja, Rajkumar Raghuwanshi, Mark Dilger, Davinder Singh, Jeevan
Chalke, Amit Kapila, Andres Freund, and Noah Misch.
Discussion: http://postgr.es/m/CA+TgmoZV8dw1H2bzZ9xkKwdrk8+XYa+DC9H=F7heO2zna5T6qg@mail.gmail.com
The current tool name is too restrictive and focuses only on verifying
checksums. As more options to control checksums for an offline cluster
are planned to be added, switch to a more generic name. Documentation
as well as all past references to the tool are updated.
Author: Michael Paquier
Reviewed-by: Michael Banck, Fabien Coelho, Seigei Kornilov
Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de
This makes it possible to turn checksums on in a live cluster, without
the previous need for dump/reload or logical replication (and to turn it
off).
Enabling checkusm starts a background process in the form of a
launcher/worker combination that goes through the entire database and
recalculates checksums on each and every page. Only when all pages have
been checksummed are they fully enabled in the cluster. Any failure of
the process will revert to checksums off and the process has to be
started.
This adds a new WAL record that indicates the state of checksums, so
the process works across replicated clusters.
Authors: Magnus Hagander and Daniel Gustafsson
Review: Tomas Vondra, Michael Banck, Heikki Linnakangas, Andrey Borodin
Earlier versions of this tool were available (and still are) on github.
Thanks to Michael Paquier, Alvaro Herrera, Peter Eisentraut, Amit Kapila,
and Satoshi Nagayasu for review.
Certain subdirectories do not get built if corresponding options are not
selected at configure time. However, "make distprep" should visit such
directories anyway, so that constructing derived files to be included in
the tarball happens without requiring all configure options to be given
in the tarball build script. Likewise, it's better if cleanup actions
unconditionally visit all directories (for example, this ensures proper
cleanup if someone has done a manual make in such a subdirectory).
To handle this, set up a convention that subdirectories that are
conditionally included in SUBDIRS should be added to ALWAYS_SUBDIRS
instead when they are excluded.
Back-patch to 9.1, so that plpython's spiexceptions.h will get provided
in 9.1 tarballs. There don't appear to be any instances where distprep
actions got missed in previous releases, and anyway this fix requires
gmake 3.80 so we don't want to apply it before 9.1.
This tool makes it possible to do the pg_start_backup/
copy files/pg_stop_backup step in a single command.
There are still some steps to be done before this is a
complete backup solution, such as the ability to stream
the required WAL logs, but it's still usable, and
could do with some buildfarm coverage.
In passing, make the checkpoint request optionally
fast instead of hardcoding it.
Magnus Hagander, reviewed by Fujii Masao and Dimitri Fontaine
Replace for loops in makefiles with proper dependencies. Parallel
make can now span across directories. Also, make -k and make -q work
properly.
GNU make 3.80 or newer is now required.
Test coverage support now covers the entire source tree, including
contrib, instead of just src/backend. In a related but independent
development, the commands make coverage and make coverage-html can be run
in any directory.
This turned out to be much easier than feared. Besides a few ad hoc fixes
to pass the make target down the tree, change all affected makefiles to
list their directories in the SUBDIRS variable, changed from variants like
DIRS and WANTED_DIRS. MSVC build fix was attempted as well.
errors in any commands, including in various clean targets that have so far
been handled inconsistently. make -i is available to ignore all errors in
a consistent and official way.