Clarify documentation on PITR and warm standby on the fact that the standby

restore_command should report failure on non-existent .backup and .history
files. Tidy up some related text along the way.

Patch by Markus Bertheau, with some editing by Simon Riggs and myself.
This commit is contained in:
Heikki Linnakangas 2008-03-28 15:00:28 +00:00
parent da215f05ec
commit 958db06181
1 changed files with 27 additions and 23 deletions

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.115 2008/03/07 01:46:41 momjian Exp $ --> <!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.116 2008/03/28 15:00:28 heikki Exp $ -->
<chapter id="backup"> <chapter id="backup">
<title>Backup and Restore</title> <title>Backup and Restore</title>
@ -577,11 +577,10 @@ cp -i pg_xlog/00000001000000A900000065 /mnt/server/archivedir/00000001000000A900
<para> <para>
It is important that the archive command return zero exit status if and It is important that the archive command return zero exit status if and
only if it succeeded. Upon getting a zero result, only if it succeeded. Upon getting a zero result,
<productname>PostgreSQL</> will assume that the WAL segment file has been <productname>PostgreSQL</> will assume that the file has been
successfully archived, and will remove or recycle it. successfully archived, and will remove or recycle it. However, a nonzero
However, a nonzero status tells status tells <productname>PostgreSQL</> that the file was not archived;
<productname>PostgreSQL</> that the file was not archived; it will try it will try again periodically until it succeeds.
again periodically until it succeeds.
</para> </para>
<para> <para>
@ -1001,11 +1000,13 @@ restore_command = 'cp /mnt/server/archivedir/%f %p'
<para> <para>
It is important that the command return nonzero exit status on failure. It is important that the command return nonzero exit status on failure.
The command <emphasis>will</> be asked for log files that are not present The command <emphasis>will</> be asked for files that are not present
in the archive; it must return nonzero when so asked. This is not an in the archive; it must return nonzero when so asked. This is not an
error condition. Be aware also that the base name of the <literal>%p</> error condition. Not all of the requested files will be WAL segment
path will be different from <literal>%f</>; do not expect them to be files; you should also expect requests for files with a suffix of
interchangeable. <literal>.backup</> or <literal>.history</>. Also be aware that
the base name of the <literal>%p</> path will be different from
<literal>%f</>; do not expect them to be interchangeable.
</para> </para>
<para> <para>
@ -1576,19 +1577,21 @@ archive_command = 'local_backup_script.sh'
<para> <para>
The magic that makes the two loosely coupled servers work together is The magic that makes the two loosely coupled servers work together is
simply a <varname>restore_command</> used on the standby that waits simply a <varname>restore_command</> used on the standby that,
for the next WAL file to become available from the primary. The when asked for the next WAL file, waits for it to become available from
<varname>restore_command</> is specified in the the primary. The <varname>restore_command</> is specified in the
<filename>recovery.conf</> file on the standby server. Normal recovery <filename>recovery.conf</> file on the standby server. Normal recovery
processing would request a file from the WAL archive, reporting failure processing would request a file from the WAL archive, reporting failure
if the file was unavailable. For standby processing it is normal for if the file was unavailable. For standby processing it is normal for
the next file to be unavailable, so we must be patient and wait for the next WAL file to be unavailable, so we must be patient and wait for
it to appear. A waiting <varname>restore_command</> can be written as it to appear. For files ending in <literal>.backup</> or
a custom script that loops after polling for the existence of the next <literal>.history</> there is no need to wait, and a non-zero return
WAL file. There must also be some way to trigger failover, which should code must be returned. A waiting <varname>restore_command</> can be
interrupt the <varname>restore_command</>, break the loop and return written as a custom script that loops after polling for the existence of
a file-not-found error to the standby server. This ends recovery and the next WAL file. There must also be some way to trigger failover, which
the standby will then come up as a normal server. should interrupt the <varname>restore_command</>, break the loop and
return a file-not-found error to the standby server. This ends recovery
and the standby will then come up as a normal server.
</para> </para>
<para> <para>
@ -1608,9 +1611,10 @@ if (!triggered)
<para> <para>
A working example of a waiting <varname>restore_command</> is provided A working example of a waiting <varname>restore_command</> is provided
as a <filename>contrib</> module named <application>pg_standby</>. This as a <filename>contrib</> module named <application>pg_standby</>. It
example can be extended as needed to support specific configurations or should be used as a reference on how to correctly implement the logic
environments. described above. It can also be extended as needed to support specific
configurations or environments.
</para> </para>
<para> <para>