Report progress of streaming base backup.

This commit adds pg_stat_progress_basebackup view that reports
the progress while an application like pg_basebackup is taking
a base backup. This uses the progress reporting infrastructure
added by c16dc1aca5, adding support for streaming base backup.

Bump catversion.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Amit Langote, Sergei Kornilov
Discussion: https://postgr.es/m/9ed8b801-8215-1f3d-62d7-65bff53f6e94@oss.nttdata.com
This commit is contained in:
Fujii Masao 2020-03-03 12:03:43 +09:00
parent d79fb88ac7
commit e65497df8f
11 changed files with 339 additions and 6 deletions

View File

@ -376,6 +376,14 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
<row>
<entry><structname>pg_stat_progress_basebackup</structname><indexterm><primary>pg_stat_progress_basebackup</primary></indexterm></entry>
<entry>One row for each WAL sender process streaming a base backup,
showing current progress.
See <xref linkend='basebackup-progress-reporting'/>.
</entry>
</row>
</tbody>
</tgroup>
</table>
@ -3535,7 +3543,10 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
certain commands during command execution. Currently, the only commands
which support progress reporting are <command>ANALYZE</command>,
<command>CLUSTER</command>,
<command>CREATE INDEX</command>, and <command>VACUUM</command>.
<command>CREATE INDEX</command>, <command>VACUUM</command>,
and <xref linkend="protocol-replication-base-backup"/> (i.e., replication
command that <xref linkend="app-pgbasebackup"/> issues to take
a base backup).
This may be expanded in the future.
</para>
@ -4336,6 +4347,156 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
</tbody>
</tgroup>
</table>
</sect2>
<sect2 id="basebackup-progress-reporting">
<title>Base Backup Progress Reporting</title>
<para>
Whenever an application like <application>pg_basebackup</application>
is taking a base backup, the
<structname>pg_stat_progress_basebackup</structname>
view will contain a row for each WAL sender process that is currently
running <command>BASE_BACKUP</command> replication command
and streaming the backup. The tables below describe the information
that will be reported and provide information about how to interpret it.
</para>
<table id="pg-stat-progress-basebackup-view" xreflabel="pg_stat_progress_basebackup">
<title><structname>pg_stat_progress_basebackup</structname> View</title>
<tgroup cols="3">
<thead>
<row>
<entry>Column</entry>
<entry>Type</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><structfield>pid</structfield></entry>
<entry><type>integer</type></entry>
<entry>Process ID of a WAL sender process.</entry>
</row>
<row>
<entry><structfield>phase</structfield></entry>
<entry><type>text</type></entry>
<entry>Current processing phase. See <xref linkend="basebackup-phases" />.</entry>
</row>
<row>
<entry><structfield>backup_total</structfield></entry>
<entry><type>bigint</type></entry>
<entry>
Total amount of data that will be streamed. If progress reporting
is not enabled in <application>pg_basebackup</application>
(i.e., <literal>--progress</literal> option is not specified),
this is <literal>0</literal>. Otherwise, this is estimated and
reported as of the beginning of
<literal>streaming database files</literal> phase. Note that
this is only an approximation since the database
may change during <literal>streaming database files</literal> phase
and WAL log may be included in the backup later. This is always
the same value as <structfield>backup_streamed</structfield>
once the amount of data streamed exceeds the estimated
total size.
</entry>
</row>
<row>
<entry><structfield>backup_streamed</structfield></entry>
<entry><type>bigint</type></entry>
<entry>
Amount of data streamed. This counter only advances
when the phase is <literal>streaming database files</literal> or
<literal>transfering wal files</literal>.
</entry>
</row>
<row>
<entry><structfield>tablespaces_total</structfield></entry>
<entry><type>bigint</type></entry>
<entry>
Total number of tablespaces that will be streamed.
</entry>
</row>
<row>
<entry><structfield>tablespaces_streamed</structfield></entry>
<entry><type>bigint</type></entry>
<entry>
Number of tablespaces streamed. This counter only
advances when the phase is <literal>streaming database files</literal>.
</entry>
</row>
</tbody>
</tgroup>
</table>
<table id="basebackup-phases">
<title>Base backup phases</title>
<tgroup cols="2">
<thead>
<row>
<entry>Phase</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>initializing</literal></entry>
<entry>
The WAL sender process is preparing to begin the backup.
This phase is expected to be very brief.
</entry>
</row>
<row>
<entry><literal>waiting for checkpoint to finish</literal></entry>
<entry>
The WAL sender process is currently performing
<function>pg_start_backup</function> to set up for
taking a base backup, and waiting for backup start
checkpoint to finish.
</entry>
</row>
<row>
<entry><literal>estimating backup size</literal></entry>
<entry>
The WAL sender process is currently estimating the total amount
of database files that will be streamed as a base backup.
</entry>
</row>
<row>
<entry><literal>streaming database files</literal></entry>
<entry>
The WAL sender process is currently streaming database files
as a base backup.
</entry>
</row>
<row>
<entry><literal>waiting for wal archiving to finish</literal></entry>
<entry>
The WAL sender process is currently performing
<function>pg_stop_backup</function> to finish the backup,
and waiting for all the WAL files required for the base backup
to be successfully archived.
If either <literal>--wal-method=none</literal> or
<literal>--wal-method=stream</literal> is specified in
<application>pg_basebackup</application>, the backup will end
when this phase is completed.
</entry>
</row>
<row>
<entry><literal>transferring wal files</literal></entry>
<entry>
The WAL sender process is currently transferring all WAL logs
generated during the backup. This phase occurs after
<literal>waiting for wal archiving to finish</literal> phase if
<literal>--wal-method=fetch</literal> is specified in
<application>pg_basebackup</application>. The backup will end
when this phase is completed.
</entry>
</row>
</tbody>
</tgroup>
</table>
</sect2>
</sect1>

View File

@ -2465,7 +2465,7 @@ The commands accepted in replication mode are:
</listitem>
</varlistentry>
<varlistentry>
<varlistentry id="protocol-replication-base-backup" xreflabel="BASE_BACKUP">
<term><literal>BASE_BACKUP</literal> [ <literal>LABEL</literal> <replaceable>'label'</replaceable> ] [ <literal>PROGRESS</literal> ] [ <literal>FAST</literal> ] [ <literal>WAL</literal> ] [ <literal>NOWAIT</literal> ] [ <literal>MAX_RATE</literal> <replaceable>rate</replaceable> ] [ <literal>TABLESPACE_MAP</literal> ] [ <literal>NOVERIFY_CHECKSUMS</literal> ]
<indexterm><primary>BASE_BACKUP</primary></indexterm>
</term>

View File

@ -104,6 +104,13 @@ PostgreSQL documentation
</listitem>
</itemizedlist>
</para>
<para>
Whenever <application>pg_basebackup</application> is taking a base
backup, the <structname>pg_stat_progress_basebackup</structname>
view will report the progress of the backup.
See <xref linkend="basebackup-progress-reporting"/> for details.
</para>
</refsect1>
<refsect1>
@ -459,6 +466,15 @@ PostgreSQL documentation
This may make the backup take slightly longer, and in particular it
will take longer before the first data is sent.
</para>
<para>
Whether this is enabled or not, the
<structname>pg_stat_progress_basebackup</structname> view
report the progress of the backup in the server side. But note
that the total amount of data that will be streamed is estimated
and reported only when this option is enabled. In other words,
<literal>backup_total</literal> column in the view always
indicates <literal>0</literal> if this option is disabled.
</para>
</listitem>
</varlistentry>

View File

@ -39,6 +39,7 @@
#include "catalog/catversion.h"
#include "catalog/pg_control.h"
#include "catalog/pg_database.h"
#include "commands/progress.h"
#include "commands/tablespace.h"
#include "common/controldata_utils.h"
#include "miscadmin.h"
@ -10228,6 +10229,10 @@ issue_xlog_fsync(int fd, XLogSegNo segno)
* active at the same time, and they don't conflict with an exclusive backup
* either.
*
* tablespaces is required only when this function is called while
* the streaming base backup requested by pg_basebackup is running.
* NULL should be specified otherwise.
*
* tblspcmapfile is required mainly for tar format in windows as native windows
* utilities are not able to create symlinks while extracting files from tar.
* However for consistency, the same is used for all platforms.
@ -10470,6 +10475,14 @@ do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
datadirpathlen = strlen(DataDir);
/*
* Report that we are now estimating the total backup size
* if we're streaming base backup as requested by pg_basebackup
*/
if (tablespaces)
pgstat_progress_update_param(PROGRESS_BASEBACKUP_PHASE,
PROGRESS_BASEBACKUP_PHASE_ESTIMATE_BACKUP_SIZE);
/* Collect information about all tablespaces */
tblspcdir = AllocateDir("pg_tblspc");
while ((de = ReadDir(tblspcdir, "pg_tblspc")) != NULL)

View File

@ -1060,6 +1060,22 @@ CREATE VIEW pg_stat_progress_create_index AS
FROM pg_stat_get_progress_info('CREATE INDEX') AS S
LEFT JOIN pg_database D ON S.datid = D.oid;
CREATE VIEW pg_stat_progress_basebackup AS
SELECT
S.pid AS pid,
CASE S.param1 WHEN 0 THEN 'initializing'
WHEN 1 THEN 'waiting for checkpoint to finish'
WHEN 2 THEN 'estimating backup size'
WHEN 3 THEN 'streaming database files'
WHEN 4 THEN 'waiting for wal archiving to finish'
WHEN 5 THEN 'transferring wal files'
END AS phase,
S.param2 AS backup_total,
S.param3 AS backup_streamed,
S.param4 AS tablespaces_total,
S.param5 AS tablespaces_streamed
FROM pg_stat_get_progress_info('BASEBACKUP') AS S;
CREATE VIEW pg_user_mappings AS
SELECT
U.oid AS umid,

View File

@ -19,6 +19,7 @@
#include "access/xlog_internal.h" /* for pg_start/stop_backup */
#include "catalog/pg_type.h"
#include "common/file_perm.h"
#include "commands/progress.h"
#include "lib/stringinfo.h"
#include "libpq/libpq.h"
#include "libpq/pqformat.h"
@ -70,6 +71,7 @@ static void parse_basebackup_options(List *options, basebackup_options *opt);
static void SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli);
static int compareWalFileNames(const ListCell *a, const ListCell *b);
static void throttle(size_t increment);
static void update_basebackup_progress(int64 delta);
static bool is_checksummed_file(const char *fullpath, const char *filename);
/* Was the backup currently in-progress initiated in recovery mode? */
@ -121,6 +123,12 @@ static long long int total_checksum_failures;
/* Do not verify checksums. */
static bool noverify_checksums = false;
/* Total amount of backup data that will be streamed */
static int64 backup_total = 0;
/* Amount of backup data already streamed */
static int64 backup_streamed = 0;
/*
* Definition of one element part of an exclusion list, used for paths part
* of checksum validation or base backups. "name" is the name of the file
@ -246,6 +254,10 @@ perform_base_backup(basebackup_options *opt)
int datadirpathlen;
List *tablespaces = NIL;
backup_total = 0;
backup_streamed = 0;
pgstat_progress_start_command(PROGRESS_COMMAND_BASEBACKUP, InvalidOid);
datadirpathlen = strlen(DataDir);
backup_started_in_recovery = RecoveryInProgress();
@ -255,6 +267,8 @@ perform_base_backup(basebackup_options *opt)
total_checksum_failures = 0;
pgstat_progress_update_param(PROGRESS_BASEBACKUP_PHASE,
PROGRESS_BASEBACKUP_PHASE_WAIT_CHECKPOINT);
startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli,
labelfile, &tablespaces,
tblspc_map_file,
@ -271,8 +285,7 @@ perform_base_backup(basebackup_options *opt)
{
ListCell *lc;
tablespaceinfo *ti;
SendXlogRecPtrResult(startptr, starttli);
int tblspc_streamed = 0;
/*
* Calculate the relative path of temporary statistics directory in
@ -291,6 +304,38 @@ perform_base_backup(basebackup_options *opt)
ti->size = opt->progress ? sendDir(".", 1, true, tablespaces, true) : -1;
tablespaces = lappend(tablespaces, ti);
/*
* Calculate the total backup size by summing up the size of each
* tablespace
*/
if (opt->progress)
{
foreach(lc, tablespaces)
{
tablespaceinfo *tmp = (tablespaceinfo *) lfirst(lc);
backup_total += tmp->size;
}
}
/* Report that we are now streaming database files as a base backup */
{
const int index[] = {
PROGRESS_BASEBACKUP_PHASE,
PROGRESS_BASEBACKUP_BACKUP_TOTAL,
PROGRESS_BASEBACKUP_TBLSPC_TOTAL
};
const int64 val[] = {
PROGRESS_BASEBACKUP_PHASE_STREAM_BACKUP,
backup_total, list_length(tablespaces)
};
pgstat_progress_update_multi_param(3, index, val);
}
/* Send the starting position of the backup */
SendXlogRecPtrResult(startptr, starttli);
/* Send tablespace header */
SendBackupHeader(tablespaces);
@ -372,8 +417,14 @@ perform_base_backup(basebackup_options *opt)
}
else
pq_putemptymessage('c'); /* CopyDone */
tblspc_streamed++;
pgstat_progress_update_param(PROGRESS_BASEBACKUP_TBLSPC_STREAMED,
tblspc_streamed);
}
pgstat_progress_update_param(PROGRESS_BASEBACKUP_PHASE,
PROGRESS_BASEBACKUP_PHASE_WAIT_WAL_ARCHIVE);
endptr = do_pg_stop_backup(labelfile->data, !opt->nowait, &endtli);
}
PG_END_ENSURE_ERROR_CLEANUP(do_pg_abort_backup, BoolGetDatum(false));
@ -399,6 +450,9 @@ perform_base_backup(basebackup_options *opt)
ListCell *lc;
TimeLineID tli;
pgstat_progress_update_param(PROGRESS_BASEBACKUP_PHASE,
PROGRESS_BASEBACKUP_PHASE_TRANSFER_WAL);
/*
* I'd rather not worry about timelines here, so scan pg_wal and
* include all WAL files in the range between 'startptr' and 'endptr',
@ -548,6 +602,7 @@ perform_base_backup(basebackup_options *opt)
if (pq_putmessage('d', buf, cnt))
ereport(ERROR,
(errmsg("base backup could not send data, aborting backup")));
update_basebackup_progress(cnt);
len += cnt;
throttle(cnt);
@ -623,6 +678,7 @@ perform_base_backup(basebackup_options *opt)
errmsg("checksum verification failure during base backup")));
}
pgstat_progress_end_command();
}
/*
@ -949,6 +1005,7 @@ sendFileWithContent(const char *filename, const char *content)
_tarWriteHeader(filename, NULL, &statbuf, false);
/* Send the contents as a CopyData message */
pq_putmessage('d', content, len);
update_basebackup_progress(len);
/* Pad to 512 byte boundary, per tar format requirements */
pad = ((len + 511) & ~511) - len;
@ -958,6 +1015,7 @@ sendFileWithContent(const char *filename, const char *content)
MemSet(buf, 0, pad);
pq_putmessage('d', buf, pad);
update_basebackup_progress(pad);
}
}
@ -1565,6 +1623,7 @@ sendFile(const char *readfilename, const char *tarfilename, struct stat *statbuf
if (pq_putmessage('d', buf, cnt))
ereport(ERROR,
(errmsg("base backup could not send data, aborting backup")));
update_basebackup_progress(cnt);
len += cnt;
throttle(cnt);
@ -1590,6 +1649,7 @@ sendFile(const char *readfilename, const char *tarfilename, struct stat *statbuf
{
cnt = Min(sizeof(buf), statbuf->st_size - len);
pq_putmessage('d', buf, cnt);
update_basebackup_progress(cnt);
len += cnt;
throttle(cnt);
}
@ -1604,6 +1664,7 @@ sendFile(const char *readfilename, const char *tarfilename, struct stat *statbuf
{
MemSet(buf, 0, pad);
pq_putmessage('d', buf, pad);
update_basebackup_progress(pad);
}
FreeFile(fp);
@ -1658,6 +1719,7 @@ _tarWriteHeader(const char *filename, const char *linktarget,
}
pq_putmessage('d', h, sizeof(h));
update_basebackup_progress(sizeof(h));
}
return sizeof(h);
@ -1755,3 +1817,36 @@ throttle(size_t increment)
*/
throttled_last = GetCurrentTimestamp();
}
/*
* Increment the counter for the amount of data already streamed
* by the given number of bytes, and update the progress report for
* pg_stat_progress_basebackup.
*/
static void
update_basebackup_progress(int64 delta)
{
const int index[] = {
PROGRESS_BASEBACKUP_BACKUP_STREAMED,
PROGRESS_BASEBACKUP_BACKUP_TOTAL
};
int64 val[2];
int nparam = 0;
backup_streamed += delta;
val[nparam++] = backup_streamed;
/*
* Avoid overflowing past 100% or the full size. This may make the total
* size number change as we approach the end of the backup (the estimate
* will always be wrong if WAL is included), but that's better than having
* the done column be bigger than the total.
*/
if (backup_total > 0 && backup_streamed > backup_total)
{
backup_total = backup_streamed;
val[nparam++] = backup_total;
}
pgstat_progress_update_multi_param(nparam, index, val);
}

View File

@ -474,6 +474,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
cmdtype = PROGRESS_COMMAND_CLUSTER;
else if (pg_strcasecmp(cmd, "CREATE INDEX") == 0)
cmdtype = PROGRESS_COMMAND_CREATE_INDEX;
else if (pg_strcasecmp(cmd, "BASEBACKUP") == 0)
cmdtype = PROGRESS_COMMAND_BASEBACKUP;
else
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),

View File

@ -53,6 +53,6 @@
*/
/* yyyymmddN */
#define CATALOG_VERSION_NO 202002271
#define CATALOG_VERSION_NO 202003031
#endif

View File

@ -119,4 +119,18 @@
#define PROGRESS_SCAN_BLOCKS_TOTAL 15
#define PROGRESS_SCAN_BLOCKS_DONE 16
/* Progress parameters for pg_basebackup */
#define PROGRESS_BASEBACKUP_PHASE 0
#define PROGRESS_BASEBACKUP_BACKUP_TOTAL 1
#define PROGRESS_BASEBACKUP_BACKUP_STREAMED 2
#define PROGRESS_BASEBACKUP_TBLSPC_TOTAL 3
#define PROGRESS_BASEBACKUP_TBLSPC_STREAMED 4
/* Phases of pg_basebackup (as advertised via PROGRESS_BASEBACKUP_PHASE) */
#define PROGRESS_BASEBACKUP_PHASE_WAIT_CHECKPOINT 1
#define PROGRESS_BASEBACKUP_PHASE_ESTIMATE_BACKUP_SIZE 2
#define PROGRESS_BASEBACKUP_PHASE_STREAM_BACKUP 3
#define PROGRESS_BASEBACKUP_PHASE_WAIT_WAL_ARCHIVE 4
#define PROGRESS_BASEBACKUP_PHASE_TRANSFER_WAL 5
#endif

View File

@ -958,7 +958,8 @@ typedef enum ProgressCommandType
PROGRESS_COMMAND_VACUUM,
PROGRESS_COMMAND_ANALYZE,
PROGRESS_COMMAND_CLUSTER,
PROGRESS_COMMAND_CREATE_INDEX
PROGRESS_COMMAND_CREATE_INDEX,
PROGRESS_COMMAND_BASEBACKUP
} ProgressCommandType;
#define PGSTAT_NUM_PROGRESS_PARAM 20

View File

@ -1876,6 +1876,21 @@ pg_stat_progress_analyze| SELECT s.pid,
(s.param8)::oid AS current_child_table_relid
FROM (pg_stat_get_progress_info('ANALYZE'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
LEFT JOIN pg_database d ON ((s.datid = d.oid)));
pg_stat_progress_basebackup| SELECT s.pid,
CASE s.param1
WHEN 0 THEN 'initializing'::text
WHEN 1 THEN 'waiting for checkpoint to finish'::text
WHEN 2 THEN 'estimating backup size'::text
WHEN 3 THEN 'streaming database files'::text
WHEN 4 THEN 'waiting for wal archiving to finish'::text
WHEN 5 THEN 'transferring wal files'::text
ELSE NULL::text
END AS phase,
s.param2 AS backup_total,
s.param3 AS backup_streamed,
s.param4 AS tablespaces_total,
s.param5 AS tablespaces_streamed
FROM pg_stat_get_progress_info('BASEBACKUP'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20);
pg_stat_progress_cluster| SELECT s.pid,
s.datid,
d.datname,