postgresql/src/bin/pg_dump/pg_backup_archiver.c

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

4856 lines
125 KiB
C
Raw Normal View History

2000-07-04 16:25:28 +02:00
/*-------------------------------------------------------------------------
*
* pg_backup_archiver.c
*
* Private implementation of the archiver routines.
*
* See the headers to pg_restore for more details.
*
* Copyright (c) 2000, Philip Warner
* Rights are granted to use this software in any way so long
* as this notice is not removed.
*
* The author is not responsible for loss or damages that may
* result from its use.
2000-07-04 16:25:28 +02:00
*
*
* IDENTIFICATION
2010-09-20 22:08:53 +02:00
* src/bin/pg_dump/pg_backup_archiver.c
*
2000-07-04 16:25:28 +02:00
*-------------------------------------------------------------------------
*/
#include "postgres_fe.h"
2000-07-04 16:25:28 +02:00
#include <ctype.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/wait.h>
#ifdef WIN32
#include <io.h>
#endif
#include "common/string.h"
#include "compress_io.h"
#include "dumputils.h"
#include "fe_utils/string_utils.h"
#include "lib/stringinfo.h"
#include "libpq/libpq-fs.h"
#include "parallel.h"
#include "pg_backup_archiver.h"
#include "pg_backup_db.h"
#include "pg_backup_utils.h"
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
/*
* State for tracking TocEntrys that are ready to process during a parallel
* restore. (This used to be a list, and we still call it that, though now
* it's really an array so that we can apply qsort to it.)
*
* tes[] is sized large enough that we can't overrun it.
* The valid entries are indexed first_te .. last_te inclusive.
* We periodically sort the array to bring larger-by-dataLength entries to
* the front; "sorted" is true if the valid entries are known sorted.
*/
typedef struct _parallelReadyList
{
TocEntry **tes; /* Ready-to-dump TocEntrys */
int first_te; /* index of first valid entry in tes[] */
int last_te; /* index of last valid entry in tes[] */
bool sorted; /* are valid entries currently sorted? */
} ParallelReadyList;
static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt,
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
const pg_compress_specification compression_spec,
bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupWorkerPtr);
static void _getObjectDescription(PQExpBuffer buf, const TocEntry *te);
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
static void _printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData);
static char *sanitize_line(const char *str, bool want_hyphen);
static void _doSetFixedOutputState(ArchiveHandle *AH);
static void _doSetSessionAuth(ArchiveHandle *AH, const char *user);
static void _reconnectToDB(ArchiveHandle *AH, const char *dbname);
static void _becomeUser(ArchiveHandle *AH, const char *user);
static void _becomeOwner(ArchiveHandle *AH, TocEntry *te);
static void _selectOutputSchema(ArchiveHandle *AH, const char *schemaName);
static void _selectTablespace(ArchiveHandle *AH, const char *tablespace);
static void _selectTableAccessMethod(ArchiveHandle *AH, const char *tableam);
static void processEncodingEntry(ArchiveHandle *AH, TocEntry *te);
static void processStdStringsEntry(ArchiveHandle *AH, TocEntry *te);
static void processSearchPathEntry(ArchiveHandle *AH, TocEntry *te);
static int _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH);
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
static RestorePass _tocEntryRestorePass(TocEntry *te);
static bool _tocEntryIsACL(TocEntry *te);
static void _disableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te);
static void _enableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te);
Fix pg_dump for hash partitioning on enum columns. Hash partitioning on an enum is problematic because the hash codes are derived from the OIDs assigned to the enum values, which will almost certainly be different after a dump-and-reload than they were before. This means that some rows probably end up in different partitions than before, causing restore to fail because of partition constraint violations. (pg_upgrade dodges this problem by using hacks to force the enum values to keep the same OIDs, but that's not possible nor desirable for pg_dump.) Users can work around that by specifying --load-via-partition-root, but since that's a dump-time not restore-time decision, one might find out the need for it far too late. Instead, teach pg_dump to apply that option automatically when dealing with a partitioned table that has hash-on-enum partitioning. Also deal with a pre-existing issue for --load-via-partition-root mode: in a parallel restore, we try to TRUNCATE target tables just before loading them, in order to enable some backend optimizations. This is bad when using --load-via-partition-root because (a) we're likely to suffer deadlocks from restore jobs trying to restore rows into other partitions than they came from, and (b) if we miss getting a deadlock we might still lose data due to a TRUNCATE removing rows from some already-completed restore job. The fix for this is conceptually simple: just don't TRUNCATE if we're dealing with a --load-via-partition-root case. The tricky bit is for pg_restore to identify those cases. In dumps using COPY commands we can inspect each COPY command to see if it targets the nominal target table or some ancestor. However, in dumps using INSERT commands it's pretty impractical to examine the INSERTs in advance. To provide a solution for that going forward, modify pg_dump to mark TABLE DATA items that are using --load-via-partition-root with a comment. (This change also responds to a complaint from Robert Haas that the dump output for --load-via-partition-root is pretty confusing.) pg_restore checks for the special comment as well as checking the COPY command if present. This will fail to identify the combination of --load-via-partition-root and --inserts in pre-existing dump files, but that should be a pretty rare case in the field. If it does happen you will probably get a deadlock failure that you can work around by not using parallel restore, which is the same as before this bug fix. Having done this, there seems no remaining reason for the alarmism in the pg_dump man page about combining --load-via-partition-root with parallel restore, so remove that warning. Patch by me; thanks to Julien Rouhaud for review. Back-patch to v11 where hash partitioning was introduced. Discussion: https://postgr.es/m/1376149.1675268279@sss.pgh.pa.us
2023-03-17 18:31:40 +01:00
static bool is_load_via_partition_root(TocEntry *te);
static void buildTocEntryArrays(ArchiveHandle *AH);
static void _moveBefore(TocEntry *pos, TocEntry *te);
2000-07-04 16:25:28 +02:00
static int _discoverArchiveFormat(ArchiveHandle *AH);
static int RestoringToDB(ArchiveHandle *AH);
static void dump_lo_buf(ArchiveHandle *AH);
static void dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim);
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
static void SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec);
static CompressFileHandle *SaveOutput(ArchiveHandle *AH);
static void RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput);
static int restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel);
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
static void restore_toc_entries_prefork(ArchiveHandle *AH,
TocEntry *pending_list);
static void restore_toc_entries_parallel(ArchiveHandle *AH,
ParallelState *pstate,
TocEntry *pending_list);
static void restore_toc_entries_postfork(ArchiveHandle *AH,
TocEntry *pending_list);
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
static void pending_list_header_init(TocEntry *l);
static void pending_list_append(TocEntry *l, TocEntry *te);
static void pending_list_remove(TocEntry *te);
static void ready_list_init(ParallelReadyList *ready_list, int tocCount);
static void ready_list_free(ParallelReadyList *ready_list);
static void ready_list_insert(ParallelReadyList *ready_list, TocEntry *te);
static void ready_list_remove(ParallelReadyList *ready_list, int i);
static void ready_list_sort(ParallelReadyList *ready_list);
static int TocEntrySizeCompare(const void *p1, const void *p2);
static void move_to_ready_list(TocEntry *pending_list,
ParallelReadyList *ready_list,
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
RestorePass pass);
static TocEntry *pop_next_work_item(ParallelReadyList *ready_list,
ParallelState *pstate);
static void mark_dump_job_done(ArchiveHandle *AH,
TocEntry *te,
int status,
void *callback_data);
static void mark_restore_job_done(ArchiveHandle *AH,
TocEntry *te,
int status,
void *callback_data);
static void fix_dependencies(ArchiveHandle *AH);
static bool has_lock_conflicts(TocEntry *te1, TocEntry *te2);
static void repoint_table_dependencies(ArchiveHandle *AH);
static void identify_locking_dependencies(ArchiveHandle *AH, TocEntry *te);
static void reduce_dependencies(ArchiveHandle *AH, TocEntry *te,
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
ParallelReadyList *ready_list);
static void mark_create_done(ArchiveHandle *AH, TocEntry *te);
static void inhibit_data_for_failed_table(ArchiveHandle *AH, TocEntry *te);
static void StrictNamesCheck(RestoreOptions *ropt);
/*
* Allocate a new DumpOptions block containing all default values.
*/
DumpOptions *
NewDumpOptions(void)
{
DumpOptions *opts = (DumpOptions *) pg_malloc(sizeof(DumpOptions));
InitDumpOptions(opts);
return opts;
}
/*
* Initialize a DumpOptions struct to all default values
*/
void
InitDumpOptions(DumpOptions *opts)
{
memset(opts, 0, sizeof(DumpOptions));
/* set any fields that shouldn't default to zeroes */
opts->include_everything = true;
opts->cparams.promptPassword = TRI_DEFAULT;
opts->dumpSections = DUMP_UNSECTIONED;
}
/*
* Create a freshly allocated DumpOptions with options equivalent to those
* found in the given RestoreOptions.
*/
DumpOptions *
dumpOptionsFromRestoreOptions(RestoreOptions *ropt)
{
DumpOptions *dopt = NewDumpOptions();
/* this is the inverse of what's at the end of pg_dump.c's main() */
dopt->cparams.dbname = ropt->cparams.dbname ? pg_strdup(ropt->cparams.dbname) : NULL;
dopt->cparams.pgport = ropt->cparams.pgport ? pg_strdup(ropt->cparams.pgport) : NULL;
dopt->cparams.pghost = ropt->cparams.pghost ? pg_strdup(ropt->cparams.pghost) : NULL;
dopt->cparams.username = ropt->cparams.username ? pg_strdup(ropt->cparams.username) : NULL;
dopt->cparams.promptPassword = ropt->cparams.promptPassword;
dopt->outputClean = ropt->dropSchema;
dopt->dataOnly = ropt->dataOnly;
dopt->schemaOnly = ropt->schemaOnly;
dopt->if_exists = ropt->if_exists;
dopt->column_inserts = ropt->column_inserts;
dopt->dumpSections = ropt->dumpSections;
dopt->aclsSkip = ropt->aclsSkip;
dopt->outputSuperuser = ropt->superuser;
dopt->outputCreateDB = ropt->createDB;
dopt->outputNoOwner = ropt->noOwner;
dopt->outputNoTableAm = ropt->noTableAm;
dopt->outputNoTablespaces = ropt->noTablespace;
dopt->disable_triggers = ropt->disable_triggers;
dopt->use_setsessauth = ropt->use_setsessauth;
dopt->disable_dollar_quoting = ropt->disable_dollar_quoting;
dopt->dump_inserts = ropt->dump_inserts;
dopt->no_comments = ropt->no_comments;
dopt->no_publications = ropt->no_publications;
dopt->no_security_labels = ropt->no_security_labels;
dopt->no_subscriptions = ropt->no_subscriptions;
dopt->lockWaitTimeout = ropt->lockWaitTimeout;
dopt->include_everything = ropt->include_everything;
dopt->enable_row_security = ropt->enable_row_security;
dopt->sequence_data = ropt->sequence_data;
return dopt;
}
2000-07-04 16:25:28 +02:00
/*
* Wrapper functions.
*
* The objective is to make writing new formats and dumpers as simple
2000-07-04 16:25:28 +02:00
* as possible, if necessary at the expense of extra function calls etc.
*
*/
/*
* The dump worker setup needs lots of knowledge of the internals of pg_dump,
* so it's defined in pg_dump.c and passed into OpenArchive. The restore worker
* setup doesn't need to know anything much, so it's defined here.
*/
static void
setupRestoreWorker(Archive *AHX)
{
ArchiveHandle *AH = (ArchiveHandle *) AHX;
AH->ReopenPtr(AH);
}
2000-07-04 16:25:28 +02:00
/* Create a new archive */
/* Public */
Archive *
CreateArchive(const char *FileSpec, const ArchiveFormat fmt,
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
const pg_compress_specification compression_spec,
bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupDumpWorker)
2000-07-04 16:25:28 +02:00
{
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
ArchiveHandle *AH = _allocAH(FileSpec, fmt, compression_spec,
dosync, mode, setupDumpWorker);
2001-03-22 05:01:46 +01:00
2000-07-04 16:25:28 +02:00
return (Archive *) AH;
}
/* Open an existing archive */
/* Public */
Archive *
OpenArchive(const char *FileSpec, const ArchiveFormat fmt)
2000-07-04 16:25:28 +02:00
{
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
ArchiveHandle *AH;
pg_compress_specification compression_spec = {0};
compression_spec.algorithm = PG_COMPRESSION_NONE;
AH = _allocAH(FileSpec, fmt, compression_spec, true,
archModeRead, setupRestoreWorker);
2001-03-22 05:01:46 +01:00
2000-07-04 16:25:28 +02:00
return (Archive *) AH;
}
/* Public */
void
CloseArchive(Archive *AHX)
2000-07-04 16:25:28 +02:00
{
ArchiveHandle *AH = (ArchiveHandle *) AHX;
2001-03-22 05:01:46 +01:00
AH->ClosePtr(AH);
2000-07-04 16:25:28 +02:00
/* Close the output */
errno = 0;
Improve type handling in pg_dump's compress file API After 0da243fed0 got committed, we've received a report about a compiler warning, related to the new LZ4File_gets() function: compress_lz4.c: In function 'LZ4File_gets': compress_lz4.c:492:19: warning: comparison of unsigned expression in '< 0' is always false [-Wtype-limits] 492 | if (dsize < 0) The reason is very simple - dsize is declared as size_t, which is an unsigned integer, and thus the check is pointless and we might fail to notice an error in some cases (or fail in a strange way a bit later). The warning could have been silenced by simply changing the type, but we realized the API mostly assumes all the libraries use the same types and report errors the same way (e.g. by returning 0 and/or negative value). But we can't make this assumption - the gzip/lz4 libraries already disagree on some of this, and even if they did a library added in the future might not. The right solution is to define what the API does, and translate the library-specific behavior in consistent way (so that the internal errors are not exposed to users of our compression API). So this adjusts the data types in a couple places, so that we don't miss library errors, and simplifies and unifies the error reporting to simply return true/false (instead of e.g. size_t). While at it, make sure LZ4File_open_write() does not clobber errno in case open_func() fails. Author: Georgios Kokolatos Reported-by: Alexander Lakhin Reviewed-by: Tomas Vondra, Justin Pryzby Discussion: https://postgr.es/m/33496f7c-3449-1426-d568-63f6bca2ac1f@gmail.com
2023-03-23 17:51:55 +01:00
if (!EndCompressFileHandle(AH->OF))
pg_fatal("could not close output file: %m");
2000-07-04 16:25:28 +02:00
}
/* Public */
void
SetArchiveOptions(Archive *AH, DumpOptions *dopt, RestoreOptions *ropt)
2000-07-04 16:25:28 +02:00
{
/* Caller can omit dump options, in which case we synthesize them */
if (dopt == NULL && ropt != NULL)
dopt = dumpOptionsFromRestoreOptions(ropt);
/* Save options for later access */
AH->dopt = dopt;
AH->ropt = ropt;
}
/* Public */
void
ProcessArchiveRestoreOptions(Archive *AHX)
{
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
TocEntry *te;
teSection curSection;
/* Decide which TOC entries will be dumped/restored, and mark them */
curSection = SECTION_PRE_DATA;
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
Improve pg_dump's dependency-sorting logic to enforce section dump order. As of 9.2, with the --section option, it is very important that the concept of "pre data", "data", and "post data" sections of the output be honored strictly; else a dump divided into separate sectional files might be unrestorable. However, the dependency-sorting logic knew nothing of sections and would happily select output orderings that didn't fit that structure. Doing so was mostly harmless before 9.2, but now we need to be sure it doesn't do that. To fix, create dummy objects representing the section boundaries and add dependencies between them and all the normal objects. (This might sound expensive but it seems to only add a percent or two to pg_dump's runtime.) This also fixes a problem introduced in 9.1 by the feature that allows incomplete GROUP BY lists when a primary key is given in GROUP BY. That means that views can depend on primary key constraints. Previously, pg_dump would deal with that by simply emitting the primary key constraint before the view definition (and hence before the data section of the output). That's bad enough for simple serial restores, where creating an index before the data is loaded works, but is undesirable for speed reasons. But it could lead to outright failure of parallel restores, as seen in bug #6699 from Joe Van Dyk. That happened because pg_restore would switch into parallel mode as soon as it reached the constraint, and then very possibly would try to emit the view definition before the primary key was committed (as a consequence of another bug that causes the view not to be correctly marked as depending on the constraint). Adding the section boundary constraints forces the dependency-sorting code to break the view into separate table and rule declarations, allowing the rule, and hence the primary key constraint it depends on, to revert to their intended location in the post-data section. This also somewhat accidentally works around the bogus-dependency-marking problem, because the rule will be correctly shown as depending on the constraint, so parallel pg_restore will now do the right thing. (We will fix the bogus-dependency problem for real in a separate patch, but that patch is not easily back-portable to 9.1, so the fact that this patch is enough to dodge the only known symptom is fortunate.) Back-patch to 9.1, except for the hunk that adds verification that the finished archive TOC list is in correct section order; the place where it was convenient to add that doesn't exist in 9.1.
2012-06-26 03:19:10 +02:00
/*
* When writing an archive, we also take this opportunity to check
* that we have generated the entries in a sane order that respects
* the section divisions. When reading, don't complain, since buggy
* old versions of pg_dump might generate out-of-order archives.
*/
if (AH->mode != archModeRead)
{
switch (te->section)
{
case SECTION_NONE:
/* ok to be anywhere */
break;
case SECTION_PRE_DATA:
if (curSection != SECTION_PRE_DATA)
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_warning("archive items not in correct section order");
Improve pg_dump's dependency-sorting logic to enforce section dump order. As of 9.2, with the --section option, it is very important that the concept of "pre data", "data", and "post data" sections of the output be honored strictly; else a dump divided into separate sectional files might be unrestorable. However, the dependency-sorting logic knew nothing of sections and would happily select output orderings that didn't fit that structure. Doing so was mostly harmless before 9.2, but now we need to be sure it doesn't do that. To fix, create dummy objects representing the section boundaries and add dependencies between them and all the normal objects. (This might sound expensive but it seems to only add a percent or two to pg_dump's runtime.) This also fixes a problem introduced in 9.1 by the feature that allows incomplete GROUP BY lists when a primary key is given in GROUP BY. That means that views can depend on primary key constraints. Previously, pg_dump would deal with that by simply emitting the primary key constraint before the view definition (and hence before the data section of the output). That's bad enough for simple serial restores, where creating an index before the data is loaded works, but is undesirable for speed reasons. But it could lead to outright failure of parallel restores, as seen in bug #6699 from Joe Van Dyk. That happened because pg_restore would switch into parallel mode as soon as it reached the constraint, and then very possibly would try to emit the view definition before the primary key was committed (as a consequence of another bug that causes the view not to be correctly marked as depending on the constraint). Adding the section boundary constraints forces the dependency-sorting code to break the view into separate table and rule declarations, allowing the rule, and hence the primary key constraint it depends on, to revert to their intended location in the post-data section. This also somewhat accidentally works around the bogus-dependency-marking problem, because the rule will be correctly shown as depending on the constraint, so parallel pg_restore will now do the right thing. (We will fix the bogus-dependency problem for real in a separate patch, but that patch is not easily back-portable to 9.1, so the fact that this patch is enough to dodge the only known symptom is fortunate.) Back-patch to 9.1, except for the hunk that adds verification that the finished archive TOC list is in correct section order; the place where it was convenient to add that doesn't exist in 9.1.
2012-06-26 03:19:10 +02:00
break;
case SECTION_DATA:
if (curSection == SECTION_POST_DATA)
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_warning("archive items not in correct section order");
Improve pg_dump's dependency-sorting logic to enforce section dump order. As of 9.2, with the --section option, it is very important that the concept of "pre data", "data", and "post data" sections of the output be honored strictly; else a dump divided into separate sectional files might be unrestorable. However, the dependency-sorting logic knew nothing of sections and would happily select output orderings that didn't fit that structure. Doing so was mostly harmless before 9.2, but now we need to be sure it doesn't do that. To fix, create dummy objects representing the section boundaries and add dependencies between them and all the normal objects. (This might sound expensive but it seems to only add a percent or two to pg_dump's runtime.) This also fixes a problem introduced in 9.1 by the feature that allows incomplete GROUP BY lists when a primary key is given in GROUP BY. That means that views can depend on primary key constraints. Previously, pg_dump would deal with that by simply emitting the primary key constraint before the view definition (and hence before the data section of the output). That's bad enough for simple serial restores, where creating an index before the data is loaded works, but is undesirable for speed reasons. But it could lead to outright failure of parallel restores, as seen in bug #6699 from Joe Van Dyk. That happened because pg_restore would switch into parallel mode as soon as it reached the constraint, and then very possibly would try to emit the view definition before the primary key was committed (as a consequence of another bug that causes the view not to be correctly marked as depending on the constraint). Adding the section boundary constraints forces the dependency-sorting code to break the view into separate table and rule declarations, allowing the rule, and hence the primary key constraint it depends on, to revert to their intended location in the post-data section. This also somewhat accidentally works around the bogus-dependency-marking problem, because the rule will be correctly shown as depending on the constraint, so parallel pg_restore will now do the right thing. (We will fix the bogus-dependency problem for real in a separate patch, but that patch is not easily back-portable to 9.1, so the fact that this patch is enough to dodge the only known symptom is fortunate.) Back-patch to 9.1, except for the hunk that adds verification that the finished archive TOC list is in correct section order; the place where it was convenient to add that doesn't exist in 9.1.
2012-06-26 03:19:10 +02:00
break;
case SECTION_POST_DATA:
/* ok no matter which section we were in */
break;
default:
pg_fatal("unexpected section code %d",
(int) te->section);
Improve pg_dump's dependency-sorting logic to enforce section dump order. As of 9.2, with the --section option, it is very important that the concept of "pre data", "data", and "post data" sections of the output be honored strictly; else a dump divided into separate sectional files might be unrestorable. However, the dependency-sorting logic knew nothing of sections and would happily select output orderings that didn't fit that structure. Doing so was mostly harmless before 9.2, but now we need to be sure it doesn't do that. To fix, create dummy objects representing the section boundaries and add dependencies between them and all the normal objects. (This might sound expensive but it seems to only add a percent or two to pg_dump's runtime.) This also fixes a problem introduced in 9.1 by the feature that allows incomplete GROUP BY lists when a primary key is given in GROUP BY. That means that views can depend on primary key constraints. Previously, pg_dump would deal with that by simply emitting the primary key constraint before the view definition (and hence before the data section of the output). That's bad enough for simple serial restores, where creating an index before the data is loaded works, but is undesirable for speed reasons. But it could lead to outright failure of parallel restores, as seen in bug #6699 from Joe Van Dyk. That happened because pg_restore would switch into parallel mode as soon as it reached the constraint, and then very possibly would try to emit the view definition before the primary key was committed (as a consequence of another bug that causes the view not to be correctly marked as depending on the constraint). Adding the section boundary constraints forces the dependency-sorting code to break the view into separate table and rule declarations, allowing the rule, and hence the primary key constraint it depends on, to revert to their intended location in the post-data section. This also somewhat accidentally works around the bogus-dependency-marking problem, because the rule will be correctly shown as depending on the constraint, so parallel pg_restore will now do the right thing. (We will fix the bogus-dependency problem for real in a separate patch, but that patch is not easily back-portable to 9.1, so the fact that this patch is enough to dodge the only known symptom is fortunate.) Back-patch to 9.1, except for the hunk that adds verification that the finished archive TOC list is in correct section order; the place where it was convenient to add that doesn't exist in 9.1.
2012-06-26 03:19:10 +02:00
break;
}
}
if (te->section != SECTION_NONE)
curSection = te->section;
Improve pg_dump's dependency-sorting logic to enforce section dump order. As of 9.2, with the --section option, it is very important that the concept of "pre data", "data", and "post data" sections of the output be honored strictly; else a dump divided into separate sectional files might be unrestorable. However, the dependency-sorting logic knew nothing of sections and would happily select output orderings that didn't fit that structure. Doing so was mostly harmless before 9.2, but now we need to be sure it doesn't do that. To fix, create dummy objects representing the section boundaries and add dependencies between them and all the normal objects. (This might sound expensive but it seems to only add a percent or two to pg_dump's runtime.) This also fixes a problem introduced in 9.1 by the feature that allows incomplete GROUP BY lists when a primary key is given in GROUP BY. That means that views can depend on primary key constraints. Previously, pg_dump would deal with that by simply emitting the primary key constraint before the view definition (and hence before the data section of the output). That's bad enough for simple serial restores, where creating an index before the data is loaded works, but is undesirable for speed reasons. But it could lead to outright failure of parallel restores, as seen in bug #6699 from Joe Van Dyk. That happened because pg_restore would switch into parallel mode as soon as it reached the constraint, and then very possibly would try to emit the view definition before the primary key was committed (as a consequence of another bug that causes the view not to be correctly marked as depending on the constraint). Adding the section boundary constraints forces the dependency-sorting code to break the view into separate table and rule declarations, allowing the rule, and hence the primary key constraint it depends on, to revert to their intended location in the post-data section. This also somewhat accidentally works around the bogus-dependency-marking problem, because the rule will be correctly shown as depending on the constraint, so parallel pg_restore will now do the right thing. (We will fix the bogus-dependency problem for real in a separate patch, but that patch is not easily back-portable to 9.1, so the fact that this patch is enough to dodge the only known symptom is fortunate.) Back-patch to 9.1, except for the hunk that adds verification that the finished archive TOC list is in correct section order; the place where it was convenient to add that doesn't exist in 9.1.
2012-06-26 03:19:10 +02:00
te->reqs = _tocEntryRequired(te, curSection, AH);
}
/* Enforce strict names checking */
if (ropt->strict_names)
StrictNamesCheck(ropt);
}
/* Public */
void
RestoreArchive(Archive *AHX)
{
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
bool parallel_mode;
TocEntry *te;
CompressFileHandle *sav;
2000-07-04 16:25:28 +02:00
They are two different problems; the TOC entry is important for any multiline command or to rerun the command easily later. Whereas displaying the failed SQL command is a matter of fixing the error messages. The latter is complicated by failed COPY commands which, with die-on-errors off, results in the data being processed as a command, so dumping the command will dump all of the data. In the case of long commands, should the whole command be dumped? eg. (eg. several pages of function definition). In the case of the COPY command, I'm not sure what to do. Obviously, it would be best to avoid sending the data, but the data and command are combined (from memory). Also, the 'data' may be in the form of INSERT statements. Attached patch produces the first 125 chars of the command: pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC Entry 26; 1255 16449270 FUNCTION plpgsql_call_handler() pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_call_handler" already exists with same argument types Command was: CREATE FUNCTION plpgsql_call_handler() RETURNS language_handler AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_call_han... pg_restore: [archiver (db)] Error from TOC Entry 27; 1255 16449271 FUNCTION plpgsql_validator(oid) pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_validator" already exists with same argument types Command was: CREATE FUNCTION plpgsql_validator(oid) RETURNS void AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_validator' LANGU... Philip Warner
2004-08-20 22:00:34 +02:00
AH->stage = STAGE_INITIALIZING;
/*
* If we're going to do parallel restore, there are some restrictions.
*/
parallel_mode = (AH->public.numWorkers > 1 && ropt->useDB);
if (parallel_mode)
{
/* We haven't got round to making this work for all archive formats */
if (AH->ClonePtr == NULL || AH->ReopenPtr == NULL)
pg_fatal("parallel restore is not supported with this archive file format");
/* Doesn't work if the archive represents dependencies as OIDs */
if (AH->version < K_VERS_1_8)
pg_fatal("parallel restore is not supported with archives made by pre-8.0 pg_dump");
/*
* It's also not gonna work if we can't reopen the input file, so
* let's try that immediately.
*/
AH->ReopenPtr(AH);
}
/*
* Make sure we won't need (de)compression we haven't got
*/
if (AH->PrintTocDataPtr != NULL)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->hadDumper && (te->reqs & REQ_DATA) != 0)
{
char *errmsg = supports_compression(AH->compression_spec);
if (errmsg)
pg_fatal("cannot restore from compressed archive (%s)",
errmsg);
else
break;
}
}
}
/*
* Prepare index arrays, so we can assume we have them throughout restore.
* It's possible we already did this, though.
*/
if (AH->tocsByDumpId == NULL)
buildTocEntryArrays(AH);
/*
* If we're using a DB connection, then connect it.
*/
if (ropt->useDB)
{
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("connecting to database for restore");
if (AH->version < K_VERS_1_3)
pg_fatal("direct database connections are not supported in pre-1.3 archives");
/*
* We don't want to guess at whether the dump will successfully
* restore; allow the attempt regardless of the version of the restore
* target.
*/
AHX->minRemoteVersion = 0;
AHX->maxRemoteVersion = 9999999;
ConnectDatabase(AHX, &ropt->cparams, false);
2004-08-29 07:07:03 +02:00
They are two different problems; the TOC entry is important for any multiline command or to rerun the command easily later. Whereas displaying the failed SQL command is a matter of fixing the error messages. The latter is complicated by failed COPY commands which, with die-on-errors off, results in the data being processed as a command, so dumping the command will dump all of the data. In the case of long commands, should the whole command be dumped? eg. (eg. several pages of function definition). In the case of the COPY command, I'm not sure what to do. Obviously, it would be best to avoid sending the data, but the data and command are combined (from memory). Also, the 'data' may be in the form of INSERT statements. Attached patch produces the first 125 chars of the command: pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC Entry 26; 1255 16449270 FUNCTION plpgsql_call_handler() pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_call_handler" already exists with same argument types Command was: CREATE FUNCTION plpgsql_call_handler() RETURNS language_handler AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_call_han... pg_restore: [archiver (db)] Error from TOC Entry 27; 1255 16449271 FUNCTION plpgsql_validator(oid) pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_validator" already exists with same argument types Command was: CREATE FUNCTION plpgsql_validator(oid) RETURNS void AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_validator' LANGU... Philip Warner
2004-08-20 22:00:34 +02:00
/*
* If we're talking to the DB directly, don't send comments since they
* obscure SQL when displaying errors
*/
AH->noTocComments = 1;
}
/*
* Work out if we have an implied data-only restore. This can happen if
* the dump was data only or if the user has used a toc list to exclude
* all of the schema data. All we do is look for schema entries - if none
* are found then we set the dataOnly flag.
*
* We could scan for wanted TABLE entries, but that is not the same as
* dataOnly. At this stage, it seems unnecessary (6-Mar-2001).
*/
if (!ropt->dataOnly)
{
int impliedDataOnly = 1;
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if ((te->reqs & REQ_SCHEMA) != 0)
{ /* It's schema, and it's wanted */
impliedDataOnly = 0;
break;
}
}
if (impliedDataOnly)
{
ropt->dataOnly = impliedDataOnly;
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("implied data-only restore");
}
}
/*
* Setup the output file if necessary.
*/
sav = SaveOutput(AH);
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
if (ropt->filename || ropt->compression_spec.algorithm != PG_COMPRESSION_NONE)
SetOutput(AH, ropt->filename, ropt->compression_spec);
2000-07-04 16:25:28 +02:00
ahprintf(AH, "--\n-- PostgreSQL database dump\n--\n\n");
2003-08-04 02:43:34 +02:00
if (AH->archiveRemoteVersion)
ahprintf(AH, "-- Dumped from database version %s\n",
AH->archiveRemoteVersion);
if (AH->archiveDumpVersion)
ahprintf(AH, "-- Dumped by pg_dump version %s\n",
AH->archiveDumpVersion);
ahprintf(AH, "\n");
if (AH->public.verbose)
dumpTimestamp(AH, "Started on", AH->createDate);
if (ropt->single_txn)
{
if (AH->connection)
StartTransaction(AHX);
else
ahprintf(AH, "BEGIN;\n\n");
}
/*
* Establish important parameter values right away.
*/
_doSetFixedOutputState(AH);
They are two different problems; the TOC entry is important for any multiline command or to rerun the command easily later. Whereas displaying the failed SQL command is a matter of fixing the error messages. The latter is complicated by failed COPY commands which, with die-on-errors off, results in the data being processed as a command, so dumping the command will dump all of the data. In the case of long commands, should the whole command be dumped? eg. (eg. several pages of function definition). In the case of the COPY command, I'm not sure what to do. Obviously, it would be best to avoid sending the data, but the data and command are combined (from memory). Also, the 'data' may be in the form of INSERT statements. Attached patch produces the first 125 chars of the command: pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC Entry 26; 1255 16449270 FUNCTION plpgsql_call_handler() pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_call_handler" already exists with same argument types Command was: CREATE FUNCTION plpgsql_call_handler() RETURNS language_handler AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_call_han... pg_restore: [archiver (db)] Error from TOC Entry 27; 1255 16449271 FUNCTION plpgsql_validator(oid) pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_validator" already exists with same argument types Command was: CREATE FUNCTION plpgsql_validator(oid) RETURNS void AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_validator' LANGU... Philip Warner
2004-08-20 22:00:34 +02:00
AH->stage = STAGE_PROCESSING;
/*
* Drop the items at the start, in reverse order
*/
2000-07-04 16:25:28 +02:00
if (ropt->dropSchema)
{
for (te = AH->toc->prev; te != AH->toc; te = te->prev)
{
AH->currentTE = te;
/*
* In createDB mode, issue a DROP *only* for the database as a
* whole. Issuing drops against anything else would be wrong,
* because at this point we're connected to the wrong database.
* (The DATABASE PROPERTIES entry, if any, should be treated like
* the DATABASE entry.)
*/
if (ropt->createDB)
{
Move handling of database properties from pg_dumpall into pg_dump. This patch rearranges the division of labor between pg_dump and pg_dumpall so that pg_dump itself handles all properties attached to a single database. Notably, a database's ACL (GRANT/REVOKE status) and local GUC settings established by ALTER DATABASE SET and ALTER ROLE IN DATABASE SET can be dumped and restored by pg_dump. This is a long-requested improvement. "pg_dumpall -g" will now produce only role- and tablespace-related output, nothing about individual databases. The total output of a regular pg_dumpall run remains the same. pg_dump (or pg_restore) will restore database-level properties only when creating the target database with --create. This applies not only to ACLs and GUCs but to the other database properties it already handled, that is database comments and security labels. This is more consistent and useful, but does represent an incompatibility in the behavior seen without --create. (This change makes the proposed patch to have pg_dump use "COMMENT ON DATABASE CURRENT_DATABASE" unnecessary, since there is no case where the command is issued that we won't know the true name of the database. We might still want that patch as a feature in its own right, but pg_dump no longer needs it.) pg_dumpall with --clean will now drop and recreate the "postgres" and "template1" databases in the target cluster, allowing their locale and encoding settings to be changed if necessary, and providing a cleaner way to set nondefault tablespaces for them than we had before. This means that such a script must now always be started in the "postgres" database; the order of drops and reconnects will not work otherwise. Without --clean, the script will not adjust any database-level properties of those two databases (including their comments, ACLs, and security labels, which it formerly would try to set). Another minor incompatibility is that the CREATE DATABASE commands in a pg_dumpall script will now always specify locale and encoding settings. Formerly those would be omitted if they matched the cluster's default. While that behavior had some usefulness in some migration scenarios, it also posed a significant hazard of unwanted locale/encoding changes. To migrate to another locale/encoding, it's now necessary to use pg_dump without --create to restore into a database with the desired settings. Commit 4bd371f6f's hack to emit "SET default_transaction_read_only = off" is gone: we now dodge that problem by the expedient of not issuing ALTER DATABASE SET commands until after reconnecting to the target database. Therefore, such settings won't apply during the restore session. In passing, improve some shaky grammar in the docs, and add a note pointing out that pg_dumpall's output can't be expected to load without any errors. (Someday we might want to fix that, but this is not that patch.) Haribabu Kommi, reviewed at various times by Andreas Karlsson, Vaishnavi Prabakaran, and Robert Haas; further hacking by me. Discussion: https://postgr.es/m/CAJrrPGcUurV0eWTeXODwsOYFN=Ekq36t1s0YnFYUNzsmRfdAyA@mail.gmail.com
2018-01-22 20:09:09 +01:00
if (strcmp(te->desc, "DATABASE") != 0 &&
strcmp(te->desc, "DATABASE PROPERTIES") != 0)
continue;
}
/* Otherwise, drop anything that's selected and has a dropStmt */
if (((te->reqs & (REQ_SCHEMA | REQ_DATA)) != 0) && te->dropStmt)
{
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("dropping %s %s", te->desc, te->tag);
/* Select owner and schema as necessary */
_becomeOwner(AH, te);
_selectOutputSchema(AH, te->namespace);
/*
* Now emit the DROP command, if the object has one. Note we
* don't necessarily emit it verbatim; at this point we add an
* appropriate IF EXISTS clause, if the user requested it.
*/
if (*te->dropStmt != '\0')
{
if (!ropt->if_exists ||
strncmp(te->dropStmt, "--", 2) == 0)
{
/*
* Without --if-exists, or if it's just a comment (as
* happens for the public schema), print the dropStmt
* as-is.
*/
ahprintf(AH, "%s", te->dropStmt);
}
else
{
/*
* Inject an appropriate spelling of "if exists". For
* large objects, we have a separate routine that
* knows how to do it, without depending on
* te->dropStmt; use that. For other objects we need
* to parse the command.
*/
if (strncmp(te->desc, "BLOB", 4) == 0)
{
DropLOIfExists(AH, te->catalogId.oid);
}
else
{
char *dropStmt = pg_strdup(te->dropStmt);
char *dropStmtOrig = dropStmt;
PQExpBuffer ftStmt = createPQExpBuffer();
/*
* Need to inject IF EXISTS clause after ALTER
* TABLE part in ALTER TABLE .. DROP statement
*/
if (strncmp(dropStmt, "ALTER TABLE", 11) == 0)
{
appendPQExpBufferStr(ftStmt,
"ALTER TABLE IF EXISTS");
dropStmt = dropStmt + 11;
}
/*
* ALTER TABLE..ALTER COLUMN..DROP DEFAULT does
* not support the IF EXISTS clause, and therefore
* we simply emit the original command for DEFAULT
* objects (modulo the adjustment made above).
*
Move handling of database properties from pg_dumpall into pg_dump. This patch rearranges the division of labor between pg_dump and pg_dumpall so that pg_dump itself handles all properties attached to a single database. Notably, a database's ACL (GRANT/REVOKE status) and local GUC settings established by ALTER DATABASE SET and ALTER ROLE IN DATABASE SET can be dumped and restored by pg_dump. This is a long-requested improvement. "pg_dumpall -g" will now produce only role- and tablespace-related output, nothing about individual databases. The total output of a regular pg_dumpall run remains the same. pg_dump (or pg_restore) will restore database-level properties only when creating the target database with --create. This applies not only to ACLs and GUCs but to the other database properties it already handled, that is database comments and security labels. This is more consistent and useful, but does represent an incompatibility in the behavior seen without --create. (This change makes the proposed patch to have pg_dump use "COMMENT ON DATABASE CURRENT_DATABASE" unnecessary, since there is no case where the command is issued that we won't know the true name of the database. We might still want that patch as a feature in its own right, but pg_dump no longer needs it.) pg_dumpall with --clean will now drop and recreate the "postgres" and "template1" databases in the target cluster, allowing their locale and encoding settings to be changed if necessary, and providing a cleaner way to set nondefault tablespaces for them than we had before. This means that such a script must now always be started in the "postgres" database; the order of drops and reconnects will not work otherwise. Without --clean, the script will not adjust any database-level properties of those two databases (including their comments, ACLs, and security labels, which it formerly would try to set). Another minor incompatibility is that the CREATE DATABASE commands in a pg_dumpall script will now always specify locale and encoding settings. Formerly those would be omitted if they matched the cluster's default. While that behavior had some usefulness in some migration scenarios, it also posed a significant hazard of unwanted locale/encoding changes. To migrate to another locale/encoding, it's now necessary to use pg_dump without --create to restore into a database with the desired settings. Commit 4bd371f6f's hack to emit "SET default_transaction_read_only = off" is gone: we now dodge that problem by the expedient of not issuing ALTER DATABASE SET commands until after reconnecting to the target database. Therefore, such settings won't apply during the restore session. In passing, improve some shaky grammar in the docs, and add a note pointing out that pg_dumpall's output can't be expected to load without any errors. (Someday we might want to fix that, but this is not that patch.) Haribabu Kommi, reviewed at various times by Andreas Karlsson, Vaishnavi Prabakaran, and Robert Haas; further hacking by me. Discussion: https://postgr.es/m/CAJrrPGcUurV0eWTeXODwsOYFN=Ekq36t1s0YnFYUNzsmRfdAyA@mail.gmail.com
2018-01-22 20:09:09 +01:00
* Likewise, don't mess with DATABASE PROPERTIES.
*
* If we used CREATE OR REPLACE VIEW as a means of
* quasi-dropping an ON SELECT rule, that should
* be emitted unchanged as well.
*
* For other object types, we need to extract the
* first part of the DROP which includes the
* object type. Most of the time this matches
* te->desc, so search for that; however for the
* different kinds of CONSTRAINTs, we know to
* search for hardcoded "DROP CONSTRAINT" instead.
*/
if (strcmp(te->desc, "DEFAULT") == 0 ||
Move handling of database properties from pg_dumpall into pg_dump. This patch rearranges the division of labor between pg_dump and pg_dumpall so that pg_dump itself handles all properties attached to a single database. Notably, a database's ACL (GRANT/REVOKE status) and local GUC settings established by ALTER DATABASE SET and ALTER ROLE IN DATABASE SET can be dumped and restored by pg_dump. This is a long-requested improvement. "pg_dumpall -g" will now produce only role- and tablespace-related output, nothing about individual databases. The total output of a regular pg_dumpall run remains the same. pg_dump (or pg_restore) will restore database-level properties only when creating the target database with --create. This applies not only to ACLs and GUCs but to the other database properties it already handled, that is database comments and security labels. This is more consistent and useful, but does represent an incompatibility in the behavior seen without --create. (This change makes the proposed patch to have pg_dump use "COMMENT ON DATABASE CURRENT_DATABASE" unnecessary, since there is no case where the command is issued that we won't know the true name of the database. We might still want that patch as a feature in its own right, but pg_dump no longer needs it.) pg_dumpall with --clean will now drop and recreate the "postgres" and "template1" databases in the target cluster, allowing their locale and encoding settings to be changed if necessary, and providing a cleaner way to set nondefault tablespaces for them than we had before. This means that such a script must now always be started in the "postgres" database; the order of drops and reconnects will not work otherwise. Without --clean, the script will not adjust any database-level properties of those two databases (including their comments, ACLs, and security labels, which it formerly would try to set). Another minor incompatibility is that the CREATE DATABASE commands in a pg_dumpall script will now always specify locale and encoding settings. Formerly those would be omitted if they matched the cluster's default. While that behavior had some usefulness in some migration scenarios, it also posed a significant hazard of unwanted locale/encoding changes. To migrate to another locale/encoding, it's now necessary to use pg_dump without --create to restore into a database with the desired settings. Commit 4bd371f6f's hack to emit "SET default_transaction_read_only = off" is gone: we now dodge that problem by the expedient of not issuing ALTER DATABASE SET commands until after reconnecting to the target database. Therefore, such settings won't apply during the restore session. In passing, improve some shaky grammar in the docs, and add a note pointing out that pg_dumpall's output can't be expected to load without any errors. (Someday we might want to fix that, but this is not that patch.) Haribabu Kommi, reviewed at various times by Andreas Karlsson, Vaishnavi Prabakaran, and Robert Haas; further hacking by me. Discussion: https://postgr.es/m/CAJrrPGcUurV0eWTeXODwsOYFN=Ekq36t1s0YnFYUNzsmRfdAyA@mail.gmail.com
2018-01-22 20:09:09 +01:00
strcmp(te->desc, "DATABASE PROPERTIES") == 0 ||
strncmp(dropStmt, "CREATE OR REPLACE VIEW", 22) == 0)
appendPQExpBufferStr(ftStmt, dropStmt);
else
{
char buffer[40];
char *mark;
if (strcmp(te->desc, "CONSTRAINT") == 0 ||
strcmp(te->desc, "CHECK CONSTRAINT") == 0 ||
strcmp(te->desc, "FK CONSTRAINT") == 0)
strcpy(buffer, "DROP CONSTRAINT");
else
snprintf(buffer, sizeof(buffer), "DROP %s",
te->desc);
mark = strstr(dropStmt, buffer);
if (mark)
{
*mark = '\0';
appendPQExpBuffer(ftStmt, "%s%s IF EXISTS%s",
dropStmt, buffer,
mark + strlen(buffer));
}
else
{
/* complain and emit unmodified command */
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_warning("could not find where to insert IF EXISTS in statement \"%s\"",
dropStmtOrig);
appendPQExpBufferStr(ftStmt, dropStmt);
}
}
ahprintf(AH, "%s", ftStmt->data);
destroyPQExpBuffer(ftStmt);
pg_free(dropStmtOrig);
}
}
}
}
}
/*
* _selectOutputSchema may have set currSchema to reflect the effect
* of a "SET search_path" command it emitted. However, by now we may
* have dropped that schema; or it might not have existed in the first
* place. In either case the effective value of search_path will not
* be what we think. Forcibly reset currSchema so that we will
* re-establish the search_path setting when needed (after creating
* the schema).
*
* If we treated users as pg_dump'able objects then we'd need to reset
* currUser here too.
*/
free(AH->currSchema);
AH->currSchema = NULL;
2000-07-04 16:25:28 +02:00
}
if (parallel_mode)
{
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
/*
* In parallel mode, turn control over to the parallel-restore logic.
*/
ParallelState *pstate;
TocEntry pending_list;
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
/* The archive format module may need some setup for this */
if (AH->PrepParallelRestorePtr)
AH->PrepParallelRestorePtr(AH);
pending_list_header_init(&pending_list);
/* This runs PRE_DATA items and then disconnects from the database */
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
restore_toc_entries_prefork(AH, &pending_list);
Assert(AH->connection == NULL);
/* ParallelBackupStart() will actually fork the processes */
pstate = ParallelBackupStart(AH);
restore_toc_entries_parallel(AH, pstate, &pending_list);
ParallelBackupEnd(AH, pstate);
/* reconnect the leader and see if we missed something */
restore_toc_entries_postfork(AH, &pending_list);
Assert(AH->connection != NULL);
}
else
2000-07-04 16:25:28 +02:00
{
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
/*
* In serial mode, process everything in three phases: normal items,
Fix pg_dump/pg_restore to restore event triggers later. Previously, event triggers were restored just after regular triggers (and FK constraints, which are basically triggers). This is risky since an event trigger, once installed, could interfere with subsequent restore commands. Worse, because event triggers don't have any particular dependencies on any post-data objects, a parallel restore would consider them eligible to be restored the moment the post-data phase starts, allowing them to also interfere with restoration of a whole bunch of objects that would have been restored before them in a serial restore. There's no way to completely remove the risk of a misguided event trigger breaking the restore, since if nothing else it could break other event triggers. But we can certainly push them to later in the process to minimize the hazard. To fix, tweak the RestorePass mechanism introduced by commit 3eb9a5e7c so that event triggers are handled as part of the post-ACL processing pass (renaming the "REFRESH" pass to "POST_ACL" to reflect its more general use). This will cause them to restore after everything except matview refreshes, which seems OK since matview refreshes really ought to run in the post-restore state of the database. In a parallel restore, event triggers and matview refreshes might be intermixed, but that seems all right as well. Also update the code and comments in pg_dump_sort.c so that its idea of how things are sorted agrees with what actually happens due to the RestorePass mechanism. This is mostly cosmetic: it'll affect the order of objects in a dump's TOC, but not the actual restore order. But not changing that would be quite confusing to somebody reading the code. Back-patch to all supported branches. Fabrízio de Royes Mello, tweaked a bit by me Discussion: https://postgr.es/m/CAFcNs+ow1hmFox8P--3GSdtwz-S3Binb6ZmoP6Vk+Xg=K6eZNA@mail.gmail.com
2020-03-09 19:58:11 +01:00
* then ACLs, then post-ACL items. We might be able to skip one or
* both extra phases in some cases, eg data-only restores.
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
*/
bool haveACL = false;
Fix pg_dump/pg_restore to restore event triggers later. Previously, event triggers were restored just after regular triggers (and FK constraints, which are basically triggers). This is risky since an event trigger, once installed, could interfere with subsequent restore commands. Worse, because event triggers don't have any particular dependencies on any post-data objects, a parallel restore would consider them eligible to be restored the moment the post-data phase starts, allowing them to also interfere with restoration of a whole bunch of objects that would have been restored before them in a serial restore. There's no way to completely remove the risk of a misguided event trigger breaking the restore, since if nothing else it could break other event triggers. But we can certainly push them to later in the process to minimize the hazard. To fix, tweak the RestorePass mechanism introduced by commit 3eb9a5e7c so that event triggers are handled as part of the post-ACL processing pass (renaming the "REFRESH" pass to "POST_ACL" to reflect its more general use). This will cause them to restore after everything except matview refreshes, which seems OK since matview refreshes really ought to run in the post-restore state of the database. In a parallel restore, event triggers and matview refreshes might be intermixed, but that seems all right as well. Also update the code and comments in pg_dump_sort.c so that its idea of how things are sorted agrees with what actually happens due to the RestorePass mechanism. This is mostly cosmetic: it'll affect the order of objects in a dump's TOC, but not the actual restore order. But not changing that would be quite confusing to somebody reading the code. Back-patch to all supported branches. Fabrízio de Royes Mello, tweaked a bit by me Discussion: https://postgr.es/m/CAFcNs+ow1hmFox8P--3GSdtwz-S3Binb6ZmoP6Vk+Xg=K6eZNA@mail.gmail.com
2020-03-09 19:58:11 +01:00
bool havePostACL = false;
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
for (te = AH->toc->next; te != AH->toc; te = te->next)
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
{
if ((te->reqs & (REQ_SCHEMA | REQ_DATA)) == 0)
continue; /* ignore if not to be dumped at all */
2000-07-04 16:25:28 +02:00
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
switch (_tocEntryRestorePass(te))
{
case RESTORE_PASS_MAIN:
(void) restore_toc_entry(AH, te, false);
break;
case RESTORE_PASS_ACL:
haveACL = true;
break;
Fix pg_dump/pg_restore to restore event triggers later. Previously, event triggers were restored just after regular triggers (and FK constraints, which are basically triggers). This is risky since an event trigger, once installed, could interfere with subsequent restore commands. Worse, because event triggers don't have any particular dependencies on any post-data objects, a parallel restore would consider them eligible to be restored the moment the post-data phase starts, allowing them to also interfere with restoration of a whole bunch of objects that would have been restored before them in a serial restore. There's no way to completely remove the risk of a misguided event trigger breaking the restore, since if nothing else it could break other event triggers. But we can certainly push them to later in the process to minimize the hazard. To fix, tweak the RestorePass mechanism introduced by commit 3eb9a5e7c so that event triggers are handled as part of the post-ACL processing pass (renaming the "REFRESH" pass to "POST_ACL" to reflect its more general use). This will cause them to restore after everything except matview refreshes, which seems OK since matview refreshes really ought to run in the post-restore state of the database. In a parallel restore, event triggers and matview refreshes might be intermixed, but that seems all right as well. Also update the code and comments in pg_dump_sort.c so that its idea of how things are sorted agrees with what actually happens due to the RestorePass mechanism. This is mostly cosmetic: it'll affect the order of objects in a dump's TOC, but not the actual restore order. But not changing that would be quite confusing to somebody reading the code. Back-patch to all supported branches. Fabrízio de Royes Mello, tweaked a bit by me Discussion: https://postgr.es/m/CAFcNs+ow1hmFox8P--3GSdtwz-S3Binb6ZmoP6Vk+Xg=K6eZNA@mail.gmail.com
2020-03-09 19:58:11 +01:00
case RESTORE_PASS_POST_ACL:
havePostACL = true;
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
break;
}
}
They are two different problems; the TOC entry is important for any multiline command or to rerun the command easily later. Whereas displaying the failed SQL command is a matter of fixing the error messages. The latter is complicated by failed COPY commands which, with die-on-errors off, results in the data being processed as a command, so dumping the command will dump all of the data. In the case of long commands, should the whole command be dumped? eg. (eg. several pages of function definition). In the case of the COPY command, I'm not sure what to do. Obviously, it would be best to avoid sending the data, but the data and command are combined (from memory). Also, the 'data' may be in the form of INSERT statements. Attached patch produces the first 125 chars of the command: pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC Entry 26; 1255 16449270 FUNCTION plpgsql_call_handler() pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_call_handler" already exists with same argument types Command was: CREATE FUNCTION plpgsql_call_handler() RETURNS language_handler AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_call_han... pg_restore: [archiver (db)] Error from TOC Entry 27; 1255 16449271 FUNCTION plpgsql_validator(oid) pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_validator" already exists with same argument types Command was: CREATE FUNCTION plpgsql_validator(oid) RETURNS void AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_validator' LANGU... Philip Warner
2004-08-20 22:00:34 +02:00
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
if (haveACL)
{
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if ((te->reqs & (REQ_SCHEMA | REQ_DATA)) != 0 &&
_tocEntryRestorePass(te) == RESTORE_PASS_ACL)
(void) restore_toc_entry(AH, te, false);
}
}
Fix pg_dump/pg_restore to restore event triggers later. Previously, event triggers were restored just after regular triggers (and FK constraints, which are basically triggers). This is risky since an event trigger, once installed, could interfere with subsequent restore commands. Worse, because event triggers don't have any particular dependencies on any post-data objects, a parallel restore would consider them eligible to be restored the moment the post-data phase starts, allowing them to also interfere with restoration of a whole bunch of objects that would have been restored before them in a serial restore. There's no way to completely remove the risk of a misguided event trigger breaking the restore, since if nothing else it could break other event triggers. But we can certainly push them to later in the process to minimize the hazard. To fix, tweak the RestorePass mechanism introduced by commit 3eb9a5e7c so that event triggers are handled as part of the post-ACL processing pass (renaming the "REFRESH" pass to "POST_ACL" to reflect its more general use). This will cause them to restore after everything except matview refreshes, which seems OK since matview refreshes really ought to run in the post-restore state of the database. In a parallel restore, event triggers and matview refreshes might be intermixed, but that seems all right as well. Also update the code and comments in pg_dump_sort.c so that its idea of how things are sorted agrees with what actually happens due to the RestorePass mechanism. This is mostly cosmetic: it'll affect the order of objects in a dump's TOC, but not the actual restore order. But not changing that would be quite confusing to somebody reading the code. Back-patch to all supported branches. Fabrízio de Royes Mello, tweaked a bit by me Discussion: https://postgr.es/m/CAFcNs+ow1hmFox8P--3GSdtwz-S3Binb6ZmoP6Vk+Xg=K6eZNA@mail.gmail.com
2020-03-09 19:58:11 +01:00
if (havePostACL)
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if ((te->reqs & (REQ_SCHEMA | REQ_DATA)) != 0 &&
Fix pg_dump/pg_restore to restore event triggers later. Previously, event triggers were restored just after regular triggers (and FK constraints, which are basically triggers). This is risky since an event trigger, once installed, could interfere with subsequent restore commands. Worse, because event triggers don't have any particular dependencies on any post-data objects, a parallel restore would consider them eligible to be restored the moment the post-data phase starts, allowing them to also interfere with restoration of a whole bunch of objects that would have been restored before them in a serial restore. There's no way to completely remove the risk of a misguided event trigger breaking the restore, since if nothing else it could break other event triggers. But we can certainly push them to later in the process to minimize the hazard. To fix, tweak the RestorePass mechanism introduced by commit 3eb9a5e7c so that event triggers are handled as part of the post-ACL processing pass (renaming the "REFRESH" pass to "POST_ACL" to reflect its more general use). This will cause them to restore after everything except matview refreshes, which seems OK since matview refreshes really ought to run in the post-restore state of the database. In a parallel restore, event triggers and matview refreshes might be intermixed, but that seems all right as well. Also update the code and comments in pg_dump_sort.c so that its idea of how things are sorted agrees with what actually happens due to the RestorePass mechanism. This is mostly cosmetic: it'll affect the order of objects in a dump's TOC, but not the actual restore order. But not changing that would be quite confusing to somebody reading the code. Back-patch to all supported branches. Fabrízio de Royes Mello, tweaked a bit by me Discussion: https://postgr.es/m/CAFcNs+ow1hmFox8P--3GSdtwz-S3Binb6ZmoP6Vk+Xg=K6eZNA@mail.gmail.com
2020-03-09 19:58:11 +01:00
_tocEntryRestorePass(te) == RESTORE_PASS_POST_ACL)
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
(void) restore_toc_entry(AH, te, false);
}
}
}
if (ropt->single_txn)
{
if (AH->connection)
CommitTransaction(AHX);
else
ahprintf(AH, "COMMIT;\n\n");
}
if (AH->public.verbose)
dumpTimestamp(AH, "Completed on", time(NULL));
ahprintf(AH, "--\n-- PostgreSQL database dump complete\n--\n\n");
/*
* Clean up & we're done.
*/
They are two different problems; the TOC entry is important for any multiline command or to rerun the command easily later. Whereas displaying the failed SQL command is a matter of fixing the error messages. The latter is complicated by failed COPY commands which, with die-on-errors off, results in the data being processed as a command, so dumping the command will dump all of the data. In the case of long commands, should the whole command be dumped? eg. (eg. several pages of function definition). In the case of the COPY command, I'm not sure what to do. Obviously, it would be best to avoid sending the data, but the data and command are combined (from memory). Also, the 'data' may be in the form of INSERT statements. Attached patch produces the first 125 chars of the command: pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC Entry 26; 1255 16449270 FUNCTION plpgsql_call_handler() pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_call_handler" already exists with same argument types Command was: CREATE FUNCTION plpgsql_call_handler() RETURNS language_handler AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_call_han... pg_restore: [archiver (db)] Error from TOC Entry 27; 1255 16449271 FUNCTION plpgsql_validator(oid) pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_validator" already exists with same argument types Command was: CREATE FUNCTION plpgsql_validator(oid) RETURNS void AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_validator' LANGU... Philip Warner
2004-08-20 22:00:34 +02:00
AH->stage = STAGE_FINALIZING;
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
if (ropt->filename || ropt->compression_spec.algorithm != PG_COMPRESSION_NONE)
RestoreOutput(AH, sav);
if (ropt->useDB)
DisconnectDatabase(&AH->public);
2000-07-04 16:25:28 +02:00
}
/*
* Restore a single TOC item. Used in both parallel and non-parallel restore;
* is_parallel is true if we are in a worker child process.
*
* Returns 0 normally, but WORKER_CREATE_DONE or WORKER_INHIBIT_DATA if
* the parallel parent has to make the corresponding status update.
*/
static int
restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel)
{
RestoreOptions *ropt = AH->public.ropt;
int status = WORKER_OK;
int reqs;
bool defnDumped;
AH->currentTE = te;
/* Dump any relevant dump warnings to stderr */
if (!ropt->suppressDumpWarnings && strcmp(te->desc, "WARNING") == 0)
{
if (!ropt->dataOnly && te->defn != NULL && strlen(te->defn) != 0)
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_warning("warning from original dump file: %s", te->defn);
else if (te->copyStmt != NULL && strlen(te->copyStmt) != 0)
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_warning("warning from original dump file: %s", te->copyStmt);
}
/* Work out what, if anything, we want from this entry */
reqs = te->reqs;
defnDumped = false;
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
/*
* If it has a schema component that we want, then process that
*/
if ((reqs & REQ_SCHEMA) != 0)
{
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
/* Show namespace in log message if available */
if (te->namespace)
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("creating %s \"%s.%s\"",
te->desc, te->namespace, te->tag);
else
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("creating %s \"%s\"",
te->desc, te->tag);
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
_printTocEntry(AH, te, false);
defnDumped = true;
if (strcmp(te->desc, "TABLE") == 0)
{
if (AH->lastErrorTE == te)
{
/*
* We failed to create the table. If
* --no-data-for-failed-tables was given, mark the
* corresponding TABLE DATA to be ignored.
*
* In the parallel case this must be done in the parent, so we
* just set the return value.
*/
if (ropt->noDataForFailedTables)
{
if (is_parallel)
status = WORKER_INHIBIT_DATA;
else
inhibit_data_for_failed_table(AH, te);
}
}
else
{
/*
* We created the table successfully. Mark the corresponding
* TABLE DATA for possible truncation.
*
* In the parallel case this must be done in the parent, so we
* just set the return value.
*/
if (is_parallel)
status = WORKER_CREATE_DONE;
else
mark_create_done(AH, te);
}
}
In pg_dump, force reconnection after issuing ALTER DATABASE SET command(s). The folly of not doing this was exposed by the buildfarm: in some cases, the GUC settings applied through ALTER DATABASE SET may be essential to interpreting the reloaded data correctly. Another argument why we can't really get away with the scheme proposed in commit b3f840120 is that it cannot work for parallel restore: even if the parent process manages to hang onto the previous GUC state, worker processes would see the state post-ALTER-DATABASE. (Perhaps we could have dodged that bullet by delaying DATABASE PROPERTIES restoration to the end of the run, but that does nothing for the data semantics problem.) This leaves us with no solution for the default_transaction_read_only issue that commit 4bd371f6f intended to work around, other than "you gotta remove such settings before dumping/upgrading". However, in view of the fact that parallel restore broke that hack years ago and no one has noticed, it's fair to question how many people care. I'm unexcited about adding a large dollop of new complexity to handle that corner case. This would be a one-liner fix, except it turns out that ReconnectToServer tries to optimize away "redundant" reconnections. While that may have been valuable when coded, a quick survey of current callers shows that there are no cases where that's actually useful, so just remove that check. While at it, remove the function's useless return value. Discussion: https://postgr.es/m/12453.1516655001@sss.pgh.pa.us
2018-01-23 16:55:08 +01:00
/*
* If we created a DB, connect to it. Also, if we changed DB
* properties, reconnect to ensure that relevant GUC settings are
* applied to our session.
*/
if (strcmp(te->desc, "DATABASE") == 0 ||
strcmp(te->desc, "DATABASE PROPERTIES") == 0)
{
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("connecting to new database \"%s\"", te->tag);
_reconnectToDB(AH, te->tag);
}
}
/*
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
* If it has a data component that we want, then process that
*/
if ((reqs & REQ_DATA) != 0)
{
/*
* hadDumper will be set if there is genuine data component for this
* node. Otherwise, we need to check the defn field for statements
* that need to be executed in data-only restores.
*/
if (te->hadDumper)
{
/*
* If we can output the data, then restore it.
*/
if (AH->PrintTocDataPtr != NULL)
{
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
_printTocEntry(AH, te, true);
if (strcmp(te->desc, "BLOBS") == 0 ||
strcmp(te->desc, "BLOB COMMENTS") == 0)
{
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("processing %s", te->desc);
_selectOutputSchema(AH, "pg_catalog");
/* Send BLOB COMMENTS data to ExecuteSimpleCommands() */
if (strcmp(te->desc, "BLOB COMMENTS") == 0)
AH->outputKind = OUTPUT_OTHERDATA;
AH->PrintTocDataPtr(AH, te);
AH->outputKind = OUTPUT_SQLCMDS;
}
else
{
Fix pg_dump for hash partitioning on enum columns. Hash partitioning on an enum is problematic because the hash codes are derived from the OIDs assigned to the enum values, which will almost certainly be different after a dump-and-reload than they were before. This means that some rows probably end up in different partitions than before, causing restore to fail because of partition constraint violations. (pg_upgrade dodges this problem by using hacks to force the enum values to keep the same OIDs, but that's not possible nor desirable for pg_dump.) Users can work around that by specifying --load-via-partition-root, but since that's a dump-time not restore-time decision, one might find out the need for it far too late. Instead, teach pg_dump to apply that option automatically when dealing with a partitioned table that has hash-on-enum partitioning. Also deal with a pre-existing issue for --load-via-partition-root mode: in a parallel restore, we try to TRUNCATE target tables just before loading them, in order to enable some backend optimizations. This is bad when using --load-via-partition-root because (a) we're likely to suffer deadlocks from restore jobs trying to restore rows into other partitions than they came from, and (b) if we miss getting a deadlock we might still lose data due to a TRUNCATE removing rows from some already-completed restore job. The fix for this is conceptually simple: just don't TRUNCATE if we're dealing with a --load-via-partition-root case. The tricky bit is for pg_restore to identify those cases. In dumps using COPY commands we can inspect each COPY command to see if it targets the nominal target table or some ancestor. However, in dumps using INSERT commands it's pretty impractical to examine the INSERTs in advance. To provide a solution for that going forward, modify pg_dump to mark TABLE DATA items that are using --load-via-partition-root with a comment. (This change also responds to a complaint from Robert Haas that the dump output for --load-via-partition-root is pretty confusing.) pg_restore checks for the special comment as well as checking the COPY command if present. This will fail to identify the combination of --load-via-partition-root and --inserts in pre-existing dump files, but that should be a pretty rare case in the field. If it does happen you will probably get a deadlock failure that you can work around by not using parallel restore, which is the same as before this bug fix. Having done this, there seems no remaining reason for the alarmism in the pg_dump man page about combining --load-via-partition-root with parallel restore, so remove that warning. Patch by me; thanks to Julien Rouhaud for review. Back-patch to v11 where hash partitioning was introduced. Discussion: https://postgr.es/m/1376149.1675268279@sss.pgh.pa.us
2023-03-17 18:31:40 +01:00
bool use_truncate;
_disableTriggersIfNecessary(AH, te);
/* Select owner and schema as necessary */
_becomeOwner(AH, te);
_selectOutputSchema(AH, te->namespace);
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("processing data for table \"%s.%s\"",
te->namespace, te->tag);
/*
* In parallel restore, if we created the table earlier in
Fix pg_dump for hash partitioning on enum columns. Hash partitioning on an enum is problematic because the hash codes are derived from the OIDs assigned to the enum values, which will almost certainly be different after a dump-and-reload than they were before. This means that some rows probably end up in different partitions than before, causing restore to fail because of partition constraint violations. (pg_upgrade dodges this problem by using hacks to force the enum values to keep the same OIDs, but that's not possible nor desirable for pg_dump.) Users can work around that by specifying --load-via-partition-root, but since that's a dump-time not restore-time decision, one might find out the need for it far too late. Instead, teach pg_dump to apply that option automatically when dealing with a partitioned table that has hash-on-enum partitioning. Also deal with a pre-existing issue for --load-via-partition-root mode: in a parallel restore, we try to TRUNCATE target tables just before loading them, in order to enable some backend optimizations. This is bad when using --load-via-partition-root because (a) we're likely to suffer deadlocks from restore jobs trying to restore rows into other partitions than they came from, and (b) if we miss getting a deadlock we might still lose data due to a TRUNCATE removing rows from some already-completed restore job. The fix for this is conceptually simple: just don't TRUNCATE if we're dealing with a --load-via-partition-root case. The tricky bit is for pg_restore to identify those cases. In dumps using COPY commands we can inspect each COPY command to see if it targets the nominal target table or some ancestor. However, in dumps using INSERT commands it's pretty impractical to examine the INSERTs in advance. To provide a solution for that going forward, modify pg_dump to mark TABLE DATA items that are using --load-via-partition-root with a comment. (This change also responds to a complaint from Robert Haas that the dump output for --load-via-partition-root is pretty confusing.) pg_restore checks for the special comment as well as checking the COPY command if present. This will fail to identify the combination of --load-via-partition-root and --inserts in pre-existing dump files, but that should be a pretty rare case in the field. If it does happen you will probably get a deadlock failure that you can work around by not using parallel restore, which is the same as before this bug fix. Having done this, there seems no remaining reason for the alarmism in the pg_dump man page about combining --load-via-partition-root with parallel restore, so remove that warning. Patch by me; thanks to Julien Rouhaud for review. Back-patch to v11 where hash partitioning was introduced. Discussion: https://postgr.es/m/1376149.1675268279@sss.pgh.pa.us
2023-03-17 18:31:40 +01:00
* this run (so that we know it is empty) and we are not
* restoring a load-via-partition-root data item then we
* wrap the COPY in a transaction and precede it with a
* TRUNCATE. If wal_level is set to minimal this prevents
* WAL-logging the COPY. This obtains a speedup similar
* to that from using single_txn mode in non-parallel
* restores.
*
* We mustn't do this for load-via-partition-root cases
* because some data might get moved across partition
* boundaries, risking deadlock and/or loss of previously
* loaded data. (We assume that all partitions of a
* partitioned table will be treated the same way.)
*/
Fix pg_dump for hash partitioning on enum columns. Hash partitioning on an enum is problematic because the hash codes are derived from the OIDs assigned to the enum values, which will almost certainly be different after a dump-and-reload than they were before. This means that some rows probably end up in different partitions than before, causing restore to fail because of partition constraint violations. (pg_upgrade dodges this problem by using hacks to force the enum values to keep the same OIDs, but that's not possible nor desirable for pg_dump.) Users can work around that by specifying --load-via-partition-root, but since that's a dump-time not restore-time decision, one might find out the need for it far too late. Instead, teach pg_dump to apply that option automatically when dealing with a partitioned table that has hash-on-enum partitioning. Also deal with a pre-existing issue for --load-via-partition-root mode: in a parallel restore, we try to TRUNCATE target tables just before loading them, in order to enable some backend optimizations. This is bad when using --load-via-partition-root because (a) we're likely to suffer deadlocks from restore jobs trying to restore rows into other partitions than they came from, and (b) if we miss getting a deadlock we might still lose data due to a TRUNCATE removing rows from some already-completed restore job. The fix for this is conceptually simple: just don't TRUNCATE if we're dealing with a --load-via-partition-root case. The tricky bit is for pg_restore to identify those cases. In dumps using COPY commands we can inspect each COPY command to see if it targets the nominal target table or some ancestor. However, in dumps using INSERT commands it's pretty impractical to examine the INSERTs in advance. To provide a solution for that going forward, modify pg_dump to mark TABLE DATA items that are using --load-via-partition-root with a comment. (This change also responds to a complaint from Robert Haas that the dump output for --load-via-partition-root is pretty confusing.) pg_restore checks for the special comment as well as checking the COPY command if present. This will fail to identify the combination of --load-via-partition-root and --inserts in pre-existing dump files, but that should be a pretty rare case in the field. If it does happen you will probably get a deadlock failure that you can work around by not using parallel restore, which is the same as before this bug fix. Having done this, there seems no remaining reason for the alarmism in the pg_dump man page about combining --load-via-partition-root with parallel restore, so remove that warning. Patch by me; thanks to Julien Rouhaud for review. Back-patch to v11 where hash partitioning was introduced. Discussion: https://postgr.es/m/1376149.1675268279@sss.pgh.pa.us
2023-03-17 18:31:40 +01:00
use_truncate = is_parallel && te->created &&
!is_load_via_partition_root(te);
if (use_truncate)
{
/*
* Parallel restore is always talking directly to a
* server, so no need to see if we should issue BEGIN.
*/
StartTransaction(&AH->public);
/*
* Issue TRUNCATE with ONLY so that child tables are
* not wiped.
*/
ahprintf(AH, "TRUNCATE TABLE ONLY %s;\n\n",
fmtQualifiedId(te->namespace, te->tag));
}
/*
* If we have a copy statement, use it.
*/
if (te->copyStmt && strlen(te->copyStmt) > 0)
{
ahprintf(AH, "%s", te->copyStmt);
AH->outputKind = OUTPUT_COPYDATA;
}
else
AH->outputKind = OUTPUT_OTHERDATA;
AH->PrintTocDataPtr(AH, te);
/*
* Terminate COPY if needed.
*/
if (AH->outputKind == OUTPUT_COPYDATA &&
RestoringToDB(AH))
EndDBCopyMode(&AH->public, te->tag);
AH->outputKind = OUTPUT_SQLCMDS;
/* close out the transaction started above */
Fix pg_dump for hash partitioning on enum columns. Hash partitioning on an enum is problematic because the hash codes are derived from the OIDs assigned to the enum values, which will almost certainly be different after a dump-and-reload than they were before. This means that some rows probably end up in different partitions than before, causing restore to fail because of partition constraint violations. (pg_upgrade dodges this problem by using hacks to force the enum values to keep the same OIDs, but that's not possible nor desirable for pg_dump.) Users can work around that by specifying --load-via-partition-root, but since that's a dump-time not restore-time decision, one might find out the need for it far too late. Instead, teach pg_dump to apply that option automatically when dealing with a partitioned table that has hash-on-enum partitioning. Also deal with a pre-existing issue for --load-via-partition-root mode: in a parallel restore, we try to TRUNCATE target tables just before loading them, in order to enable some backend optimizations. This is bad when using --load-via-partition-root because (a) we're likely to suffer deadlocks from restore jobs trying to restore rows into other partitions than they came from, and (b) if we miss getting a deadlock we might still lose data due to a TRUNCATE removing rows from some already-completed restore job. The fix for this is conceptually simple: just don't TRUNCATE if we're dealing with a --load-via-partition-root case. The tricky bit is for pg_restore to identify those cases. In dumps using COPY commands we can inspect each COPY command to see if it targets the nominal target table or some ancestor. However, in dumps using INSERT commands it's pretty impractical to examine the INSERTs in advance. To provide a solution for that going forward, modify pg_dump to mark TABLE DATA items that are using --load-via-partition-root with a comment. (This change also responds to a complaint from Robert Haas that the dump output for --load-via-partition-root is pretty confusing.) pg_restore checks for the special comment as well as checking the COPY command if present. This will fail to identify the combination of --load-via-partition-root and --inserts in pre-existing dump files, but that should be a pretty rare case in the field. If it does happen you will probably get a deadlock failure that you can work around by not using parallel restore, which is the same as before this bug fix. Having done this, there seems no remaining reason for the alarmism in the pg_dump man page about combining --load-via-partition-root with parallel restore, so remove that warning. Patch by me; thanks to Julien Rouhaud for review. Back-patch to v11 where hash partitioning was introduced. Discussion: https://postgr.es/m/1376149.1675268279@sss.pgh.pa.us
2023-03-17 18:31:40 +01:00
if (use_truncate)
CommitTransaction(&AH->public);
_enableTriggersIfNecessary(AH, te);
}
}
}
else if (!defnDumped)
{
/* If we haven't already dumped the defn part, do so now */
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("executing %s %s", te->desc, te->tag);
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
_printTocEntry(AH, te, false);
}
}
if (AH->public.n_errors > 0 && status == WORKER_OK)
status = WORKER_IGNORED_ERRORS;
return status;
}
/*
* Allocate a new RestoreOptions block.
* This is mainly so we can initialize it, but also for future expansion,
*/
2000-07-04 16:25:28 +02:00
RestoreOptions *
NewRestoreOptions(void)
{
RestoreOptions *opts;
opts = (RestoreOptions *) pg_malloc0(sizeof(RestoreOptions));
2000-07-04 16:25:28 +02:00
/* set any fields that shouldn't default to zeroes */
2000-07-04 16:25:28 +02:00
opts->format = archUnknown;
opts->cparams.promptPassword = TRI_DEFAULT;
opts->dumpSections = DUMP_UNSECTIONED;
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
opts->compression_spec.algorithm = PG_COMPRESSION_NONE;
opts->compression_spec.level = 0;
2000-07-04 16:25:28 +02:00
return opts;
}
static void
_disableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te)
2000-07-04 16:25:28 +02:00
{
RestoreOptions *ropt = AH->public.ropt;
/* This hack is only needed in a data-only restore */
if (!ropt->dataOnly || !ropt->disable_triggers)
return;
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("disabling triggers for %s", te->tag);
/*
* Become superuser if possible, since they are the only ones who can
* disable constraint triggers. If -S was not given, assume the initial
* user identity is a superuser. (XXX would it be better to become the
* table owner?)
*/
_becomeUser(AH, ropt->superuser);
/*
* Disable them.
*/
ahprintf(AH, "ALTER TABLE %s DISABLE TRIGGER ALL;\n\n",
fmtQualifiedId(te->namespace, te->tag));
2000-07-04 16:25:28 +02:00
}
static void
_enableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te)
2000-07-04 16:25:28 +02:00
{
RestoreOptions *ropt = AH->public.ropt;
/* This hack is only needed in a data-only restore */
if (!ropt->dataOnly || !ropt->disable_triggers)
return;
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("enabling triggers for %s", te->tag);
/*
* Become superuser if possible, since they are the only ones who can
* disable constraint triggers. If -S was not given, assume the initial
* user identity is a superuser. (XXX would it be better to become the
* table owner?)
*/
_becomeUser(AH, ropt->superuser);
/*
* Enable them.
*/
ahprintf(AH, "ALTER TABLE %s ENABLE TRIGGER ALL;\n\n",
fmtQualifiedId(te->namespace, te->tag));
}
2000-07-04 16:25:28 +02:00
Fix pg_dump for hash partitioning on enum columns. Hash partitioning on an enum is problematic because the hash codes are derived from the OIDs assigned to the enum values, which will almost certainly be different after a dump-and-reload than they were before. This means that some rows probably end up in different partitions than before, causing restore to fail because of partition constraint violations. (pg_upgrade dodges this problem by using hacks to force the enum values to keep the same OIDs, but that's not possible nor desirable for pg_dump.) Users can work around that by specifying --load-via-partition-root, but since that's a dump-time not restore-time decision, one might find out the need for it far too late. Instead, teach pg_dump to apply that option automatically when dealing with a partitioned table that has hash-on-enum partitioning. Also deal with a pre-existing issue for --load-via-partition-root mode: in a parallel restore, we try to TRUNCATE target tables just before loading them, in order to enable some backend optimizations. This is bad when using --load-via-partition-root because (a) we're likely to suffer deadlocks from restore jobs trying to restore rows into other partitions than they came from, and (b) if we miss getting a deadlock we might still lose data due to a TRUNCATE removing rows from some already-completed restore job. The fix for this is conceptually simple: just don't TRUNCATE if we're dealing with a --load-via-partition-root case. The tricky bit is for pg_restore to identify those cases. In dumps using COPY commands we can inspect each COPY command to see if it targets the nominal target table or some ancestor. However, in dumps using INSERT commands it's pretty impractical to examine the INSERTs in advance. To provide a solution for that going forward, modify pg_dump to mark TABLE DATA items that are using --load-via-partition-root with a comment. (This change also responds to a complaint from Robert Haas that the dump output for --load-via-partition-root is pretty confusing.) pg_restore checks for the special comment as well as checking the COPY command if present. This will fail to identify the combination of --load-via-partition-root and --inserts in pre-existing dump files, but that should be a pretty rare case in the field. If it does happen you will probably get a deadlock failure that you can work around by not using parallel restore, which is the same as before this bug fix. Having done this, there seems no remaining reason for the alarmism in the pg_dump man page about combining --load-via-partition-root with parallel restore, so remove that warning. Patch by me; thanks to Julien Rouhaud for review. Back-patch to v11 where hash partitioning was introduced. Discussion: https://postgr.es/m/1376149.1675268279@sss.pgh.pa.us
2023-03-17 18:31:40 +01:00
/*
* Detect whether a TABLE DATA TOC item is performing "load via partition
* root", that is the target table is an ancestor partition rather than the
* table the TOC item is nominally for.
*
* In newer archive files this can be detected by checking for a special
* comment placed in te->defn. In older files we have to fall back to seeing
* if the COPY statement targets the named table or some other one. This
* will not work for data dumped as INSERT commands, so we could give a false
* negative in that case; fortunately, that's a rarely-used option.
*/
static bool
is_load_via_partition_root(TocEntry *te)
{
if (te->defn &&
strncmp(te->defn, "-- load via partition root ", 27) == 0)
return true;
if (te->copyStmt && *te->copyStmt)
{
PQExpBuffer copyStmt = createPQExpBuffer();
bool result;
/*
* Build the initial part of the COPY as it would appear if the
* nominal target table is the actual target. If we see anything
* else, it must be a load-via-partition-root case.
*/
appendPQExpBuffer(copyStmt, "COPY %s ",
fmtQualifiedId(te->namespace, te->tag));
result = strncmp(te->copyStmt, copyStmt->data, copyStmt->len) != 0;
destroyPQExpBuffer(copyStmt);
return result;
}
/* Assume it's not load-via-partition-root */
return false;
}
2000-07-04 16:25:28 +02:00
/*
* This is a routine that is part of the dumper interface, hence the 'Archive*' parameter.
2000-07-04 16:25:28 +02:00
*/
/* Public */
void
WriteData(Archive *AHX, const void *data, size_t dLen)
2000-07-04 16:25:28 +02:00
{
ArchiveHandle *AH = (ArchiveHandle *) AHX;
if (!AH->currToc)
pg_fatal("internal error -- WriteData cannot be called outside the context of a DataDumper routine");
AH->WriteDataPtr(AH, data, dLen);
2000-07-04 16:25:28 +02:00
}
/*
* Create a new TOC entry. The TOC was designed as a TOC, but is now the
* repository for all metadata. But the name has stuck.
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
*
* The new entry is added to the Archive's TOC list. Most callers can ignore
* the result value because nothing else need be done, but a few want to
* manipulate the TOC entry further.
2000-07-04 16:25:28 +02:00
*/
/* Public */
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
TocEntry *
ArchiveEntry(Archive *AHX, CatalogId catalogId, DumpId dumpId,
ArchiveOpts *opts)
2000-07-04 16:25:28 +02:00
{
ArchiveHandle *AH = (ArchiveHandle *) AHX;
TocEntry *newToc;
2001-03-22 05:01:46 +01:00
newToc = (TocEntry *) pg_malloc0(sizeof(TocEntry));
2001-03-22 05:01:46 +01:00
AH->tocCount++;
if (dumpId > AH->maxDumpId)
AH->maxDumpId = dumpId;
2000-07-04 16:25:28 +02:00
newToc->prev = AH->toc->prev;
newToc->next = AH->toc;
AH->toc->prev->next = newToc;
AH->toc->prev = newToc;
2001-03-22 05:01:46 +01:00
newToc->catalogId = catalogId;
newToc->dumpId = dumpId;
newToc->section = opts->section;
newToc->tag = pg_strdup(opts->tag);
newToc->namespace = opts->namespace ? pg_strdup(opts->namespace) : NULL;
newToc->tablespace = opts->tablespace ? pg_strdup(opts->tablespace) : NULL;
newToc->tableam = opts->tableam ? pg_strdup(opts->tableam) : NULL;
newToc->owner = opts->owner ? pg_strdup(opts->owner) : NULL;
newToc->desc = pg_strdup(opts->description);
newToc->defn = opts->createStmt ? pg_strdup(opts->createStmt) : NULL;
newToc->dropStmt = opts->dropStmt ? pg_strdup(opts->dropStmt) : NULL;
newToc->copyStmt = opts->copyStmt ? pg_strdup(opts->copyStmt) : NULL;
if (opts->nDeps > 0)
{
newToc->dependencies = (DumpId *) pg_malloc(opts->nDeps * sizeof(DumpId));
memcpy(newToc->dependencies, opts->deps, opts->nDeps * sizeof(DumpId));
newToc->nDeps = opts->nDeps;
}
else
{
newToc->dependencies = NULL;
newToc->nDeps = 0;
}
newToc->dataDumper = opts->dumpFn;
newToc->dataDumperArg = opts->dumpArg;
newToc->hadDumper = opts->dumpFn ? true : false;
2000-07-04 16:25:28 +02:00
newToc->formatData = NULL;
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
newToc->dataLength = 0;
2000-07-04 16:25:28 +02:00
if (AH->ArchiveEntryPtr != NULL)
AH->ArchiveEntryPtr(AH, newToc);
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
return newToc;
2000-07-04 16:25:28 +02:00
}
/* Public */
void
PrintTOCSummary(Archive *AHX)
2000-07-04 16:25:28 +02:00
{
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
TocEntry *te;
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
pg_compress_specification out_compression_spec = {0};
teSection curSection;
CompressFileHandle *sav;
const char *fmtName;
char stamp_str[64];
2000-07-04 16:25:28 +02:00
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
/* TOC is always uncompressed */
out_compression_spec.algorithm = PG_COMPRESSION_NONE;
sav = SaveOutput(AH);
2000-07-04 16:25:28 +02:00
if (ropt->filename)
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
SetOutput(AH, ropt->filename, out_compression_spec);
2000-07-04 16:25:28 +02:00
if (strftime(stamp_str, sizeof(stamp_str), PGDUMP_STRFTIME_FMT,
localtime(&AH->createDate)) == 0)
strcpy(stamp_str, "[unknown]");
ahprintf(AH, ";\n; Archive created at %s\n", stamp_str);
ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %s\n",
sanitize_line(AH->archdbname, false),
AH->tocCount,
get_compress_algorithm_name(AH->compression_spec.algorithm));
switch (AH->format)
{
case archCustom:
fmtName = "CUSTOM";
break;
case archDirectory:
fmtName = "DIRECTORY";
break;
case archTar:
fmtName = "TAR";
break;
default:
fmtName = "UNKNOWN";
}
ahprintf(AH, "; Dump Version: %d.%d-%d\n",
ARCHIVE_MAJOR(AH->version), ARCHIVE_MINOR(AH->version), ARCHIVE_REV(AH->version));
ahprintf(AH, "; Format: %s\n", fmtName);
2002-10-27 03:52:10 +01:00
ahprintf(AH, "; Integer: %d bytes\n", (int) AH->intSize);
ahprintf(AH, "; Offset: %d bytes\n", (int) AH->offSize);
if (AH->archiveRemoteVersion)
ahprintf(AH, "; Dumped from database version: %s\n",
AH->archiveRemoteVersion);
if (AH->archiveDumpVersion)
ahprintf(AH, "; Dumped by pg_dump version: %s\n",
AH->archiveDumpVersion);
ahprintf(AH, ";\n;\n; Selected TOC Entries:\n;\n");
2000-07-04 16:25:28 +02:00
curSection = SECTION_PRE_DATA;
for (te = AH->toc->next; te != AH->toc; te = te->next)
2000-07-04 16:25:28 +02:00
{
if (te->section != SECTION_NONE)
curSection = te->section;
if (ropt->verbose ||
(_tocEntryRequired(te, curSection, AH) & (REQ_SCHEMA | REQ_DATA)) != 0)
{
char *sanitized_name;
char *sanitized_schema;
char *sanitized_owner;
/*
*/
sanitized_name = sanitize_line(te->tag, false);
sanitized_schema = sanitize_line(te->namespace, true);
sanitized_owner = sanitize_line(te->owner, false);
ahprintf(AH, "%d; %u %u %s %s %s %s\n", te->dumpId,
te->catalogId.tableoid, te->catalogId.oid,
te->desc, sanitized_schema, sanitized_name,
sanitized_owner);
free(sanitized_name);
free(sanitized_schema);
free(sanitized_owner);
}
if (ropt->verbose && te->nDeps > 0)
{
int i;
ahprintf(AH, ";\tdepends on:");
for (i = 0; i < te->nDeps; i++)
ahprintf(AH, " %d", te->dependencies[i]);
ahprintf(AH, "\n");
}
2000-07-04 16:25:28 +02:00
}
/* Enforce strict names checking */
if (ropt->strict_names)
StrictNamesCheck(ropt);
2000-07-04 16:25:28 +02:00
if (ropt->filename)
RestoreOutput(AH, sav);
2000-07-04 16:25:28 +02:00
}
/***********
* Large Object Archival
***********/
/* Called by a dumper to signal start of a LO */
int
StartLO(Archive *AHX, Oid oid)
{
ArchiveHandle *AH = (ArchiveHandle *) AHX;
if (!AH->StartLOPtr)
pg_fatal("large-object output not supported in chosen format");
AH->StartLOPtr(AH, AH->currToc, oid);
return 1;
}
/* Called by a dumper to signal end of a LO */
int
EndLO(Archive *AHX, Oid oid)
{
ArchiveHandle *AH = (ArchiveHandle *) AHX;
if (AH->EndLOPtr)
AH->EndLOPtr(AH, AH->currToc, oid);
return 1;
}
/**********
* Large Object Restoration
**********/
/*
* Called by a format handler before any LOs are restored
*/
void
StartRestoreLOs(ArchiveHandle *AH)
{
RestoreOptions *ropt = AH->public.ropt;
if (!ropt->single_txn)
{
if (AH->connection)
StartTransaction(&AH->public);
else
ahprintf(AH, "BEGIN;\n\n");
}
AH->loCount = 0;
}
/*
* Called by a format handler after all LOs are restored
*/
void
EndRestoreLOs(ArchiveHandle *AH)
{
RestoreOptions *ropt = AH->public.ropt;
if (!ropt->single_txn)
{
if (AH->connection)
CommitTransaction(&AH->public);
else
ahprintf(AH, "COMMIT;\n\n");
}
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info(ngettext("restored %d large object",
"restored %d large objects",
AH->loCount),
AH->loCount);
}
/*
* Called by a format handler to initiate restoration of a LO
*/
void
StartRestoreLO(ArchiveHandle *AH, Oid oid, bool drop)
{
bool old_lo_style = (AH->version < K_VERS_1_12);
Oid loOid;
AH->loCount++;
/* Initialize the LO Buffer */
AH->lo_buf_used = 0;
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("restoring large object with OID %u", oid);
/* With an old archive we must do drop and create logic here */
if (old_lo_style && drop)
DropLOIfExists(AH, oid);
if (AH->connection)
{
if (old_lo_style)
{
loOid = lo_create(AH->connection, oid);
if (loOid == 0 || loOid != oid)
pg_fatal("could not create large object %u: %s",
oid, PQerrorMessage(AH->connection));
}
AH->loFd = lo_open(AH->connection, oid, INV_WRITE);
if (AH->loFd == -1)
pg_fatal("could not open large object %u: %s",
oid, PQerrorMessage(AH->connection));
}
else
{
if (old_lo_style)
ahprintf(AH, "SELECT pg_catalog.lo_open(pg_catalog.lo_create('%u'), %d);\n",
oid, INV_WRITE);
else
ahprintf(AH, "SELECT pg_catalog.lo_open('%u', %d);\n",
oid, INV_WRITE);
}
AH->writingLO = true;
}
void
EndRestoreLO(ArchiveHandle *AH, Oid oid)
{
if (AH->lo_buf_used > 0)
{
/* Write remaining bytes from the LO buffer */
dump_lo_buf(AH);
}
AH->writingLO = false;
if (AH->connection)
{
lo_close(AH->connection, AH->loFd);
AH->loFd = -1;
}
else
{
ahprintf(AH, "SELECT pg_catalog.lo_close(0);\n\n");
}
}
2000-07-04 16:25:28 +02:00
/***********
* Sorting and Reordering
***********/
void
SortTocFromFile(Archive *AHX)
2000-07-04 16:25:28 +02:00
{
ArchiveHandle *AH = (ArchiveHandle *) AHX;
RestoreOptions *ropt = AH->public.ropt;
2000-07-04 16:25:28 +02:00
FILE *fh;
StringInfoData linebuf;
2001-03-22 05:01:46 +01:00
2000-07-04 16:25:28 +02:00
/* Allocate space for the 'wanted' array, and init it */
ropt->idWanted = (bool *) pg_malloc0(sizeof(bool) * AH->maxDumpId);
2000-07-04 16:25:28 +02:00
/* Setup the file */
fh = fopen(ropt->tocFile, PG_BINARY_R);
if (!fh)
pg_fatal("could not open TOC file \"%s\": %m", ropt->tocFile);
2000-07-04 16:25:28 +02:00
initStringInfo(&linebuf);
while (pg_get_line_buf(fh, &linebuf))
2000-07-04 16:25:28 +02:00
{
char *cmnt;
char *endptr;
DumpId id;
TocEntry *te;
/* Truncate line at comment, if any */
cmnt = strchr(linebuf.data, ';');
2000-07-04 16:25:28 +02:00
if (cmnt != NULL)
{
2000-07-04 16:25:28 +02:00
cmnt[0] = '\0';
linebuf.len = cmnt - linebuf.data;
}
2001-03-22 05:01:46 +01:00
/* Ignore if all blank */
if (strspn(linebuf.data, " \t\r\n") == linebuf.len)
2000-07-04 16:25:28 +02:00
continue;
2001-03-22 05:01:46 +01:00
/* Get an ID, check it's valid and not already seen */
id = strtol(linebuf.data, &endptr, 10);
if (endptr == linebuf.data || id <= 0 || id > AH->maxDumpId ||
ropt->idWanted[id - 1])
2001-03-22 05:01:46 +01:00
{
pg_log_warning("line ignored: %s", linebuf.data);
2000-07-04 16:25:28 +02:00
continue;
}
/* Find TOC entry */
te = getTocEntryByDumpId(AH, id);
2000-07-04 16:25:28 +02:00
if (!te)
pg_fatal("could not find entry for ID %d",
id);
2000-07-04 16:25:28 +02:00
Improve parallel restore's ability to cope with selective restore (-L option). The original coding tended to break down in the face of modified restore orders, as shown in bug #5626 from Albert Ullrich, because it would flip over into parallel-restore operation too soon. That causes problems because we don't have sufficient dependency information in dump archives to allow safe parallel processing of SECTION_PRE_DATA items. Even if we did, it's probably undesirable to allow that to override the commanded restore order. To fix the problem of omitted items causing unexpected changes in restore order, tweak SortTocFromFile so that omitted items end up at the head of the list not the tail. This ensures that they'll be examined and their dependencies will be marked satisfied before we get to any interesting items. In HEAD and 9.0, we can easily change restore_toc_entries_parallel so that all SECTION_PRE_DATA items are guaranteed to be processed in the initial serial-restore loop, and hence in commanded order. Only DATA and POST_DATA items are candidates for parallel processing. For them there might be variations from the commanded order because of parallelism, but we should do it in a safe order thanks to dependencies. In 8.4 it's much harder to make such a guarantee. I settled for not letting the initial loop break out into parallel processing mode if it sees a DATA/POST_DATA item that's not to be restored; this at least prevents a non-restorable item from causing premature exit from the loop. This means that 8.4 will be more likely to fail given a badly-ordered -L list than 9.x, but we don't really promise any such thing will work anyway.
2010-08-21 15:59:44 +02:00
/* Mark it wanted */
ropt->idWanted[id - 1] = true;
2000-07-04 16:25:28 +02:00
Improve parallel restore's ability to cope with selective restore (-L option). The original coding tended to break down in the face of modified restore orders, as shown in bug #5626 from Albert Ullrich, because it would flip over into parallel-restore operation too soon. That causes problems because we don't have sufficient dependency information in dump archives to allow safe parallel processing of SECTION_PRE_DATA items. Even if we did, it's probably undesirable to allow that to override the commanded restore order. To fix the problem of omitted items causing unexpected changes in restore order, tweak SortTocFromFile so that omitted items end up at the head of the list not the tail. This ensures that they'll be examined and their dependencies will be marked satisfied before we get to any interesting items. In HEAD and 9.0, we can easily change restore_toc_entries_parallel so that all SECTION_PRE_DATA items are guaranteed to be processed in the initial serial-restore loop, and hence in commanded order. Only DATA and POST_DATA items are candidates for parallel processing. For them there might be variations from the commanded order because of parallelism, but we should do it in a safe order thanks to dependencies. In 8.4 it's much harder to make such a guarantee. I settled for not letting the initial loop break out into parallel processing mode if it sees a DATA/POST_DATA item that's not to be restored; this at least prevents a non-restorable item from causing premature exit from the loop. This means that 8.4 will be more likely to fail given a badly-ordered -L list than 9.x, but we don't really promise any such thing will work anyway.
2010-08-21 15:59:44 +02:00
/*
* Move each item to the end of the list as it is selected, so that
* they are placed in the desired order. Any unwanted items will end
* up at the front of the list, which may seem unintuitive but it's
* what we need. In an ordinary serial restore that makes no
* difference, but in a parallel restore we need to mark unrestored
* items' dependencies as satisfied before we start examining
* restorable items. Otherwise they could have surprising
* side-effects on the order in which restorable items actually get
* restored.
*/
_moveBefore(AH->toc, te);
2000-07-04 16:25:28 +02:00
}
pg_free(linebuf.data);
if (fclose(fh) != 0)
pg_fatal("could not close TOC file: %m");
2000-07-04 16:25:28 +02:00
}
/**********************
* Convenience functions that look like standard IO functions
2000-07-04 16:25:28 +02:00
* for writing data when in dump mode.
**********************/
/* Public */
void
2000-07-04 16:25:28 +02:00
archputs(const char *s, Archive *AH)
{
WriteData(AH, s, strlen(s));
2000-07-04 16:25:28 +02:00
}
/* Public */
int
archprintf(Archive *AH, const char *fmt,...)
{
int save_errno = errno;
char *p;
size_t len = 128; /* initial assumption about buffer size */
size_t cnt;
2001-03-22 05:01:46 +01:00
for (;;)
2000-07-04 16:25:28 +02:00
{
va_list args;
/* Allocate work buffer. */
p = (char *) pg_malloc(len);
/* Try to format the data. */
errno = save_errno;
va_start(args, fmt);
cnt = pvsnprintf(p, len, fmt, args);
va_end(args);
if (cnt < len)
break; /* success */
/* Release buffer and loop around to try again with larger len. */
free(p);
len = cnt;
2000-07-04 16:25:28 +02:00
}
2000-07-04 16:25:28 +02:00
WriteData(AH, p, cnt);
free(p);
return (int) cnt;
2000-07-04 16:25:28 +02:00
}
/*******************************
* Stuff below here should be 'private' to the archiver routines
*******************************/
static void
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
SetOutput(ArchiveHandle *AH, const char *filename,
const pg_compress_specification compression_spec)
2000-07-04 16:25:28 +02:00
{
CompressFileHandle *CFH;
const char *mode;
int fn = -1;
2001-03-22 05:01:46 +01:00
2000-07-04 16:25:28 +02:00
if (filename)
{
if (strcmp(filename, "-") == 0)
fn = fileno(stdout);
}
2000-07-04 16:25:28 +02:00
else if (AH->FH)
fn = fileno(AH->FH);
else if (AH->fSpec)
{
filename = AH->fSpec;
}
else
fn = fileno(stdout);
2001-03-22 05:01:46 +01:00
if (AH->mode == archModeAppend)
mode = PG_BINARY_A;
else
mode = PG_BINARY_W;
CFH = InitCompressFileHandle(compression_spec);
2000-07-04 16:25:28 +02:00
Improve type handling in pg_dump's compress file API After 0da243fed0 got committed, we've received a report about a compiler warning, related to the new LZ4File_gets() function: compress_lz4.c: In function 'LZ4File_gets': compress_lz4.c:492:19: warning: comparison of unsigned expression in '< 0' is always false [-Wtype-limits] 492 | if (dsize < 0) The reason is very simple - dsize is declared as size_t, which is an unsigned integer, and thus the check is pointless and we might fail to notice an error in some cases (or fail in a strange way a bit later). The warning could have been silenced by simply changing the type, but we realized the API mostly assumes all the libraries use the same types and report errors the same way (e.g. by returning 0 and/or negative value). But we can't make this assumption - the gzip/lz4 libraries already disagree on some of this, and even if they did a library added in the future might not. The right solution is to define what the API does, and translate the library-specific behavior in consistent way (so that the internal errors are not exposed to users of our compression API). So this adjusts the data types in a couple places, so that we don't miss library errors, and simplifies and unifies the error reporting to simply return true/false (instead of e.g. size_t). While at it, make sure LZ4File_open_write() does not clobber errno in case open_func() fails. Author: Georgios Kokolatos Reported-by: Alexander Lakhin Reviewed-by: Tomas Vondra, Justin Pryzby Discussion: https://postgr.es/m/33496f7c-3449-1426-d568-63f6bca2ac1f@gmail.com
2023-03-23 17:51:55 +01:00
if (!CFH->open_func(filename, fn, mode, CFH))
{
if (filename)
pg_fatal("could not open output file \"%s\": %m", filename);
else
pg_fatal("could not open output file: %m");
}
AH->OF = CFH;
}
static CompressFileHandle *
SaveOutput(ArchiveHandle *AH)
{
return (CompressFileHandle *) AH->OF;
2000-07-04 16:25:28 +02:00
}
static void
RestoreOutput(ArchiveHandle *AH, CompressFileHandle *savedOutput)
2000-07-04 16:25:28 +02:00
{
errno = 0;
Improve type handling in pg_dump's compress file API After 0da243fed0 got committed, we've received a report about a compiler warning, related to the new LZ4File_gets() function: compress_lz4.c: In function 'LZ4File_gets': compress_lz4.c:492:19: warning: comparison of unsigned expression in '< 0' is always false [-Wtype-limits] 492 | if (dsize < 0) The reason is very simple - dsize is declared as size_t, which is an unsigned integer, and thus the check is pointless and we might fail to notice an error in some cases (or fail in a strange way a bit later). The warning could have been silenced by simply changing the type, but we realized the API mostly assumes all the libraries use the same types and report errors the same way (e.g. by returning 0 and/or negative value). But we can't make this assumption - the gzip/lz4 libraries already disagree on some of this, and even if they did a library added in the future might not. The right solution is to define what the API does, and translate the library-specific behavior in consistent way (so that the internal errors are not exposed to users of our compression API). So this adjusts the data types in a couple places, so that we don't miss library errors, and simplifies and unifies the error reporting to simply return true/false (instead of e.g. size_t). While at it, make sure LZ4File_open_write() does not clobber errno in case open_func() fails. Author: Georgios Kokolatos Reported-by: Alexander Lakhin Reviewed-by: Tomas Vondra, Justin Pryzby Discussion: https://postgr.es/m/33496f7c-3449-1426-d568-63f6bca2ac1f@gmail.com
2023-03-23 17:51:55 +01:00
if (!EndCompressFileHandle(AH->OF))
pg_fatal("could not close output file: %m");
2000-07-04 16:25:28 +02:00
AH->OF = savedOutput;
2000-07-04 16:25:28 +02:00
}
/*
* Print formatted text to the output file (usually stdout).
*/
int
ahprintf(ArchiveHandle *AH, const char *fmt,...)
{
int save_errno = errno;
char *p;
size_t len = 128; /* initial assumption about buffer size */
size_t cnt;
2001-03-22 05:01:46 +01:00
for (;;)
{
va_list args;
/* Allocate work buffer. */
p = (char *) pg_malloc(len);
/* Try to format the data. */
errno = save_errno;
va_start(args, fmt);
cnt = pvsnprintf(p, len, fmt, args);
va_end(args);
if (cnt < len)
break; /* success */
/* Release buffer and loop around to try again with larger len. */
free(p);
len = cnt;
2000-07-04 16:25:28 +02:00
}
2000-07-04 16:25:28 +02:00
ahwrite(p, 1, cnt, AH);
free(p);
return (int) cnt;
2000-07-04 16:25:28 +02:00
}
/*
* Single place for logic which says 'We are restoring to a direct DB connection'.
*/
static int
RestoringToDB(ArchiveHandle *AH)
{
RestoreOptions *ropt = AH->public.ropt;
return (ropt && ropt->useDB && AH->connection);
}
/*
* Dump the current contents of the LO data buffer while writing a LO
*/
static void
dump_lo_buf(ArchiveHandle *AH)
{
if (AH->connection)
{
int res;
res = lo_write(AH->connection, AH->loFd, AH->lo_buf, AH->lo_buf_used);
pg_log_debug(ngettext("wrote %zu byte of large object data (result = %d)",
"wrote %zu bytes of large object data (result = %d)",
AH->lo_buf_used),
AH->lo_buf_used, res);
/* We assume there are no short writes, only errors */
if (res != AH->lo_buf_used)
warn_or_exit_horribly(AH, "could not write to large object: %s",
PQerrorMessage(AH->connection));
}
else
{
PQExpBuffer buf = createPQExpBuffer();
appendByteaLiteralAHX(buf,
(const unsigned char *) AH->lo_buf,
AH->lo_buf_used,
AH);
/* Hack: turn off writingLO so ahwrite doesn't recurse to here */
AH->writingLO = false;
ahprintf(AH, "SELECT pg_catalog.lowrite(0, %s);\n", buf->data);
AH->writingLO = true;
destroyPQExpBuffer(buf);
}
AH->lo_buf_used = 0;
}
2000-07-04 16:25:28 +02:00
/*
* Write buffer to the output file (usually stdout). This is used for
* outputting 'restore' scripts etc. It is even possible for an archive
* format to create a custom output routine to 'fake' a restore if it
* wants to generate a script (see TAR output).
2000-07-04 16:25:28 +02:00
*/
void
2000-07-04 16:25:28 +02:00
ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle *AH)
{
int bytes_written = 0;
if (AH->writingLO)
{
size_t remaining = size * nmemb;
while (AH->lo_buf_used + remaining > AH->lo_buf_size)
{
size_t avail = AH->lo_buf_size - AH->lo_buf_used;
memcpy((char *) AH->lo_buf + AH->lo_buf_used, ptr, avail);
ptr = (const void *) ((const char *) ptr + avail);
remaining -= avail;
AH->lo_buf_used += avail;
dump_lo_buf(AH);
}
memcpy((char *) AH->lo_buf + AH->lo_buf_used, ptr, remaining);
AH->lo_buf_used += remaining;
bytes_written = size * nmemb;
}
else if (AH->CustomOutPtr)
bytes_written = AH->CustomOutPtr(AH, ptr, size * nmemb);
/*
* If we're doing a restore, and it's direct to DB, and we're connected
* then send it to the DB.
*/
else if (RestoringToDB(AH))
bytes_written = ExecuteSqlCommandBuf(&AH->public, (const char *) ptr, size * nmemb);
else
{
CompressFileHandle *CFH = (CompressFileHandle *) AH->OF;
Improve type handling in pg_dump's compress file API After 0da243fed0 got committed, we've received a report about a compiler warning, related to the new LZ4File_gets() function: compress_lz4.c: In function 'LZ4File_gets': compress_lz4.c:492:19: warning: comparison of unsigned expression in '< 0' is always false [-Wtype-limits] 492 | if (dsize < 0) The reason is very simple - dsize is declared as size_t, which is an unsigned integer, and thus the check is pointless and we might fail to notice an error in some cases (or fail in a strange way a bit later). The warning could have been silenced by simply changing the type, but we realized the API mostly assumes all the libraries use the same types and report errors the same way (e.g. by returning 0 and/or negative value). But we can't make this assumption - the gzip/lz4 libraries already disagree on some of this, and even if they did a library added in the future might not. The right solution is to define what the API does, and translate the library-specific behavior in consistent way (so that the internal errors are not exposed to users of our compression API). So this adjusts the data types in a couple places, so that we don't miss library errors, and simplifies and unifies the error reporting to simply return true/false (instead of e.g. size_t). While at it, make sure LZ4File_open_write() does not clobber errno in case open_func() fails. Author: Georgios Kokolatos Reported-by: Alexander Lakhin Reviewed-by: Tomas Vondra, Justin Pryzby Discussion: https://postgr.es/m/33496f7c-3449-1426-d568-63f6bca2ac1f@gmail.com
2023-03-23 17:51:55 +01:00
if (CFH->write_func(ptr, size * nmemb, CFH))
bytes_written = size * nmemb;
}
if (bytes_written != size * nmemb)
WRITE_ERROR_EXIT;
}
/* on some error, we may decide to go on... */
void
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
warn_or_exit_horribly(ArchiveHandle *AH, const char *fmt,...)
{
va_list ap;
They are two different problems; the TOC entry is important for any multiline command or to rerun the command easily later. Whereas displaying the failed SQL command is a matter of fixing the error messages. The latter is complicated by failed COPY commands which, with die-on-errors off, results in the data being processed as a command, so dumping the command will dump all of the data. In the case of long commands, should the whole command be dumped? eg. (eg. several pages of function definition). In the case of the COPY command, I'm not sure what to do. Obviously, it would be best to avoid sending the data, but the data and command are combined (from memory). Also, the 'data' may be in the form of INSERT statements. Attached patch produces the first 125 chars of the command: pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC Entry 26; 1255 16449270 FUNCTION plpgsql_call_handler() pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_call_handler" already exists with same argument types Command was: CREATE FUNCTION plpgsql_call_handler() RETURNS language_handler AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_call_han... pg_restore: [archiver (db)] Error from TOC Entry 27; 1255 16449271 FUNCTION plpgsql_validator(oid) pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_validator" already exists with same argument types Command was: CREATE FUNCTION plpgsql_validator(oid) RETURNS void AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_validator' LANGU... Philip Warner
2004-08-20 22:00:34 +02:00
switch (AH->stage)
{
case STAGE_NONE:
/* Do nothing special */
break;
case STAGE_INITIALIZING:
if (AH->stage != AH->lastErrorStage)
pg_log_info("while INITIALIZING:");
They are two different problems; the TOC entry is important for any multiline command or to rerun the command easily later. Whereas displaying the failed SQL command is a matter of fixing the error messages. The latter is complicated by failed COPY commands which, with die-on-errors off, results in the data being processed as a command, so dumping the command will dump all of the data. In the case of long commands, should the whole command be dumped? eg. (eg. several pages of function definition). In the case of the COPY command, I'm not sure what to do. Obviously, it would be best to avoid sending the data, but the data and command are combined (from memory). Also, the 'data' may be in the form of INSERT statements. Attached patch produces the first 125 chars of the command: pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC Entry 26; 1255 16449270 FUNCTION plpgsql_call_handler() pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_call_handler" already exists with same argument types Command was: CREATE FUNCTION plpgsql_call_handler() RETURNS language_handler AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_call_han... pg_restore: [archiver (db)] Error from TOC Entry 27; 1255 16449271 FUNCTION plpgsql_validator(oid) pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_validator" already exists with same argument types Command was: CREATE FUNCTION plpgsql_validator(oid) RETURNS void AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_validator' LANGU... Philip Warner
2004-08-20 22:00:34 +02:00
break;
case STAGE_PROCESSING:
if (AH->stage != AH->lastErrorStage)
pg_log_info("while PROCESSING TOC:");
They are two different problems; the TOC entry is important for any multiline command or to rerun the command easily later. Whereas displaying the failed SQL command is a matter of fixing the error messages. The latter is complicated by failed COPY commands which, with die-on-errors off, results in the data being processed as a command, so dumping the command will dump all of the data. In the case of long commands, should the whole command be dumped? eg. (eg. several pages of function definition). In the case of the COPY command, I'm not sure what to do. Obviously, it would be best to avoid sending the data, but the data and command are combined (from memory). Also, the 'data' may be in the form of INSERT statements. Attached patch produces the first 125 chars of the command: pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC Entry 26; 1255 16449270 FUNCTION plpgsql_call_handler() pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_call_handler" already exists with same argument types Command was: CREATE FUNCTION plpgsql_call_handler() RETURNS language_handler AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_call_han... pg_restore: [archiver (db)] Error from TOC Entry 27; 1255 16449271 FUNCTION plpgsql_validator(oid) pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_validator" already exists with same argument types Command was: CREATE FUNCTION plpgsql_validator(oid) RETURNS void AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_validator' LANGU... Philip Warner
2004-08-20 22:00:34 +02:00
break;
case STAGE_FINALIZING:
if (AH->stage != AH->lastErrorStage)
pg_log_info("while FINALIZING:");
They are two different problems; the TOC entry is important for any multiline command or to rerun the command easily later. Whereas displaying the failed SQL command is a matter of fixing the error messages. The latter is complicated by failed COPY commands which, with die-on-errors off, results in the data being processed as a command, so dumping the command will dump all of the data. In the case of long commands, should the whole command be dumped? eg. (eg. several pages of function definition). In the case of the COPY command, I'm not sure what to do. Obviously, it would be best to avoid sending the data, but the data and command are combined (from memory). Also, the 'data' may be in the form of INSERT statements. Attached patch produces the first 125 chars of the command: pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC Entry 26; 1255 16449270 FUNCTION plpgsql_call_handler() pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_call_handler" already exists with same argument types Command was: CREATE FUNCTION plpgsql_call_handler() RETURNS language_handler AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_call_han... pg_restore: [archiver (db)] Error from TOC Entry 27; 1255 16449271 FUNCTION plpgsql_validator(oid) pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_validator" already exists with same argument types Command was: CREATE FUNCTION plpgsql_validator(oid) RETURNS void AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_validator' LANGU... Philip Warner
2004-08-20 22:00:34 +02:00
break;
}
if (AH->currentTE != NULL && AH->currentTE != AH->lastErrorTE)
{
pg_log_info("from TOC entry %d; %u %u %s %s %s",
AH->currentTE->dumpId,
AH->currentTE->catalogId.tableoid,
AH->currentTE->catalogId.oid,
AH->currentTE->desc ? AH->currentTE->desc : "(no desc)",
AH->currentTE->tag ? AH->currentTE->tag : "(no tag)",
AH->currentTE->owner ? AH->currentTE->owner : "(no owner)");
They are two different problems; the TOC entry is important for any multiline command or to rerun the command easily later. Whereas displaying the failed SQL command is a matter of fixing the error messages. The latter is complicated by failed COPY commands which, with die-on-errors off, results in the data being processed as a command, so dumping the command will dump all of the data. In the case of long commands, should the whole command be dumped? eg. (eg. several pages of function definition). In the case of the COPY command, I'm not sure what to do. Obviously, it would be best to avoid sending the data, but the data and command are combined (from memory). Also, the 'data' may be in the form of INSERT statements. Attached patch produces the first 125 chars of the command: pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC Entry 26; 1255 16449270 FUNCTION plpgsql_call_handler() pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_call_handler" already exists with same argument types Command was: CREATE FUNCTION plpgsql_call_handler() RETURNS language_handler AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_call_han... pg_restore: [archiver (db)] Error from TOC Entry 27; 1255 16449271 FUNCTION plpgsql_validator(oid) pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_validator" already exists with same argument types Command was: CREATE FUNCTION plpgsql_validator(oid) RETURNS void AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_validator' LANGU... Philip Warner
2004-08-20 22:00:34 +02:00
}
AH->lastErrorStage = AH->stage;
AH->lastErrorTE = AH->currentTE;
va_start(ap, fmt);
pg_log_generic_v(PG_LOG_ERROR, PG_LOG_PRIMARY, fmt, ap);
va_end(ap);
if (AH->public.exit_on_error)
exit_nicely(1);
else
AH->public.n_errors++;
}
Improve parallel restore's ability to cope with selective restore (-L option). The original coding tended to break down in the face of modified restore orders, as shown in bug #5626 from Albert Ullrich, because it would flip over into parallel-restore operation too soon. That causes problems because we don't have sufficient dependency information in dump archives to allow safe parallel processing of SECTION_PRE_DATA items. Even if we did, it's probably undesirable to allow that to override the commanded restore order. To fix the problem of omitted items causing unexpected changes in restore order, tweak SortTocFromFile so that omitted items end up at the head of the list not the tail. This ensures that they'll be examined and their dependencies will be marked satisfied before we get to any interesting items. In HEAD and 9.0, we can easily change restore_toc_entries_parallel so that all SECTION_PRE_DATA items are guaranteed to be processed in the initial serial-restore loop, and hence in commanded order. Only DATA and POST_DATA items are candidates for parallel processing. For them there might be variations from the commanded order because of parallelism, but we should do it in a safe order thanks to dependencies. In 8.4 it's much harder to make such a guarantee. I settled for not letting the initial loop break out into parallel processing mode if it sees a DATA/POST_DATA item that's not to be restored; this at least prevents a non-restorable item from causing premature exit from the loop. This means that 8.4 will be more likely to fail given a badly-ordered -L list than 9.x, but we don't really promise any such thing will work anyway.
2010-08-21 15:59:44 +02:00
#ifdef NOT_USED
2000-07-04 16:25:28 +02:00
static void
_moveAfter(ArchiveHandle *AH, TocEntry *pos, TocEntry *te)
{
Improve parallel restore's ability to cope with selective restore (-L option). The original coding tended to break down in the face of modified restore orders, as shown in bug #5626 from Albert Ullrich, because it would flip over into parallel-restore operation too soon. That causes problems because we don't have sufficient dependency information in dump archives to allow safe parallel processing of SECTION_PRE_DATA items. Even if we did, it's probably undesirable to allow that to override the commanded restore order. To fix the problem of omitted items causing unexpected changes in restore order, tweak SortTocFromFile so that omitted items end up at the head of the list not the tail. This ensures that they'll be examined and their dependencies will be marked satisfied before we get to any interesting items. In HEAD and 9.0, we can easily change restore_toc_entries_parallel so that all SECTION_PRE_DATA items are guaranteed to be processed in the initial serial-restore loop, and hence in commanded order. Only DATA and POST_DATA items are candidates for parallel processing. For them there might be variations from the commanded order because of parallelism, but we should do it in a safe order thanks to dependencies. In 8.4 it's much harder to make such a guarantee. I settled for not letting the initial loop break out into parallel processing mode if it sees a DATA/POST_DATA item that's not to be restored; this at least prevents a non-restorable item from causing premature exit from the loop. This means that 8.4 will be more likely to fail given a badly-ordered -L list than 9.x, but we don't really promise any such thing will work anyway.
2010-08-21 15:59:44 +02:00
/* Unlink te from list */
2000-07-04 16:25:28 +02:00
te->prev->next = te->next;
te->next->prev = te->prev;
Improve parallel restore's ability to cope with selective restore (-L option). The original coding tended to break down in the face of modified restore orders, as shown in bug #5626 from Albert Ullrich, because it would flip over into parallel-restore operation too soon. That causes problems because we don't have sufficient dependency information in dump archives to allow safe parallel processing of SECTION_PRE_DATA items. Even if we did, it's probably undesirable to allow that to override the commanded restore order. To fix the problem of omitted items causing unexpected changes in restore order, tweak SortTocFromFile so that omitted items end up at the head of the list not the tail. This ensures that they'll be examined and their dependencies will be marked satisfied before we get to any interesting items. In HEAD and 9.0, we can easily change restore_toc_entries_parallel so that all SECTION_PRE_DATA items are guaranteed to be processed in the initial serial-restore loop, and hence in commanded order. Only DATA and POST_DATA items are candidates for parallel processing. For them there might be variations from the commanded order because of parallelism, but we should do it in a safe order thanks to dependencies. In 8.4 it's much harder to make such a guarantee. I settled for not letting the initial loop break out into parallel processing mode if it sees a DATA/POST_DATA item that's not to be restored; this at least prevents a non-restorable item from causing premature exit from the loop. This means that 8.4 will be more likely to fail given a badly-ordered -L list than 9.x, but we don't really promise any such thing will work anyway.
2010-08-21 15:59:44 +02:00
/* and insert it after "pos" */
2000-07-04 16:25:28 +02:00
te->prev = pos;
te->next = pos->next;
pos->next->prev = te;
pos->next = te;
}
Improve parallel restore's ability to cope with selective restore (-L option). The original coding tended to break down in the face of modified restore orders, as shown in bug #5626 from Albert Ullrich, because it would flip over into parallel-restore operation too soon. That causes problems because we don't have sufficient dependency information in dump archives to allow safe parallel processing of SECTION_PRE_DATA items. Even if we did, it's probably undesirable to allow that to override the commanded restore order. To fix the problem of omitted items causing unexpected changes in restore order, tweak SortTocFromFile so that omitted items end up at the head of the list not the tail. This ensures that they'll be examined and their dependencies will be marked satisfied before we get to any interesting items. In HEAD and 9.0, we can easily change restore_toc_entries_parallel so that all SECTION_PRE_DATA items are guaranteed to be processed in the initial serial-restore loop, and hence in commanded order. Only DATA and POST_DATA items are candidates for parallel processing. For them there might be variations from the commanded order because of parallelism, but we should do it in a safe order thanks to dependencies. In 8.4 it's much harder to make such a guarantee. I settled for not letting the initial loop break out into parallel processing mode if it sees a DATA/POST_DATA item that's not to be restored; this at least prevents a non-restorable item from causing premature exit from the loop. This means that 8.4 will be more likely to fail given a badly-ordered -L list than 9.x, but we don't really promise any such thing will work anyway.
2010-08-21 15:59:44 +02:00
#endif
2000-07-04 16:25:28 +02:00
static void
_moveBefore(TocEntry *pos, TocEntry *te)
2000-07-04 16:25:28 +02:00
{
Improve parallel restore's ability to cope with selective restore (-L option). The original coding tended to break down in the face of modified restore orders, as shown in bug #5626 from Albert Ullrich, because it would flip over into parallel-restore operation too soon. That causes problems because we don't have sufficient dependency information in dump archives to allow safe parallel processing of SECTION_PRE_DATA items. Even if we did, it's probably undesirable to allow that to override the commanded restore order. To fix the problem of omitted items causing unexpected changes in restore order, tweak SortTocFromFile so that omitted items end up at the head of the list not the tail. This ensures that they'll be examined and their dependencies will be marked satisfied before we get to any interesting items. In HEAD and 9.0, we can easily change restore_toc_entries_parallel so that all SECTION_PRE_DATA items are guaranteed to be processed in the initial serial-restore loop, and hence in commanded order. Only DATA and POST_DATA items are candidates for parallel processing. For them there might be variations from the commanded order because of parallelism, but we should do it in a safe order thanks to dependencies. In 8.4 it's much harder to make such a guarantee. I settled for not letting the initial loop break out into parallel processing mode if it sees a DATA/POST_DATA item that's not to be restored; this at least prevents a non-restorable item from causing premature exit from the loop. This means that 8.4 will be more likely to fail given a badly-ordered -L list than 9.x, but we don't really promise any such thing will work anyway.
2010-08-21 15:59:44 +02:00
/* Unlink te from list */
2000-07-04 16:25:28 +02:00
te->prev->next = te->next;
te->next->prev = te->prev;
Improve parallel restore's ability to cope with selective restore (-L option). The original coding tended to break down in the face of modified restore orders, as shown in bug #5626 from Albert Ullrich, because it would flip over into parallel-restore operation too soon. That causes problems because we don't have sufficient dependency information in dump archives to allow safe parallel processing of SECTION_PRE_DATA items. Even if we did, it's probably undesirable to allow that to override the commanded restore order. To fix the problem of omitted items causing unexpected changes in restore order, tweak SortTocFromFile so that omitted items end up at the head of the list not the tail. This ensures that they'll be examined and their dependencies will be marked satisfied before we get to any interesting items. In HEAD and 9.0, we can easily change restore_toc_entries_parallel so that all SECTION_PRE_DATA items are guaranteed to be processed in the initial serial-restore loop, and hence in commanded order. Only DATA and POST_DATA items are candidates for parallel processing. For them there might be variations from the commanded order because of parallelism, but we should do it in a safe order thanks to dependencies. In 8.4 it's much harder to make such a guarantee. I settled for not letting the initial loop break out into parallel processing mode if it sees a DATA/POST_DATA item that's not to be restored; this at least prevents a non-restorable item from causing premature exit from the loop. This means that 8.4 will be more likely to fail given a badly-ordered -L list than 9.x, but we don't really promise any such thing will work anyway.
2010-08-21 15:59:44 +02:00
/* and insert it before "pos" */
2000-07-04 16:25:28 +02:00
te->prev = pos->prev;
te->next = pos;
pos->prev->next = te;
pos->prev = te;
}
/*
* Build index arrays for the TOC list
*
* This should be invoked only after we have created or read in all the TOC
* items.
*
* The arrays are indexed by dump ID (so entry zero is unused). Note that the
* array entries run only up to maxDumpId. We might see dependency dump IDs
* beyond that (if the dump was partial); so always check the array bound
* before trying to touch an array entry.
*/
static void
buildTocEntryArrays(ArchiveHandle *AH)
2000-07-04 16:25:28 +02:00
{
DumpId maxDumpId = AH->maxDumpId;
2000-07-04 16:25:28 +02:00
TocEntry *te;
2001-03-22 05:01:46 +01:00
AH->tocsByDumpId = (TocEntry **) pg_malloc0((maxDumpId + 1) * sizeof(TocEntry *));
AH->tableDataId = (DumpId *) pg_malloc0((maxDumpId + 1) * sizeof(DumpId));
for (te = AH->toc->next; te != AH->toc; te = te->next)
2000-07-04 16:25:28 +02:00
{
/* this check is purely paranoia, maxDumpId should be correct */
if (te->dumpId <= 0 || te->dumpId > maxDumpId)
pg_fatal("bad dumpId");
/* tocsByDumpId indexes all TOCs by their dump ID */
AH->tocsByDumpId[te->dumpId] = te;
/*
* tableDataId provides the TABLE DATA item's dump ID for each TABLE
* TOC entry that has a DATA item. We compute this by reversing the
* TABLE DATA item's dependency, knowing that a TABLE DATA item has
* just one dependency and it is the TABLE item.
*/
if (strcmp(te->desc, "TABLE DATA") == 0 && te->nDeps > 0)
{
DumpId tableId = te->dependencies[0];
/*
* The TABLE item might not have been in the archive, if this was
* a data-only dump; but its dump ID should be less than its data
* item's dump ID, so there should be a place for it in the array.
*/
if (tableId <= 0 || tableId > maxDumpId)
pg_fatal("bad table dumpId for TABLE DATA item");
AH->tableDataId[tableId] = te->dumpId;
}
2000-07-04 16:25:28 +02:00
}
}
TocEntry *
getTocEntryByDumpId(ArchiveHandle *AH, DumpId id)
{
/* build index arrays if we didn't already */
if (AH->tocsByDumpId == NULL)
buildTocEntryArrays(AH);
if (id > 0 && id <= AH->maxDumpId)
return AH->tocsByDumpId[id];
2000-07-04 16:25:28 +02:00
return NULL;
}
int
TocIDRequired(ArchiveHandle *AH, DumpId id)
2000-07-04 16:25:28 +02:00
{
TocEntry *te = getTocEntryByDumpId(AH, id);
2000-07-04 16:25:28 +02:00
if (!te)
return 0;
return te->reqs;
2000-07-04 16:25:28 +02:00
}
size_t
WriteOffset(ArchiveHandle *AH, pgoff_t o, int wasSet)
{
int off;
/* Save the flag */
AH->WriteBytePtr(AH, wasSet);
/* Write out pgoff_t smallest byte first, prevents endian mismatch */
for (off = 0; off < sizeof(pgoff_t); off++)
{
AH->WriteBytePtr(AH, o & 0xFF);
o >>= 8;
}
return sizeof(pgoff_t) + 1;
}
int
ReadOffset(ArchiveHandle *AH, pgoff_t * o)
{
int i;
int off;
int offsetFlg;
/* Initialize to zero */
*o = 0;
/* Check for old version */
if (AH->version < K_VERS_1_7)
{
/* Prior versions wrote offsets using WriteInt */
i = ReadInt(AH);
/* -1 means not set */
if (i < 0)
return K_OFFSET_POS_NOT_SET;
else if (i == 0)
return K_OFFSET_NO_DATA;
/* Cast to pgoff_t because it was written as an int. */
*o = (pgoff_t) i;
return K_OFFSET_POS_SET;
}
/*
* Read the flag indicating the state of the data pointer. Check if valid
* and die if not.
*
* This used to be handled by a negative or zero pointer, now we use an
* extra byte specifically for the state.
*/
offsetFlg = AH->ReadBytePtr(AH) & 0xFF;
switch (offsetFlg)
{
case K_OFFSET_POS_NOT_SET:
case K_OFFSET_NO_DATA:
case K_OFFSET_POS_SET:
break;
default:
pg_fatal("unexpected data offset flag %d", offsetFlg);
}
/*
* Read the bytes
*/
for (off = 0; off < AH->offSize; off++)
{
if (off < sizeof(pgoff_t))
*o |= ((pgoff_t) (AH->ReadBytePtr(AH))) << (off * 8);
else
{
if (AH->ReadBytePtr(AH) != 0)
pg_fatal("file offset in dump file is too large");
}
}
return offsetFlg;
}
size_t
2000-07-04 16:25:28 +02:00
WriteInt(ArchiveHandle *AH, int i)
{
int b;
2001-03-22 05:01:46 +01:00
2000-07-04 16:25:28 +02:00
/*
* This is a bit yucky, but I don't want to make the binary format very
* dependent on representation, and not knowing much about it, I write out
2000-07-04 16:25:28 +02:00
* a sign byte. If you change this, don't forget to change the file
* version #, and modify ReadInt to read the new format AS WELL AS the old
2000-07-04 16:25:28 +02:00
* formats.
*/
2001-03-22 05:01:46 +01:00
2000-07-04 16:25:28 +02:00
/* SIGN byte */
if (i < 0)
{
AH->WriteBytePtr(AH, 1);
i = -i;
2000-07-04 16:25:28 +02:00
}
else
AH->WriteBytePtr(AH, 0);
2001-03-22 05:01:46 +01:00
2000-07-04 16:25:28 +02:00
for (b = 0; b < AH->intSize; b++)
{
AH->WriteBytePtr(AH, i & 0xFF);
i >>= 8;
2000-07-04 16:25:28 +02:00
}
2001-03-22 05:01:46 +01:00
2000-07-04 16:25:28 +02:00
return AH->intSize + 1;
}
int
ReadInt(ArchiveHandle *AH)
{
int res = 0;
int bv,
b;
int sign = 0; /* Default positive */
int bitShift = 0;
2000-07-04 16:25:28 +02:00
if (AH->version > K_VERS_1_0)
/* Read a sign byte */
sign = AH->ReadBytePtr(AH);
2000-07-04 16:25:28 +02:00
for (b = 0; b < AH->intSize; b++)
{
bv = AH->ReadBytePtr(AH) & 0xFF;
if (bv != 0)
res = res + (bv << bitShift);
bitShift += 8;
2000-07-04 16:25:28 +02:00
}
if (sign)
res = -res;
2000-07-04 16:25:28 +02:00
return res;
}
size_t
WriteStr(ArchiveHandle *AH, const char *c)
2000-07-04 16:25:28 +02:00
{
size_t res;
if (c)
{
int len = strlen(c);
res = WriteInt(AH, len);
AH->WriteBufPtr(AH, c, len);
res += len;
}
else
res = WriteInt(AH, -1);
return res;
2000-07-04 16:25:28 +02:00
}
char *
ReadStr(ArchiveHandle *AH)
{
char *buf;
int l;
l = ReadInt(AH);
if (l < 0)
buf = NULL;
else
{
buf = (char *) pg_malloc(l + 1);
AH->ReadBufPtr(AH, (void *) buf, l);
buf[l] = '\0';
}
2000-07-04 16:25:28 +02:00
return buf;
}
static bool
_fileExistsInDirectory(const char *dir, const char *filename)
{
struct stat st;
char buf[MAXPGPATH];
if (snprintf(buf, MAXPGPATH, "%s/%s", dir, filename) >= MAXPGPATH)
pg_fatal("directory name too long: \"%s\"", dir);
return (stat(buf, &st) == 0 && S_ISREG(st.st_mode));
}
2000-12-07 03:52:27 +01:00
static int
_discoverArchiveFormat(ArchiveHandle *AH)
2000-07-04 16:25:28 +02:00
{
FILE *fh;
char sig[6]; /* More than enough */
size_t cnt;
2000-07-04 16:25:28 +02:00
int wantClose = 0;
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_debug("attempting to ascertain archive format");
free(AH->lookahead);
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
AH->readHeader = 0;
AH->lookaheadSize = 512;
AH->lookahead = pg_malloc0(512);
AH->lookaheadLen = 0;
AH->lookaheadPos = 0;
2000-07-04 16:25:28 +02:00
if (AH->fSpec)
{
struct stat st;
wantClose = 1;
/*
* Check if the specified archive is a directory. If so, check if
* there's a "toc.dat" (or "toc.dat.{gz,lz4,zst}") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
AH->format = archDirectory;
if (_fileExistsInDirectory(AH->fSpec, "toc.dat"))
return AH->format;
#ifdef HAVE_LIBZ
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.gz"))
return AH->format;
#endif
#ifdef USE_LZ4
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.lz4"))
return AH->format;
#endif
#ifdef USE_ZSTD
if (_fileExistsInDirectory(AH->fSpec, "toc.dat.zst"))
return AH->format;
#endif
pg_fatal("directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)",
AH->fSpec);
fh = NULL; /* keep compiler quiet */
}
else
{
fh = fopen(AH->fSpec, PG_BINARY_R);
if (!fh)
pg_fatal("could not open input file \"%s\": %m", AH->fSpec);
}
2000-07-04 16:25:28 +02:00
}
else
{
fh = stdin;
if (!fh)
pg_fatal("could not open input file: %m");
}
2000-07-04 16:25:28 +02:00
if ((cnt = fread(sig, 1, 5, fh)) != 5)
{
if (ferror(fh))
pg_fatal("could not read input file: %m");
else
pg_fatal("input file is too short (read %lu, expected 5)",
(unsigned long) cnt);
}
2000-07-04 16:25:28 +02:00
/* Save it, just in case we need it later */
memcpy(&AH->lookahead[0], sig, 5);
AH->lookaheadLen = 5;
2000-07-04 16:25:28 +02:00
if (strncmp(sig, "PGDMP", 5) == 0)
{
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
/* It's custom format, stop here */
AH->format = archCustom;
AH->readHeader = 1;
}
else
{
/*
* *Maybe* we have a tar archive format file or a text dump ... So,
* read first 512 byte header...
*/
cnt = fread(&AH->lookahead[AH->lookaheadLen], 1, 512 - AH->lookaheadLen, fh);
/* read failure is checked below */
AH->lookaheadLen += cnt;
2000-07-04 16:25:28 +02:00
if (AH->lookaheadLen >= strlen(TEXT_DUMPALL_HEADER) &&
(strncmp(AH->lookahead, TEXT_DUMP_HEADER, strlen(TEXT_DUMP_HEADER)) == 0 ||
strncmp(AH->lookahead, TEXT_DUMPALL_HEADER, strlen(TEXT_DUMPALL_HEADER)) == 0))
{
/*
* looks like it's probably a text format dump. so suggest they
* try psql
*/
pg_fatal("input file appears to be a text format dump. Please use psql.");
}
if (AH->lookaheadLen != 512)
{
if (feof(fh))
pg_fatal("input file does not appear to be a valid archive (too short?)");
else
READ_ERROR_EXIT(fh);
}
2000-07-04 16:25:28 +02:00
if (!isValidTarHeader(AH->lookahead))
pg_fatal("input file does not appear to be a valid archive");
2000-07-04 16:25:28 +02:00
AH->format = archTar;
}
2000-07-04 16:25:28 +02:00
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
/* Close the file if we opened it */
2000-07-04 16:25:28 +02:00
if (wantClose)
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
{
if (fclose(fh) != 0)
pg_fatal("could not close input file: %m");
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
/* Forget lookahead, since we'll re-read header after re-opening */
AH->readHeader = 0;
AH->lookaheadLen = 0;
}
2000-07-04 16:25:28 +02:00
return AH->format;
}
/*
* Allocate an archive handle
*/
static ArchiveHandle *
_allocAH(const char *FileSpec, const ArchiveFormat fmt,
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
const pg_compress_specification compression_spec,
bool dosync, ArchiveMode mode,
SetupWorkerPtrType setupWorkerPtr)
{
2000-07-04 16:25:28 +02:00
ArchiveHandle *AH;
CompressFileHandle *CFH;
pg_compress_specification out_compress_spec = {0};
2000-07-04 16:25:28 +02:00
pg_log_debug("allocating AH for %s, format %d",
FileSpec ? FileSpec : "(stdio)", fmt);
AH = (ArchiveHandle *) pg_malloc0(sizeof(ArchiveHandle));
2000-07-04 16:25:28 +02:00
AH->version = K_VERS_SELF;
/* initialize for backwards compatible string processing */
Fix the inadvertent libpq ABI breakage discovered by Martin Pitt: the renumbering of encoding IDs done between 8.2 and 8.3 turns out to break 8.2 initdb and psql if they are run with an 8.3beta1 libpq.so. For the moment we can rearrange the order of enum pg_enc to keep the same number for everything except PG_JOHAB, which isn't a problem since there are no direct references to it in the 8.2 programs anyway. (This does force initdb unfortunately.) Going forward, we want to fix things so that encoding IDs can be changed without an ABI break, and this commit includes the changes needed to allow libpq's encoding IDs to be treated as fully independent of the backend's. The main issue is that libpq clients should not include pg_wchar.h or otherwise assume they know the specific values of libpq's encoding IDs, since they might encounter version skew between pg_wchar.h and the libpq.so they are using. To fix, have libpq officially export functions needed for encoding name<=>ID conversion and validity checking; it was doing this anyway unofficially. It's still the case that we can't renumber backend encoding IDs until the next bump in libpq's major version number, since doing so will break the 8.2-era client programs. However the code is now prepared to avoid this type of problem in future. Note that initdb is no longer a libpq client: we just pull in the two source files we need directly. The patch also fixes a few places that were being sloppy about checking for an unrecognized encoding name.
2007-10-13 22:18:42 +02:00
AH->public.encoding = 0; /* PG_SQL_ASCII */
AH->public.std_strings = false;
/* sql error handling */
AH->public.exit_on_error = true;
AH->public.n_errors = 0;
AH->archiveDumpVersion = PG_VERSION;
AH->createDate = time(NULL);
2000-07-04 16:25:28 +02:00
AH->intSize = sizeof(int);
AH->offSize = sizeof(pgoff_t);
2000-07-04 16:25:28 +02:00
if (FileSpec)
{
AH->fSpec = pg_strdup(FileSpec);
2001-03-22 05:01:46 +01:00
/*
* Not used; maybe later....
*
* AH->workDir = pg_strdup(FileSpec); for(i=strlen(FileSpec) ; i > 0 ;
* i--) if (AH->workDir[i-1] == '/')
*/
2000-07-04 16:25:28 +02:00
}
else
AH->fSpec = NULL;
2000-07-04 16:25:28 +02:00
AH->currUser = NULL; /* unknown */
AH->currSchema = NULL; /* ditto */
AH->currTablespace = NULL; /* ditto */
AH->currTableAm = NULL; /* ditto */
2004-08-29 07:07:03 +02:00
AH->toc = (TocEntry *) pg_malloc0(sizeof(TocEntry));
2000-07-04 16:25:28 +02:00
AH->toc->next = AH->toc;
AH->toc->prev = AH->toc;
2001-03-22 05:01:46 +01:00
2000-07-04 16:25:28 +02:00
AH->mode = mode;
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
AH->compression_spec = compression_spec;
AH->dosync = dosync;
2000-07-04 16:25:28 +02:00
memset(&(AH->sqlparse), 0, sizeof(AH->sqlparse));
2000-07-04 16:25:28 +02:00
/* Open stdout with no compression for AH output handle */
out_compress_spec.algorithm = PG_COMPRESSION_NONE;
CFH = InitCompressFileHandle(out_compress_spec);
Improve type handling in pg_dump's compress file API After 0da243fed0 got committed, we've received a report about a compiler warning, related to the new LZ4File_gets() function: compress_lz4.c: In function 'LZ4File_gets': compress_lz4.c:492:19: warning: comparison of unsigned expression in '< 0' is always false [-Wtype-limits] 492 | if (dsize < 0) The reason is very simple - dsize is declared as size_t, which is an unsigned integer, and thus the check is pointless and we might fail to notice an error in some cases (or fail in a strange way a bit later). The warning could have been silenced by simply changing the type, but we realized the API mostly assumes all the libraries use the same types and report errors the same way (e.g. by returning 0 and/or negative value). But we can't make this assumption - the gzip/lz4 libraries already disagree on some of this, and even if they did a library added in the future might not. The right solution is to define what the API does, and translate the library-specific behavior in consistent way (so that the internal errors are not exposed to users of our compression API). So this adjusts the data types in a couple places, so that we don't miss library errors, and simplifies and unifies the error reporting to simply return true/false (instead of e.g. size_t). While at it, make sure LZ4File_open_write() does not clobber errno in case open_func() fails. Author: Georgios Kokolatos Reported-by: Alexander Lakhin Reviewed-by: Tomas Vondra, Justin Pryzby Discussion: https://postgr.es/m/33496f7c-3449-1426-d568-63f6bca2ac1f@gmail.com
2023-03-23 17:51:55 +01:00
if (!CFH->open_func(NULL, fileno(stdout), PG_BINARY_A, CFH))
pg_fatal("could not open stdout for appending: %m");
AH->OF = CFH;
2000-07-04 16:25:28 +02:00
/*
* On Windows, we need to use binary mode to read/write non-text files,
* which include all archive formats as well as compressed plain text.
* Force stdin/stdout into binary mode if that is what we are using.
*/
#ifdef WIN32
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
if ((fmt != archNull || compression_spec.algorithm != PG_COMPRESSION_NONE) &&
(AH->fSpec == NULL || strcmp(AH->fSpec, "") == 0))
{
if (mode == archModeWrite)
_setmode(fileno(stdout), O_BINARY);
else
_setmode(fileno(stdin), O_BINARY);
}
#endif
AH->SetupWorkerPtr = setupWorkerPtr;
2000-07-04 16:25:28 +02:00
if (fmt == archUnknown)
AH->format = _discoverArchiveFormat(AH);
else
AH->format = fmt;
2000-07-04 16:25:28 +02:00
switch (AH->format)
{
case archCustom:
InitArchiveFmt_Custom(AH);
break;
2000-07-04 16:25:28 +02:00
case archNull:
InitArchiveFmt_Null(AH);
break;
2000-07-04 16:25:28 +02:00
case archDirectory:
InitArchiveFmt_Directory(AH);
break;
case archTar:
InitArchiveFmt_Tar(AH);
break;
default:
pg_fatal("unrecognized file format \"%d\"", fmt);
2000-07-04 16:25:28 +02:00
}
return AH;
}
/*
* Write out all data (tables & LOs)
*/
2000-07-04 16:25:28 +02:00
void
WriteDataChunks(ArchiveHandle *AH, ParallelState *pstate)
2000-07-04 16:25:28 +02:00
{
TocEntry *te;
2000-07-04 16:25:28 +02:00
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
if (pstate && pstate->numWorkers > 1)
2000-07-04 16:25:28 +02:00
{
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
/*
* In parallel mode, this code runs in the leader process. We
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
* construct an array of candidate TEs, then sort it into decreasing
* size order, then dispatch each TE to a data-transfer worker. By
* dumping larger tables first, we avoid getting into a situation
* where we're down to one job and it's big, losing parallelism.
*/
TocEntry **tes;
int ntes;
2000-07-04 16:25:28 +02:00
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
tes = (TocEntry **) pg_malloc(AH->tocCount * sizeof(TocEntry *));
ntes = 0;
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
/* Consider only TEs with dataDumper functions ... */
if (!te->dataDumper)
continue;
/* ... and ignore ones not enabled for dump */
if ((te->reqs & REQ_DATA) == 0)
continue;
tes[ntes++] = te;
}
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
if (ntes > 1)
qsort(tes, ntes, sizeof(TocEntry *), TocEntrySizeCompare);
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
for (int i = 0; i < ntes; i++)
DispatchJobForTocEntry(AH, pstate, tes[i], ACT_DUMP,
mark_dump_job_done, NULL);
pg_free(tes);
/* Now wait for workers to finish. */
WaitForWorkers(AH, pstate, WFW_ALL_IDLE);
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
}
else
{
/* Non-parallel mode: just dump all candidate TEs sequentially. */
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
/* Must have same filter conditions as above */
if (!te->dataDumper)
continue;
if ((te->reqs & REQ_DATA) == 0)
continue;
WriteDataChunksForTocEntry(AH, te);
}
}
}
2001-03-22 05:01:46 +01:00
/*
* Callback function that's invoked in the leader process after a step has
* been parallel dumped.
*
* We don't need to do anything except check for worker failure.
*/
static void
mark_dump_job_done(ArchiveHandle *AH,
TocEntry *te,
int status,
void *callback_data)
{
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("finished item %d %s %s",
te->dumpId, te->desc, te->tag);
if (status != 0)
pg_fatal("worker process failed: exit code %d",
status);
}
void
WriteDataChunksForTocEntry(ArchiveHandle *AH, TocEntry *te)
{
StartDataPtrType startPtr;
EndDataPtrType endPtr;
2001-03-22 05:01:46 +01:00
AH->currToc = te;
if (strcmp(te->desc, "BLOBS") == 0)
{
startPtr = AH->StartLOsPtr;
endPtr = AH->EndLOsPtr;
2000-07-04 16:25:28 +02:00
}
else
{
startPtr = AH->StartDataPtr;
endPtr = AH->EndDataPtr;
}
if (startPtr != NULL)
(*startPtr) (AH, te);
/*
* The user-provided DataDumper routine needs to call AH->WriteData
*/
te->dataDumper((Archive *) AH, te->dataDumperArg);
if (endPtr != NULL)
(*endPtr) (AH, te);
AH->currToc = NULL;
2000-07-04 16:25:28 +02:00
}
void
WriteToc(ArchiveHandle *AH)
{
TocEntry *te;
char workbuf[32];
int tocCount;
int i;
2001-03-22 05:01:46 +01:00
/* count entries that will actually be dumped */
tocCount = 0;
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if ((te->reqs & (REQ_SCHEMA | REQ_DATA | REQ_SPECIAL)) != 0)
tocCount++;
}
/* printf("%d TOC Entries to save\n", tocCount); */
2001-03-22 05:01:46 +01:00
WriteInt(AH, tocCount);
for (te = AH->toc->next; te != AH->toc; te = te->next)
2000-07-04 16:25:28 +02:00
{
if ((te->reqs & (REQ_SCHEMA | REQ_DATA | REQ_SPECIAL)) == 0)
continue;
WriteInt(AH, te->dumpId);
2000-07-04 16:25:28 +02:00
WriteInt(AH, te->dataDumper ? 1 : 0);
/* OID is recorded as a string for historical reasons */
sprintf(workbuf, "%u", te->catalogId.tableoid);
WriteStr(AH, workbuf);
sprintf(workbuf, "%u", te->catalogId.oid);
WriteStr(AH, workbuf);
WriteStr(AH, te->tag);
2000-07-04 16:25:28 +02:00
WriteStr(AH, te->desc);
WriteInt(AH, te->section);
2000-07-04 16:25:28 +02:00
WriteStr(AH, te->defn);
WriteStr(AH, te->dropStmt);
WriteStr(AH, te->copyStmt);
WriteStr(AH, te->namespace);
WriteStr(AH, te->tablespace);
WriteStr(AH, te->tableam);
2000-07-04 16:25:28 +02:00
WriteStr(AH, te->owner);
Remove WITH OIDS support, change oid catalog column visibility. Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_*->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by * (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
WriteStr(AH, "false");
/* Dump list of dependencies */
for (i = 0; i < te->nDeps; i++)
{
sprintf(workbuf, "%d", te->dependencies[i]);
WriteStr(AH, workbuf);
}
WriteStr(AH, NULL); /* Terminate List */
2000-07-04 16:25:28 +02:00
if (AH->WriteExtraTocPtr)
AH->WriteExtraTocPtr(AH, te);
2000-07-04 16:25:28 +02:00
}
}
void
ReadToc(ArchiveHandle *AH)
{
int i;
char *tmp;
DumpId *deps;
int depIdx;
int depSize;
TocEntry *te;
bool is_supported;
2000-07-04 16:25:28 +02:00
AH->tocCount = ReadInt(AH);
AH->maxDumpId = 0;
2000-07-04 16:25:28 +02:00
for (i = 0; i < AH->tocCount; i++)
{
te = (TocEntry *) pg_malloc0(sizeof(TocEntry));
te->dumpId = ReadInt(AH);
if (te->dumpId > AH->maxDumpId)
AH->maxDumpId = te->dumpId;
/* Sanity check */
if (te->dumpId <= 0)
pg_fatal("entry ID %d out of range -- perhaps a corrupt TOC",
te->dumpId);
te->hadDumper = ReadInt(AH);
if (AH->version >= K_VERS_1_8)
{
tmp = ReadStr(AH);
sscanf(tmp, "%u", &te->catalogId.tableoid);
free(tmp);
}
else
te->catalogId.tableoid = InvalidOid;
tmp = ReadStr(AH);
sscanf(tmp, "%u", &te->catalogId.oid);
free(tmp);
te->tag = ReadStr(AH);
te->desc = ReadStr(AH);
if (AH->version >= K_VERS_1_11)
{
te->section = ReadInt(AH);
}
else
{
/*
* Rules for pre-8.4 archives wherein pg_dump hasn't classified
* the entries into sections. This list need not cover entry
* types added later than 8.4.
*/
if (strcmp(te->desc, "COMMENT") == 0 ||
strcmp(te->desc, "ACL") == 0 ||
strcmp(te->desc, "ACL LANGUAGE") == 0)
te->section = SECTION_NONE;
else if (strcmp(te->desc, "TABLE DATA") == 0 ||
strcmp(te->desc, "BLOBS") == 0 ||
strcmp(te->desc, "BLOB COMMENTS") == 0)
te->section = SECTION_DATA;
else if (strcmp(te->desc, "CONSTRAINT") == 0 ||
strcmp(te->desc, "CHECK CONSTRAINT") == 0 ||
strcmp(te->desc, "FK CONSTRAINT") == 0 ||
strcmp(te->desc, "INDEX") == 0 ||
strcmp(te->desc, "RULE") == 0 ||
strcmp(te->desc, "TRIGGER") == 0)
te->section = SECTION_POST_DATA;
else
te->section = SECTION_PRE_DATA;
}
te->defn = ReadStr(AH);
te->dropStmt = ReadStr(AH);
if (AH->version >= K_VERS_1_3)
te->copyStmt = ReadStr(AH);
if (AH->version >= K_VERS_1_6)
te->namespace = ReadStr(AH);
if (AH->version >= K_VERS_1_10)
te->tablespace = ReadStr(AH);
if (AH->version >= K_VERS_1_14)
te->tableam = ReadStr(AH);
te->owner = ReadStr(AH);
is_supported = true;
if (AH->version < K_VERS_1_9)
is_supported = false;
else
{
tmp = ReadStr(AH);
if (strcmp(tmp, "true") == 0)
is_supported = false;
free(tmp);
}
if (!is_supported)
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_warning("restoring tables WITH OIDS is not supported anymore");
2004-08-29 07:07:03 +02:00
/* Read TOC entry dependencies */
if (AH->version >= K_VERS_1_5)
{
depSize = 100;
deps = (DumpId *) pg_malloc(sizeof(DumpId) * depSize);
depIdx = 0;
for (;;)
{
tmp = ReadStr(AH);
if (!tmp)
break; /* end of list */
if (depIdx >= depSize)
{
depSize *= 2;
deps = (DumpId *) pg_realloc(deps, sizeof(DumpId) * depSize);
}
sscanf(tmp, "%d", &deps[depIdx]);
free(tmp);
depIdx++;
}
if (depIdx > 0) /* We have a non-null entry */
{
deps = (DumpId *) pg_realloc(deps, sizeof(DumpId) * depIdx);
te->dependencies = deps;
te->nDeps = depIdx;
}
else
{
free(deps);
te->dependencies = NULL;
te->nDeps = 0;
}
}
else
{
te->dependencies = NULL;
te->nDeps = 0;
}
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
te->dataLength = 0;
if (AH->ReadExtraTocPtr)
AH->ReadExtraTocPtr(AH, te);
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_debug("read TOC entry %d (ID %d) for %s %s",
i, te->dumpId, te->desc, te->tag);
/* link completed entry into TOC circular list */
te->prev = AH->toc->prev;
AH->toc->prev->next = te;
AH->toc->prev = te;
te->next = AH->toc;
/* special processing immediately upon read for some items */
if (strcmp(te->desc, "ENCODING") == 0)
processEncodingEntry(AH, te);
else if (strcmp(te->desc, "STDSTRINGS") == 0)
processStdStringsEntry(AH, te);
else if (strcmp(te->desc, "SEARCHPATH") == 0)
processSearchPathEntry(AH, te);
2000-07-04 16:25:28 +02:00
}
}
static void
processEncodingEntry(ArchiveHandle *AH, TocEntry *te)
{
/* te->defn should have the form SET client_encoding = 'foo'; */
char *defn = pg_strdup(te->defn);
char *ptr1;
char *ptr2 = NULL;
int encoding;
ptr1 = strchr(defn, '\'');
if (ptr1)
ptr2 = strchr(++ptr1, '\'');
if (ptr2)
{
*ptr2 = '\0';
encoding = pg_char_to_encoding(ptr1);
if (encoding < 0)
pg_fatal("unrecognized encoding \"%s\"",
ptr1);
AH->public.encoding = encoding;
}
else
pg_fatal("invalid ENCODING item: %s",
te->defn);
free(defn);
}
static void
processStdStringsEntry(ArchiveHandle *AH, TocEntry *te)
{
/* te->defn should have the form SET standard_conforming_strings = 'x'; */
char *ptr1;
ptr1 = strchr(te->defn, '\'');
if (ptr1 && strncmp(ptr1, "'on'", 4) == 0)
AH->public.std_strings = true;
else if (ptr1 && strncmp(ptr1, "'off'", 5) == 0)
AH->public.std_strings = false;
else
pg_fatal("invalid STDSTRINGS item: %s",
te->defn);
}
static void
processSearchPathEntry(ArchiveHandle *AH, TocEntry *te)
{
/*
* te->defn should contain a command to set search_path. We just copy it
* verbatim for use later.
*/
AH->public.searchpath = pg_strdup(te->defn);
}
static void
StrictNamesCheck(RestoreOptions *ropt)
{
const char *missing_name;
Assert(ropt->strict_names);
if (ropt->schemaNames.head != NULL)
{
missing_name = simple_string_list_not_touched(&ropt->schemaNames);
if (missing_name != NULL)
pg_fatal("schema \"%s\" not found", missing_name);
}
if (ropt->tableNames.head != NULL)
{
missing_name = simple_string_list_not_touched(&ropt->tableNames);
if (missing_name != NULL)
pg_fatal("table \"%s\" not found", missing_name);
}
if (ropt->indexNames.head != NULL)
{
missing_name = simple_string_list_not_touched(&ropt->indexNames);
if (missing_name != NULL)
pg_fatal("index \"%s\" not found", missing_name);
}
if (ropt->functionNames.head != NULL)
{
missing_name = simple_string_list_not_touched(&ropt->functionNames);
if (missing_name != NULL)
pg_fatal("function \"%s\" not found", missing_name);
}
if (ropt->triggerNames.head != NULL)
{
missing_name = simple_string_list_not_touched(&ropt->triggerNames);
if (missing_name != NULL)
pg_fatal("trigger \"%s\" not found", missing_name);
}
}
/*
* Determine whether we want to restore this TOC entry.
*
* Returns 0 if entry should be skipped, or some combination of the
* REQ_SCHEMA and REQ_DATA bits if we want to restore schema and/or data
* portions of this TOC entry, or REQ_SPECIAL if it's a special entry.
*/
static int
_tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH)
2000-07-04 16:25:28 +02:00
{
int res = REQ_SCHEMA | REQ_DATA;
RestoreOptions *ropt = AH->public.ropt;
2001-03-22 05:01:46 +01:00
/* These items are treated specially */
if (strcmp(te->desc, "ENCODING") == 0 ||
strcmp(te->desc, "STDSTRINGS") == 0 ||
strcmp(te->desc, "SEARCHPATH") == 0)
return REQ_SPECIAL;
/*
* DATABASE and DATABASE PROPERTIES also have a special rule: they are
* restored in createDB mode, and not restored otherwise, independently of
* all else.
*/
if (strcmp(te->desc, "DATABASE") == 0 ||
strcmp(te->desc, "DATABASE PROPERTIES") == 0)
{
if (ropt->createDB)
return REQ_SCHEMA;
else
return 0;
}
/*
* Process exclusions that affect certain classes of TOC entries.
*/
2000-07-04 16:25:28 +02:00
/* If it's an ACL, maybe ignore it */
if (ropt->aclsSkip && _tocEntryIsACL(te))
return 0;
2000-07-04 16:25:28 +02:00
/* If it's a comment, maybe ignore it */
if (ropt->no_comments && strcmp(te->desc, "COMMENT") == 0)
return 0;
/*
* If it's a publication or a table part of a publication, maybe ignore
* it.
*/
if (ropt->no_publications &&
(strcmp(te->desc, "PUBLICATION") == 0 ||
strcmp(te->desc, "PUBLICATION TABLE") == 0 ||
strcmp(te->desc, "PUBLICATION TABLES IN SCHEMA") == 0))
return 0;
/* If it's a security label, maybe ignore it */
if (ropt->no_security_labels && strcmp(te->desc, "SECURITY LABEL") == 0)
return 0;
/* If it's a subscription, maybe ignore it */
if (ropt->no_subscriptions && strcmp(te->desc, "SUBSCRIPTION") == 0)
return 0;
/* Ignore it if section is not to be dumped/restored */
switch (curSection)
{
case SECTION_PRE_DATA:
if (!(ropt->dumpSections & DUMP_PRE_DATA))
return 0;
break;
case SECTION_DATA:
if (!(ropt->dumpSections & DUMP_DATA))
return 0;
break;
case SECTION_POST_DATA:
if (!(ropt->dumpSections & DUMP_POST_DATA))
return 0;
break;
default:
/* shouldn't get here, really, but ignore it */
return 0;
}
/* Ignore it if rejected by idWanted[] (cf. SortTocFromFile) */
if (ropt->idWanted && !ropt->idWanted[te->dumpId - 1])
return 0;
/*
* Check options for selective dump/restore.
*/
if (strcmp(te->desc, "ACL") == 0 ||
strcmp(te->desc, "COMMENT") == 0 ||
strcmp(te->desc, "SECURITY LABEL") == 0)
2000-07-04 16:25:28 +02:00
{
/* Database properties react to createDB, not selectivity options. */
if (strncmp(te->tag, "DATABASE ", 9) == 0)
{
if (!ropt->createDB)
return 0;
}
else if (ropt->schemaNames.head != NULL ||
ropt->schemaExcludeNames.head != NULL ||
ropt->selTypes)
{
/*
* In a selective dump/restore, we want to restore these dependent
* TOC entry types only if their parent object is being restored.
* Without selectivity options, we let through everything in the
* archive. Note there may be such entries with no parent, eg
* non-default ACLs for built-in objects.
*
* This code depends on the parent having been marked already,
* which should be the case; if it isn't, perhaps due to
* SortTocFromFile rearrangement, skipping the dependent entry
* seems prudent anyway.
*
* Ideally we'd handle, eg, table CHECK constraints this way too.
* But it's hard to tell which of their dependencies is the one to
* consult.
*/
if (te->nDeps != 1 ||
TocIDRequired(AH, te->dependencies[0]) == 0)
return 0;
}
}
else
{
/* Apply selective-restore rules for standalone TOC entries. */
if (ropt->schemaNames.head != NULL)
{
/* If no namespace is specified, it means all. */
if (!te->namespace)
return 0;
if (!simple_string_list_member(&ropt->schemaNames, te->namespace))
return 0;
}
if (ropt->schemaExcludeNames.head != NULL &&
te->namespace &&
simple_string_list_member(&ropt->schemaExcludeNames, te->namespace))
return 0;
if (ropt->selTypes)
{
if (strcmp(te->desc, "TABLE") == 0 ||
strcmp(te->desc, "TABLE DATA") == 0 ||
strcmp(te->desc, "VIEW") == 0 ||
strcmp(te->desc, "FOREIGN TABLE") == 0 ||
strcmp(te->desc, "MATERIALIZED VIEW") == 0 ||
strcmp(te->desc, "MATERIALIZED VIEW DATA") == 0 ||
strcmp(te->desc, "SEQUENCE") == 0 ||
strcmp(te->desc, "SEQUENCE SET") == 0)
{
if (!ropt->selTable)
return 0;
if (ropt->tableNames.head != NULL &&
!simple_string_list_member(&ropt->tableNames, te->tag))
return 0;
}
else if (strcmp(te->desc, "INDEX") == 0)
{
if (!ropt->selIndex)
return 0;
if (ropt->indexNames.head != NULL &&
!simple_string_list_member(&ropt->indexNames, te->tag))
return 0;
}
else if (strcmp(te->desc, "FUNCTION") == 0 ||
strcmp(te->desc, "AGGREGATE") == 0 ||
strcmp(te->desc, "PROCEDURE") == 0)
{
if (!ropt->selFunction)
return 0;
if (ropt->functionNames.head != NULL &&
!simple_string_list_member(&ropt->functionNames, te->tag))
return 0;
}
else if (strcmp(te->desc, "TRIGGER") == 0)
{
if (!ropt->selTrigger)
return 0;
if (ropt->triggerNames.head != NULL &&
!simple_string_list_member(&ropt->triggerNames, te->tag))
return 0;
}
else
return 0;
}
2000-07-04 16:25:28 +02:00
}
/*
* Determine whether the TOC entry contains schema and/or data components,
* and mask off inapplicable REQ bits. If it had a dataDumper, assume
* it's both schema and data. Otherwise it's probably schema-only, but
* there are exceptions.
*/
if (!te->hadDumper)
{
/*
* Special Case: If 'SEQUENCE SET' or anything to do with LOs, then it
* is considered a data entry. We don't need to check for the BLOBS
* entry or old-style BLOB COMMENTS, because they will have hadDumper
* = true ... but we do need to check new-style BLOB ACLs, comments,
* etc.
*/
if (strcmp(te->desc, "SEQUENCE SET") == 0 ||
strcmp(te->desc, "BLOB") == 0 ||
(strcmp(te->desc, "ACL") == 0 &&
strncmp(te->tag, "LARGE OBJECT ", 13) == 0) ||
(strcmp(te->desc, "COMMENT") == 0 &&
strncmp(te->tag, "LARGE OBJECT ", 13) == 0) ||
(strcmp(te->desc, "SECURITY LABEL") == 0 &&
strncmp(te->tag, "LARGE OBJECT ", 13) == 0))
res = res & REQ_DATA;
else
res = res & ~REQ_DATA;
}
Fix pg_dump for hash partitioning on enum columns. Hash partitioning on an enum is problematic because the hash codes are derived from the OIDs assigned to the enum values, which will almost certainly be different after a dump-and-reload than they were before. This means that some rows probably end up in different partitions than before, causing restore to fail because of partition constraint violations. (pg_upgrade dodges this problem by using hacks to force the enum values to keep the same OIDs, but that's not possible nor desirable for pg_dump.) Users can work around that by specifying --load-via-partition-root, but since that's a dump-time not restore-time decision, one might find out the need for it far too late. Instead, teach pg_dump to apply that option automatically when dealing with a partitioned table that has hash-on-enum partitioning. Also deal with a pre-existing issue for --load-via-partition-root mode: in a parallel restore, we try to TRUNCATE target tables just before loading them, in order to enable some backend optimizations. This is bad when using --load-via-partition-root because (a) we're likely to suffer deadlocks from restore jobs trying to restore rows into other partitions than they came from, and (b) if we miss getting a deadlock we might still lose data due to a TRUNCATE removing rows from some already-completed restore job. The fix for this is conceptually simple: just don't TRUNCATE if we're dealing with a --load-via-partition-root case. The tricky bit is for pg_restore to identify those cases. In dumps using COPY commands we can inspect each COPY command to see if it targets the nominal target table or some ancestor. However, in dumps using INSERT commands it's pretty impractical to examine the INSERTs in advance. To provide a solution for that going forward, modify pg_dump to mark TABLE DATA items that are using --load-via-partition-root with a comment. (This change also responds to a complaint from Robert Haas that the dump output for --load-via-partition-root is pretty confusing.) pg_restore checks for the special comment as well as checking the COPY command if present. This will fail to identify the combination of --load-via-partition-root and --inserts in pre-existing dump files, but that should be a pretty rare case in the field. If it does happen you will probably get a deadlock failure that you can work around by not using parallel restore, which is the same as before this bug fix. Having done this, there seems no remaining reason for the alarmism in the pg_dump man page about combining --load-via-partition-root with parallel restore, so remove that warning. Patch by me; thanks to Julien Rouhaud for review. Back-patch to v11 where hash partitioning was introduced. Discussion: https://postgr.es/m/1376149.1675268279@sss.pgh.pa.us
2023-03-17 18:31:40 +01:00
/*
* If there's no definition command, there's no schema component. Treat
* "load via partition root" comments as not schema.
*/
if (!te->defn || !te->defn[0] ||
strncmp(te->defn, "-- load via partition root ", 27) == 0)
res = res & ~REQ_SCHEMA;
/*
* Special case: <Init> type with <Max OID> tag; this is obsolete and we
* always ignore it.
*/
if ((strcmp(te->desc, "<Init>") == 0) && (strcmp(te->tag, "Max OID") == 0))
return 0;
2000-07-04 16:25:28 +02:00
/* Mask it if we only want schema */
if (ropt->schemaOnly)
{
pg_upgrade: Fix large object COMMENTS, SECURITY LABELS When performing a pg_upgrade, we copy the files behind pg_largeobject and pg_largeobject_metadata, allowing us to avoid having to dump out and reload the actual data for large objects and their ACLs. Unfortunately, that isn't all of the information which can be associated with large objects. Currently, we also support COMMENTs and SECURITY LABELs with large objects and these were being silently dropped during a pg_upgrade as pg_dump would skip everything having to do with a large object and pg_upgrade only copied the tables mentioned to the new cluster. As the file copies happen after the catalog dump and reload, we can't simply include the COMMENTs and SECURITY LABELs in pg_dump's binary-mode output but we also have to include the actual large object definition as well. With the definition, comments, and security labels in the pg_dump output and the file copies performed by pg_upgrade, all of the data and metadata associated with large objects is able to be successfully pulled forward across a pg_upgrade. In 9.6 and master, we can simply adjust the dump bitmask to indicate which components we don't want. In 9.5 and earlier, we have to put explciit checks in in dumpBlob() and dumpBlobs() to not include the ACL or the data when in binary-upgrade mode. Adjustments made to the privileges regression test to allow another test (large_object.sql) to be added which explicitly leaves a large object with a comment in place to provide coverage of that case with pg_upgrade. Back-patch to all supported branches. Discussion: https://postgr.es/m/20170221162655.GE9812@tamriel.snowman.net
2017-03-06 23:03:57 +01:00
/*
* The sequence_data option overrides schemaOnly for SEQUENCE SET.
Make pg_dump's ACL, sec label, and comment entries reliably identifiable. _tocEntryRequired() expects that it can identify ACL, SECURITY LABEL, and COMMENT TOC entries that are for large objects by seeing whether the tag for them starts with "LARGE OBJECT ". While that works fine for actual large objects, which are indeed tagged that way, it's subject to false positives unless every such entry's tag starts with an appropriate type ID. And in fact it does not work for ACLs, because up to now we customarily tagged those entries with just the bare name of the object. This means that an ACL for an object named "LARGE OBJECT something" would be misclassified as data not schema, with undesirable results in a schema-only or data-only dump --- although pg_upgrade seems unaffected, due to the special case for binary-upgrade mode further down in _tocEntryRequired(). We can fix this by changing all the dumpACL calls to use the label strings already in use for comments and security labels, which do follow the convention of starting with an object type indicator. Well, mostly they follow it. dumpDatabase() got it wrong, using just the bare database name for those purposes, so that a database named "LARGE OBJECT something" would similarly be subject to having its comment or security label dropped or included when not wanted. Bring that into line too. (Note that up to now, database ACLs have not been processed by pg_dump, so that this issue doesn't affect them.) _tocEntryRequired() itself is not free of fault: it was overly liberal about matching object tags to "LARGE OBJECT " in binary-upgrade mode. This looks like it is probably harmless because there would be no data component to strip anyway in that mode, but at best it's trouble waiting to happen, so tighten that up too. The possible misclassification of SECURITY LABEL entries for databases is in principle a security problem, but the opportunities for actual exploits seem too narrow to be interesting. The other cases seem like just bugs, since an object owner can change its ACL or comment for himself, he needn't try to trick someone else into doing it by choosing a strange name. This has been broken since per-large-object TOC entries were introduced in 9.0, so back-patch to all supported branches. Discussion: https://postgr.es/m/21714.1516553459@sss.pgh.pa.us
2018-01-22 18:06:18 +01:00
*
* In binary-upgrade mode, even with schemaOnly set, we do not mask
* out large objects. (Only large object definitions, comments and
* other metadata should be generated in binary-upgrade mode, not the
* actual data, but that need not concern us here.)
pg_upgrade: Fix large object COMMENTS, SECURITY LABELS When performing a pg_upgrade, we copy the files behind pg_largeobject and pg_largeobject_metadata, allowing us to avoid having to dump out and reload the actual data for large objects and their ACLs. Unfortunately, that isn't all of the information which can be associated with large objects. Currently, we also support COMMENTs and SECURITY LABELs with large objects and these were being silently dropped during a pg_upgrade as pg_dump would skip everything having to do with a large object and pg_upgrade only copied the tables mentioned to the new cluster. As the file copies happen after the catalog dump and reload, we can't simply include the COMMENTs and SECURITY LABELs in pg_dump's binary-mode output but we also have to include the actual large object definition as well. With the definition, comments, and security labels in the pg_dump output and the file copies performed by pg_upgrade, all of the data and metadata associated with large objects is able to be successfully pulled forward across a pg_upgrade. In 9.6 and master, we can simply adjust the dump bitmask to indicate which components we don't want. In 9.5 and earlier, we have to put explciit checks in in dumpBlob() and dumpBlobs() to not include the ACL or the data when in binary-upgrade mode. Adjustments made to the privileges regression test to allow another test (large_object.sql) to be added which explicitly leaves a large object with a comment in place to provide coverage of that case with pg_upgrade. Back-patch to all supported branches. Discussion: https://postgr.es/m/20170221162655.GE9812@tamriel.snowman.net
2017-03-06 23:03:57 +01:00
*/
if (!(ropt->sequence_data && strcmp(te->desc, "SEQUENCE SET") == 0) &&
Make pg_dump's ACL, sec label, and comment entries reliably identifiable. _tocEntryRequired() expects that it can identify ACL, SECURITY LABEL, and COMMENT TOC entries that are for large objects by seeing whether the tag for them starts with "LARGE OBJECT ". While that works fine for actual large objects, which are indeed tagged that way, it's subject to false positives unless every such entry's tag starts with an appropriate type ID. And in fact it does not work for ACLs, because up to now we customarily tagged those entries with just the bare name of the object. This means that an ACL for an object named "LARGE OBJECT something" would be misclassified as data not schema, with undesirable results in a schema-only or data-only dump --- although pg_upgrade seems unaffected, due to the special case for binary-upgrade mode further down in _tocEntryRequired(). We can fix this by changing all the dumpACL calls to use the label strings already in use for comments and security labels, which do follow the convention of starting with an object type indicator. Well, mostly they follow it. dumpDatabase() got it wrong, using just the bare database name for those purposes, so that a database named "LARGE OBJECT something" would similarly be subject to having its comment or security label dropped or included when not wanted. Bring that into line too. (Note that up to now, database ACLs have not been processed by pg_dump, so that this issue doesn't affect them.) _tocEntryRequired() itself is not free of fault: it was overly liberal about matching object tags to "LARGE OBJECT " in binary-upgrade mode. This looks like it is probably harmless because there would be no data component to strip anyway in that mode, but at best it's trouble waiting to happen, so tighten that up too. The possible misclassification of SECURITY LABEL entries for databases is in principle a security problem, but the opportunities for actual exploits seem too narrow to be interesting. The other cases seem like just bugs, since an object owner can change its ACL or comment for himself, he needn't try to trick someone else into doing it by choosing a strange name. This has been broken since per-large-object TOC entries were introduced in 9.0, so back-patch to all supported branches. Discussion: https://postgr.es/m/21714.1516553459@sss.pgh.pa.us
2018-01-22 18:06:18 +01:00
!(ropt->binary_upgrade &&
(strcmp(te->desc, "BLOB") == 0 ||
(strcmp(te->desc, "ACL") == 0 &&
strncmp(te->tag, "LARGE OBJECT ", 13) == 0) ||
(strcmp(te->desc, "COMMENT") == 0 &&
strncmp(te->tag, "LARGE OBJECT ", 13) == 0) ||
(strcmp(te->desc, "SECURITY LABEL") == 0 &&
strncmp(te->tag, "LARGE OBJECT ", 13) == 0))))
res = res & REQ_SCHEMA;
}
2000-07-04 16:25:28 +02:00
/* Mask it if we only want data */
if (ropt->dataOnly)
res = res & REQ_DATA;
2000-07-04 16:25:28 +02:00
return res;
}
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
/*
* Identify which pass we should restore this TOC entry in.
*
* See notes with the RestorePass typedef in pg_backup_archiver.h.
*/
static RestorePass
_tocEntryRestorePass(TocEntry *te)
{
/* "ACL LANGUAGE" was a crock emitted only in PG 7.4 */
if (strcmp(te->desc, "ACL") == 0 ||
strcmp(te->desc, "ACL LANGUAGE") == 0 ||
strcmp(te->desc, "DEFAULT ACL") == 0)
return RESTORE_PASS_ACL;
Fix pg_dump/pg_restore to restore event triggers later. Previously, event triggers were restored just after regular triggers (and FK constraints, which are basically triggers). This is risky since an event trigger, once installed, could interfere with subsequent restore commands. Worse, because event triggers don't have any particular dependencies on any post-data objects, a parallel restore would consider them eligible to be restored the moment the post-data phase starts, allowing them to also interfere with restoration of a whole bunch of objects that would have been restored before them in a serial restore. There's no way to completely remove the risk of a misguided event trigger breaking the restore, since if nothing else it could break other event triggers. But we can certainly push them to later in the process to minimize the hazard. To fix, tweak the RestorePass mechanism introduced by commit 3eb9a5e7c so that event triggers are handled as part of the post-ACL processing pass (renaming the "REFRESH" pass to "POST_ACL" to reflect its more general use). This will cause them to restore after everything except matview refreshes, which seems OK since matview refreshes really ought to run in the post-restore state of the database. In a parallel restore, event triggers and matview refreshes might be intermixed, but that seems all right as well. Also update the code and comments in pg_dump_sort.c so that its idea of how things are sorted agrees with what actually happens due to the RestorePass mechanism. This is mostly cosmetic: it'll affect the order of objects in a dump's TOC, but not the actual restore order. But not changing that would be quite confusing to somebody reading the code. Back-patch to all supported branches. Fabrízio de Royes Mello, tweaked a bit by me Discussion: https://postgr.es/m/CAFcNs+ow1hmFox8P--3GSdtwz-S3Binb6ZmoP6Vk+Xg=K6eZNA@mail.gmail.com
2020-03-09 19:58:11 +01:00
if (strcmp(te->desc, "EVENT TRIGGER") == 0 ||
strcmp(te->desc, "MATERIALIZED VIEW DATA") == 0)
return RESTORE_PASS_POST_ACL;
/*
* Comments need to be emitted in the same pass as their parent objects.
* ACLs haven't got comments, and neither do matview data objects, but
* event triggers do. (Fortunately, event triggers haven't got ACLs, or
* we'd need yet another weird special case.)
*/
if (strcmp(te->desc, "COMMENT") == 0 &&
strncmp(te->tag, "EVENT TRIGGER ", 14) == 0)
return RESTORE_PASS_POST_ACL;
/* All else can be handled in the main pass. */
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
return RESTORE_PASS_MAIN;
}
/*
* Identify TOC entries that are ACLs.
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
*
* Note: it seems worth duplicating some code here to avoid a hard-wired
* assumption that these are exactly the same entries that we restore during
* the RESTORE_PASS_ACL phase.
*/
static bool
_tocEntryIsACL(TocEntry *te)
{
/* "ACL LANGUAGE" was a crock emitted only in PG 7.4 */
if (strcmp(te->desc, "ACL") == 0 ||
strcmp(te->desc, "ACL LANGUAGE") == 0 ||
strcmp(te->desc, "DEFAULT ACL") == 0)
return true;
return false;
}
/*
* Issue SET commands for parameters that we want to have set the same way
* at all times during execution of a restore script.
*/
static void
_doSetFixedOutputState(ArchiveHandle *AH)
{
RestoreOptions *ropt = AH->public.ropt;
/*
* Disable timeouts to allow for slow commands, idle parallel workers, etc
*/
2008-05-04 10:32:21 +02:00
ahprintf(AH, "SET statement_timeout = 0;\n");
ahprintf(AH, "SET lock_timeout = 0;\n");
ahprintf(AH, "SET idle_in_transaction_session_timeout = 0;\n");
/* Select the correct character set encoding */
ahprintf(AH, "SET client_encoding = '%s';\n",
pg_encoding_to_char(AH->public.encoding));
/* Select the correct string literal syntax */
ahprintf(AH, "SET standard_conforming_strings = %s;\n",
AH->public.std_strings ? "on" : "off");
/* Select the role to be used during restore */
if (ropt && ropt->use_role)
ahprintf(AH, "SET ROLE %s;\n", fmtId(ropt->use_role));
/* Select the dump-time search_path */
if (AH->public.searchpath)
ahprintf(AH, "%s", AH->public.searchpath);
/* Make sure function checking is disabled */
ahprintf(AH, "SET check_function_bodies = false;\n");
/* Ensure that all valid XML data will be accepted */
ahprintf(AH, "SET xmloption = content;\n");
They are two different problems; the TOC entry is important for any multiline command or to rerun the command easily later. Whereas displaying the failed SQL command is a matter of fixing the error messages. The latter is complicated by failed COPY commands which, with die-on-errors off, results in the data being processed as a command, so dumping the command will dump all of the data. In the case of long commands, should the whole command be dumped? eg. (eg. several pages of function definition). In the case of the COPY command, I'm not sure what to do. Obviously, it would be best to avoid sending the data, but the data and command are combined (from memory). Also, the 'data' may be in the form of INSERT statements. Attached patch produces the first 125 chars of the command: pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC Entry 26; 1255 16449270 FUNCTION plpgsql_call_handler() pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_call_handler" already exists with same argument types Command was: CREATE FUNCTION plpgsql_call_handler() RETURNS language_handler AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_call_han... pg_restore: [archiver (db)] Error from TOC Entry 27; 1255 16449271 FUNCTION plpgsql_validator(oid) pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_validator" already exists with same argument types Command was: CREATE FUNCTION plpgsql_validator(oid) RETURNS void AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_validator' LANGU... Philip Warner
2004-08-20 22:00:34 +02:00
/* Avoid annoying notices etc */
ahprintf(AH, "SET client_min_messages = warning;\n");
if (!AH->public.std_strings)
ahprintf(AH, "SET escape_string_warning = off;\n");
They are two different problems; the TOC entry is important for any multiline command or to rerun the command easily later. Whereas displaying the failed SQL command is a matter of fixing the error messages. The latter is complicated by failed COPY commands which, with die-on-errors off, results in the data being processed as a command, so dumping the command will dump all of the data. In the case of long commands, should the whole command be dumped? eg. (eg. several pages of function definition). In the case of the COPY command, I'm not sure what to do. Obviously, it would be best to avoid sending the data, but the data and command are combined (from memory). Also, the 'data' may be in the form of INSERT statements. Attached patch produces the first 125 chars of the command: pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC Entry 26; 1255 16449270 FUNCTION plpgsql_call_handler() pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_call_handler" already exists with same argument types Command was: CREATE FUNCTION plpgsql_call_handler() RETURNS language_handler AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_call_han... pg_restore: [archiver (db)] Error from TOC Entry 27; 1255 16449271 FUNCTION plpgsql_validator(oid) pjw pg_restore: [archiver (db)] could not execute query: ERROR: function "plpgsql_validator" already exists with same argument types Command was: CREATE FUNCTION plpgsql_validator(oid) RETURNS void AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_validator' LANGU... Philip Warner
2004-08-20 22:00:34 +02:00
/* Adjust row-security state */
if (ropt && ropt->enable_row_security)
ahprintf(AH, "SET row_security = on;\n");
else
ahprintf(AH, "SET row_security = off;\n");
ahprintf(AH, "\n");
}
/*
* Issue a SET SESSION AUTHORIZATION command. Caller is responsible
* for updating state if appropriate. If user is NULL or an empty string,
* the specification DEFAULT will be used.
*/
static void
_doSetSessionAuth(ArchiveHandle *AH, const char *user)
{
PQExpBuffer cmd = createPQExpBuffer();
2002-09-04 22:31:48 +02:00
appendPQExpBufferStr(cmd, "SET SESSION AUTHORIZATION ");
2002-09-04 22:31:48 +02:00
/*
* SQL requires a string literal here. Might as well be correct.
*/
if (user && *user)
appendStringLiteralAHX(cmd, user, AH);
else
appendPQExpBufferStr(cmd, "DEFAULT");
appendPQExpBufferChar(cmd, ';');
if (RestoringToDB(AH))
{
PGresult *res;
res = PQexec(AH->connection, cmd->data);
if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
/* NOT warn_or_exit_horribly... use -O instead to skip this. */
pg_fatal("could not set session user to \"%s\": %s",
user, PQerrorMessage(AH->connection));
PQclear(res);
}
else
ahprintf(AH, "%s\n\n", cmd->data);
destroyPQExpBuffer(cmd);
}
/*
* Issue the commands to connect to the specified database.
*
* If we're currently restoring right into a database, this will
* actually establish a connection. Otherwise it puts a \connect into
* the script output.
*/
static void
_reconnectToDB(ArchiveHandle *AH, const char *dbname)
{
if (RestoringToDB(AH))
ReconnectToServer(AH, dbname);
else
{
PQExpBufferData connectbuf;
initPQExpBuffer(&connectbuf);
appendPsqlMetaConnect(&connectbuf, dbname);
ahprintf(AH, "%s\n", connectbuf.data);
termPQExpBuffer(&connectbuf);
}
/*
* NOTE: currUser keeps track of what the imaginary session user in our
* script is. It's now effectively reset to the original userID.
*/
free(AH->currUser);
AH->currUser = NULL;
/* don't assume we still know the output schema, tablespace, etc either */
free(AH->currSchema);
AH->currSchema = NULL;
free(AH->currTableAm);
AH->currTableAm = NULL;
free(AH->currTablespace);
AH->currTablespace = NULL;
2004-08-29 07:07:03 +02:00
/* re-establish fixed state */
_doSetFixedOutputState(AH);
}
/*
* Become the specified user, and update state to avoid redundant commands
*
* NULL or empty argument is taken to mean restoring the session default
*/
static void
_becomeUser(ArchiveHandle *AH, const char *user)
{
if (!user)
user = ""; /* avoid null pointers */
if (AH->currUser && strcmp(AH->currUser, user) == 0)
return; /* no need to do anything */
_doSetSessionAuth(AH, user);
/*
* NOTE: currUser keeps track of what the imaginary session user in our
* script is
*/
free(AH->currUser);
AH->currUser = pg_strdup(user);
}
/*
* Become the owner of the given TOC entry object. If
* changes in ownership are not allowed, this doesn't do anything.
*/
static void
_becomeOwner(ArchiveHandle *AH, TocEntry *te)
{
RestoreOptions *ropt = AH->public.ropt;
if (ropt && (ropt->noOwner || !ropt->use_setsessauth))
return;
_becomeUser(AH, te->owner);
}
/*
* Issue the commands to select the specified schema as the current schema
* in the target database.
*/
static void
_selectOutputSchema(ArchiveHandle *AH, const char *schemaName)
{
PQExpBuffer qry;
/*
* If there was a SEARCHPATH TOC entry, we're supposed to just stay with
* that search_path rather than switching to entry-specific paths.
* Otherwise, it's an old archive that will not restore correctly unless
* we set the search_path as it's expecting.
*/
if (AH->public.searchpath)
return;
if (!schemaName || *schemaName == '\0' ||
(AH->currSchema && strcmp(AH->currSchema, schemaName) == 0))
return; /* no need to do anything */
qry = createPQExpBuffer();
appendPQExpBuffer(qry, "SET search_path = %s",
fmtId(schemaName));
if (strcmp(schemaName, "pg_catalog") != 0)
appendPQExpBufferStr(qry, ", pg_catalog");
if (RestoringToDB(AH))
{
PGresult *res;
res = PQexec(AH->connection, qry->data);
if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
warn_or_exit_horribly(AH,
"could not set search_path to \"%s\": %s",
schemaName, PQerrorMessage(AH->connection));
PQclear(res);
}
else
ahprintf(AH, "%s;\n\n", qry->data);
free(AH->currSchema);
AH->currSchema = pg_strdup(schemaName);
destroyPQExpBuffer(qry);
}
/*
* Issue the commands to select the specified tablespace as the current one
* in the target database.
*/
static void
_selectTablespace(ArchiveHandle *AH, const char *tablespace)
{
RestoreOptions *ropt = AH->public.ropt;
PQExpBuffer qry;
const char *want,
*have;
/* do nothing in --no-tablespaces mode */
if (ropt->noTablespace)
return;
have = AH->currTablespace;
want = tablespace;
/* no need to do anything for non-tablespace object */
if (!want)
return;
if (have && strcmp(want, have) == 0)
return; /* no need to do anything */
qry = createPQExpBuffer();
if (strcmp(want, "") == 0)
{
/* We want the tablespace to be the database's default */
appendPQExpBufferStr(qry, "SET default_tablespace = ''");
}
else
{
/* We want an explicit tablespace */
appendPQExpBuffer(qry, "SET default_tablespace = %s", fmtId(want));
}
if (RestoringToDB(AH))
{
PGresult *res;
res = PQexec(AH->connection, qry->data);
if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
warn_or_exit_horribly(AH,
"could not set default_tablespace to %s: %s",
fmtId(want), PQerrorMessage(AH->connection));
PQclear(res);
}
else
ahprintf(AH, "%s;\n\n", qry->data);
free(AH->currTablespace);
AH->currTablespace = pg_strdup(want);
destroyPQExpBuffer(qry);
}
/*
* Set the proper default_table_access_method value for the table.
*/
static void
_selectTableAccessMethod(ArchiveHandle *AH, const char *tableam)
{
RestoreOptions *ropt = AH->public.ropt;
PQExpBuffer cmd;
const char *want,
*have;
/* do nothing in --no-table-access-method mode */
if (ropt->noTableAm)
return;
have = AH->currTableAm;
want = tableam;
if (!want)
return;
if (have && strcmp(want, have) == 0)
return;
cmd = createPQExpBuffer();
appendPQExpBuffer(cmd, "SET default_table_access_method = %s;", fmtId(want));
if (RestoringToDB(AH))
{
PGresult *res;
res = PQexec(AH->connection, cmd->data);
if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
warn_or_exit_horribly(AH,
"could not set default_table_access_method: %s",
PQerrorMessage(AH->connection));
PQclear(res);
}
else
ahprintf(AH, "%s\n\n", cmd->data);
destroyPQExpBuffer(cmd);
free(AH->currTableAm);
AH->currTableAm = pg_strdup(want);
}
/*
* Extract an object description for a TOC entry, and append it to buf.
*
* This is used for ALTER ... OWNER TO.
*
* If the object type has no owner, do nothing.
*/
static void
_getObjectDescription(PQExpBuffer buf, const TocEntry *te)
{
const char *type = te->desc;
/* objects that don't require special decoration */
if (strcmp(type, "COLLATION") == 0 ||
strcmp(type, "CONVERSION") == 0 ||
strcmp(type, "DOMAIN") == 0 ||
strcmp(type, "FOREIGN TABLE") == 0 ||
strcmp(type, "MATERIALIZED VIEW") == 0 ||
strcmp(type, "SEQUENCE") == 0 ||
strcmp(type, "STATISTICS") == 0 ||
strcmp(type, "TABLE") == 0 ||
strcmp(type, "TEXT SEARCH DICTIONARY") == 0 ||
strcmp(type, "TEXT SEARCH CONFIGURATION") == 0 ||
strcmp(type, "TYPE") == 0 ||
strcmp(type, "VIEW") == 0 ||
/* non-schema-specified objects */
strcmp(type, "DATABASE") == 0 ||
strcmp(type, "PROCEDURAL LANGUAGE") == 0 ||
strcmp(type, "SCHEMA") == 0 ||
strcmp(type, "EVENT TRIGGER") == 0 ||
strcmp(type, "FOREIGN DATA WRAPPER") == 0 ||
strcmp(type, "SERVER") == 0 ||
strcmp(type, "PUBLICATION") == 0 ||
strcmp(type, "SUBSCRIPTION") == 0)
{
appendPQExpBuffer(buf, "%s ", type);
if (te->namespace && *te->namespace)
appendPQExpBuffer(buf, "%s.", fmtId(te->namespace));
appendPQExpBufferStr(buf, fmtId(te->tag));
}
/* LOs just have a name, but it's numeric so must not use fmtId */
else if (strcmp(type, "BLOB") == 0)
{
appendPQExpBuffer(buf, "LARGE OBJECT %s", te->tag);
}
/*
* These object types require additional decoration. Fortunately, the
* information needed is exactly what's in the DROP command.
*/
else if (strcmp(type, "AGGREGATE") == 0 ||
strcmp(type, "FUNCTION") == 0 ||
strcmp(type, "OPERATOR") == 0 ||
strcmp(type, "OPERATOR CLASS") == 0 ||
strcmp(type, "OPERATOR FAMILY") == 0 ||
strcmp(type, "PROCEDURE") == 0)
{
/* Chop "DROP " off the front and make a modifiable copy */
char *first = pg_strdup(te->dropStmt + 5);
char *last;
/* point to last character in string */
last = first + strlen(first) - 1;
/* Strip off any ';' or '\n' at the end */
while (last >= first && (*last == '\n' || *last == ';'))
last--;
*(last + 1) = '\0';
2004-08-29 07:07:03 +02:00
appendPQExpBufferStr(buf, first);
2004-08-29 07:07:03 +02:00
free(first);
return;
}
/* these object types don't have separate owners */
else if (strcmp(type, "CAST") == 0 ||
strcmp(type, "CHECK CONSTRAINT") == 0 ||
strcmp(type, "CONSTRAINT") == 0 ||
strcmp(type, "DATABASE PROPERTIES") == 0 ||
strcmp(type, "DEFAULT") == 0 ||
strcmp(type, "FK CONSTRAINT") == 0 ||
strcmp(type, "INDEX") == 0 ||
strcmp(type, "RULE") == 0 ||
strcmp(type, "TRIGGER") == 0 ||
strcmp(type, "ROW SECURITY") == 0 ||
strcmp(type, "POLICY") == 0 ||
strcmp(type, "USER MAPPING") == 0)
{
/* do nothing */
}
else
pg_fatal("don't know how to set owner for object type \"%s\"", type);
}
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
/*
* Emit the SQL commands to create the object represented by a TOC entry
*
* This now also includes issuing an ALTER OWNER command to restore the
* object's ownership, if wanted. But note that the object's permissions
* will remain at default, until the matching ACL TOC entry is restored.
*/
static void
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
_printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData)
2000-07-04 16:25:28 +02:00
{
RestoreOptions *ropt = AH->public.ropt;
/* Select owner, schema, tablespace and default AM as necessary */
_becomeOwner(AH, te);
_selectOutputSchema(AH, te->namespace);
_selectTablespace(AH, te->tablespace);
_selectTableAccessMethod(AH, te->tableam);
/* Emit header comment for item */
if (!AH->noTocComments)
{
const char *pfx;
char *sanitized_name;
char *sanitized_schema;
char *sanitized_owner;
if (isData)
pfx = "Data for ";
else
pfx = "";
ahprintf(AH, "--\n");
if (AH->public.verbose)
{
ahprintf(AH, "-- TOC entry %d (class %u OID %u)\n",
te->dumpId, te->catalogId.tableoid, te->catalogId.oid);
if (te->nDeps > 0)
{
int i;
ahprintf(AH, "-- Dependencies:");
for (i = 0; i < te->nDeps; i++)
ahprintf(AH, " %d", te->dependencies[i]);
ahprintf(AH, "\n");
}
}
sanitized_name = sanitize_line(te->tag, false);
sanitized_schema = sanitize_line(te->namespace, true);
sanitized_owner = sanitize_line(ropt->noOwner ? NULL : te->owner, true);
ahprintf(AH, "-- %sName: %s; Type: %s; Schema: %s; Owner: %s",
pfx, sanitized_name, te->desc, sanitized_schema,
sanitized_owner);
free(sanitized_name);
free(sanitized_schema);
free(sanitized_owner);
if (te->tablespace && strlen(te->tablespace) > 0 && !ropt->noTablespace)
{
char *sanitized_tablespace;
sanitized_tablespace = sanitize_line(te->tablespace, false);
ahprintf(AH, "; Tablespace: %s", sanitized_tablespace);
free(sanitized_tablespace);
}
ahprintf(AH, "\n");
if (AH->PrintExtraTocPtr != NULL)
AH->PrintExtraTocPtr(AH, te);
ahprintf(AH, "--\n\n");
}
2000-07-04 16:25:28 +02:00
/*
* Actually print the definition.
*
* Really crude hack for suppressing AUTHORIZATION clause that old pg_dump
* versions put into CREATE SCHEMA. Don't mutate the variant for schema
* "public" that is a comment. We have to do this when --no-owner mode is
* selected. This is ugly, but I see no other good way ...
*/
if (ropt->noOwner &&
strcmp(te->desc, "SCHEMA") == 0 && strncmp(te->defn, "--", 2) != 0)
{
ahprintf(AH, "CREATE SCHEMA %s;\n\n\n", fmtId(te->tag));
}
else
{
if (te->defn && strlen(te->defn) > 0)
ahprintf(AH, "%s\n\n", te->defn);
}
/*
* If we aren't using SET SESSION AUTH to determine ownership, we must
* instead issue an ALTER OWNER command. Schema "public" is special; when
* a dump emits a comment in lieu of creating it, we use ALTER OWNER even
* when using SET SESSION for all other objects. We assume that anything
* without a DROP command is not a separately ownable object.
*/
if (!ropt->noOwner &&
(!ropt->use_setsessauth ||
(strcmp(te->desc, "SCHEMA") == 0 &&
strncmp(te->defn, "--", 2) == 0)) &&
te->owner && strlen(te->owner) > 0 &&
te->dropStmt && strlen(te->dropStmt) > 0)
{
PQExpBufferData temp;
initPQExpBuffer(&temp);
_getObjectDescription(&temp, te);
/*
* If _getObjectDescription() didn't fill the buffer, then there is no
* owner.
*/
if (temp.data[0])
ahprintf(AH, "ALTER %s OWNER TO %s;\n\n", temp.data, fmtId(te->owner));
termPQExpBuffer(&temp);
}
2000-07-04 16:25:28 +02:00
/*
* If it's an ACL entry, it might contain SET SESSION AUTHORIZATION
* commands, so we can no longer assume we know the current auth setting.
*/
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
if (_tocEntryIsACL(te))
{
free(AH->currUser);
AH->currUser = NULL;
}
2000-07-04 16:25:28 +02:00
}
/*
* Sanitize a string to be included in an SQL comment or TOC listing, by
* replacing any newlines with spaces. This ensures each logical output line
* is in fact one physical output line, to prevent corruption of the dump
* (which could, in the worst case, present an SQL injection vulnerability
* if someone were to incautiously load a dump containing objects with
* maliciously crafted names).
*
* The result is a freshly malloc'd string. If the input string is NULL,
* return a malloc'ed empty string, unless want_hyphen, in which case return a
* malloc'ed hyphen.
*
* Note that we currently don't bother to quote names, meaning that the name
* fields aren't automatically parseable. "pg_restore -L" doesn't care because
* it only examines the dumpId field, but someday we might want to try harder.
*/
static char *
sanitize_line(const char *str, bool want_hyphen)
{
char *result;
char *s;
if (!str)
return pg_strdup(want_hyphen ? "-" : "");
result = pg_strdup(str);
for (s = result; *s != '\0'; s++)
{
if (*s == '\n' || *s == '\r')
*s = ' ';
}
return result;
}
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
/*
* Write the file header for a custom-format archive
*/
2000-07-04 16:25:28 +02:00
void
WriteHead(ArchiveHandle *AH)
{
struct tm crtm;
AH->WriteBufPtr(AH, "PGDMP", 5); /* Magic code */
AH->WriteBytePtr(AH, ARCHIVE_MAJOR(AH->version));
AH->WriteBytePtr(AH, ARCHIVE_MINOR(AH->version));
AH->WriteBytePtr(AH, ARCHIVE_REV(AH->version));
AH->WriteBytePtr(AH, AH->intSize);
AH->WriteBytePtr(AH, AH->offSize);
AH->WriteBytePtr(AH, AH->format);
AH->WriteBytePtr(AH, AH->compression_spec.algorithm);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
WriteInt(AH, crtm.tm_hour);
WriteInt(AH, crtm.tm_mday);
WriteInt(AH, crtm.tm_mon);
WriteInt(AH, crtm.tm_year);
WriteInt(AH, crtm.tm_isdst);
WriteStr(AH, PQdb(AH->connection));
WriteStr(AH, AH->public.remoteVersionStr);
WriteStr(AH, PG_VERSION);
2000-07-04 16:25:28 +02:00
}
void
ReadHead(ArchiveHandle *AH)
{
char *errmsg;
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
char vmaj,
vmin,
vrev;
int fmt;
2000-07-04 16:25:28 +02:00
/*
* If we haven't already read the header, do so.
*
* NB: this code must agree with _discoverArchiveFormat(). Maybe find a
* way to unify the cases?
*/
if (!AH->readHeader)
{
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
char tmpMag[7];
AH->ReadBufPtr(AH, tmpMag, 5);
2000-07-04 16:25:28 +02:00
if (strncmp(tmpMag, "PGDMP", 5) != 0)
pg_fatal("did not find magic string in file header");
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
}
2000-07-04 16:25:28 +02:00
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
vmaj = AH->ReadBytePtr(AH);
vmin = AH->ReadBytePtr(AH);
2000-07-04 16:25:28 +02:00
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
if (vmaj > 1 || (vmaj == 1 && vmin > 0)) /* Version > 1.0 */
vrev = AH->ReadBytePtr(AH);
else
vrev = 0;
2000-07-04 16:25:28 +02:00
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
AH->version = MAKE_ARCHIVE_VERSION(vmaj, vmin, vrev);
2000-07-04 16:25:28 +02:00
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
if (AH->version < K_VERS_1_0 || AH->version > K_VERS_MAX)
pg_fatal("unsupported version (%d.%d) in file header",
vmaj, vmin);
2000-07-04 16:25:28 +02:00
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
AH->intSize = AH->ReadBytePtr(AH);
if (AH->intSize > 32)
pg_fatal("sanity check on integer size (%lu) failed",
(unsigned long) AH->intSize);
2000-07-04 16:25:28 +02:00
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
if (AH->intSize > sizeof(int))
pg_log_warning("archive was made on a machine with larger integers, some operations might fail");
2000-07-04 16:25:28 +02:00
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
if (AH->version >= K_VERS_1_7)
AH->offSize = AH->ReadBytePtr(AH);
else
AH->offSize = AH->intSize;
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
fmt = AH->ReadBytePtr(AH);
2000-07-04 16:25:28 +02:00
Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 19:34:16 +02:00
if (AH->format != fmt)
pg_fatal("expected format (%d) differs from format found in file (%d)",
AH->format, fmt);
2000-07-04 16:25:28 +02:00
if (AH->version >= K_VERS_1_15)
AH->compression_spec.algorithm = AH->ReadBytePtr(AH);
else if (AH->version >= K_VERS_1_2)
2000-07-04 16:25:28 +02:00
{
/* Guess the compression method based on the level */
if (AH->version < K_VERS_1_4)
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
AH->compression_spec.level = AH->ReadBytePtr(AH);
else
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
AH->compression_spec.level = ReadInt(AH);
if (AH->compression_spec.level != 0)
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
2000-07-04 16:25:28 +02:00
}
else
Switch pg_dump to use compression specifications Compression specifications are currently used by pg_basebackup and pg_receivewal, and are able to let the user control in an extended way the method and level of compression used. As an effect of this commit, pg_dump's -Z/--compress is now able to use more than just an integer, as of the grammar "method[:detail]". The method can be either "none" or "gzip", and can optionally take a detail string. If the detail string is only an integer, it defines the compression level. A comma-separated list of keywords can also be used method allows for more options, the only keyword supported now is "level". The change is backward-compatible, hence specifying only an integer leads to no compression for a level of 0 and gzip compression when the level is greater than 0. Most of the code changes are straight-forward, as pg_dump was relying on an integer tracking the compression level to check for gzip or no compression. These are changed to use a compression specification and the algorithm stored in it. As of this change, note that the dump format is not bumped because there is no need yet to track the compression algorithm in the TOC entries. Hence, we still rely on the compression level to make the difference when reading them. This will be mandatory once a new compression method is added, though. In order to keep the code simpler when parsing the compression specification, the code is changed so as pg_dump now fails hard when using gzip on -Z/--compress without its support compiled, rather than enforcing no compression without the user knowing about it except through a warning. Like before this commit, archive and custom formats are compressed by default when the code is compiled with gzip, and left uncompressed without gzip. Author: Georgios Kokolatos Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 02:45:02 +01:00
AH->compression_spec.algorithm = PG_COMPRESSION_GZIP;
2000-07-04 16:25:28 +02:00
errmsg = supports_compression(AH->compression_spec);
if (errmsg)
{
pg_log_warning("archive is compressed, but this installation does not support compression (%s) -- no data will be available",
errmsg);
pg_free(errmsg);
}
2000-07-04 16:25:28 +02:00
if (AH->version >= K_VERS_1_4)
{
struct tm crtm;
crtm.tm_sec = ReadInt(AH);
crtm.tm_min = ReadInt(AH);
crtm.tm_hour = ReadInt(AH);
crtm.tm_mday = ReadInt(AH);
crtm.tm_mon = ReadInt(AH);
crtm.tm_year = ReadInt(AH);
crtm.tm_isdst = ReadInt(AH);
/*
* Newer versions of glibc have mktime() report failure if tm_isdst is
* inconsistent with the prevailing timezone, e.g. tm_isdst = 1 when
* TZ=UTC. This is problematic when restoring an archive under a
* different timezone setting. If we get a failure, try again with
* tm_isdst set to -1 ("don't know").
*
* XXX with or without this hack, we reconstruct createDate
* incorrectly when the prevailing timezone is different from
* pg_dump's. Next time we bump the archive version, we should flush
* this representation and store a plain seconds-since-the-Epoch
* timestamp instead.
*/
AH->createDate = mktime(&crtm);
if (AH->createDate == (time_t) -1)
{
crtm.tm_isdst = -1;
AH->createDate = mktime(&crtm);
if (AH->createDate == (time_t) -1)
pg_log_warning("invalid creation date in header");
}
}
if (AH->version >= K_VERS_1_4)
{
AH->archdbname = ReadStr(AH);
}
if (AH->version >= K_VERS_1_10)
{
AH->archiveRemoteVersion = ReadStr(AH);
AH->archiveDumpVersion = ReadStr(AH);
}
2000-07-04 16:25:28 +02:00
}
/*
* checkSeek
* check to see if ftell/fseek can be performed.
*/
bool
checkSeek(FILE *fp)
{
pgoff_t tpos;
/* Check that ftello works on this file */
tpos = ftello(fp);
if (tpos < 0)
return false;
/*
* Check that fseeko(SEEK_SET) works, too. NB: we used to try to test
* this with fseeko(fp, 0, SEEK_CUR). But some platforms treat that as a
* successful no-op even on files that are otherwise unseekable.
*/
if (fseeko(fp, tpos, SEEK_SET) != 0)
return false;
return true;
}
/*
* dumpTimestamp
*/
static void
dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim)
{
char buf[64];
if (strftime(buf, sizeof(buf), PGDUMP_STRFTIME_FMT, localtime(&tim)) != 0)
ahprintf(AH, "-- %s %s\n\n", msg, buf);
}
/*
* Main engine for parallel restore.
*
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
* Parallel restore is done in three phases. In this first phase,
* we'll process all SECTION_PRE_DATA TOC entries that are allowed to be
* processed in the RESTORE_PASS_MAIN pass. (In practice, that's all
* PRE_DATA items other than ACLs.) Entries we can't process now are
* added to the pending_list for later phases to deal with.
*/
static void
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
restore_toc_entries_prefork(ArchiveHandle *AH, TocEntry *pending_list)
{
bool skipped_some;
TocEntry *next_work_item;
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_debug("entering restore_toc_entries_prefork");
/* Adjust dependency information */
fix_dependencies(AH);
/*
* Do all the early stuff in a single connection in the parent. There's no
* great point in running it in parallel, in fact it will actually run
* faster in a single connection because we avoid all the connection and
* setup overhead. Also, pre-9.2 pg_dump versions were not very good
Make pg_dump emit more accurate dependency information. While pg_dump has included dependency information in archive-format output ever since 7.3, it never made any large effort to ensure that that information was actually useful. In particular, in common situations where dependency chains include objects that aren't separately emitted in the dump, the dependencies shown for objects that were emitted would reference the dump IDs of these un-dumped objects, leaving no clue about which other objects the visible objects indirectly depend on. So far, parallel pg_restore has managed to avoid tripping over this misfeature, but only by dint of some crude hacks like not trusting dependency information in the pre-data section of the archive. It seems prudent to do something about this before it rises up to bite us, so instead of emitting the "raw" dependencies of each dumped object, recursively search for its actual dependencies among the subset of objects that are being dumped. Back-patch to 9.2, since that code hasn't yet diverged materially from HEAD. At some point we might need to back-patch further, but right now there are no known cases where this is actively necessary. (The one known case, bug #6699, is fixed in a different way by my previous patch.) Since this patch depends on 9.2 changes that made TOC entries be marked before output commences as to whether they'll be dumped, back-patching further would require additional surgery; and as of now there's no evidence that it's worth the risk.
2012-06-26 03:20:24 +02:00
* about showing all the dependencies of SECTION_PRE_DATA items, so we do
* not risk trying to process them out-of-order.
*
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
* Stuff that we can't do immediately gets added to the pending_list.
* Note: we don't yet filter out entries that aren't going to be restored.
* They might participate in dependency chains connecting entries that
* should be restored, so we treat them as live until we actually process
* them.
*
Make pg_dump emit more accurate dependency information. While pg_dump has included dependency information in archive-format output ever since 7.3, it never made any large effort to ensure that that information was actually useful. In particular, in common situations where dependency chains include objects that aren't separately emitted in the dump, the dependencies shown for objects that were emitted would reference the dump IDs of these un-dumped objects, leaving no clue about which other objects the visible objects indirectly depend on. So far, parallel pg_restore has managed to avoid tripping over this misfeature, but only by dint of some crude hacks like not trusting dependency information in the pre-data section of the archive. It seems prudent to do something about this before it rises up to bite us, so instead of emitting the "raw" dependencies of each dumped object, recursively search for its actual dependencies among the subset of objects that are being dumped. Back-patch to 9.2, since that code hasn't yet diverged materially from HEAD. At some point we might need to back-patch further, but right now there are no known cases where this is actively necessary. (The one known case, bug #6699, is fixed in a different way by my previous patch.) Since this patch depends on 9.2 changes that made TOC entries be marked before output commences as to whether they'll be dumped, back-patching further would require additional surgery; and as of now there's no evidence that it's worth the risk.
2012-06-26 03:20:24 +02:00
* Note: as of 9.2, it should be guaranteed that all PRE_DATA items appear
* before DATA items, and all DATA items before POST_DATA items. That is
* not certain to be true in older archives, though, and in any case use
* of a list file would destroy that ordering (cf. SortTocFromFile). So
* this loop cannot assume that it holds.
*/
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
AH->restorePass = RESTORE_PASS_MAIN;
skipped_some = false;
for (next_work_item = AH->toc->next; next_work_item != AH->toc; next_work_item = next_work_item->next)
{
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
bool do_now = true;
if (next_work_item->section != SECTION_PRE_DATA)
{
/* DATA and POST_DATA items are just ignored for now */
if (next_work_item->section == SECTION_DATA ||
next_work_item->section == SECTION_POST_DATA)
{
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
do_now = false;
skipped_some = true;
}
else
{
/*
* SECTION_NONE items, such as comments, can be processed now
* if we are still in the PRE_DATA part of the archive. Once
* we've skipped any items, we have to consider whether the
* comment's dependencies are satisfied, so skip it for now.
*/
if (skipped_some)
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
do_now = false;
}
}
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
/*
* Also skip items that need to be forced into later passes. We need
* not set skipped_some in this case, since by assumption no main-pass
* items could depend on these.
*/
if (_tocEntryRestorePass(next_work_item) != RESTORE_PASS_MAIN)
do_now = false;
if (do_now)
{
/* OK, restore the item and update its dependencies */
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("processing item %d %s %s",
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
next_work_item->dumpId,
next_work_item->desc, next_work_item->tag);
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
(void) restore_toc_entry(AH, next_work_item, false);
/* Reduce dependencies, but don't move anything to ready_list */
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
reduce_dependencies(AH, next_work_item, NULL);
}
else
{
/* Nope, so add it to pending_list */
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
pending_list_append(pending_list, next_work_item);
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
}
}
/*
* Now close parent connection in prep for parallel steps. We do this
* mainly to ensure that we don't exceed the specified number of parallel
* connections.
*/
DisconnectDatabase(&AH->public);
/* blow away any transient state from the old connection */
free(AH->currUser);
AH->currUser = NULL;
free(AH->currSchema);
AH->currSchema = NULL;
free(AH->currTablespace);
AH->currTablespace = NULL;
free(AH->currTableAm);
AH->currTableAm = NULL;
}
/*
* Main engine for parallel restore.
*
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
* Parallel restore is done in three phases. In this second phase,
* we process entries by dispatching them to parallel worker children
* (processes on Unix, threads on Windows), each of which connects
* separately to the database. Inter-entry dependencies are respected,
* and so is the RestorePass multi-pass structure. When we can no longer
* make any entries ready to process, we exit. Normally, there will be
* nothing left to do; but if there is, the third phase will mop up.
*/
static void
restore_toc_entries_parallel(ArchiveHandle *AH, ParallelState *pstate,
TocEntry *pending_list)
{
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
ParallelReadyList ready_list;
TocEntry *next_work_item;
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_debug("entering restore_toc_entries_parallel");
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
/* Set up ready_list with enough room for all known TocEntrys */
ready_list_init(&ready_list, AH->tocCount);
/*
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
* The pending_list contains all items that we need to restore. Move all
* items that are available to process immediately into the ready_list.
* After this setup, the pending list is everything that needs to be done
* but is blocked by one or more dependencies, while the ready list
* contains items that have no remaining dependencies and are OK to
* process in the current restore pass.
*/
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
AH->restorePass = RESTORE_PASS_MAIN;
move_to_ready_list(pending_list, &ready_list, AH->restorePass);
/*
* main parent loop
*
* Keep going until there is no worker still running AND there is no work
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
* left to be done. Note invariant: at top of loop, there should always
* be at least one worker available to dispatch a job to.
*/
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("entering main parallel loop");
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
for (;;)
{
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
/* Look for an item ready to be dispatched to a worker */
next_work_item = pop_next_work_item(&ready_list, pstate);
if (next_work_item != NULL)
{
/* If not to be restored, don't waste time launching a worker */
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
if ((next_work_item->reqs & (REQ_SCHEMA | REQ_DATA)) == 0)
{
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("skipping item %d %s %s",
next_work_item->dumpId,
next_work_item->desc, next_work_item->tag);
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
/* Update its dependencies as though we'd completed it */
reduce_dependencies(AH, next_work_item, &ready_list);
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
/* Loop around to see if anything else can be dispatched */
continue;
}
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("launching item %d %s %s",
next_work_item->dumpId,
next_work_item->desc, next_work_item->tag);
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
/* Dispatch to some worker */
DispatchJobForTocEntry(AH, pstate, next_work_item, ACT_RESTORE,
mark_restore_job_done, &ready_list);
}
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
else if (IsEveryWorkerIdle(pstate))
{
/*
* Nothing is ready and no worker is running, so we're done with
* the current pass or maybe with the whole process.
*/
if (AH->restorePass == RESTORE_PASS_LAST)
break; /* No more parallel processing is possible */
/* Advance to next restore pass */
AH->restorePass++;
/* That probably allows some stuff to be made ready */
move_to_ready_list(pending_list, &ready_list, AH->restorePass);
/* Loop around to see if anything's now ready */
continue;
}
else
{
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
/*
* We have nothing ready, but at least one child is working, so
* wait for some subjob to finish.
*/
}
/*
* Before dispatching another job, check to see if anything has
* finished. We should check every time through the loop so as to
* reduce dependencies as soon as possible. If we were unable to
* dispatch any job this time through, wait until some worker finishes
* (and, hopefully, unblocks some pending item). If we did dispatch
* something, continue as soon as there's at least one idle worker.
* Note that in either case, there's guaranteed to be at least one
* idle worker when we return to the top of the loop. This ensures we
* won't block inside DispatchJobForTocEntry, which would be
* undesirable: we'd rather postpone dispatching until we see what's
* been unblocked by finished jobs.
*/
WaitForWorkers(AH, pstate,
next_work_item ? WFW_ONE_IDLE : WFW_GOT_STATUS);
}
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
/* There should now be nothing in ready_list. */
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
Assert(ready_list.first_te > ready_list.last_te);
ready_list_free(&ready_list);
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("finished main parallel loop");
}
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
/*
* Main engine for parallel restore.
*
* Parallel restore is done in three phases. In this third phase,
* we mop up any remaining TOC entries by processing them serially.
* This phase normally should have nothing to do, but if we've somehow
* gotten stuck due to circular dependencies or some such, this provides
* at least some chance of completing the restore successfully.
*/
static void
restore_toc_entries_postfork(ArchiveHandle *AH, TocEntry *pending_list)
{
RestoreOptions *ropt = AH->public.ropt;
TocEntry *te;
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_debug("entering restore_toc_entries_postfork");
/*
* Now reconnect the single parent connection.
*/
ConnectDatabase((Archive *) AH, &ropt->cparams, true);
/* re-establish fixed state */
_doSetFixedOutputState(AH);
/*
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
* Make sure there is no work left due to, say, circular dependencies, or
* some other pathological condition. If so, do it in the single parent
* connection. We don't sweat about RestorePass ordering; it's likely we
* already violated that.
*/
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
for (te = pending_list->pending_next; te != pending_list; te = te->pending_next)
{
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("processing missed item %d %s %s",
te->dumpId, te->desc, te->tag);
(void) restore_toc_entry(AH, te, false);
}
}
/*
* Check if te1 has an exclusive lock requirement for an item that te2 also
* requires, whether or not te2's requirement is for an exclusive lock.
*/
static bool
has_lock_conflicts(TocEntry *te1, TocEntry *te2)
{
int j,
k;
for (j = 0; j < te1->nLockDeps; j++)
{
for (k = 0; k < te2->nDeps; k++)
{
if (te1->lockDeps[j] == te2->dependencies[k])
return true;
}
}
return false;
}
/*
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
* Initialize the header of the pending-items list.
*
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
* This is a circular list with a dummy TocEntry as header, just like the
* main TOC list; but we use separate list links so that an entry can be in
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
* the main TOC list as well as in the pending list.
*/
static void
pending_list_header_init(TocEntry *l)
{
l->pending_prev = l->pending_next = l;
}
/* Append te to the end of the pending-list headed by l */
static void
pending_list_append(TocEntry *l, TocEntry *te)
{
te->pending_prev = l->pending_prev;
l->pending_prev->pending_next = te;
l->pending_prev = te;
te->pending_next = l;
}
/* Remove te from the pending-list */
static void
pending_list_remove(TocEntry *te)
{
te->pending_prev->pending_next = te->pending_next;
te->pending_next->pending_prev = te->pending_prev;
te->pending_prev = NULL;
te->pending_next = NULL;
}
/*
* Initialize the ready_list with enough room for up to tocCount entries.
*/
static void
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
ready_list_init(ParallelReadyList *ready_list, int tocCount)
{
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
ready_list->tes = (TocEntry **)
pg_malloc(tocCount * sizeof(TocEntry *));
ready_list->first_te = 0;
ready_list->last_te = -1;
ready_list->sorted = false;
}
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
/*
* Free storage for a ready_list.
*/
static void
ready_list_free(ParallelReadyList *ready_list)
{
pg_free(ready_list->tes);
}
/* Add te to the ready_list */
static void
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
ready_list_insert(ParallelReadyList *ready_list, TocEntry *te)
{
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
ready_list->tes[++ready_list->last_te] = te;
/* List is (probably) not sorted anymore. */
ready_list->sorted = false;
}
/* Remove the i'th entry in the ready_list */
static void
ready_list_remove(ParallelReadyList *ready_list, int i)
{
int f = ready_list->first_te;
Assert(i >= f && i <= ready_list->last_te);
/*
* In the typical case where the item to be removed is the first ready
* entry, we need only increment first_te to remove it. Otherwise, move
* the entries before it to compact the list. (This preserves sortedness,
* if any.) We could alternatively move the entries after i, but there
* are typically many more of those.
*/
if (i > f)
{
TocEntry **first_te_ptr = &ready_list->tes[f];
memmove(first_te_ptr + 1, first_te_ptr, (i - f) * sizeof(TocEntry *));
}
ready_list->first_te++;
}
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
/* Sort the ready_list into the desired order */
static void
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
ready_list_sort(ParallelReadyList *ready_list)
{
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
if (!ready_list->sorted)
{
int n = ready_list->last_te - ready_list->first_te + 1;
if (n > 1)
qsort(ready_list->tes + ready_list->first_te, n,
sizeof(TocEntry *),
TocEntrySizeCompare);
ready_list->sorted = true;
}
}
/* qsort comparator for sorting TocEntries by dataLength */
static int
TocEntrySizeCompare(const void *p1, const void *p2)
{
const TocEntry *te1 = *(const TocEntry *const *) p1;
const TocEntry *te2 = *(const TocEntry *const *) p2;
/* Sort by decreasing dataLength */
if (te1->dataLength > te2->dataLength)
return -1;
if (te1->dataLength < te2->dataLength)
return 1;
/* For equal dataLengths, sort by dumpId, just to be stable */
if (te1->dumpId < te2->dumpId)
return -1;
if (te1->dumpId > te2->dumpId)
return 1;
return 0;
}
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
/*
* Move all immediately-ready items from pending_list to ready_list.
*
* Items are considered ready if they have no remaining dependencies and
* they belong in the current restore pass. (See also reduce_dependencies,
* which applies the same logic one-at-a-time.)
*/
static void
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
move_to_ready_list(TocEntry *pending_list,
ParallelReadyList *ready_list,
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
RestorePass pass)
{
TocEntry *te;
TocEntry *next_te;
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
for (te = pending_list->pending_next; te != pending_list; te = next_te)
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
{
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
/* must save list link before possibly removing te from list */
next_te = te->pending_next;
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
if (te->depCount == 0 &&
_tocEntryRestorePass(te) == pass)
{
/* Remove it from pending_list ... */
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
pending_list_remove(te);
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
/* ... and add to ready_list */
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
ready_list_insert(ready_list, te);
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
}
}
}
/*
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
* Find the next work item (if any) that is capable of being run now,
* and remove it from the ready_list.
*
* Returns the item, or NULL if nothing is runnable.
*
* To qualify, the item must have no remaining dependencies
* and no requirements for locks that are incompatible with
* items currently running. Items in the ready_list are known to have
* no remaining dependencies, but we have to check for lock conflicts.
*/
static TocEntry *
pop_next_work_item(ParallelReadyList *ready_list,
ParallelState *pstate)
{
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
/*
* Sort the ready_list so that we'll tackle larger jobs first.
*/
ready_list_sort(ready_list);
/*
* Search the ready_list until we find a suitable item.
*/
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
for (int i = ready_list->first_te; i <= ready_list->last_te; i++)
{
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
TocEntry *te = ready_list->tes[i];
bool conflicts = false;
/*
* Check to see if the item would need exclusive lock on something
* that a currently running item also needs lock on, or vice versa. If
* so, we don't want to schedule them together.
*/
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
for (int k = 0; k < pstate->numWorkers; k++)
{
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
TocEntry *running_te = pstate->te[k];
if (running_te == NULL)
continue;
if (has_lock_conflicts(te, running_te) ||
has_lock_conflicts(running_te, te))
{
conflicts = true;
break;
}
}
if (conflicts)
continue;
/* passed all tests, so this item can run */
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
ready_list_remove(ready_list, i);
return te;
}
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_debug("no item ready");
return NULL;
}
/*
* Restore a single TOC item in parallel with others
*
* this is run in the worker, i.e. in a thread (Windows) or a separate process
* (everything else). A worker process executes several such work items during
* a parallel backup or restore. Once we terminate here and report back that
* our work is finished, the leader process will assign us a new work item.
*/
int
parallel_restore(ArchiveHandle *AH, TocEntry *te)
{
int status;
Assert(AH->connection != NULL);
/* Count only errors associated with this TOC entry */
AH->public.n_errors = 0;
/* Restore the TOC item */
status = restore_toc_entry(AH, te, true);
return status;
}
/*
* Callback function that's invoked in the leader process after a step has
* been parallel restored.
*
* Update status and reduce the dependency count of any dependent items.
*/
static void
mark_restore_job_done(ArchiveHandle *AH,
TocEntry *te,
int status,
void *callback_data)
{
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
ParallelReadyList *ready_list = (ParallelReadyList *) callback_data;
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("finished item %d %s %s",
te->dumpId, te->desc, te->tag);
if (status == WORKER_CREATE_DONE)
mark_create_done(AH, te);
else if (status == WORKER_INHIBIT_DATA)
{
inhibit_data_for_failed_table(AH, te);
AH->public.n_errors++;
}
else if (status == WORKER_IGNORED_ERRORS)
AH->public.n_errors++;
else if (status != 0)
pg_fatal("worker process failed: exit code %d",
status);
reduce_dependencies(AH, te, ready_list);
}
/*
* Process the dependency information into a form useful for parallel restore.
*
* This function takes care of fixing up some missing or badly designed
* dependencies, and then prepares subsidiary data structures that will be
* used in the main parallel-restore logic, including:
* 1. We build the revDeps[] arrays of incoming dependency dumpIds.
* 2. We set up depCount fields that are the number of as-yet-unprocessed
* dependencies for each TOC entry.
*
* We also identify locking dependencies so that we can avoid trying to
* schedule conflicting items at the same time.
*/
static void
fix_dependencies(ArchiveHandle *AH)
{
TocEntry *te;
int i;
/*
* Initialize the depCount/revDeps/nRevDeps fields, and make sure the TOC
* items are marked as not being in any parallel-processing list.
*/
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
te->depCount = te->nDeps;
te->revDeps = NULL;
te->nRevDeps = 0;
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
te->pending_prev = NULL;
te->pending_next = NULL;
}
/*
* POST_DATA items that are shown as depending on a table need to be
* re-pointed to depend on that table's data, instead. This ensures they
* won't get scheduled until the data has been loaded.
*/
repoint_table_dependencies(AH);
/*
* Pre-8.4 versions of pg_dump neglected to set up a dependency from BLOB
* COMMENTS to BLOBS. Cope. (We assume there's only one BLOBS and only
* one BLOB COMMENTS in such files.)
*/
if (AH->version < K_VERS_1_11)
{
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (strcmp(te->desc, "BLOB COMMENTS") == 0 && te->nDeps == 0)
{
TocEntry *te2;
for (te2 = AH->toc->next; te2 != AH->toc; te2 = te2->next)
{
if (strcmp(te2->desc, "BLOBS") == 0)
{
te->dependencies = (DumpId *) pg_malloc(sizeof(DumpId));
te->dependencies[0] = te2->dumpId;
te->nDeps++;
te->depCount++;
break;
}
}
break;
}
}
}
/*
* At this point we start to build the revDeps reverse-dependency arrays,
* so all changes of dependencies must be complete.
*/
/*
* Count the incoming dependencies for each item. Also, it is possible
Make pg_dump emit more accurate dependency information. While pg_dump has included dependency information in archive-format output ever since 7.3, it never made any large effort to ensure that that information was actually useful. In particular, in common situations where dependency chains include objects that aren't separately emitted in the dump, the dependencies shown for objects that were emitted would reference the dump IDs of these un-dumped objects, leaving no clue about which other objects the visible objects indirectly depend on. So far, parallel pg_restore has managed to avoid tripping over this misfeature, but only by dint of some crude hacks like not trusting dependency information in the pre-data section of the archive. It seems prudent to do something about this before it rises up to bite us, so instead of emitting the "raw" dependencies of each dumped object, recursively search for its actual dependencies among the subset of objects that are being dumped. Back-patch to 9.2, since that code hasn't yet diverged materially from HEAD. At some point we might need to back-patch further, but right now there are no known cases where this is actively necessary. (The one known case, bug #6699, is fixed in a different way by my previous patch.) Since this patch depends on 9.2 changes that made TOC entries be marked before output commences as to whether they'll be dumped, back-patching further would require additional surgery; and as of now there's no evidence that it's worth the risk.
2012-06-26 03:20:24 +02:00
* that the dependencies list items that are not in the archive at all
* (that should not happen in 9.2 and later, but is highly likely in older
* archives). Subtract such items from the depCounts.
*/
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
for (i = 0; i < te->nDeps; i++)
{
DumpId depid = te->dependencies[i];
if (depid <= AH->maxDumpId && AH->tocsByDumpId[depid] != NULL)
AH->tocsByDumpId[depid]->nRevDeps++;
else
te->depCount--;
}
}
/*
* Allocate space for revDeps[] arrays, and reset nRevDeps so we can use
* it as a counter below.
*/
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->nRevDeps > 0)
te->revDeps = (DumpId *) pg_malloc(te->nRevDeps * sizeof(DumpId));
te->nRevDeps = 0;
}
/*
* Build the revDeps[] arrays of incoming-dependency dumpIds. This had
* better agree with the loops above.
*/
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
for (i = 0; i < te->nDeps; i++)
{
DumpId depid = te->dependencies[i];
if (depid <= AH->maxDumpId && AH->tocsByDumpId[depid] != NULL)
{
TocEntry *otherte = AH->tocsByDumpId[depid];
otherte->revDeps[otherte->nRevDeps++] = te->dumpId;
}
}
}
/*
* Lastly, work out the locking dependencies.
*/
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
te->lockDeps = NULL;
te->nLockDeps = 0;
identify_locking_dependencies(AH, te);
}
}
/*
* Change dependencies on table items to depend on table data items instead,
* but only in POST_DATA items.
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
*
* Also, for any item having such dependency(s), set its dataLength to the
* largest dataLength of the table data items it depends on. This ensures
* that parallel restore will prioritize larger jobs (index builds, FK
* constraint checks, etc) over smaller ones, avoiding situations where we
* end a restore with only one active job working on a large table.
*/
static void
repoint_table_dependencies(ArchiveHandle *AH)
{
TocEntry *te;
int i;
DumpId olddep;
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (te->section != SECTION_POST_DATA)
continue;
for (i = 0; i < te->nDeps; i++)
{
olddep = te->dependencies[i];
if (olddep <= AH->maxDumpId &&
AH->tableDataId[olddep] != 0)
{
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
DumpId tabledataid = AH->tableDataId[olddep];
TocEntry *tabledatate = AH->tocsByDumpId[tabledataid];
te->dependencies[i] = tabledataid;
te->dataLength = Max(te->dataLength, tabledatate->dataLength);
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_debug("transferring dependency %d -> %d to %d",
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
te->dumpId, olddep, tabledataid);
}
}
}
}
/*
* Identify which objects we'll need exclusive lock on in order to restore
* the given TOC entry (*other* than the one identified by the TOC entry
* itself). Record their dump IDs in the entry's lockDeps[] array.
*/
static void
identify_locking_dependencies(ArchiveHandle *AH, TocEntry *te)
{
DumpId *lockids;
int nlockids;
int i;
/*
* We only care about this for POST_DATA items. PRE_DATA items are not
* run in parallel, and DATA items are all independent by assumption.
*/
if (te->section != SECTION_POST_DATA)
return;
/* Quick exit if no dependencies at all */
if (te->nDeps == 0)
return;
/*
* Most POST_DATA items are ALTER TABLEs or some moral equivalent of that,
* and hence require exclusive lock. However, we know that CREATE INDEX
* does not. (Maybe someday index-creating CONSTRAINTs will fall in that
* category too ... but today is not that day.)
*/
if (strcmp(te->desc, "INDEX") == 0)
return;
/*
* We assume the entry requires exclusive lock on each TABLE or TABLE DATA
* item listed among its dependencies. Originally all of these would have
* been TABLE items, but repoint_table_dependencies would have repointed
* them to the TABLE DATA items if those are present (which they might not
* be, eg in a schema-only dump). Note that all of the entries we are
* processing here are POST_DATA; otherwise there might be a significant
* difference between a dependency on a table and a dependency on its
* data, so that closer analysis would be needed here.
*/
lockids = (DumpId *) pg_malloc(te->nDeps * sizeof(DumpId));
nlockids = 0;
for (i = 0; i < te->nDeps; i++)
{
DumpId depid = te->dependencies[i];
if (depid <= AH->maxDumpId && AH->tocsByDumpId[depid] != NULL &&
((strcmp(AH->tocsByDumpId[depid]->desc, "TABLE DATA") == 0) ||
strcmp(AH->tocsByDumpId[depid]->desc, "TABLE") == 0))
lockids[nlockids++] = depid;
}
if (nlockids == 0)
{
free(lockids);
return;
}
te->lockDeps = pg_realloc(lockids, nlockids * sizeof(DumpId));
te->nLockDeps = nlockids;
}
/*
* Remove the specified TOC entry from the depCounts of items that depend on
* it, thereby possibly making them ready-to-run. Any pending item that
* becomes ready should be moved to the ready_list, if that's provided.
*/
static void
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
reduce_dependencies(ArchiveHandle *AH, TocEntry *te,
ParallelReadyList *ready_list)
{
int i;
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_debug("reducing dependencies for %d", te->dumpId);
for (i = 0; i < te->nRevDeps; i++)
{
TocEntry *otherte = AH->tocsByDumpId[te->revDeps[i]];
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
Assert(otherte->depCount > 0);
otherte->depCount--;
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
/*
* It's ready if it has no remaining dependencies, and it belongs in
* the current restore pass, and it is currently a member of the
* pending list (that check is needed to prevent double restore in
* some cases where a list-file forces out-of-order restoring).
* However, if ready_list == NULL then caller doesn't want any list
* memberships changed.
Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
2017-08-03 23:36:23 +02:00
*/
if (otherte->depCount == 0 &&
_tocEntryRestorePass(otherte) == AH->restorePass &&
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
otherte->pending_prev != NULL &&
ready_list != NULL)
{
/* Remove it from pending list ... */
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
pending_list_remove(otherte);
/* ... and add to ready_list */
Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
2018-09-14 23:31:51 +02:00
ready_list_insert(ready_list, otherte);
}
}
}
/*
* Set the created flag on the DATA member corresponding to the given
* TABLE member
*/
static void
mark_create_done(ArchiveHandle *AH, TocEntry *te)
{
if (AH->tableDataId[te->dumpId] != 0)
{
TocEntry *ted = AH->tocsByDumpId[AH->tableDataId[te->dumpId]];
ted->created = true;
}
}
/*
* Mark the DATA member corresponding to the given TABLE member
* as not wanted
*/
static void
inhibit_data_for_failed_table(ArchiveHandle *AH, TocEntry *te)
{
Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 14:24:37 +02:00
pg_log_info("table \"%s\" could not be created, will not restore its data",
te->tag);
if (AH->tableDataId[te->dumpId] != 0)
{
TocEntry *ted = AH->tocsByDumpId[AH->tableDataId[te->dumpId]];
ted->reqs = 0;
}
}
/*
* Clone and de-clone routines used in parallel restoration.
*
* Enough of the structure is cloned to ensure that there is no
* conflict between different threads each with their own clone.
*/
ArchiveHandle *
CloneArchive(ArchiveHandle *AH)
{
ArchiveHandle *clone;
/* Make a "flat" copy */
clone = (ArchiveHandle *) pg_malloc(sizeof(ArchiveHandle));
memcpy(clone, AH, sizeof(ArchiveHandle));
/* Handle format-independent fields */
memset(&(clone->sqlparse), 0, sizeof(clone->sqlparse));
/* The clone will have its own connection, so disregard connection state */
clone->connection = NULL;
Redesign handling of SIGTERM/control-C in parallel pg_dump/pg_restore. Formerly, Unix builds of pg_dump/pg_restore would trap SIGINT and similar signals and set a flag that was tested in various data-transfer loops. This was prone to errors of omission (cf commit 3c8aa6654); and even if the client-side response was prompt, we did nothing that would cause long-running SQL commands (e.g. CREATE INDEX) to terminate early. Also, the master process would effectively do nothing at all upon receipt of SIGINT; the only reason it seemed to work was that in typical scenarios the signal would also be delivered to the child processes. We should support termination when a signal is delivered only to the master process, though. Windows builds had no console interrupt handler, so they would just fall over immediately at control-C, again leaving long-running SQL commands to finish unmolested. To fix, remove the flag-checking approach altogether. Instead, allow the Unix signal handler to send a cancel request directly and then exit(1). In the master process, also have it forward the signal to the children. On Windows, add a console interrupt handler that behaves approximately the same. The main difference is that a single execution of the Windows handler can send all the cancel requests since all the info is available in one process, whereas on Unix each process sends a cancel only for its own database connection. In passing, fix an old problem that DisconnectDatabase tends to send a cancel request before exiting a parallel worker, even if nothing went wrong. This is at least a waste of cycles, and could lead to unexpected log messages, or maybe even data loss if it happened in pg_restore (though in the current code the problem seems to affect only pg_dump). The cause was that after a COPY step, pg_dump was leaving libpq in PGASYNC_BUSY state, causing PQtransactionStatus() to report PQTRANS_ACTIVE. That's normally harmless because the next PQexec() will silently clear the PGASYNC_BUSY state; but in a parallel worker we might exit without any additional SQL commands after a COPY step. So add an extra PQgetResult() call after a COPY to allow libpq to return to PGASYNC_IDLE state. This is a bug fix, IMO, so back-patch to 9.3 where parallel dump/restore were introduced. Thanks to Kyotaro Horiguchi for Windows testing and code suggestions. Original-Patch: <7005.1464657274@sss.pgh.pa.us> Discussion: <20160602.174941.256342236.horiguchi.kyotaro@lab.ntt.co.jp>
2016-06-02 19:27:53 +02:00
clone->connCancel = NULL;
clone->currUser = NULL;
clone->currSchema = NULL;
clone->currTableAm = NULL;
clone->currTablespace = NULL;
/* savedPassword must be local in case we change it while connecting */
if (clone->savedPassword)
clone->savedPassword = pg_strdup(clone->savedPassword);
/* clone has its own error count, too */
clone->public.n_errors = 0;
/*
* Connect our new clone object to the database, using the same connection
* parameters used for the original connection.
*/
ConnectDatabase((Archive *) clone, &clone->public.ropt->cparams, true);
/* re-establish fixed state */
if (AH->mode == archModeRead)
_doSetFixedOutputState(clone);
/* in write case, setupDumpWorker will fix up connection state */
/* Let the format-specific code have a chance too */
clone->ClonePtr(clone);
Assert(clone->connection != NULL);
return clone;
}
/*
* Release clone-local storage.
*
* Note: we assume any clone-local connection was already closed.
*/
void
DeCloneArchive(ArchiveHandle *AH)
{
Redesign handling of SIGTERM/control-C in parallel pg_dump/pg_restore. Formerly, Unix builds of pg_dump/pg_restore would trap SIGINT and similar signals and set a flag that was tested in various data-transfer loops. This was prone to errors of omission (cf commit 3c8aa6654); and even if the client-side response was prompt, we did nothing that would cause long-running SQL commands (e.g. CREATE INDEX) to terminate early. Also, the master process would effectively do nothing at all upon receipt of SIGINT; the only reason it seemed to work was that in typical scenarios the signal would also be delivered to the child processes. We should support termination when a signal is delivered only to the master process, though. Windows builds had no console interrupt handler, so they would just fall over immediately at control-C, again leaving long-running SQL commands to finish unmolested. To fix, remove the flag-checking approach altogether. Instead, allow the Unix signal handler to send a cancel request directly and then exit(1). In the master process, also have it forward the signal to the children. On Windows, add a console interrupt handler that behaves approximately the same. The main difference is that a single execution of the Windows handler can send all the cancel requests since all the info is available in one process, whereas on Unix each process sends a cancel only for its own database connection. In passing, fix an old problem that DisconnectDatabase tends to send a cancel request before exiting a parallel worker, even if nothing went wrong. This is at least a waste of cycles, and could lead to unexpected log messages, or maybe even data loss if it happened in pg_restore (though in the current code the problem seems to affect only pg_dump). The cause was that after a COPY step, pg_dump was leaving libpq in PGASYNC_BUSY state, causing PQtransactionStatus() to report PQTRANS_ACTIVE. That's normally harmless because the next PQexec() will silently clear the PGASYNC_BUSY state; but in a parallel worker we might exit without any additional SQL commands after a COPY step. So add an extra PQgetResult() call after a COPY to allow libpq to return to PGASYNC_IDLE state. This is a bug fix, IMO, so back-patch to 9.3 where parallel dump/restore were introduced. Thanks to Kyotaro Horiguchi for Windows testing and code suggestions. Original-Patch: <7005.1464657274@sss.pgh.pa.us> Discussion: <20160602.174941.256342236.horiguchi.kyotaro@lab.ntt.co.jp>
2016-06-02 19:27:53 +02:00
/* Should not have an open database connection */
Assert(AH->connection == NULL);
/* Clear format-specific state */
AH->DeClonePtr(AH);
/* Clear state allocated by CloneArchive */
if (AH->sqlparse.curCmd)
destroyPQExpBuffer(AH->sqlparse.curCmd);
/* Clear any connection-local state */
free(AH->currUser);
free(AH->currSchema);
free(AH->currTablespace);
free(AH->currTableAm);
free(AH->savedPassword);
free(AH);
}