Commit Graph

376 Commits

Author SHA1 Message Date
Tomas Vondra e9960732a9 Introduce a generic pg_dump compression API
Switch pg_dump to use the Compression API, implemented by bf9aa490db.

The CompressFileHandle replaces the cfp* family of functions with a
struct of callbacks for accessing (compressed) files. This allows adding
new compression methods simply by introducing a new struct instance with
appropriate implementation of the callbacks.

Archives compressed using custom compression methods store an identifier
of the compression algorithm in their header instead of the compression
level. The header version is bumped.

Author: Georgios Kokolatos
Reviewed-by: Michael Paquier, Rachel Heaton, Justin Pryzby, Tomas Vondra
Discussion: https://postgr.es/m/faUNEOpts9vunEaLnmxmG-DldLSg_ql137OC3JYDmgrOMHm1RvvWY2IdBkv_CRxm5spCCb_OmKNk2T03TMm0fBEWveFF9wA1WizPuAgB7Ss%3D%40protonmail.com
2023-02-23 18:33:40 +01:00
Tomas Vondra 03d02f54a6 Prepare pg_dump internals for additional compression methods
Commit bf9aa490db introduced a compression API in compress_io.{c,h} to
make reuse easier, and allow adding more compression algorithms.
However, pg_backup_archiver.c was not switched to this API and continued
to call the compression directly.

This commit teaches pg_backup_archiver.c about the compression API, so
that it can benefit from bf9aa490db (simpler code, easier addition of
new compression methods).

Author: Georgios Kokolatos
Reviewed-by: Michael Paquier, Rachel Heaton, Justin Pryzby, Tomas Vondra
Discussion: https://postgr.es/m/faUNEOpts9vunEaLnmxmG-DldLSg_ql137OC3JYDmgrOMHm1RvvWY2IdBkv_CRxm5spCCb_OmKNk2T03TMm0fBEWveFF9wA1WizPuAgB7Ss%3D%40protonmail.com
2023-02-23 15:38:40 +01:00
Peter Eisentraut aa69541046 Remove useless casts to (void *) in arguments of some system functions
The affected functions are: bsearch, memcmp, memcpy, memset, memmove,
qsort, repalloc

Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/fd9adf5d-b1aa-e82f-e4c7-263c30145807%40enterprisedb.com
2023-02-07 06:57:59 +01:00
Michael Paquier 783d8abc3b Fix behavior with pg_restore -l and compressed dumps
pg_restore -l has always been able to read the TOC data of a dump even
if its binary has no support for compression, for both compressed and
uncompressed dumps.  5e73a60 has introduced a backward-incompatible
behavior by switching a warning to a hard error in the code path reading
the header data of a dump, preventing the TOC items to be listed even if
pg_restore -l, with no support for compression, is used on a compressed
dump.  Most modern systems should have support for zlib, but it can be
also possible that somebody relies on the past behavior when copying
over a dump where binaries are not built with zlib support (most likely
some WIN32 flavors these days, though most environments should provide
that).

There is no easy way to have a regression test for this pattern, as it
requires a mix of dump/restore commands with different compilation
options, with and without compression.  One possibility I see here would
be to have a command-line option that enforces a non-compression check
for a build that supports compression, but that does not seem worth the
cost, either.

Reported-by: Justin Pryzby
Author: Georgios Kokolatos
Discussion: https://postgr.es/m/20230125180020.GF22427@telsasoft.com
2023-01-27 10:19:50 +09:00
Tom Lane 74739d1d3f Avoid harmless warning from pg_dump --if-exists mode.
If the public schema has a non-default owner (perhaps due to
dropping and recreating it) then use of pg_dump's "--if-exists"
option results in a warning message:

warning: could not find where to insert IF EXISTS in statement "-- *not* dropping schema, since initdb creates it"

This is harmless since the dump output is the same either way,
but nonetheless it's undesirable.  It's the fault of commit
a7a7be1f2, which created situations where a TOC entry's "defn"
or "dropStmt" fields could be just comments.  Although that
commit fixed up the kluges in pg_backup_archiver.c that munge defn
strings, it missed doing so for the one that munges dropStmts.

Per bug# 17753 from Justin Zhang.

Discussion: https://postgr.es/m/17753-9c8773631747ee1c@postgresql.org
2023-01-19 19:32:50 -05:00
Peter Eisentraut 35ce24c333 pg_dump: Remove "blob" terminology
For historical reasons, pg_dump refers to large objects as "BLOBs".
This term is not used anywhere else in PostgreSQL, and it also means
something different in the SQL standard and other SQL systems.

This patch renames internal functions, code comments, documentation,
etc. to use the "large object" or "LO" terminology instead.  There is
no functionality change, so the archive format still uses the name
"BLOB" for the archive entry.  Additional long command-line options
are added with the new naming.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/868a381f-4650-9460-1726-1ffd39a270b4%40enterprisedb.com
2022-12-05 08:52:55 +01:00
Michael Paquier 5e73a60488 Switch pg_dump to use compression specifications
Compression specifications are currently used by pg_basebackup and
pg_receivewal, and are able to let the user control in an extended way
the method and level of compression used.  As an effect of this commit,
pg_dump's -Z/--compress is now able to use more than just an integer, as
of the grammar "method[:detail]".

The method can be either "none" or "gzip", and can optionally take a
detail string.  If the detail string is only an integer, it defines the
compression level.  A comma-separated list of keywords can also be used
method allows for more options, the only keyword supported now is
"level".

The change is backward-compatible, hence specifying only an integer
leads to no compression for a level of 0 and gzip compression when the
level is greater than 0.

Most of the code changes are straight-forward, as pg_dump was relying on
an integer tracking the compression level to check for gzip or no
compression.  These are changed to use a compression specification and
the algorithm stored in it.

As of this change, note that the dump format is not bumped because there
is no need yet to track the compression algorithm in the TOC entries.
Hence, we still rely on the compression level to make the difference
when reading them.  This will be mandatory once a new compression method
is added, though.

In order to keep the code simpler when parsing the compression
specification, the code is changed so as pg_dump now fails hard when
using gzip on -Z/--compress without its support compiled, rather than
enforcing no compression without the user knowing about it except
through a warning.  Like before this commit, archive and custom formats
are compressed by default when the code is compiled with gzip, and left
uncompressed without gzip.

Author: Georgios Kokolatos
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
2022-12-02 10:45:02 +09:00
Peter Eisentraut 3655b46aa0 pg_dump: Refactor code that constructs ALTER ... OWNER TO commands
Avoid having to list all the possible object types twice.  Instead,
only _getObjectDescription() needs to know about specific object
types.  It communicates back to _printTocEntry() whether an owner is
to be set.

In passing, remove the logic to use ALTER TABLE to set the owner of
views and sequences.  This is no longer necessary.  Furthermore, if
pg_dump doesn't recognize the object type, this is now a fatal error,
not a warning.

Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/0a00f923-599a-381b-923f-0d802a727715@enterprisedb.com
2022-11-02 17:24:38 -04:00
Peter Eisentraut 02c408e21a Remove redundant null pointer checks before free()
Per applicable standards, free() with a null pointer is a no-op.
Systems that don't observe that are ancient and no longer relevant.
Some PostgreSQL code already required this behavior, so this change
does not introduce any new requirements, just makes the code more
consistent.

Discussion: https://www.postgresql.org/message-id/flat/dac5d2d0-98f5-94d9-8e69-46da2413593d%40enterprisedb.com
2022-07-03 11:47:15 +02:00
Tom Lane 23e7b38bfe Pre-beta mechanical code beautification.
Run pgindent, pgperltidy, and reformat-dat-files.
I manually fixed a couple of comments that pgindent uglified.
2022-05-12 15:17:30 -04:00
Tom Lane 9a374b77fb Improve frontend error logging style.
Get rid of the separate "FATAL" log level, as it was applied
so inconsistently as to be meaningless.  This mostly involves
s/pg_log_fatal/pg_log_error/g.

Create a macro pg_fatal() to handle the common use-case of
pg_log_error() immediately followed by exit(1).  Various
modules had already invented either this or equivalent macros;
standardize on pg_fatal() and apply it where possible.

Invent the ability to add "detail" and "hint" messages to a
frontend message, much as we have long had in the backend.

Except where rewording was needed to convert existing coding
to detail/hint style, I have (mostly) resisted the temptation
to change existing message wording.

Patch by me.  Design and patch reviewed at various stages by
Robert Haas, Kyotaro Horiguchi, Peter Eisentraut and
Daniel Gustafsson.

Discussion: https://postgr.es/m/1363732.1636496441@sss.pgh.pa.us
2022-04-08 14:55:14 -04:00
Daniel Gustafsson 1a29217a00 Free temporary memory when reading TOC
ReadStr returns allocated memory which the caller is responsible for
freeing when done with the string. This commit ensures that memory is
freed in one case which used ReadStr in a conditional. While the leak
might not be too concerning, this makes the code consistent across all
ReadStr callsites in ReadToc. Due to the lack of complaints of issues
in production from this, no backpatch is performed at this point.

Author: Bharath Rupireddy, Georgios Kokolatos
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/oZwKiUxFsVaetG2xOJp7Hwao8F1AKIdfFDQLNJrnwoaxmjyB-45r_aYmhgXHKLcMI3GT24m9L6HafSi2ns7WFxXe0mw2_tIJpD-Z3vb_eyI=@pm.me
2022-02-09 14:12:55 +01:00
Michael Paquier 2158628864 Add support for --no-table-access-method in pg_{dump,dumpall,restore}
The logic is similar to default_tablespace in some ways, so as no SET
queries on default_table_access_method are generated before dumping or
restoring an object (table or materialized view support table AMs) when
specifying this new option.

This option is useful to enforce the use of a default access method even
if some tables included in a dump use an AM different than the system's
default.

There are already two cases in the TAP tests of pg_dump with a table and
a materialized view that use a non-default table AM, and these are
extended that the new option does not generate SET clauses on
default_table_access_method.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20211207153930.GR17618@telsasoft.com
2022-01-17 14:51:46 +09:00
Tom Lane 30e7c175b8 Remove pg_dump/pg_dumpall support for dumping from pre-9.2 servers.
Per discussion, we'll limit support for old servers to those branches
that can still be built easily on modern platforms, which as of now
is 9.2 and up.  Remove over a thousand lines of code dedicated to
dumping from older server versions.  (As in previous changes of
this sort, we aren't removing pg_restore's ability to read older
archive files ... though it's fair to wonder how that might be
tested nowadays.)  This cleans up some dead code left behind by
commit 989596152.

Discussion: https://postgr.es/m/2923349.1634942313@sss.pgh.pa.us
2021-12-14 17:09:07 -05:00
Tom Lane 3cac2c8caa Handle close() failures more robustly in pg_dump and pg_basebackup.
Coverity complained that applying get_gz_error after a failed gzclose,
as we did in one place in pg_basebackup, is unsafe.  I think it's
right: it's entirely likely that the call is touching freed memory.
Change that to inspect errno, as we do for other gzclose calls.

Also, be careful to initialize errno to zero immediately before any
gzclose() call where we care about the error status.  (There are
some calls where we don't, because we already failed at some previous
step.)  This ensures that we don't get a misleadingly irrelevant
error code if gzclose() fails in a way that doesn't set errno.
We could work harder at that, but it looks to me like all such cases
are basically can't-happen if we're not misusing zlib, so it's
not worth the extra notational cruft that would be required.

Also, fix several places that simply failed to check for close-time
errors at all, mostly at some remove from the close or gzclose itself;
and one place that did check but didn't bother to report the errno.

Back-patch to v12.  These mistakes are older than that, but between
the frontend logging API changes that happened in v12 and the fact
that frontend code can't rely on %m before that, the patch would need
substantial revision to work in older branches.  It doesn't quite
seem worth the trouble given the lack of related field complaints.

Patch by me; thanks to Michael Paquier for review.

Discussion: https://postgr.es/m/1343113.1636489231@sss.pgh.pa.us
2021-11-17 13:08:25 -05:00
Amit Kapila 5a2832465f Allow publishing the tables of schema.
A new option "FOR ALL TABLES IN SCHEMA" in Create/Alter Publication allows
one or more schemas to be specified, whose tables are selected by the
publisher for sending the data to the subscriber.

The new syntax allows specifying both the tables and schemas. For example:
CREATE PUBLICATION pub1 FOR TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2;
OR
ALTER PUBLICATION pub1 ADD TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2;

A new system table "pg_publication_namespace" has been added, to maintain
the schemas that the user wants to publish through the publication.
Modified the output plugin (pgoutput) to publish the changes if the
relation is part of schema publication.

Updates pg_dump to identify and dump schema publications. Updates the \d
family of commands to display schema publications and \dRp+ variant will
now display associated schemas if any.

Author: Vignesh C, Hou Zhijie, Amit Kapila
Syntax-Suggested-by: Tom Lane, Alvaro Herrera
Reviewed-by: Greg Nancarrow, Masahiko Sawada, Hou Zhijie, Amit Kapila, Haiying Tang, Ajin Cherian, Rahila Syed, Bharath Rupireddy, Mark Dilger
Tested-by: Haiying Tang
Discussion: https://www.postgresql.org/message-id/CALDaNm0OANxuJ6RXqwZsM1MSY4s19nuH3734j4a72etDwvBETQ@mail.gmail.com
2021-10-27 07:44:52 +05:30
Tom Lane 70bef49400 Fix minor memory leaks in pg_dump.
I found these by running pg_dump under "valgrind --leak-check=full".

The changes in flagInhIndexes() and getIndexes() replace allocation of
an array of which we use only some elements by individual allocations
of just the actually-needed objects.  The previous coding wasted some
memory, but more importantly it confused valgrind's leak tracking.

collectComments() and collectSecLabels() remain major blots on
the valgrind report, because they don't PQclear their query
results, in order to avoid a lot of strdup's.  That's a dubious
tradeoff, but I'll leave it alone here; an upcoming patch will
modify those functions enough to justify changing the tradeoff.
2021-10-24 12:38:26 -04:00
Noah Misch a7a7be1f2f Dump public schema ownership and security labels.
As a side effect, this corrects dumps of public schema ACLs in databases
where the DBA dropped and recreated that schema.

Reviewed by Asif Rehman.

Discussion: https://postgr.es/m/20201229134924.GA1431748@rfd.leadboat.com
2021-06-28 18:34:55 -07:00
Tom Lane f807e3410f Work around portability issue with newer versions of mktime().
Recent glibc versions have made mktime() fail if tm_isdst is
inconsistent with the prevailing timezone; in particular it fails for
tm_isdst = 1 when the zone is UTC.  (This seems wildly inconsistent
with the POSIX-mandated treatment of "incorrect" values for the other
fields of struct tm, so if you ask me it's a bug, but I bet they'll
say it's intentional.)  This has been observed to cause cosmetic
problems when pg_restore'ing an archive created in a different
timezone.

To fix, do mktime() using the field values from the archive, and if
that fails try again with tm_isdst = -1.  This will give a result
that's off by the UTC-offset difference from the original zone, but
that was true before, too.  It's not terribly critical since we don't
do anything with the result except possibly print it.  (Someday we
should flush this entire bit of logic and record a standard-format
timestamp in the archive instead.  That's not okay for a back-patched
bug fix, though.)

Also, guard our only other use of mktime() by having initdb's
build_time_t() set tm_isdst = -1 not 0.  This case could only have
an issue in zones that are DST year-round; but I think some do exist,
or could in future.

Per report from Wells Oliver.  Back-patch to all supported
versions, since any of them might need to run with a newer glibc.

Discussion: https://postgr.es/m/CAOC+FBWDhDHO7G-i1_n_hjRzCnUeFO+H-Czi1y10mFhRWpBrew@mail.gmail.com
2021-06-13 14:32:42 -04:00
Tom Lane e6241d8e03 Rethink definition of pg_attribute.attcompression.
Redefine '\0' (InvalidCompressionMethod) as meaning "if we need to
compress, use the current setting of default_toast_compression".
This allows '\0' to be a suitable default choice regardless of
datatype, greatly simplifying code paths that initialize tupledescs
and the like.  It seems like a more user-friendly approach as well,
because now the default compression choice doesn't migrate into table
definitions, meaning that changing default_toast_compression is
usually sufficient to flip an installation's behavior; one needn't
tediously issue per-column ALTER SET COMPRESSION commands.

Along the way, fix a few minor bugs and documentation issues
with the per-column-compression feature.  Adopt more robust
APIs for SetIndexStorageProperties and GetAttributeCompression.

Bump catversion because typical contents of attcompression will now
be different.  We could get away without doing that, but it seems
better to ensure v14 installations all agree on this.  (We already
forced initdb for beta2, anyway.)

Discussion: https://postgr.es/m/626613.1621787110@sss.pgh.pa.us
2021-05-27 13:24:27 -04:00
Tom Lane ec03f2df17 Fix pg_restore's misdesigned code for detecting archive file format.
Despite the clear comments pointing out that the duplicative code
segments in ReadHead() and _discoverArchiveFormat() needed to be
in sync, they were not: the latter did not bother to apply any of
the sanity checks in the former.  We'd missed noticing this partly
because none of those checks would fail in scenarios we customarily
test, and partly because the oversight would be masked if both
segments execute, which they would in cases other than needing to
autodetect the format of a non-seekable stdin source.  However,
in a case meeting all these requirements --- for example, trying
to read a newer-than-supported archive format from non-seekable
stdin --- pg_restore missed applying the version check and would
likely dump core or otherwise misbehave.

The whole thing is silly anyway, because there seems little reason
to duplicate the logic beyond the one-line verification that the
file starts with "PGDMP".  There seems to have been an undocumented
assumption that multiple major formats (major enough to require
separate reader modules) would nonetheless share the first half-dozen
fields of the custom-format header.  This seems unlikely, so let's
fix it by just nuking the duplicate logic in _discoverArchiveFormat().

Also get rid of the pointless attempt to seek back to the start of
the file after successful autodetection.  That wastes cycles and
it means we have four behaviors to verify not two.

Per bug #16951 from Sergey Koposov.  This has been broken for
decades, so back-patch to all supported versions.

Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org
2021-04-01 13:34:16 -04:00
Tom Lane aa25d1089a Fix up pg_dump's handling of per-attribute compression options.
The approach used in commit bbe0a81db would've been disastrous for
portability of dumps.  Instead handle non-default compression options
in separate ALTER TABLE commands.  This reduces chatter for the
common case where most columns are compressed the same way, and it
makes it possible to restore the dump to a server that lacks any
knowledge of per-attribute compression options (so long as you're
willing to ignore syntax errors from the ALTER TABLE commands).

There's a whole lot left to do to mop up after bbe0a81db, but
I'm fast-tracking this part because we need to see if it's
enough to make the buildfarm's cross-version-upgrade tests happy.

Justin Pryzby and Tom Lane

Discussion: https://postgr.es/m/20210119190720.GL8560@telsasoft.com
2021-03-20 15:01:10 -04:00
Peter Eisentraut d2a2808eb4 pg_dump: Don't use enums for defining bit mask values
This usage would mean that values of the enum type are potentially not
one of the enum values.  Use macros instead, like everywhere else.

Discussion: https://www.postgresql.org/message-id/14dde730-1d34-260e-fa9d-7664df2d6313@enterprisedb.com
2020-12-11 19:15:30 +01:00
Tom Lane 929c69aa19 In pg_restore's dump_lo_buf(), work a little harder on error handling.
Failure to write data to a large object during restore led to an ugly
and uninformative error message.  To add insult to injury, it then
fatal'd out, where other SQL-level errors usually result in pressing on.

Report the underlying error condition, rather than just giving not-very-
useful byte counts, and use warn_or_exit_horribly() so as to adhere to
pg_restore's general policy about whether to continue or not.

Also recognize that lo_write() returns int not size_t.

Per report from Justin Pryzby, though I didn't use his patch.
Given the lack of comparable complaints, I'm not sure this is
worth back-patching.

Discussion: https://postgr.es/m/20201018010232.GF9241@telsasoft.com
2020-10-18 12:26:02 -04:00
Tom Lane a45bc8a4f6 Fix handling of -d "connection string" in pg_dump/pg_restore.
Parallel pg_dump failed if its -d parameter was a connection string
containing any essential information other than host, port, or username.
The same was true for pg_restore with --create.

The reason is that these scenarios failed to preserve the connection
string from the command line; the code felt free to replace that with
just the database name when reconnecting from a pg_dump parallel worker
or after creating the target database.  By chance, parallel pg_restore
did not suffer this defect, as long as you didn't say --create.

In practice it seems that the error would be obvious only if the
connstring included essential, non-default SSL or GSS parameters.
This may explain why it took us so long to notice.  (It also makes
it very difficult to craft a regression test case illustrating the
problem, since the test would fail in builds without those options.)

Fix by refactoring so that ConnectDatabase always receives all the
relevant options directly from the command line, rather than
reconstructed values.  Inject a different database name, when necessary,
by relying on libpq's rules for handling multiple "dbname" parameters.

While here, let's get rid of the essentially duplicate _connectDB
function, as well as some obsolete nearby cruft.

Per bug #16604 from Zsolt Ero.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/16604-933f4b8791227b15@postgresql.org
2020-09-24 18:19:38 -04:00
Tom Lane 2e3c19462d Simplify SortTocFromFile() by removing fixed buffer-size limit.
pg_restore previously coped with overlength TOC-file lines using some
complicated logic to ignore additional bufferloads.  While this isn't
wrong, since we don't expect that the interesting part of a line would
run to more than a dozen or so bytes, it's more complex than it needs
to be.  Use a StringInfo instead of a fixed-size buffer so that we can
process long lines as single entities and thus not need the extra
logic.

Daniel Gustafsson

Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
2020-09-22 16:03:32 -04:00
Peter Eisentraut 8354e7b27e Remove unused parameters
Remove various unused parameters in pg_dump code.  These have all
become unused over time or were never used.

Discussion: https://www.postgresql.org/message-id/flat/511bb100-f829-ba21-2f10-9f952ec06ead%402ndquadrant.com
2020-09-19 13:29:54 +02:00
Tom Lane 99175141c9 Improve common/logging.c's support for multiple verbosity levels.
Instead of hard-wiring specific verbosity levels into the option
processing of client applications, invent pg_logging_increase_verbosity()
and encourage clients to implement --verbose by calling that.  Then,
the common convention that more -v's gets you more verbosity just works.

In particular, this allows resurrection of the debug-grade messages that
have long existed in pg_dump and its siblings.  They were unreachable
before this commit due to lack of a way to select PG_LOG_DEBUG logging
level.  (It appears that they may have been unreachable for some time
before common/logging.c was introduced, too, so I'm not specifically
blaming cc8d41511 for the oversight.  One reason for thinking that is
that it's now apparent that _allocAH()'s message needs a null-pointer
guard.  Testing might have failed to reveal that before 96bf88d52.)

Discussion: https://postgr.es/m/1173106.1600116625@sss.pgh.pa.us
2020-09-17 12:52:18 -04:00
Andres Freund e07633646a code: replace 'master' with 'leader' where appropriate.
Leader already is the more widely used terminology, but a few places
didn't get the message.

Author: Andres Freund
Reviewed-By: David Steele
Discussion: https://postgr.es/m/20200615182235.x7lch5n6kcjq4aue@alap3.anarazel.de
2020-07-08 12:58:32 -07:00
Michael Paquier aaf8c99050 Fix typos and some format mistakes in comments
Author: Justin Pryzby
Discussion: https://postgr.es/m/20200612023709.GC14879@telsasoft.com
2020-06-12 21:05:10 +09:00
Tom Lane a9d70c1087 Fix pg_dump/pg_restore to restore event trigger comments later.
Repair an oversight in commit 8728b2c70: if we're postponing restore
of event triggers to the end, we must also postpone restoring any
comments on them, since of course we cannot create the comments first.
(This opens yet another opportunity for an event trigger to bollix
the restore, but there's no help for that.)

Per bug #16346 from Alexander Lakhin.

Like the previous commit, back-patch to all supported branches.

Hamid Akhtar and Tom Lane

Discussion: https://postgr.es/m/16346-6210ad7a0ea81be1@postgresql.org
2020-04-08 11:23:39 -04:00
Tom Lane 8728b2c703 Fix pg_dump/pg_restore to restore event triggers later.
Previously, event triggers were restored just after regular triggers
(and FK constraints, which are basically triggers).  This is risky
since an event trigger, once installed, could interfere with subsequent
restore commands.  Worse, because event triggers don't have any
particular dependencies on any post-data objects, a parallel restore
would consider them eligible to be restored the moment the post-data
phase starts, allowing them to also interfere with restoration of a
whole bunch of objects that would have been restored before them in
a serial restore.  There's no way to completely remove the risk of a
misguided event trigger breaking the restore, since if nothing else
it could break other event triggers.  But we can certainly push them
to later in the process to minimize the hazard.

To fix, tweak the RestorePass mechanism introduced by commit 3eb9a5e7c
so that event triggers are handled as part of the post-ACL processing
pass (renaming the "REFRESH" pass to "POST_ACL" to reflect its more
general use).  This will cause them to restore after everything except
matview refreshes, which seems OK since matview refreshes really ought
to run in the post-restore state of the database.  In a parallel
restore, event triggers and matview refreshes might be intermixed,
but that seems all right as well.

Also update the code and comments in pg_dump_sort.c so that its idea
of how things are sorted agrees with what actually happens due to
the RestorePass mechanism.  This is mostly cosmetic: it'll affect the
order of objects in a dump's TOC, but not the actual restore order.
But not changing that would be quite confusing to somebody reading
the code.

Back-patch to all supported branches.

Fabrízio de Royes Mello, tweaked a bit by me

Discussion: https://postgr.es/m/CAFcNs+ow1hmFox8P--3GSdtwz-S3Binb6ZmoP6Vk+Xg=K6eZNA@mail.gmail.com
2020-03-09 14:58:26 -04:00
Tom Lane 799d22461a Assume that we have functional, 64-bit fseeko()/ftello().
Windows has this, and so do all other live platforms according to the
buildfarm, so remove the configure probe and src/port/ substitution.

Keep the probe that detects whether _LARGEFILE_SOURCE has to be
defined to get that, though ... that seems to be still relevant in
some places.

This is part of a series of commits to get rid of no-longer-relevant
configure checks and dead src/port/ code.  I'm committing them separately
to make it easier to back out individual changes if they prove less
portable than I expect.

Discussion: https://postgr.es/m/15379.1582221614@sss.pgh.pa.us
2020-02-21 14:30:47 -05:00
Alvaro Herrera 3974c4a724 Remove useless "return;" lines
Discussion: https://postgr.es/m/20191128144653.GA27883@alvherre.pgsql
2019-11-28 16:48:37 -03:00
Amit Kapila dddf4cdc33 Make the order of the header file includes consistent in non-backend modules.
Similar to commit 7e735035f2, this commit makes the order of header file
inclusion consistent for non-backend modules.

In passing, fix the case where we were using angle brackets (<>) for the
local module includes instead of quotes ("").

Author: Vignesh C
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com
2019-10-25 07:41:52 +05:30
Michael Paquier 66bde49d96 Fix inconsistencies and typos in the tree, take 10
This addresses some issues with unnecessary code comments, fixes various
typos in docs and comments, and removes some orphaned structures and
definitions.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/9aabc775-5494-b372-8bcb-4dfc0bd37c68@gmail.com
2019-08-13 13:53:41 +09:00
David Rowley 8abc13a889 Use appendStringInfoString and appendPQExpBufferStr where possible
This changes various places where appendPQExpBuffer was used in places
where it was possible to use appendPQExpBufferStr, and likewise for
appendStringInfo and appendStringInfoString.  This is really just a
stylistic improvement, but there are also small performance gains to be
had from doing this.

Discussion: http://postgr.es/m/CAKJS1f9P=M-3ULmPvr8iCno8yvfDViHibJjpriHU8+SXUgeZ=w@mail.gmail.com
2019-07-04 13:01:13 +12:00
Tom Lane 8255c7a5ee Phase 2 pgindent run for v12.
Switch to 2.1 version of pg_bsd_indent.  This formats
multiline function declarations "correctly", that is with
additional lines of parameter declarations indented to match
where the first line's left parenthesis is.

Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
2019-05-22 13:04:48 -04:00
Tom Lane be76af171c Initial pgindent run for v12.
This is still using the 2.0 version of pg_bsd_indent.
I thought it would be good to commit this separately,
so as to document the differences between 2.0 and 2.1 behavior.

Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us
2019-05-22 12:55:34 -04:00
Tom Lane fc9a62af3f Move logging.h and logging.c from src/fe_utils/ to src/common/.
The original placement of this module in src/fe_utils/ is ill-considered,
because several src/common/ modules have dependencies on it, meaning that
libpgcommon and libpgfeutils now have mutual dependencies.  That makes it
pointless to have distinct libraries at all.  The intended design is that
libpgcommon is lower-level than libpgfeutils, so only dependencies from
the latter to the former are acceptable.

We already have the precedent that fe_memutils and a couple of other
modules in src/common/ are frontend-only, so it's not stretching anything
out of whack to treat logging.c as a frontend-only module in src/common/.
To the extent that such modules help provide a common frontend/backend
environment for the rest of common/ to use, it's a reasonable design.
(logging.c does not yet provide an ereport() emulation, but one can
dream.)

Hence, move these files over, and revert basically all of the build-system
changes made by commit cc8d41511.  There are no places that need to grow
new dependencies on libpgcommon, further reinforcing the idea that this
is the right solution.

Discussion: https://postgr.es/m/a912ffff-f6e4-778a-c86a-cf5c47a12933@2ndquadrant.com
2019-05-14 14:20:10 -04:00
Alvaro Herrera 7fcdb5e002 pg_dump: store unused attribs as NULL instead of '\0'
Commit f831d4accd changed pg_dump to emit (and pg_restore to
understand) NULLs for unused members in ArchiveEntry structs, as a side
effect of some code beautification.  That broke pg_restore of dumps
generated with older pg_dump, however, so it was reverted in
19455c9f56.  Since the archiver version number has been bumped in
3b925e905d, we can put it back.

Author: Dmitry Dolgov
Discussion: https://postgr.es/m/CA+q6zcXx0XHqLsFJLaUU2j5BDiBAHig=YRoBC_YVq7VJGvzBEA@mail.gmail.com
2019-04-26 12:05:17 -04:00
Alvaro Herrera 413ccaa74d pg_restore: Require "-f -" to mean stdout
The previous convention that stdout was selected by default when nothing
is specified was just too error-prone.

After a suggestion from Andrew Gierth.
Author: Euler Taveira
Reviewed-by: Yoshikazu Imai, José Arthur Benetasso Villanova
Discussion: https://postgr.es/m/87sgwrmhdv.fsf@news-spur.riddles.org.uk
2019-04-04 16:54:15 -03:00
Peter Eisentraut cc8d415117 Unified logging system for command-line programs
This unifies the various ad hoc logging (message printing, error
printing) systems used throughout the command-line programs.

Features:

- Program name is automatically prefixed.

- Message string does not end with newline.  This removes a common
  source of inconsistencies and omissions.

- Additionally, a final newline is automatically stripped, simplifying
  use of PQerrorMessage() etc., another common source of mistakes.

- I converted error message strings to use %m where possible.

- As a result of the above several points, more translatable message
  strings can be shared between different components and between
  frontends and backend, without gratuitous punctuation or whitespace
  differences.

- There is support for setting a "log level".  This is not meant to be
  user-facing, but can be used internally to implement debug or
  verbose modes.

- Lazy argument evaluation, so no significant overhead if logging at
  some level is disabled.

- Some color in the messages, similar to gcc and clang.  Set
  PG_COLOR=auto to try it out.  Some colors are predefined, but can be
  customized by setting PG_COLORS.

- Common files (common/, fe_utils/, etc.) can handle logging much more
  simply by just using one API without worrying too much about the
  context of the calling program, requiring callbacks, or having to
  pass "progname" around everywhere.

- Some programs called setvbuf() to make sure that stderr is
  unbuffered, even on Windows.  But not all programs did that.  This
  is now done centrally.

Soft goals:

- Reduces vertical space use and visual complexity of error reporting
  in the source code.

- Encourages more deliberate classification of messages.  For example,
  in some cases it wasn't clear without analyzing the surrounding code
  whether a message was meant as an error or just an info.

- Concepts and terms are vaguely aligned with popular logging
  frameworks such as log4j and Python logging.

This is all just about printing stuff out.  Nothing affects program
flow (e.g., fatal exits).  The uses are just too varied to do that.
Some existing code had wrappers that do some kind of print-and-exit,
and I adapted those.

I tried to keep the output mostly the same, but there is a lot of
historical baggage to unwind and special cases to consider, and I
might not always have succeeded.  One significant change is that
pg_rewind used to write all error messages to stdout.  That is now
changed to stderr.

Reviewed-by: Donald Dong <xdong@csumb.edu>
Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru>
Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 20:01:35 +02:00
Michael Paquier 1983af8e89 Switch some palloc/memset calls to palloc0
Some code paths have been doing some allocations followed by an
immediate memset() to initialize the allocated area with zeros, this is
a bit overkill as there are already interfaces to do both things in one
call.

Author: Daniel Gustafsson
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/vN0OodBPkKs7g2Z1uyk3CUEmhdtspHgYCImhlmSxv1Xn6nY1ZnaaGHL8EWUIQ-NEv36tyc4G5-uA3UXUF2l4sFXtK_EQgLN1hcgunlFVKhA=@yesql.se
2019-03-27 12:02:50 +09:00
Tom Lane 4870dce37f Ensure xmloption = content while restoring pg_dump output.
In combination with the previous commit, this ensures that valid XML
data can always be dumped and reloaded, whether it is "document"
or "content".

Discussion: https://postgr.es/m/CAN-V+g-6JqUQEQZ55Q3toXEN6d5Ez5uvzL4VR+8KtvJKj31taw@mail.gmail.com
2019-03-23 16:51:37 -04:00
Andres Freund 3b925e905d tableam: Add pg_dump support.
This adds pg_dump support for table AMs in a similar manner to how
tablespaces are handled. That is, instead of specifying the AM for
every CREATE TABLE etc, emit SET default_table_access_method
statements. That makes it easier to change the AM for all/most tables
in a dump, and allows restore to succeed even if some AM is not
available.

This increases the dump archive version, as a tables/matview's AM
needs to be tracked therein.

Author: Dimitri Dolgov, Andres Freund
Discussion:
    https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
    https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 09:54:38 -08:00
Alvaro Herrera 19455c9f56 pg_dump: Fix ArchiveEntry handling of some empty values
Commit f831d4acc changed what pg_dump emits for some empty fields: they
were output as empty strings before, NULL pointer afterwards.  That
makes old pg_restore unable to work (crash) with such files, which is
unacceptable.  Return to the original representation by explicitly
setting those struct members to "" where needed; remove some no longer
needed checks for NULL input.

We can declutter the code a little by returning to NULLs when we next
update the archive version, so add a note to remind us later.

Discussion: https://postgr.es/m/20190225074539.az6j3u464cvsoxh6@depesz.com
Reported-by: hubert depesz lubaczewski
Author: Dmitry Dolgov
2019-02-28 17:19:44 -03:00
Tom Lane 4dbe196907 Repair unsafe/unportable snprintf usage in pg_restore.
warn_or_exit_horribly() was blithely passing a potentially-NULL
string pointer to a %s format specifier.  That works (at least
to the extent of not crashing) on some platforms, but not all,
and since we switched to our own snprintf.c it doesn't work
for us anywhere.

Of the three string fields being handled this way here, I think
that only "owner" is supposed to be nullable ... but considering
that this is error-reporting code, it has very little business
assuming anything, so put in defenses for all three.

Per a crash observed on buildfarm member crake and then
reproduced here.  Because of the portability aspect,
back-patch to all supported versions.
2019-02-09 19:45:38 -05:00
Alvaro Herrera f831d4accd Add ArchiveOpts to pass options to ArchiveEntry
The ArchiveEntry function has a number of arguments that can be
considered optional.  Split them out into a separate struct, to make the
API more flexible for changes.

Author: Dmitry Dolgov
Discussion: https://postgr.es/m/CA+q6zcXRxPE+qp6oerQWJ3zS061WPOhdxeMrdc-Yf-2V5vsrEw@mail.gmail.com
2019-02-01 11:29:42 -03:00
Peter Eisentraut 5c82830797 pg_dump: Add missing newline to error message 2018-12-27 10:03:28 +01:00