Commit Graph

224 Commits

Author SHA1 Message Date
Alvaro Herrera
11b335ac4c pg_dump: Fix verbosity level in LO progress messages
In passing, reword another instance of the same message that was
gratuitously different.

Author: Josh Kupershmidt
after a bug report by Bosco Rama
2012-06-19 17:20:23 -04:00
Peter Eisentraut
e1e97e9313 pg_dump: Add missing newlines at end of messages 2012-06-18 23:57:00 +03:00
Bruce Momjian
927d61eeff Run pgindent on 9.2 source tree in preparation for first 9.3
commit-fest.
2012-06-10 15:20:04 -04:00
Tom Lane
4317e0246c Rewrite --section option to decouple it from --schema-only/--data-only.
The initial implementation of pg_dump's --section option supposed that the
existing --schema-only and --data-only options could be made equivalent to
--section settings.  This is wrong, though, due to dubious but long since
set-in-stone decisions about where to dump SEQUENCE SET items, as seen in
bug report from Martin Pitt.  (And I'm not totally convinced there weren't
other bugs, either.)  Undo that coupling and instead drive --section
filtering off current-section state tracked as we scan through the TOC
list to call _tocEntryRequired().

To make sure those decisions don't shift around and hopefully save a few
cycles, run _tocEntryRequired() only once per TOC entry and save the result
in a new TOC field.  This required minor rejiggering of ACL handling but
also allows a far cleaner implementation of inhibit_data_for_failed_table.

Also, to ensure that pg_dump and pg_restore have the same behavior with
respect to the --section switches, add _tocEntryRequired() filtering to
WriteToc() and WriteDataChunks(), rather than trying to implement section
filtering in an entirely orthogonal way in dumpDumpableObject().  This
required adjusting the handling of the special ENCODING and STDSTRINGS
items, but they were pretty weird before anyway.

Minor other code review for the patch, too.
2012-05-29 23:22:14 -04:00
Tom Lane
c89bdf7690 Eliminate some more O(N^2) behaviors in pg_dump/pg_restore.
This patch fixes three places (which AFAICT is all of them) where runtime
was O(N^2) in the number of TOC entries, by using an index array to replace
linear searches of the TOC list.  This performance issue is a bit less bad
than those recently fixed, because it depends on the number of items dumped
not the number in the source database, so the problem can be dodged by
doing partial dumps.

The previous coding already had an instance of one of the two index arrays
needed, but it was only calculated in parallel-restore cases; now we need
it all the time.  I also chose to move the arrays into the ArchiveHandle
data structure, to make this code a bit more ready for the day that we
try to sling multiple ArchiveHandles around in pg_dump or pg_restore.

Since we still need some server-side work before pg_dump can really cope
nicely with tens of thousands of tables, there's probably little point in
back-patching.
2012-05-28 20:38:28 -04:00
Alvaro Herrera
9d23a70d51 pg_dump: get rid of die_horribly
The old code was using exit_horribly or die_horribly other depending on
whether it had an ArchiveHandle on which to close the connection or not;
but there were places that were passing a NULL ArchiveHandle to
die_horribly, and other places that used exit_horribly while having an
AH available.  So there wasn't all that much consistency.

Improve the situation by keeping only one of the routines, and instead
of having to pass the AH down from the caller, arrange for it to be
present for an on_exit_nicely callback to operate on.

Author: Joachim Wieland
Some tweaks by me

Per a suggestion from Robert Haas, in the ongoing "parallel pg_dump"
saga.
2012-03-20 18:58:00 -03:00
Peter Eisentraut
19f45565f5 pg_dump: Remove undocumented "files" output format
This was for demonstration only, and now it was creating compiler
warnings from zlib without an obvious fix (see also
d923125b77), let's just remove it.  The
"directory" format is presumably similar enough anyway.
2012-03-20 20:39:59 +02:00
Tom Lane
89e0bac86d Convert newlines to spaces in names written in pg_dump comments.
pg_dump was incautious about sanitizing object names that are emitted
within SQL comments in its output script.  A name containing a newline
would at least render the script syntactically incorrect.  Maliciously
crafted object names could present a SQL injection risk when the script
is reloaded.

Reported by Heikki Linnakangas, patch by Robert Haas

Security: CVE-2012-0868
2012-02-23 15:53:09 -05:00
Robert Haas
e9a22259c4 Invent on_exit_nicely for pg_dump.
Per recent discussions on pgsql-hackers regarding parallel pg_dump.
2012-02-16 11:49:20 -05:00
Peter Eisentraut
e09509bd33 pg_dump: Add some const qualifiers 2012-02-07 23:20:29 +02:00
Robert Haas
1631598ea2 pg_dump: Further reduce reliance on global variables.
This is another round of refactoring to make things simpler for parallel
pg_dump.  pg_dump.c now issues SQL queries through the relevant Archive
object, rather than relying on the global variable g_conn.  This commit
isn't quite enough to get rid of g_conn entirely, but it makes a big
dent in its utilization and, along the way, manages to be slightly less
code than before.
2012-02-07 10:07:02 -05:00
Peter Eisentraut
88a6ac9f93 pg_dump: Add GCC noreturn attribute to appropriate functions
This is a small help to the compiler and static analyzers.
2012-01-31 20:49:10 +02:00
Tom Lane
f3316a05b5 Fix pg_restore's direct-to-database mode for INSERT-style table data.
In commit 6545a901aa, I removed the mini SQL
lexer that was in pg_backup_db.c, thinking that it had no real purpose
beyond separating COPY data from SQL commands, which purpose had been
obsoleted by long-ago fixes in pg_dump's archive file format.
Unfortunately this was in error: that code was also used to identify
command boundaries in INSERT-style table data, which is run together as a
single string in the archive file for better compressibility.  As a result,
direct-to-database restores from archive files made with --inserts or
--column-inserts fail in our latest releases, as reported by Dick Visser.

To fix, restore the mini SQL lexer, but simplify it by adjusting the
calling logic so that it's only required to cope with INSERT-style table
data, not arbitrary SQL commands.  This allows us to not have to deal with
SQL comments, E'' strings, or dollar-quoted strings, none of which have
ever been emitted by dumpTableData_insert.

Also, fix the lexer to cope with standard-conforming strings, which was the
actual bug that the previous patch was meant to solve.

Back-patch to all supported branches.  The previous patch went back to 8.2,
which unfortunately means that the EOL release of 8.2 contains this bug,
but I don't think we're doing another 8.2 release just because of that.
2012-01-06 13:04:09 -05:00
Andrew Dunstan
54a622cadf Suggest use of psql when pg_restore gets a text dump. 2012-01-03 16:02:49 -05:00
Andrew Dunstan
a4cd6abcc9 Add --section option to pg_dump and pg_restore.
Valid values are --pre-data, data and post-data. The option can be
given more than once. --schema-only is equivalent to
--section=pre-data --section=post-data. --data-only is equivalent
to --section=data.

Andrew Dunstan, reviewed by Joachim Wieland and Josh Berkus.
2011-12-16 19:09:38 -05:00
Tom Lane
0195e5c4ab Clean up after recent pg_dump patches.
Fix entirely broken handling of va_list printing routines, update some
out-of-date comments, fix some bogus inclusion orders, fix NLS declarations,
fix missed realloc calls.
2011-11-29 20:41:54 -05:00
Bruce Momjian
8b08deb0d1 Simplify the pg_dump/pg_restore error reporting macros, and allow
pg_dumpall to use the same memory allocation functions as the others.
2011-11-29 16:34:45 -05:00
Bruce Momjian
9a7d49d1fb Move pg_dump memory routines into pg_dumpmem.c/h and restore common.c
with its original functions.  The previous function migration would
cause too many difficulties in back-patching.
2011-11-26 22:34:36 -05:00
Bruce Momjian
3c0afde11a Modify pg_dump to use error-free memory allocation macros. This avoids
ignoring errors and call-site error checking.
2011-11-25 15:40:51 -05:00
Peter Eisentraut
1b81c2fe6e Remove many -Wcast-qual warnings
This addresses only those cases that are easy to fix by adding or
moving a const qualifier or removing an unnecessary cast.  There are
many more complicated cases remaining.
2011-09-11 21:54:32 +03:00
Peter Eisentraut
52ce20589a Add missing format attributes
Add __attribute__ decorations for printf format checking to the places that
were missing them.  Fix the resulting warnings.  Add
-Wmissing-format-attribute to the standard set of warnings for GCC, so these
don't happen again.

The warning fixes here are relatively harmless.  The one serious problem
discovered by this was already committed earlier in
cf15fb5cab.
2011-09-10 23:12:46 +03:00
Tom Lane
6e1f1fee97 Actually, all of parallel restore's limitations should be tested earlier.
On closer inspection, whining in restore_toc_entries_parallel is really
much too late for any user-facing error case.  The right place to do it
is at the start of RestoreArchive(), before we've done anything interesting
(suh as trying to DROP all the targets ...)

Back-patch to 8.4, where parallel restore was introduced.
2011-08-28 22:27:48 -04:00
Tom Lane
d6e7abe45a Be more user-friendly about unsupported cases for parallel pg_restore.
If we are unable to do a parallel restore because the input file is stdin
or is otherwise unseekable, we should complain and fail immediately, not
after having done some of the restore.  Complaining once per thread isn't
so cool either, and the messages should be worded to make it clear this is
an unsupported case not some weird race-condition bug.  Per complaint from
Lonni Friedman.

Back-patch to 8.4, where parallel restore was introduced.
2011-08-28 21:48:58 -04:00
Tom Lane
6545a901aa Fix pg_restore's direct-to-database mode for standard_conforming_strings.
pg_backup_db.c contained a mini SQL lexer with which it tried to identify
boundaries between SQL commands, but that code was not designed to cope
with standard_conforming_strings, and would get the wrong answer if a
backslash immediately precedes a closing single quote in such a string,
as per report from Julian Mehnle.  The bug only affects direct-to-database
restores from archive files made with standard_conforming_strings = on.

Rather than complicating the code some more to try to fix that, let's just
rip it all out.  The only reason it was needed was to cope with COPY data
embedded into ordinary archive entries, which was a layout that was used
only for about the first three weeks of the archive format's existence,
and never in any production release of pg_dump.  Instead, just rely on the
archive file layout to tell us whether we're printing COPY data or not.

This bug represents a data corruption hazard in all releases in which
standard_conforming_strings can be turned on, ie 8.2 and later, so
back-patch to all supported branches.
2011-07-28 14:06:57 -04:00
Peter Eisentraut
c8e0c32119 Rename pg_dump --no-security-label to --no-security-labels
Other similar options also use the plural form.
2011-05-19 23:20:11 +03:00
Bruce Momjian
bf50caf105 pgindent run before PG 9.1 beta 1. 2011-04-10 11:42:00 -04:00
Tom Lane
1471a147f0 Fix SortTocFromFile() to cope with lines that are too long for its buffer.
The original coding supposed that a dump TOC file could never contain lines
longer than 1K.  The folly of that was exposed by a recent report from
Per-Olov Esgard.  We only really need to see the first dozen or two bytes
of each line, since we're just trying to read off the numeric ID at the
start of the line; so there's no need for a particularly huge buffer.
What there is a need for is logic to not process continuation bufferloads.

Back-patch to all supported branches, since it's always been like this.
2011-04-07 11:40:23 -04:00
Tom Lane
4cff100d73 Fix parallel pg_restore to handle comments on POST_DATA items correctly.
The previous coding would try to process all SECTION_NONE items in the
initial sequential-restore pass, which failed if they were dependencies of
not-yet-restored items.  Fix by postponing such items into the parallel
processing pass once we have skipped any non-PRE_DATA item.

Back-patch into 9.0; the original parallel-restore coding in 8.4 did not
have this bug, so no need to change it.

Report and diagnosis by Arnd Hannemann.
2011-02-18 13:11:45 -05:00
Peter Eisentraut
b313bca0af DDL support for collations
- collowner field
- CREATE COLLATION
- ALTER COLLATION
- DROP COLLATION
- COMMENT ON COLLATION
- integration with extensions
- pg_dump support for the above
- dependency management
- psql tab completion
- psql \dO command
2011-02-12 15:55:18 +02:00
Heikki Linnakangas
56d77c9e56 Silence compiler warning about uninitialized variable, noted by
Itagaki Takahiro
2011-01-24 08:28:35 +02:00
Heikki Linnakangas
7f508f1c6b Add 'directory' format to pg_dump. The new directory format is compatible
with the 'tar' format, in that untarring a tar format archive produces a
valid directory format archive.

Joachim Wieland and Heikki Linnakangas
2011-01-23 23:10:15 +02:00
Tom Lane
e2627258c3 Suppress possibly-uninitialized-variable warnings from gcc 4.5.
It appears that gcc 4.5 can issue such warnings for whole structs, not
just scalar variables as in the past.  Refactor some pg_dump code slightly
so that the OutputContext local variables are always initialized, even
if they won't be used.  It's cheap enough to not be worth worrying about.
2011-01-22 17:56:42 -05:00
Robert Haas
0d692a0dc9 Basic foreign table support.
Foreign tables are a core component of SQL/MED.  This commit does
not provide a working SQL/MED infrastructure, because foreign tables
cannot yet be queried.  Support for foreign table scans will need to
be added in a future patch.  However, this patch creates the necessary
system catalog structure, syntax support, and support for ancillary
operations such as COMMENT and SECURITY LABEL.

Shigeru Hanada, heavily revised by Robert Haas
2011-01-01 23:48:11 -05:00
Tom Lane
663fc32e26 Eliminate O(N^2) behavior in parallel restore with many blobs.
With hundreds of thousands of TOC entries, the repeated searches in
reduce_dependencies() become the dominant cost.  Get rid of that searching
by constructing reverse-dependency lists, which we can do in O(N) time
during the fix_dependencies() preprocessing.  I chose to store the reverse
dependencies as DumpId arrays for consistency with the forward-dependency
representation, and keep the previously-transient tocsByDumpId[] array
around to locate actual TOC entry structs quickly from dump IDs.

While this fixes the slow case reported by Vlad Arkhipov, there is still
a potential for O(N^2) behavior with sufficiently many tables:
fix_dependencies itself, as well as mark_create_done and
inhibit_data_for_failed_table, are doing repeated searches to deal with
table-to-table-data dependencies.  Possibly this work could be extended
to deal with that, although the latter two functions are also used in
non-parallel restore where we currently don't run fix_dependencies.

Another TODO is that we fail to parallelize restore of multiple blobs
at all.  This appears to require changes in the archive format to fix.

Back-patch to 9.0 where the problem was reported.  8.4 has potential issues
as well; but since it doesn't create a separate TOC entry for each blob,
it's at much less risk of having enough TOC entries to cause real problems.
2010-12-09 13:03:11 -05:00
Robert Haas
4d355a8336 Add a SECURITY LABEL command.
This is intended as infrastructure to support integration with label-based
mandatory access control systems such as SE-Linux. Further changes (mostly
hooks) will be needed, but this is a big chunk of it.

KaiGai Kohei and Robert Haas
2010-09-27 20:55:27 -04:00
Magnus Hagander
9f2e211386 Remove cvs keywords from all files. 2010-09-20 22:08:53 +02:00
Tom Lane
c5d6d5bc6d Improve parallel restore's ability to cope with selective restore (-L option).
The original coding tended to break down in the face of modified restore
orders, as shown in bug #5626 from Albert Ullrich, because it would flip over
into parallel-restore operation too soon.  That causes problems because we
don't have sufficient dependency information in dump archives to allow safe
parallel processing of SECTION_PRE_DATA items.  Even if we did, it's probably
undesirable to allow that to override the commanded restore order.

To fix the problem of omitted items causing unexpected changes in restore
order, tweak SortTocFromFile so that omitted items end up at the head of
the list not the tail.  This ensures that they'll be examined and their
dependencies will be marked satisfied before we get to any interesting
items.

In HEAD and 9.0, we can easily change restore_toc_entries_parallel so that
all SECTION_PRE_DATA items are guaranteed to be processed in the initial
serial-restore loop, and hence in commanded order.  Only DATA and POST_DATA
items are candidates for parallel processing.  For them there might be
variations from the commanded order because of parallelism, but we should
do it in a safe order thanks to dependencies.

In 8.4 it's much harder to make such a guarantee.  I settled for not
letting the initial loop break out into parallel processing mode if
it sees a DATA/POST_DATA item that's not to be restored; this at least
prevents a non-restorable item from causing premature exit from the loop.
This means that 8.4 will be more likely to fail given a badly-ordered -L
list than 9.x, but we don't really promise any such thing will work anyway.
2010-08-21 13:59:44 +00:00
Bruce Momjian
239d769e7e pgindent run for 9.0, second run 2010-07-06 19:19:02 +00:00
Tom Lane
04d9f4dab4 Improve pg_dump's checkSeek() function to verify the functioning of ftello
as well as fseeko, and to not assume that fseeko(fp, 0, SEEK_CUR) proves
anything.  Also improve some related comments.  Per my observation that
the SEEK_CUR test didn't actually work on some platforms, and subsequent
discussion with Robert Haas.

Back-patch to 8.4.  In earlier releases it's not that important whether
we get the hasSeek test right, but with parallel restore it matters.
2010-06-28 02:07:02 +00:00
Tom Lane
bd823e11fa Ensure that pg_restore -l will output DATABASE entries whether or not -C
is specified.  Per bug report from Russell Smith and ensuing discussion.
Since this is a corner case behavioral change, I'm going to be conservative
and not back-patch it.

In passing, also rename the RestoreOptions field for the -C switch to
something less generic than "create".
2010-05-15 21:41:16 +00:00
Robert Haas
33980a0640 Fix various instances of "the the".
Two of these were pointed out by Erik Rijkers; the rest I found.
2010-04-23 23:21:44 +00:00
Peter Eisentraut
2827516394 Also print the libpq error message when lo_create or lo_open fails 2010-03-18 20:00:51 +00:00
Bruce Momjian
65e806cba1 pgindent run for 9.0 2010-02-26 02:01:40 +00:00
Tom Lane
6a2e19d96d Fix patch for printing backend and pg_dump versions so that it works in
a desirable fashion in archive-dump cases, ie you should get the pg_dump
version not the pg_restore version.
2010-02-24 02:42:55 +00:00
Bruce Momjian
28cdf5f7ab Have pg_dump (-v) verbose mode output the pg_dump and server versions in
text output mode, like we do in custom output mode.

Jim Cox
2010-02-23 21:48:32 +00:00
Tom Lane
c0d5be5d6a Fix up pg_dump's treatment of large object ownership and ACLs. We now emit
a separate archive entry for each BLOB, and use pg_dump's standard methods
for dealing with its ownership, ACL if any, and comment if any.  This means
that switches like --no-owner and --no-privileges do what they're supposed
to.  Preliminary testing says that performance is still reasonable even
with many blobs, though we'll have to see how that shakes out in the field.

KaiGai Kohei, revised by me
2010-02-18 01:29:10 +00:00
Tom Lane
16f2eadfab When doing a parallel restore, we must guard against out-of-range dependency
dump IDs, because the array we're using is sized according to the highest
dump ID actually defined in the archive file.  In a partial dump there could
be references to higher dump IDs that weren't dumped.  Treat these the same
as references to in-range IDs that weren't dumped.  (The whole thing is a
bit scary because the missing objects might have been part of dependency
chains, which we won't know about.  Not much we can do though --- throwing
an error is probably overreaction.)

Also, reject parallel restore with pre-1.8 archive version (made by pre-8.0
pg_dump).  In these old versions the dependency entries are OIDs, not dump
IDs, and we don't have enough information to interpret them.

Per bug #5288 from Jon Erdman.
2010-01-19 18:39:19 +00:00
Itagaki Takahiro
84f910a707 Additional fixes for large object access control.
Use pg_largeobject_metadata.oid instead of pg_largeobject.loid
to enumerate existing large objects in pg_dump, pg_restore, and
contrib modules.
2009-12-14 00:39:11 +00:00
Tom Lane
249724cb01 Create an ALTER DEFAULT PRIVILEGES command, which allows users to adjust
the privileges that will be applied to subsequently-created objects.

Such adjustments are always per owning role, and can be restricted to objects
created in particular schemas too.  A notable benefit is that users can
override the traditional default privilege settings, eg, the PUBLIC EXECUTE
privilege traditionally granted by default for functions.

Petr Jelinek
2009-10-05 19:24:49 +00:00
Tom Lane
f033f6d28b Modify parallel pg_restore to track pending and ready items by means of
two new lists, rather than repeatedly rescanning the main TOC list.
This avoids a potential O(N^2) slowdown, although you'd need a *lot*
of tables to make that really significant; and it might simplify future
improvements in the scheduling algorithm by making the set of ready
items more easily inspectable.  The original thought that it would
in itself result in a more efficient job dispatch order doesn't seem
to have been borne out in testing, but it seems worth doing anyway.
2009-08-07 22:48:34 +00:00