It appears that gcc 4.5 can issue such warnings for whole structs, not
just scalar variables as in the past. Refactor some pg_dump code slightly
so that the OutputContext local variables are always initialized, even
if they won't be used. It's cheap enough to not be worth worrying about.
With hundreds of thousands of TOC entries, the repeated searches in
reduce_dependencies() become the dominant cost. Get rid of that searching
by constructing reverse-dependency lists, which we can do in O(N) time
during the fix_dependencies() preprocessing. I chose to store the reverse
dependencies as DumpId arrays for consistency with the forward-dependency
representation, and keep the previously-transient tocsByDumpId[] array
around to locate actual TOC entry structs quickly from dump IDs.
While this fixes the slow case reported by Vlad Arkhipov, there is still
a potential for O(N^2) behavior with sufficiently many tables:
fix_dependencies itself, as well as mark_create_done and
inhibit_data_for_failed_table, are doing repeated searches to deal with
table-to-table-data dependencies. Possibly this work could be extended
to deal with that, although the latter two functions are also used in
non-parallel restore where we currently don't run fix_dependencies.
Another TODO is that we fail to parallelize restore of multiple blobs
at all. This appears to require changes in the archive format to fix.
Back-patch to 9.0 where the problem was reported. 8.4 has potential issues
as well; but since it doesn't create a separate TOC entry for each blob,
it's at much less risk of having enough TOC entries to cause real problems.
to make it easier to reuse that code. There is no user-visible changes.
This is in preparation for the patch to add a new archive format, a directory,
to perform a custom-like dump but with each table being dumped to a separate
file (that in turn is a prerequisite for parallel pg_dump). This also makes it
easier to add new compression methods in the future, and makes the
pg_backup_custom.c code easier to read, when the compression-related code is
factored out.
Joachim Wieland, with heavy editorialization by me.
a separate archive entry for each BLOB, and use pg_dump's standard methods
for dealing with its ownership, ACL if any, and comment if any. This means
that switches like --no-owner and --no-privileges do what they're supposed
to. Preliminary testing says that performance is still reasonable even
with many blobs, though we'll have to see how that shakes out in the field.
KaiGai Kohei, revised by me
two new lists, rather than repeatedly rescanning the main TOC list.
This avoids a potential O(N^2) slowdown, although you'd need a *lot*
of tables to make that really significant; and it might simplify future
improvements in the scheduling algorithm by making the set of ready
items more easily inspectable. The original thought that it would
in itself result in a more efficient job dispatch order doesn't seem
to have been borne out in testing, but it seems worth doing anyway.
The previous implementation got it right in most cases but failed in one:
if you pg_dump into an archive with standard_conforming_strings enabled, then
pg_restore to a script file (not directly to a database), the script will set
standard_conforming_strings = on but then emit large object data as
nonstandardly-escaped strings.
At the moment the code is made to emit hex-format bytea strings when dumping
to a script file. We might want to change to old-style escaping for backwards
compatibility, but that would be slower and bulkier. If we do, it's just a
matter of reimplementing appendByteaLiteral().
This has been broken for a long time, but given the lack of field complaints
I'm not going to worry about back-patching.
post-data step is run in a separate worker child (a thread on Windows, a child
process elsewhere) up to the concurrent number specified by the new pg_restore
command-line --multi-thread | -m switch.
Andrew Dunstan, with some editing by Tom Lane.
and standard_conforming_strings; likewise for the other client programs
that need it. As per previous discussion, a pg_dump dump now conforms
to the standard_conforming_strings setting of the source database.
We don't use E'' syntax in the dump, thereby improving portability of
the SQL. I added a SET escape_strings_warning = off command to keep
the dumps from getting a lot of back-chatter from that.
> found in a pg_dump archive. It had problems with dollar-quote tags
broken across bufferload boundaries (this may explain bug report from
Rod Taylor), also with dollar-quote literals of the form $a$a$...,
and was also confused about the rules for backslash in double quoted
identifiers (ie, they're not special). Also put in placeholder support
for E'...' literals --- this will need more work later.
using the recently added lo_create() function. The restore logic in
pg_restore is greatly simplified as well, since there's no need anymore
to try to adjust database references to match a new set of blob OIDs.
multiline command or to rerun the command easily later.
Whereas displaying the failed SQL command is a matter of fixing the
error
messages.
The latter is complicated by failed COPY commands which, with
die-on-errors
off, results in the data being processed as a command, so dumping the
command will dump all of the data.
In the case of long commands, should the whole command be dumped? eg.
(eg.
several pages of function definition).
In the case of the COPY command, I'm not sure what to do. Obviously, it
would be best to avoid sending the data, but the data and command are
combined (from memory). Also, the 'data' may be in the form of INSERT
statements.
Attached patch produces the first 125 chars of the command:
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC Entry 26; 1255 16449270
FUNCTION
plpgsql_call_handler() pjw
pg_restore: [archiver (db)] could not execute query: ERROR: function
"plpgsql_call_handler" already exists with same argument types
Command was: CREATE FUNCTION plpgsql_call_handler() RETURNS
language_handler
AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_call_han...
pg_restore: [archiver (db)] Error from TOC Entry 27; 1255 16449271
FUNCTION
plpgsql_validator(oid) pjw
pg_restore: [archiver (db)] could not execute query: ERROR: function
"plpgsql_validator" already exists with same argument types
Command was: CREATE FUNCTION plpgsql_validator(oid) RETURNS void
AS '/var/lib/pgsql-8.0b1/lib/plpgsql', 'plpgsql_validator'
LANGU...
Philip Warner
will treat any unquoted string that starts with a $ and has no preceding
identifier chars as a potential $-quote tag, it then makes sure that the
tag chars are valid. If so, it processes the $-quote.
Philip Warner
errors. This is the second submission, which integrates Tom comments about
localisation and exit code. I also added some comments about one sql
command which is not ignored.
Fabien COELHO
WITH/WITHOUT OIDS in dump files. This makes dump files more portable.
I have updated the pg_dump version so old binary dumps will load fine.
Pre-7.5 dumps use WITHOUT OIDS in SQL were needed, so they should be
fine.
any restore operation, thereby ensuring that dumped data is interpreted
the same way it was dumped even if the target database has a different
encoding. Per suggestions from Pavel Stehule and others. Also,
simplify scheme for handling check_function_bodies ... we may as well
just set that at the head of the script.
pg_depend to determine a safe dump order. Defaults and check constraints
can be emitted either as part of a table or domain definition, or
separately if that's needed to break a dependency loop. Lots of old
half-baked code for controlling dump order removed.
- Fix handling of {data/schema}-only restores when using a full
backup file; prior version was restoring schema in data-only
restores. Added enum to make code easier to understand.