This allows recovery to notice certain incorrect recovery scenarios.
If a server has recovered to point X on timeline 5, and you restart
recovery, it better be on timeline 5 when it reaches point X again, not on
some timeline with a higher ID. This can happen e.g if you a standby server
is shut down, a new timeline appears in the WAL archive, and the standby
server is restarted. It will try to follow the new timeline, which is wrong
because some WAL on the old timeline was already replayed before shutdown.
Requires an initdb (or at least pg_resetxlog), because this adds a field to
the control file.
storage.
Have pg_upgrade use it, and enable server options fsync=off and
full_page_writes=off.
Document that users turning fsync from off to on should run initdb
--sync-only.
[ Previous commit was incorrectly applied as a git merge. ]
binary-upgrade mode; instead only skip dumping the current user.
This bug was introduced in during the removal of split_old_dump(). Bug
discovered during local testing.
client encoding and the client encoding is not *safe* one. Such an
example is, file encoding is UTF-8 and client encoding SJIS. Patch
contributed by Jiang Guiqing.
--single-transaction to restore each database schema. This yields
performance improvements for databases with many tables. Also, remove
split_old_dump() as it is no longer needed.
Since we've already chdir'd into the data directory, the file should
be referenced as just "postmaster.pid", without prefixing the directory
path. This is harmless in the normal case where an absolute PGDATA path
is used, but quite dangerous if a relative path is specified, since the
program might then fail to notice an active postmaster.
Reported by Hari Babu. This got broken in my commit
eb5949d190, so patch all active versions.
Without this, the connection will be killed after timeout if
wal_sender_timeout is set in the server.
Original patch by Amit Kapila, modified by me to fit recent changes in the
code.
We used to send structs wrapped in CopyData messages, which works as long as
the client and server agree on things like endianess, timestamp format and
alignment. That's good enough for running a standby server, which has to run
on the same platform anyway, but it's useful for tools like pg_receivexlog
to work across platforms.
This breaks protocol compatibility of streaming replication, but we never
promised that to be compatible across versions, anyway.
Represent a sequence's current value as a separate TableDataInfo dumpable
object, so that it can be dumped within the data section of the archive
rather than in pre-data. This fixes an undesirable inconsistency between
the meanings of "--data-only" and "--section=data", and also fixes dumping
of sequences that are marked as extension configuration tables, as per a
report from Marko Kreen back in July. The main cost is that we do one more
SQL query per sequence, but that's probably not very meaningful in most
databases.
Back-patch to 9.1, since it has the extension configuration issue even
though not the --section switch.
In commit 4317e0246c, I accidentally broke
this behavior while rearranging code to ensure that --create wouldn't
affect whether a DATABASE entry gets put into archive-format output.
Thus, 9.2 would issue a DROP DATABASE command in --clean mode, which is
either useless or dangerous depending on the usage scenario.
It should not do that, and no longer does.
A bright spot is that this refactoring makes it easy to allow the
combination of --clean and --create to work sensibly, ie, emit DROP
DATABASE then CREATE DATABASE before reconnecting. Ordinarily we'd
consider that a feature addition and not back-patch it, but it seems
silly to not include the extra couple of lines required in the 9.2
version of the code.
Per report from Guillaume Lelarge, though this is slightly more extensive
than his proposed patch.
Don't leak a file descriptor if the file is empty or we can't read its size.
Expect there to be a newline at the end of the last line, too. If there
isn't, ignore anything after the last newline. This makes it a tiny bit
more robust in case the file is appended to concurrently, so that we don't
return the last line if it hasn't been fully written yet. And this makes
the code a bit less obscure, anyway. Per Tom Lane's suggestion.
Backpatch to all supported branches.
If postmaster changed postmaster.pid while pg_ctl was reading it, pg_ctl
could overrun the buffer it allocated for the file. Fix by reading the
whole file to memory with one read() call.
initdb contains an identical copy of the readfile() function, but the files
that initdb reads are static, not modified concurrently. Nevertheless, add
a simple bounds-check there, if only to silence static analysis tools.
Per report from Dave Vitek. Backpatch to all supported branches.
Numerous flex and bison make rules have appeared in the source tree
over time, and they are all virtually identical, so we can replace
them by pattern rules with some variables for customization.
Users of pgxs will also be able to benefit from this.
On some platforms these functions return NULL, rather than the more common
practice of returning a pointer to a zero-sized block of memory. Hack our
various wrapper functions to hide the difference by substituting a size
request of 1. This is probably not so important for the callers, who
should never touch the block anyway if they asked for size 0 --- but it's
important for the wrapper functions themselves, which mistakenly treated
the NULL result as an out-of-memory failure. This broke at least pg_dump
for the case of no user-defined aggregates, as per report from
Matthew Carrington.
Back-patch to 9.2 to fix the pg_dump issue. Given the lack of previous
complaints, it seems likely that there is no live bug in previous releases,
even though some of these functions were in place before that.
We had a number of variants on the theme of "malloc or die", with the
majority named like "pg_malloc", but by no means all. Standardize on the
names pg_malloc, pg_malloc0, pg_realloc, pg_strdup. Get rid of pg_calloc
entirely in favor of using pg_malloc0.
This is an essentially cosmetic change, so no back-patch. (I did find
a couple of places where psql and pg_dump were using plain malloc or
strdup instead of the pg_ versions, but they don't look significant
enough to bother back-patching.)
timeval.t_sec is of type time_t, which is not always compatible with long.
I'm not sure if this was just harmless warning or a real bug, but this
fixes it, anyway.
The tar output module did some very ugly and ultimately incorrect hacking
on COPY commands to try to get them to work in the context of restoring a
deconstructed tar archive. In particular, it would fail altogether for
table names containing any upper-case characters, since it smashed the
command string to lower-case before modifying it (and, just to add insult
to injury, did that in a way that would fail in multibyte encodings).
I don't see any particular value in being flexible about the case of the
command keywords, since the string will just have been created by
dumpTableData, so let's get rid of the whole case-folding thing.
Also, it doesn't seem to meet the POLA for the script to restore data only
in COPY mode, so add \i commands to make it have comparable behavior in
--inserts mode.
Noted while looking at the tar-output code in connection with Brian
Weaver's patch.
Both programs got the "magic" string wrong, causing standard-conforming tar
implementations to believe the output was just legacy tar format without
any POSIX extensions. This doesn't actually matter that much, especially
since pg_dump failed to fill the POSIX fields anyway, but still there is
little point in emitting tar format if we can't be compliant with the
standard. In addition, pg_dump failed to write the EOF marker correctly
(there should be 2 blocks of zeroes not just one), pg_basebackup put the
numeric group ID in the wrong place, and both programs had a pretty
brain-dead idea of how to compute the checksum. Fix all that and improve
the comments a bit.
pg_restore is modified to accept either the correct POSIX-compliant "magic"
string or the previous value. This part of the change will need to be
back-patched to avoid an unnecessary compatibility break when a previous
version tries to read tar-format output from 9.3 pg_dump.
Brian Weaver and Tom Lane
The same message is used in both pg_restore and pg_dump, and it's
confusing to output "restoring data for table xyz" when the user
is just doing a pg_dump.
Formerly it would only show them for relkinds 'r' and 'f' (plain tables
and foreign tables). However, as of 9.2, views can also have reloptions,
namely security_barrier. The relkind restriction seems pointless and
not at all future-proof, so just print reloptions whenever there are any.
In passing, make some cosmetic improvements to the code that pulls the
"tableinfo" fields out of the PGresult.
Noted and patched by Dean Rasheed, with adjustment for all relkinds by me.
Only warn when connecting to a newer server, since connecting to older
servers works pretty well nowadays. Also update the documentation a
little about current psql/server compatibility expectations.
If a view has circular dependencies, pg_dump splits it into a CREATE TABLE
and a CREATE RULE command to break the dependency loop. However, if the
view has reloptions, those options cannot be applied in the CREATE TABLE
command, because views and tables have different allowed reloptions so
CREATE TABLE would reject them. Instead apply the reloptions after the
CREATE RULE, using ALTER VIEW SET.
Replace unix_socket_directory with unix_socket_directories, which is a list
of socket directories, and adjust postmaster's code to allow zero or more
Unix-domain sockets to be created.
This is mostly a straightforward change, but since the Unix sockets ought
to be created after the TCP/IP sockets for safety reasons (better chance
of detecting a port number conflict), AddToDataDirLockFile needs to be
fixed to support out-of-order updates of data directory lockfile lines.
That's a change that had been foreseen to be necessary someday anyway.
Honza Horak, reviewed and revised by Tom Lane
Previously, the -1 option was silently ignored.
Also, emit an error if -1 is used in a context where it won't be
respected, to avoid user confusion.
Original patch by Fabien COELHO, but this version is quite different
from the original submission.
In particular, with a controlled shutdown of the master, pg_basebackup
with streaming log could terminate without an error message, even though
the backup is not consistent.
In passing, fix a few cases where walfile wasn't properly set to -1 after
closing.
Fujii Masao
The most user-visible part of this is to change the long options
--statusint and --noloop to --status-interval and --no-loop,
respectively, per discussion.
Also, consistently enclose file names in double quotes, per our
conventions; and consistently use the term "transaction log file" to
talk about WAL segments. (Someday we may need to go over this
terminology and make it consistent across the whole source code.)
Finally, reflow the code to better fit in 80 columns, and have pgindent
fix it up some more.