This patch makes libpq check the server's hostname against DNS names listed
in the X509 subjectAltName extension field in the server certificate. This
allows the same certificate to be used for multiple domain names. If there
are no SANs in the certificate, the Common Name field is used, like before
this patch. If both are given, the Common Name is ignored. That is a bit
surprising, but that's the behavior mandated by the relevant RFCs, and it's
also what the common web browsers do.
This also adds a libpq_ngettext helper macro to allow plural messages to be
translated in libpq. Apparently this happened to be the first plural message
in libpq, so it was not needed before.
Alexey Klyukin, with some kibitzing by me.
The code that tried to split a page at 75/25 ratio, when appending to the
end of an index, was buggy in two ways. First, there was a silly typo that
caused it to just fill the left page as full as possible. But the logic as
it was intended wasn't correct either, and would actually have given a ratio
closer to 60/40 than 75/25.
Gaetano Mendola spotted the typo. Backpatch to 9.4, where this code was added.
The code for raising a NUMERIC value to an integer power wasn't very
careful about large powers. It got an outright wrong answer for an
exponent of INT_MIN, due to failure to consider overflow of the Abs(exp)
operation; which is fixable by using an unsigned rather than signed
exponent value after that point. Also, even though the number of
iterations of the power-computation loop is pretty limited, it's easy for
the repeated squarings to result in ridiculously enormous intermediate
values, which can take unreasonable amounts of time/memory to process,
or even overflow the internal "weight" field and so produce a wrong answer.
We can forestall misbehaviors of that sort by bailing out as soon as the
weight value exceeds what will fit in int16, since then the final answer
must overflow (if exp > 0) or underflow (if exp < 0) the packed numeric
format.
Per off-list report from Pavel Stehule. Back-patch to all supported
branches.
When running vacuumdb --analyze-in-stages --all, it needs to run the
first stage across all databases before the second one, instead of
running all stages in a database before processing the next one.
Also respect the --quiet option with --analyze-in-stages.
Provide an option to skip NULL values in a row when generating a JSON
object from that row with row_to_json. This can reduce the size of the
JSON object in cases where columns are NULL without really reducing the
information in the JSON object.
This also makes row_to_json into a single function with default values,
rather than having multiple functions. In passing, change array_to_json
to also be a single function with default values (we don't add an
'ignore_nulls' option yet- it's not clear that there is a sensible
use-case there, and it hasn't been asked for in any case).
Pavel Stehule
Apparently, older versions of "prove" (couldn't identify the exact
version from the changelog) don't look into the t/ directory for tests
by default, so specify it explicitly.
Instead of palloc'ing each HashJoinTuple individually, allocate 32kB chunks
and pack the tuples densely in the chunks. This avoids the AllocChunk
header overhead, and the space wasted by standard allocator's habit of
rounding sizes up to the nearest power of two.
This doesn't contain any planner changes, because the planner's estimate of
memory usage ignores the palloc overhead. Now that the overhead is smaller,
the planner's estimates are in fact more accurate.
Tomas Vondra, reviewed by Robert Haas.
07c8651dd9 currently causes compilation errors on mscv (and
probably some other) compilers because our getopt_long()
implementation doesn't have support for optional_argument.
Thus implement optional_argument in our fallback implemenation. It's
quite possibly also useful in other cases.
Arguably this needs a configure check for optional_argument, but it
has existed pretty much since getopt_long() was introduced and thus
doesn't seem worth the configure runtime.
Normally I'd would not push a patch this fast, but this allows msvc to
build again and has low risk as only optional_argument behaviour has
changed.
Author: Michael Paquier and Andres Freund
Discussion: CAB7nPqS5VeedSCxrK=QouokbawgGKLpyc1Q++RRFCa_sjcSVrg@mail.gmail.com
The code I added in commit f343a880d5 was
careless about preserving AND/OR flatness: it could create a structure with
an OR node directly underneath another one. That breaks an assumption
that's fairly important for planning efficiency, not to mention triggering
various Asserts (as reported by Benjamin Smith). Add a trifle more logic
to handle the case properly.
Add --help=<topic> for the commandline, and \? <topic> as a backslash
command, to show more help than the invocations without parameters
do. "commands", "variables" and "options" currently exist as help
topics describing, respectively, backslash commands, psql variables,
and commandline switches. Without parameters the help commands show
their previous topic.
Some further wordsmithing or extending of the added help content might
be needed; but there seems little benefit delaying the overall feature
further.
Author: Pavel Stehule, editorialized by many
Reviewed-By: Andres Freund, Petr Jelinek, Fujii Masao, MauMau, Abhijit
Menon-Sen and Erik Rijkers.
Discussion: CAFj8pRDVGuC-nXBfe2CK8vpyzd2Dsr9GVpbrATAnZO=2YQ0s2Q@mail.gmail.com,
CAFj8pRA54AbTv2RXDTRxiAd8hy8wxmoVLqhJDRCwEnhdd7OUkw@mail.gmail.com
Previously, they functioned as barriers against CPU reordering but not
compiler reordering, an odd API that required extensive use of volatile
everywhere that spinlocks are used. That's error-prone and has negative
implications for performance, so change it.
In theory, this makes it safe to remove many of the uses of volatile
that we currently have in our code base, but we may find that there are
some bugs in this effort when we do. In the long run, though, this
should make for much more maintainable code.
Patch by me. Review by Andres Freund.
This provides a convenient method of classifying input values into buckets
that are not necessarily equal-width. It works on any sortable data type.
The choice of function name is a bit debatable, perhaps, but showing that
there's a relationship to the SQL standard's width_bucket() function seems
more attractive than the other proposals.
Petr Jelinek, reviewed by Pavel Stehule
The xml type previously rejected "content" that is empty or consists
only of spaces. But the SQL/XML standard allows that, so change that.
The accepted values for XML "documents" are not changed.
Reviewed-by: Ali Akbar <the.apaan@gmail.com>
Now that ALTER TABLE .. ALL IN TABLESPACE has replaced the previous
ALTER TABLESPACE approach, it makes sense to move the calls down in
to ProcessUtilitySlow where the rest of ALTER TABLE is handled.
This also means that event triggers will support ALTER TABLE .. ALL
(which was the impetus for the original change, though it has other
good qualities also).
Álvaro Herrera
Back-patch to 9.4 as the original rework was.
Some Sparc CPUs can be run in various coherence models, ranging from
RMO (relaxed) over PSO (partial) to TSO (total). Solaris has always
run CPUs in TSO mode while in userland, but linux didn't use to and
the various *BSDs still don't. Unfortunately the sparc TAS/S_UNLOCK
were only correct under TSO. Fix that by adding the necessary memory
barrier instructions. On sparcv8+, which should be all relevant CPUs,
these are treated as NOPs if the current consistency model doesn't
require the barriers.
Discussion: 20140630222854.GW26930@awork2.anarazel.de
Will be backpatched to all released branches once a few buildfarm
cycles haven't shown up problems. As I've no access to sparc, this is
blindly written.
psql's \s (print command history) doesn't work at all with recent libedit
versions when printing to the terminal, because libedit tries to do an
fchmod() on the target file which will fail if the target is /dev/tty.
(We'd already noted this in the context of the target being /dev/null.)
Even before that, it didn't work pleasantly, because libedit likes to
encode the command history file (to ensure successful reloading), which
renders it nigh unreadable, not to mention significantly different-looking
depending on exactly which libedit version you have. So let's forget using
write_history() for this purpose, and instead print the data ourselves,
using logic similar to that used to iterate over the history for newline
encoding/decoding purposes.
While we're at it, insert the ability to use the pager when \s is printing
to the terminal. This has been an acknowledged shortcoming of \s for many
years, so while you could argue it's not exactly a back-patchable bug fix
it still seems like a good improvement. Anyone who's seriously annoyed
at this can use "\s /dev/tty" or local equivalent to get the old behavior.
Experimentation with this showed that the history iteration logic was
actually rather broken when used with libedit. It turns out that with
libedit you have to use previous_history() not next_history() to advance
to more recent history entries. The easiest and most robust fix for this
seems to be to make a run-time test to verify which function to call.
We had not noticed this because libedit doesn't really need the newline
encoding logic: its own encoding ensures that command entries containing
newlines are reloaded correctly (unlike libreadline). So the effective
behavior with recent libedits was that only the oldest history entry got
newline-encoded or newline-decoded. However, because of yet other bugs in
history_set_pos(), some old versions of libedit allowed the existing loop
logic to reach entries besides the oldest, which means there may be libedit
~/.psql_history files out there containing encoded newlines in more than
just the oldest entry. To ensure we can reload such files, it seems
appropriate to back-patch this fix, even though that will result in some
incompatibility with older psql versions (ie, multiline history entries
written by a psql with this fix will look corrupted to a psql without it,
if its libedit is reasonably up to date).
Stepan Rutz and Tom Lane
Update the tab completion for the changes made in
3c4cf08087, which rework 'MOVE ALL' to be
'ALTER .. ALL IN TABLESPACE'.
Fujii Masao
Back-patch to 9.4, as the original change was.
Previously \watch could not display the query execution time even
when \timing was enabled because it used PSQLexec instead of
SendQuery and that function didn't support \timing. This patch
introduces PSQLexecWatch and changes \watch so as to use it, instead.
PSQLexecWatch is the function to run the query, print its results and
display how long it took (only when \timing is enabled).
This patch also changes --echo-hidden so that it doesn't print
the query that \watch executes. Since \watch cannot execute
backslash command queries, they should not be printed even
when --echo-hidden is set.
Patch by me, review by Heikki Linnakangas and Michael Paquier
The number of % parameter markers in RAISE statement should match the number
of parameters given. We used to check that at execution time, but we have
all the information needed at compile time, so let's check it at compile
time instead. It's generally better to find mistakes earlier.
Marko Tiikkaja, reviewed by Fabien Coelho
Every redo routine uses the same idiom to determine what to do to a page:
check if there's a backup block for it, and if not read, the buffer if the
block exists, and check its LSN. Refactor that into a common function,
XLogReadBufferForRedo, making all the redo routines shorter and more
readable.
This has no user-visible effect, and makes no changes to the WAL format.
Reviewed by Andres Freund, Alvaro Herrera, Michael Paquier.
The new %l substitution shows the line number inside a (potentially
multi-line) statement starting from one.
Author: Sawada Masahiko, heavily editorialized by me.
Reviewed-By: Jeevan Chalke, Alvaro Herrera
This patch allows us to execute ALTER SYSTEM RESET command to
remove the configuration entry from postgresql.auto.conf.
Vik Fearing, reviewed by Amit Kapila and me.
68a2e52bba has introduced LWLockAcquireCommon() containing the
previous contents of LWLockAcquire() plus added functionality. The
latter then calls it, just like LWLockAcquireWithVar(). Because the
majority of callers don't need the added functionality, declare the
common code as inline. The compiler then can optimize away the unused
code. Doing so is also useful when looking at profiles, to
differentiate the users.
Backpatch to 9.4, the first branch to contain LWLockAcquireCommon().
Since the dawn of time (aka Postgres95) multiple pins of the same
buffer by one backend have been optimized not to modify the shared
refcount more than once. This optimization has always used a NBuffer
sized array in each backend keeping track of a backend's pins.
That array (PrivateRefCount) was one of the biggest per-backend memory
allocations, depending on the shared_buffers setting. Besides the
waste of memory it also has proven to be a performance bottleneck when
assertions are enabled as we make sure that there's no remaining pins
left at the end of transactions. Also, on servers with lots of memory
and a correspondingly high shared_buffers setting the amount of random
memory accesses can also lead to poor cpu cache efficiency.
Because of these reasons a backend's buffers pins are now kept track
of in a small statically sized array that overflows into a hash table
when necessary. Benchmarks have shown neutral to positive performance
results with considerably lower memory usage.
Patch by me, review by Robert Haas.
Discussion: 20140321182231.GA17111@alap3.anarazel.de
The list of posting lists it's dealing with can contain placeholders for
deleted posting lists. The placeholders are kept around so that they can
be WAL-logged, but we must be careful to not try to access them.
This fixes bug #11280, reported by Mårten Svantesson. Backpatch to 9.4,
where the compressed data leaf page code was added.
This reverts commit e23014f3d4.
As the side effect of the reverted commit, when the unit is
specified, the reloption was stored in the catalog with the unit.
This broke pg_dump (specifically, it prevented pg_dump from
outputting restorable backup regarding the reloption) and
turned the buildfarm red. Revert the commit until the fixed
version is ready.
This is useful to allow to set GUCs to values that include spaces;
something that wasn't previously possible. The primary case motivating
this is the desire to set default_transaction_isolation to 'repeatable
read' on a per connection basis, but other usecases like seach_path do
also exist.
This introduces a slight backward incompatibility: Previously a \ in
an option value would have been passed on literally, now it'll be
taken as an escape.
The relevant mailing list discussion starts with
20140204125823.GJ12016@awork2.anarazel.de.
This introduces an infrastructure which allows us to specify the units
like ms (milliseconds) in integer relation option, like GUC parameter.
Currently only autovacuum_vacuum_cost_delay reloption can accept
the units.
Reviewed by Michael Paquier
Previously, only a single-byte character was allowed as an
escape. This patch allows it to be a multi-byte character, though it
still must be a single character.
Reviewed by Heikki Linnakangas and Tom Lane.
If SELECT FOR UPDATE NOWAIT tries to lock a tuple that is concurrently
being updated, it might fail to honor its NOWAIT specification and block
instead of raising an error.
Fix by adding a no-wait flag to EvalPlanQualFetch which it can pass down
to heap_lock_tuple; also use it in EvalPlanQualFetch itself to avoid
blocking while waiting for a concurrent transaction.
Authors: Craig Ringer and Thomas Munro, tweaked by Álvaro
http://www.postgresql.org/message-id/51FB6703.9090801@2ndquadrant.com
Per Thomas Munro in the course of his SKIP LOCKED feature submission,
who also provided one of the isolation test specs.
Backpatch to 9.4, because that's as far back as it applies without
conflicts (although the bug goes all the way back). To that branch also
backpatch Thomas Munro's new NOWAIT test cases, committed in master by
Heikki as commit 9ee16b49f0 .
In some cases, not all Vars were being correctly marked as having been
modified for updatable security barrier views, which resulted in invalid
plans (eg: when security barrier views were created over top of
inheiritance structures).
In passing, be sure to update both varattno and varonattno, as _equalVar
won't consider the Vars identical otherwise. This isn't known to cause
any issues with updatable security barrier views, but was noticed as
missing while working on RLS and makes sense to get fixed.
Back-patch to 9.4 where updatable security barrier views were
introduced.
Use SECURITY_LOCAL_USERID_CHANGE while building temporary tables;
only escalate to SECURITY_RESTRICTED_OPERATION while potentially
running user-supplied code. The more secure mode was preventing
temp table creation. Add regression tests to cover this problem.
This fixes Bug #11208 reported by Bruno Emanuel de Andrade Silva.
Backpatch to 9.4, where the bug was introduced.
Prevent automatic oid assignment when in binary upgrade mode. Also
throw an error when contrib/pg_upgrade_support functions are called when
not in binary upgrade mode.
This prevent automatically-assigned oids from conflicting with later
pre-assigned oids coming from the old cluster. It also makes sure oids
are preserved in call important cases.
There's no point in setting up a context error callback when doing
conditional lock acquisition, because we never actually wait and so the
user wouldn't be able to see the context message anywhere. In fact,
this is more in line with what ConditionalXactLockTableWait is doing.
Backpatch to 9.4, where this was added.
The symbol was added by 71901ab6d; the original code was introduced by
6868ed749. Development of both overlapped which is why we apparently
failed to notice.
This is a (very slight) behavior change, so I'm not backpatching this to
9.4 for now, even though the symbol does exist there.
Event triggers want to know the OID of the interesting object created,
which is the main type. The array created as part of the operation is
just a subsidiary object which is not of much interest.
Other DDL commands are already returning the OID, which is required for
future additional event trigger work. This is merely making these
commands in line with the rest of utility command support.
Add a succint comment explaining why it's correct to change the
persistence in this way. Also s/loggedness/persistence/ because native
speakers didn't like the latter term.
Fabrízio and Álvaro
CheckConstraintFetch() leaked a cstring in the caller's context for each
CHECK constraint expression it copied into the relcache. Ordinarily that
isn't problematic, but it can be during CLOBBER_CACHE testing because so
many reloads can happen during a single query; so complicate the code
slightly to allow freeing the cstring after use. Per testing on buildfarm
member barnacle.
This is exactly like the leak fixed in AttrDefaultFetch() by commit
078b2ed291. (Yes, this time I did look for
other instances of the same coding pattern :-(.) Like that patch, no
back-patch, since it seems unlikely that there's any problem except under
very artificial test conditions.
BTW, it strikes me that both of these places would require further work
comparable to commit ab8c84db2f, if we ever
supported defaults or check constraints on system catalogs: they both
assume they are copying into an empty relcache data structure, and that
conceivably wouldn't be the case during recursive reloading of a system
catalog. This does not seem worth worrying about for the moment, since
there is no near-term prospect of supporting any such thing. So I'll
just note the possibility for the archives' sake.
Add a note that some options can be specified multiple times to select
multiple objects to restore. This replaces the somewhat confusing use
of plurals in the option descriptions themselves.
This enables changing permanent (logged) tables to unlogged and
vice-versa.
(Docs for ALTER TABLE / SET TABLESPACE got shuffled in an order that
hopefully makes more sense than the original.)
Author: Fabrízio de Royes Mello
Reviewed by: Christoph Berg, Andres Freund, Thom Brown
Some tweaking by Álvaro Herrera
Cause the path extraction operators to return their lefthand input,
not NULL, if the path array has no elements. This seems more consistent
since the case ought to correspond to applying the simple extraction
operator (->) zero times.
Cause other corner cases in field/element/path extraction to return NULL
rather than failing. This behavior is arguably more useful than throwing
an error, since it allows an expression index using these operators to be
built even when not all values in the column are suitable for the
extraction being indexed. Moreover, we already had multiple
inconsistencies between the path extraction operators and the simple
extraction operators, as well as inconsistencies between the JSON and
JSONB code paths. Adopt a uniform rule of returning NULL rather than
throwing an error when the JSON input does not have a structure that
permits the request to be satisfied.
Back-patch to 9.4. Update the release notes to list this as a behavior
change since 9.3.
Previously, we would first create the symlinks the way they are in the
original system, and at the end replace them with the mapped symlinks.
That never really made much sense, so now we create the symlink pointing
to the correct location to begin with, so that there's no need to fix
them at the end.
The old coding didn't work correctly on Windows, because Windows junction
points look more like directories than files, and ought to be removed with
rmdir rather than unlink. Also, it incorrectly used "%d" rather than "%u"
to print an Oid, but that's gone now.
Report and patch by Amit Kapila, with minor changes by me. Reviewed by
MauMau. Backpatch to 9.4, where the --tablespace feature was added.
As 'ALTER TABLESPACE .. MOVE ALL' really didn't change the tablespace
but instead changed objects inside tablespaces, it made sense to
rework the syntax and supporting functions to operate under the
'ALTER (TABLE|INDEX|MATERIALIZED VIEW)' syntax and to be in
tablecmds.c.
Pointed out by Alvaro, who also suggested the new syntax.
Back-patch to 9.4.
We have had INT64_FORMAT and UINT64_FORMAT for a long time, but that's not
good enough if you want something more exotic, like "%20lld".
Abhijit Menon-Sen, per Andres Freund's suggestion.
Cover some cases I omitted before, such as null and empty-string
elements in the path array. This exposes another inconsistency:
json_extract_path complains about empty path elements but
jsonb_extract_path does not.
jsonb's #> operator segfaulted (dereferencing a null pointer) if the RHS
was a zero-length array, as reported in bug #11207 from Justin Van Winkle.
json's #> operator returns NULL in such cases, so for the moment let's
make jsonb act likewise.
Also add a bunch of regression test queries memorializing the -> and #>
operators' behavior for this and other corner cases.
There is a good argument for changing some of these behaviors, as they
are not very consistent with each other, and throwing an error isn't
necessarily a desirable behavior for operators that are likely to be
used in indexes. However, everybody can agree that a core dump is the
Wrong Thing, and we need test cases even if we decide to change their
expected output later.
While the space is optional, it seems nicer to be consistent with what
you get if you do "SET search_path=...". SET always normalizes the
separator to be comma+space.
Christoph Martin
This reverts commit 083d29c65b.
The commit changed the code so that it causes an errors when
IDENTIFY_SYSTEM returns three columns. But which prevents us
from using the replication-related utilities against the server
with older version. This is not what we want. For that
compatibility, we allow the utilities to receive three columns
as the result of IDENTIFY_SYSTEM eventhough it actually returns
four columns in 9.4 or later.
Pointed out by Andres Freund.
5a991ef869 added new column into
the result of IDENTIFY_SYSTEM command. But it was not reflected into
several codes checking that result. Specifically though the number of
columns in the result was increased to 4, it was still compared with 3
in some replication codes.
Back-patch to 9.4 where the number of columns in IDENTIFY_SYSTEM
result was increased.
Report from Michael Paquier
Programs need execute permission on a DLL file to load it. MSYS
"install" ignores the mode argument, and our Cygwin build statically
links libpq into programs. That explains the lack of buildfarm trouble.
Back-patch to 9.0 (all supported versions).
In support of this, have the MSVC build follow GNU make in preferring
GNUmakefile over Makefile when a directory contains both.
Michael Paquier, reviewed by MauMau.
strncmp() is a specialized API unsuited for routine copying into
fixed-size buffers. On a system where the length of a single filename
can exceed MAXPGPATH, the pg_archivecleanup change prevents a simple
crash in the subsequent strlen(). Few filesystems support names that
long, and calling pg_archivecleanup with untrusted input is still not a
credible use case. Therefore, no back-patch.
David Rowley
Move the functions within the file so that public interface functions come
first, followed by internal functions. Previously, be_tls_write was first,
then internal stuff, and finally the rest of the public interface, which
clearly didn't make much sense.
Per Andres Freund's complaint.
Commit f30015b6d7 made this happen for
timestamp and timestamptz, but it seems pretty inconsistent to not
do it for simple dates as well.
(In passing, I re-pgindent'd json.c.)
PG_RETURN_BOOL() should only be used in functions following the V1 SQL
function API. This coding accidentally fails to fail since letting the
compiler coerce the Datum representation of bool back to plain bool
does give the right answer; but that doesn't make it a good idea.
Back-patch to older branches just to avoid unnecessary code divergence.
Make lists of the names of all operators that are claimed to be commutator
pairs or negator pairs. This is analogous to the existing queries that
make lists of all operator names appearing in particular opclass strategy
slots. Unexpected additions to these lists are likely to be mistakes; had
we had these queries in place before, bug #11178 might've been prevented.
<@ and @> are each other's commutators, but they were incorrectly marked
as being each other's negators instead. (This was actually questioned
in a comment in the original commit, but nobody followed through :-(.)
Per bug #11178 from Christian Pronovost.
In passing, fix some JSONB operator descriptions that were randomly
different from the phrasing of every other similar description.
catversion bump for pg_catalog contents change.
When the TAP tests are run in-tree (make check), set the shared library
path using the appropriate environment variable, using a logic similar
to pg_regress, so that the right libraries are used.
This provides a small but worthwhile speedup when sorting text, at least
in cases to which the sortsupport machinery applies.
Robert Haas and Peter Geoghegan
Previously the help message described that -m is an option for
"stop", "restart" and "promote" commands in pg_ctl. But actually
that's not an option for "promote". So this commit fixes that
incorrect description in the help message.
Back-patch to 9.3 where the incorrect description was added.
parseRelOptions() tended to leak memory in the caller's context. Most
of the time this doesn't really matter since the caller's context is
at most query-lifespan, and the function won't be invoked very many times.
However, when testing with CLOBBER_CACHE_RECURSIVELY, the same relcache
entry can get rebuilt a *lot* of times in one query, leading to significant
intraquery memory bloat if it has any reloptions. Noted while
investigating a related report from Tomas Vondra.
In passing, get rid of some Asserts that are redundant with the one
done by deconstruct_array().
As with other patches to avoid leaks in CLOBBER_CACHE testing, it doesn't
really seem worth back-patching this.
When replacing rd_indexlist, rd_indexattr, etc, we neglected to pfree any
old value of these fields. Under ordinary circumstances, the old value
would always be NULL, so this seemed reasonable enough. However, in cases
where we're rebuilding a system catalog's relcache entry and another cache
flush occurs on that same catalog meanwhile, it's possible for the field to
not be NULL when we return to the outer level, because we already refilled
it while recovering from the inner flush. This leads to a fairly small
session-lifespan leak in CacheMemoryContext. In real-world usage the leak
would be too small to notice; but in testing with CLOBBER_CACHE_RECURSIVELY
the leakage can add up to the point of causing OOM failures, as reported by
Tomas Vondra.
The issue has been there a long time, but it only seems worth fixing in
HEAD, like the previous fix in this area (commit 078b2ed291).
This option is equivalent to --slot option which pg_receivexlog has
already supported, which specifies the replication slot to use for
WAL streaming. pg_recvlogical has already supported both options,
and this commit makes pg_receivexlog consistent with pg_recvlogical
regarding the slot option.
Back-patch to 9.4 where the slot option was added.
Michael Paquier
Some error messages complained about --init and --stop being used
whereas the --create and --drop are the correct verbs. Fix that.
Also a XLogRecPtr was tested in a boolean fashion instead of being
compared to InvalidXLogRecPtr.
Backpatch to 9.4 where pg_recvlogical was introduced.
Michael Paquier
When doing logical decoding using START_LOGICAL_REPLICATION in a
walsender process the walsender sometimes was sending out keepalive
messages too frequently. Asking for feedback every time.
WalSndWaitForWal() sends out keepalive messages when it's waiting for
new WAL to be generated locally when it sees that the remote side
hasn't yet flushed WAL up to the local position. That generally is
good but causes problems if the remote side only writes but doesn't
flush changes yet. So check for both remote write and flush position.
Additionally we've asked for feedback to the keepalive message which
isn't warranted when waiting for WAL in contrast to preventing
timeouts because of wal_sender_timeout.
Complaint and patch by Steve Singer.
When both postgresql.conf and postgresql.auto.conf have their own entry of
the same parameter, PostgreSQL uses the entry in postgresql.auto.conf because
it appears last in the configuration scan. IOW, the other entries which appear
earlier are ignored. But, previously, ProcessConfigFile() detected the invalid
settings of even those unused entries and emitted the error messages
complaining about them, at postmaster startup. Complaining about the entries
to ignore is basically useless.
This problem happened because ProcessConfigFile() was called twice at
postmaster startup and the first call read only postgresql.conf. That is, the
first call could check the entry which might be ignored eventually by
the second call which read both postgresql.conf and postgresql.auto.conf.
To work around the problem, this commit changes ProcessConfigFile so that
its first call processes only data_directory and the second one does all the
entries. It's OK to process data_directory in the first call because it's
ensured that data_directory doesn't exist in postgresql.auto.conf.
Back-patch to 9.4 where postgresql.auto.conf was added.
Patch by me. Review by Amit Kapila
This commit also changes tab-completion for \set so that it displays
all the special variables like COMP_KEYWORD_CASE. Previously it displayed
only variables having the set values. Which was not user-friendly for
those who want to set the unset variables.
This commit also changes tab-completion for :variable so that only the
variables having the set values are displayed. Previously even unset
variables were displayed.
Pavel Stehule, modified by me.
This refactoring is in preparation for adding support for other SSL
implementations, with no user-visible effects. There are now two #defines,
USE_OPENSSL which is defined when building with OpenSSL, and USE_SSL which
is defined when building with any SSL implementation. Currently, OpenSSL is
the only implementation so the two #defines go together, but USE_SSL is
supposed to be used for implementation-independent code.
The libpq SSL code is changed to use a custom BIO, which does all the raw
I/O, like we've been doing in the backend for a long time. That makes it
possible to use MSG_NOSIGNAL to block SIGPIPE when using SSL, which avoids
a couple of syscall for each send(). Probably doesn't make much performance
difference in practice - the SSL encryption is expensive enough to mask the
effect - but it was a natural result of this refactoring.
Based on a patch by Martijn van Oosterhout from 2006. Briefly reviewed by
Alvaro Herrera, Andreas Karlsson, Jeff Janes.
There's actually no need for any special case for unknown-type literals,
since we only need to push the value through its output function and
unknownout() works fine. The code that was here was completely bizarre
anyway, and would fail outright in cases that should work, not to mention
suffering from some copy-and-paste bugs.
Fix an obvious typo in json_build_object()'s complaint about invalid
number of arguments, and make the errhint a bit more sensible too.
Per discussion about how to word the improved hint, change the few places
in the documentation that refer to JSON object field names as "names" to
say "keys" instead, since that's what we've said in the vast majority of
places in the docs. Arguably "name" is more correct, since that's the
terminology used in RFC 7159; but we're stuck with "key" in view of the
naming of json_object_keys() so let's at least be self-consistent.
I adjusted a few code comments to match this as well, and failed to
resist the temptation to clean up some odd whitespace choices in the
same area, as well as a useless duplicate PG_ARGISNULL() check. There's
still quite a bit of code that uses the phrase "field name" in non-user-
visible ways, so I left those usages alone.
Such cases are disallowed by the SQL spec, and even if we wanted to allow
them, the semantics seem ambiguous: how should the FK columns be matched up
with the columns of a unique index? (The matching could be significant in
the presence of opclasses with different notions of equality, so this issue
isn't just academic.) However, our code did not previously reject such
cases, but instead would either fail to match to any unique index, or
generate a bizarre opclass-lookup error because of sloppy thinking in the
index-matching code.
David Rowley
This allows us to specify the maximum time to issue fsync to ensure
the received WAL file is safely flushed to disk. Without this,
pg_receivexlog always flushes WAL file only when it's closed and
which can cause WAL data to be lost at the event of a crash.
Furuya Osamu, heavily modified by me.
Previously, TOAST tables only required in the new cluster could cause
oid conflicts if they were auto-numbered and a later conflicting oid had
to be assigned.
Backpatch through 9.3
Based on the old comment, it took me a while to figure out what the
problem was. The importnat detail is that SSL_read() can return WANT_READ
even though some raw data was received from the socket.
This could be useful for datatypes like text, where we might want
to optimize for some collations but not others. However, this patch
doesn't introduce any new sortsupport functions that work this way;
it merely revises the code so that future patches may do so.
Patch by me. Review by Peter Geoghegan.
Previously the source codes for processing the received data and handling
the end of stream were included in pg_receivexlog main loop. This commit
splits out them as separate functions. This is useful for improving the
readability of main loop code and making the future pg_receivexlog-related
patch simpler.
When more than one setting entries of same parameter exist in the
configuration file, PostgreSQL uses only entry appearing last in
configuration file scan. Since the other entries are not used,
ParseConfigFp() doesn't need to process them, but previously it did
that. This problematic behavior caused the configuration file scan
to detect invalid settings of unused entries (e.g., existence of
multiple entries of PGC_POSTMASTER parameter) and log the messages
complaining about them.
This commit changes the configuration file scan so that it processes
only last entry of each parameter.
Note that when multiple entries of same parameter exist both in
postgresql.conf and postgresql.auto.conf, unused entries in
postgresql.conf are still processed only at postmaster startup.
The problem has existed since old version, but a user is more likely
to encounter it since 9.4 where ALTER SYSTEM command was introduced.
So back-patch to 9.4.
Amit Kapila, slightly modified by me. Per report from Christoph Berg.
In 9.2, pg_receivexlog with verbose option has emitted the messages
at the end of each WAL file. But the commit 0b63291 suppressed such
messages by mistake. This commit fixes the bug so that pg_receivexlog
--verbose outputs such messages again.
Back-patch to 9.3 where the bug was added.
log_newpage is used by many indexams, in addition to heap, but for
historical reasons it's always been part of the heapam rmgr. Starting with
9.3, we have another WAL record type for logging an image of a page,
XLOG_FPI. Simplify things by moving log_newpage and log_newpage_buffer to
xlog.c, and switch to using the XLOG_FPI record type.
Bump the WAL version number because the code to replay the old HEAP_NEWPAGE
records is removed.
When autovacuum is nominally off, we will still launch autovac workers
to vacuum tables that are at risk of XID wraparound. But after we'd done
that, an autovac worker would proceed to autovacuum every table in the
targeted database, if they meet the usual thresholds for autovacuuming.
This is at best pretty unexpected; at worst it delays response to the
wraparound threat. Fix it so that if autovacuum is nominally off, we
*only* do forced vacuums and not any other work.
Per gripe from Andrey Zhidenkov. This has been like this all along,
so back-patch to all supported branches.
InitProcess() relies on IsBackgroundWorker to decide whether the PGPROC
for a new backend should be taken from ProcGlobal's freeProcs or from
bgworkerFreeProcs. In EXEC_BACKEND builds, InitProcess() is called
sooner than in non-EXEC_BACKEND builds, and IsBackgroundWorker wasn't
getting initialized soon enough.
Report by Noah Misch. Diagnosis and fix by me.
Commit 0ac5ad5134 removed an optimization in multixact.c that skipped
fetching members of MultiXactId that were older than our
OldestVisibleMXactId value. The reason this was removed is that it is
possible for multixacts that contain updates to be older than that
value. However, if the caller is certain that the multi does not
contain an update (because the infomask bits say so), it can pass this
info down to GetMultiXactIdMembers, enabling it to use the old
optimization.
Pointed out by Andres Freund in 20131121200517.GM7240@alap2.anarazel.de
Testing for abortedness of a multixact member that's being frozen is
unnecessary: we only need to know whether the transaction is still in
progress or committed to determine whether it must be kept or not. This
let us simplify the code a bit and avoid a useless TransactionIdDidAbort
test.
Suggested by Andres Freund awhile back.
There were several oversights in recovery code where COMMIT/ABORT PREPARED
records were ignored:
* pg_last_xact_replay_timestamp() (wasn't updated for 2PC commits)
* recovery_min_apply_delay (2PC commits were applied immediately)
* recovery_target_xid (recovery would not stop if the XID used 2PC)
The first of those was reported by Sergiy Zuban in bug #11032, analyzed by
Tom Lane and Andres Freund. The bug was always there, but was masked before
commit d19bd29f07, because COMMIT PREPARED
always created an extra regular transaction that was WAL-logged.
Backpatch to all supported versions (older versions didn't have all the
features and therefore didn't have all of the above bugs).
findDependencyLoops() was not bright about cases where there are multiple
dependency paths between the same two dumpable objects. In most scenarios
this did not hurt us too badly; but since the introduction of section
boundary pseudo-objects in commit a1ef01fe16,
it was possible for this code to take unreasonable amounts of time (tens
of seconds on a database with a couple thousand objects), as reported in
bug #11033 from Joe Van Dyk. Joe's particular problem scenario involved
"pg_dump -a" mode with long chains of foreign key constraints, but I think
that similar problems could arise with other situations as long as there
were enough objects. To fix, add a flag array that lets us notice when we
arrive at the same object again while searching from a given start object.
This simple change seems to be enough to eliminate the performance problem.
Back-patch to 9.1, like the patch that introduced section boundary objects.
This return code is possible wherever we pass bAlertable = TRUE; it
arises when Windows caused the current thread to run an "I/O completion
routine" or an "asynchronous procedure call". PostgreSQL does not
provoke either of those Windows facilities, hence this bug remaining
largely unnoticed, but other local code might do so. Due to a shortage
of complaints, no back-patch for now.
Per report from Shiv Shivaraju Gowda, this bug can cause
PGSemaphoreLock() to PANIC. The bug can also cause select() to report
timeout expiration too early, which might confuse pgstat_init() and
CheckRADIUSAuth().
shm_mq_send_bytes didn't invariably initialize *bytes_written before
returning, which would cause shm_mq_send to read from uninitialized
memory and add the value it found there to mqh->mqh_partial_bytes.
This could cause the next attempt to send a message via the queue to
fail an assertion (if the queue was detached) or copy data from a
garbage pointer value into the queue (if non-blocking mode was in use).
Nothing in the checkpointer calls InitXLOGAccess(), so WALInsertLocks
never got initialized there. Without EXEC_BACKEND, it works anyway
because the correct value is inherited from the postmaster, but
with EXEC_BACKEND we've got a problem. The problem appears to have
been introduced by commit 68a2e52bba.
To fix, move the relevant initialization steps from InitXLOGAccess()
to XLOGShmemInit(), making this more parallel to what we do
elsewhere.
Amit Kapila
Ephemeral slots - slots that shouldn't survive database restarts -
weren't properly cleaned up after a immediate/crash restart. They were
ignored in the sense that they weren't restored into memory and thus
didn't cause unwanted resource retention; but they prevented a new
slot with the same name from being created.
Now ephemeral slots are fully removed during startup.
Backpatch to 9.4 where replication slots where added.
The problem is that pg_receivexlog calls select(2) with timeout=0 and
goes into busy loop when --status-interval option is set to 0. This bug
was introduced by the commit,
74cbe966fe.
Per report from Sawada Masahiko
This is consistent with the POSIX verdict that kill() shall not report
ESRCH for a zombie process. Back-patch to 9.0 (all supported versions).
Test code from commit d7cdf6ee36 depends
on it, and log messages about kill() reporting "Invalid argument" will
cease to appear for this not-unexpected condition.