The setting values of some parameters including max_worker_processes
must be equal to or higher than the values on the master. However,
previously max_worker_processes was not listed as such parameter
in the document. So this commit adds it to that list.
Back-patch to 9.4 where max_worker_processes was added.
The default argument, if given, has to be of exactly the same datatype
as the first argument; but this was not stated in so many words, and
the error message you get about it might not lead your thought in the
right direction. Per bug #13587 from Robert McGehee.
A quick scan says that these are the only two built-in functions with two
anyelement arguments and no other polymorphic arguments. There are plenty
of cases of, eg, anyarray and anyelement, but those seem less likely to
confuse. For instance this doesn't seem terribly hard to figure out:
"function array_remove(integer[], numeric) does not exist". So I've
contented myself with fixing these two cases.
Although commit 79af9a1d2 was initially applied to HEAD only, we later
back-patched the change into all branches (commits 6bbf75192 et al).
So it's not a new behavior in 9.5 and should not be release-noted here.
This behavior wasn't documented, but it should be because it's user-visible
in triggers and other functions executed on the remote server.
Per question from Adam Fuchs.
Back-patch to 9.3 where postgres_fdw was added.
The table-rewriting forms of ALTER TABLE are MVCC-unsafe, in much the same
way as TRUNCATE, because they replace all rows of the table with newly-made
rows with a new xmin. (Ideally, concurrent transactions with old snapshots
would continue to see the old table contents, but the data is not there
anymore --- and if it were there, it would be inconsistent with the table's
updated rowtype, so there would be serious implementation problems to fix.)
This was nowhere documented though, and the problem was only documented for
TRUNCATE in a note in the TRUNCATE reference page. Create a new "Caveats"
section in the MVCC chapter that can be home to this and other limitations
on serializable consistency.
In passing, fix a mistaken statement that VACUUM and CLUSTER would reclaim
space occupied by a dropped column. They don't reconstruct existing tuples
so they couldn't do that.
Back-patch to all supported branches.
Reduce lock levels down to ShareUpdateExclusiveLock for all autovacuum-related
relation options when setting them using ALTER TABLE.
Add infrastructure to allow varying lock levels for relation options in later
patches. Setting multiple options together uses the highest lock level required
for any option. Works for both main and toast tables.
Fabrízio Mello, reviewed by Michael Paquier, mild edit and additional regression
tests from myself
Fix docs build failure introduced by commit 6fcd88511f.
I failed to resist the temptation to rearrange the description of
pg_create_physical_replication_slot(), too.
When creating a physical slot it's often useful to immediately reserve
the current WAL position instead of only doing after the first feedback
message arrives. That e.g. allows slots to guarantee that all the WAL
for a base backup will be available afterwards.
Logical slots already have to reserve WAL during creation, so generalize
that logic into being usable for both physical and logical slots.
Catversion bump because of the new parameter.
Author: Gurjeet Singh
Reviewed-By: Andres Freund
Discussion: CABwTF4Wh_dBCzTU=49pFXR6coR4NW1ynb+vBqT+Po=7fuq5iCw@mail.gmail.com
There's no reason not to expose both restart_lsn and confirmed_flush
since they have rather distinct meanings. The former is the oldest WAL
still required and valid for both physical and logical slots, whereas
the latter is the location up to which a logical slot's consumer has
confirmed receiving data. Most of the time a slot will require older
WAL (i.e. restart_lsn) than the confirmed
position (i.e. confirmed_flush_lsn).
Author: Marko Tiikkaja, editorialized by me
Discussion: 559D110B.1020109@joh.to
Immediately starting to stream after --create-slot is inconvenient in a
number of situations (e.g. when configuring a slot for use in
recovery.conf) and it's easy to just call pg_receivexlog twice in the
rest of the cases.
Author: Michael Paquier
Discussion: CAB7nPqQ9qEtuDiKY3OpNzHcz5iUA+DUX9FcN9K8GUkCZvG7+Ew@mail.gmail.com
Backpatch: 9.5, where the option was introduced
The allowed syntax for OVERLAPS, viz "row OVERLAPS row", is sufficiently
constrained that we don't actually need a precedence declaration for
OVERLAPS; indeed removing this declaration does not change the generated
gram.c file at all. Let's remove it to avoid confusion about whether
OVERLAPS has precedence or not. If we ever generalize what we allow for
OVERLAPS, we might need to put back a precedence declaration for it,
but we might want some other level than what it has today --- and leaving
the declaration there would just risk confusion about whether that would
be an incompatible change.
Likewise, remove OVERLAPS from the documentation's precedence table.
Per discussion with Noah Misch. Back-patch to 9.5 where we hacked up some
nearby precedence decisions.
Amit reviewed the replication origins patch and made some good
points. Address them. This fixes typos in error messages, docs and
comments and adds a missing error check (although in a
should-never-happen scenario).
Discussion: CAA4eK1JqUBVeWWKwUmBPryFaje4190ug0y-OAUHWQ6tD83V4xg@mail.gmail.com
Backpatch: 9.5, where replication origins were introduced.
Make it more clear that bgw_main is usually not what you want. Put the
background worker flags in a variablelist rather than having them as
part of a paragraph. Explain important limits on how bgw_main_arg can
be used.
Craig Ringer, substantially revised by me.
On Windows, use listen_address=127.0.0.1 to allow TCP connections. We were
already using "pg_regress --config-auth" to set up HBA appropriately. The
standard_initdb helper function now sets up the server's
unix_socket_directories or listen_addresses in the config file, so that
they don't need to be specified in the pg_ctl command line anymore. That
way, the pg_ctl invocations in test programs don't need to differ between
Windows and Unix.
Add another helper function to configure the server's pg_hba.conf to allow
replication connections. The configuration is done similarly to "pg_regress
--config-auth": trust on domain sockets on Unix, and SSPI authentication on
Windows.
Replace calls to "cat" and "touch" programs with built-in perl code, as
those programs don't normally exist on Windows.
Add instructions in the docs on how to install IPC::Run on Windows. Adjust
vcregress.pl to not replace PERL5LIB completely in vcregress.pl, because
otherwise cannot install IPC::Run in a non-standard location easily.
Michael Paquier, reviewed by Noah Misch, some additional tweaking by me.
This option specifies a replication slot for WAL streaming (-X stream),
so that there can be continuous replication slot use between WAL
streaming during the base backup and the start of regular streaming
replication.
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
CreatePolicy() and AlterPolicy() omit to create a pg_shdepend entry for
each role in the TO clause. Fix this by creating a new shared dependency
type called SHARED_DEPENDENCY_POLICY and assigning it to each role.
Reported by Noah Misch. Patch by me, reviewed by Alvaro Herrera.
Back-patch to 9.5 where RLS was introduced.
Although initdb has long discouraged use of a filesystem mount-point
directory as a PG data directory, this point was covered nowhere in the
user-facing documentation. Also, with the popularity of pg_upgrade,
we really need to recommend that the PG user own not only the data
directory but its parent directory too. (Without a writable parent
directory, operations such as "mv data data.old" fail immediately.
pg_upgrade itself doesn't do that, but wrapper scripts for it often do.)
Hence, adjust the "Creating a Database Cluster" section to address
these points. I also took the liberty of wordsmithing the discussion
of NFS a bit.
These considerations aren't by any means new, so back-patch to all
supported branches.
The pg_stats view is supposed to be restricted to only show rows
about tables the user can read. However, it sometimes can leak
information which could not otherwise be seen when row level security
is enabled. Fix that by not showing pg_stats rows to users that would
be subject to RLS on the table the row is related to. This is done
by creating/using the newly introduced SQL visible function,
row_security_active().
Along the way, clean up three call sites of check_enable_rls(). The second
argument of that function should only be specified as other than
InvalidOid when we are checking as a different user than the current one,
as in when querying through a view. These sites were passing GetUserId()
instead of InvalidOid, which can cause the function to return incorrect
results if the current user has the BYPASSRLS privilege and row_security
has been set to OFF.
Additionally fix a bug causing RI Trigger error messages to unintentionally
leak information when RLS is enabled, and other minor cleanup and
improvements. Also add WITH (security_barrier) to the definition of pg_stats.
Bumped CATVERSION due to new SQL functions and pg_stats view definition.
Back-patch to 9.5 where RLS was introduced. Reported by Yaroslav.
Patch by Joe Conway and Dean Rasheed with review and input by
Michael Paquier and Stephen Frost.
While postgres' use of SSL renegotiation is a good idea in theory, it
turned out to not work well in practice. The specification and openssl's
implementation of it have lead to several security issues. Postgres' use
of renegotiation also had its share of bugs.
Additionally OpenSSL has a bunch of bugs around renegotiation, reported
and open for years, that regularly lead to connections breaking with
obscure error messages. We tried increasingly complex workarounds to get
around these bugs, but we didn't find anything complete.
Since these connection breakages often lead to hard to debug problems,
e.g. spuriously failing base backups and significant latency spikes when
synchronous replication is used, we have decided to change the default
setting for ssl renegotiation to 0 (disabled) in the released
backbranches and remove it entirely in 9.5 and master.
Author: Andres Freund
Discussion: 20150624144148.GQ4797@alap3.anarazel.de
Backpatch: 9.5 and master, 9.0-9.4 get a different patch
The original implementation of TABLESAMPLE modeled the tablesample method
API on index access methods, which wasn't a good choice because, without
specialized DDL commands, there's no way to build an extension that can
implement a TSM. (Raw inserts into system catalogs are not an acceptable
thing to do, because we can't undo them during DROP EXTENSION, nor will
pg_upgrade behave sanely.) Instead adopt an API more like procedural
language handlers or foreign data wrappers, wherein the only SQL-level
support object needed is a single handler function identified by having
a special return type. This lets us get rid of the supporting catalog
altogether, so that no custom DDL support is needed for the feature.
Adjust the API so that it can support non-constant tablesample arguments
(the original coding assumed we could evaluate the argument expressions at
ExecInitSampleScan time, which is undesirable even if it weren't outright
unsafe), and discourage sampling methods from looking at invisible tuples.
Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
within and across queries, as required by the SQL standard, and deal more
honestly with methods that can't support that requirement.
Make a full code-review pass over the tablesample additions, and fix
assorted bugs, omissions, infelicities, and cosmetic issues (such as
failure to put the added code stanzas in a consistent ordering).
Improve EXPLAIN's output of tablesample plans, too.
Back-patch to 9.5 so that we don't have to support the original API
in production.
This removes some info about support procedures being used, which was
obsoleted by commit db5f98ab4f, as well as add some more documentation
on how to create new opclasses using the Minmax infrastructure.
(Hopefully we can get something similar for Inclusion as well.)
In passing, fix some obsolete mentions of "mmtuples" in source code
comments.
Backpatch to 9.5, where BRIN was introduced.
Previously, there was an inconsistency across json/jsonb operators that
operate on datums containing JSON arrays -- only some operators
supported negative array count-from-the-end subscripting. Specifically,
only a new-to-9.5 jsonb deletion operator had support (the new "jsonb -
integer" operator). This inconsistency seemed likely to be
counter-intuitive to users. To fix, allow all places where the user can
supply an integer subscript to accept a negative subscript value,
including path-orientated operators and functions, as well as other
extraction operators. This will need to be called out as an
incompatibility in the 9.5 release notes, since it's possible that users
are relying on certain established extraction operators changed here
yielding NULL in the event of a negative subscript.
For the json type, this requires adding a way of cheaply getting the
total JSON array element count ahead of time when parsing arrays with a
negative subscript involved, necessitating an ad-hoc lex and parse.
This is followed by a "conversion" from a negative subscript to its
equivalent positive-wise value using the count. From there on, it's as
if a positive-wise value was originally provided.
Note that there is still a minor inconsistency here across jsonb
deletion operators. Unlike the aforementioned new "-" deletion operator
that accepts an integer on its right hand side, the new "#-" path
orientated deletion variant does not throw an error when it appears like
an array subscript (input that could be recognized by as an integer
literal) is being used on an object, which is wrong-headed. The reason
for not being stricter is that it could be the case that an object pair
happens to have a key value that looks like an integer; in general,
these two possibilities are impossible to differentiate with rhs path
text[] argument elements. However, we still don't allow the "#-"
path-orientated deletion operator to perform array-style subscripting.
Rather, we just return the original left operand value in the event of a
negative subscript (which seems analogous to how the established
"jsonb/json #> text[]" path-orientated operator may yield NULL in the
event of an invalid subscript).
In passing, make SetArrayPath() stricter about not accepting cases where
there is trailing non-numeric garbage bytes rather than a clean NUL
byte. This means, for example, that strings like "10e10" are now not
accepted as an array subscript of 10 by some new-to-9.5 path-orientated
jsonb operators (e.g. the new #- operator). Finally, remove dead code
for jsonb subscript deletion; arguably, this should have been done in
commit b81c7b409.
Peter Geoghegan and Andrew Dunstan
This tells you what fraction of NOTIFY's queue is currently filled.
Brendan Jurd, reviewed by Merlin Moncure and Gurjeet Singh. A few
further tweaks by me.
Other options cannot be changed, as it's not totally clear if cached plans
would need to be invalidated if one of the other options change. Selectivity
estimator functions only change plan costs, not correctness of plans, so
those should be safe.
Original patch by Uriy Zhuravlev, heavily edited by me.
pg_receivexlog and pg_recvlogical error out when --create-slot is
specified and a slot with the same name already exists. In some cases,
especially with pg_receivexlog, that's rather annoying and requires
additional scripting.
Backpatch to 9.5 as slot control functions have newly been added to
pg_receivexlog, and there doesn't seem much point leaving it in a less
useful state.
Discussion: 20150619144755.GG29350@alap3.anarazel.de
Commit 31eae602 added new syntax to many DDL commands to use CURRENT_USER
or SESSION_USER instead of role name in ALTER ... OWNER TO, but because
of a misplaced '{', the syntax in the docs implied that the syntax was
"ALTER ... CURRENT_USER", instead of "ALTER ... OWNER TO CURRENT_USER".
Fix that, and also the funny indentation in some of the modified syntax
blurps.
The documentation implied that there was seldom any reason to use the
array_append, array_prepend, and array_cat functions directly. But that's
not really true, because they can help make it clear which case is meant,
which the || operator can't do since it's overloaded to represent all three
cases. Add some discussion and examples illustrating the potentially
confusing behavior that can ensue if the parser misinterprets what was
meant.
Per a complaint from Michael Herold. Back-patch to 9.2, which is where ||
started to behave this way.
When enabling wal_compression, there is a risk to leak data similarly to
the BREACH and CRIME attacks on SSL where the compression ratio of
a full page image gives a hint of what is the existing data of this page.
This vulnerability is quite cumbersome to exploit in practice, but doable.
So this patch makes wal_compression PGC_SUSET in order to prevent
non-superusers from enabling it and exploiting the vulnerability while
DBA thinks the risk very seriously and disables it in postgresql.conf.
Back-patch to 9.5 where wal_compression was introduced.
Tom fixed another one of these in commit 7f32dbcd, but there was another
almost identical one in libpq docs. Per his comment:
HP's web server has apparently become case-sensitive sometime recently.
Per bug #13479 from Daniel Abraham. Corrected link identified by Alvaro.
The .backup file name can be passed to pg_archivecleanup even if
it includes the extension which is specified in -x option.
However, previously the document incorrectly warned a user
not to do that.
Back-patch to 9.2 where pg_archivecleanup's -x option and
the warning were added.
Commit de76884 changed an archive recovery so that the last WAL
segment with old timeline was renamed with suffix .partial. It should
have updated WAL-related utilities so that they can handle such
.paritial WAL files, but we forgot that.
This patch changes pg_archivecleanup so that it can clean up even
archived WAL files with .partial suffix. Also it allows us to specify
.partial WAL file name as the command-line argument "oldestkeptwalfile".
This patch also changes pg_resetxlog so that it can remove .partial
WAL files in pg_xlog directory.
pg_xlogdump cannot handle .partial WAL files. Per discussion,
we decided only to document that limitation instead of adding the fix.
Because a user can easily work around the limitation (i.e., just remove
.partial suffix from the file name) and the fix seems complicated for
very narrow use case.
Back-patch to 9.5 where the problem existed.
Review by Michael Paquier.
Discussion: http://www.postgresql.org/message-id/CAHGQGwGxMKnVHGgTfiig2Bt_2djec0in3-DLJmtg7+nEiidFdQ@mail.gmail.com
-t will now match views, foreign tables, materialized views, and sequences,
not only plain tables. This is more useful, and also more consistent with
the behavior of pg_dump's -t switch, which has always matched all relation
types.
We're still not there on matching pg_dump's behavior entirely, so mention
that in the docs.
Craig Ringer, reviewed by Pavel Stehule
This allows convenient checking for existence of a GUC from SQL, which is
particularly useful when dealing with custom variables.
David Christensen, reviewed by Jeevan Chalke
1) Add sgml comments referencing commits. This is useful to search for
missing items etc.
The comments containing the commit notes are an excerpt from:
git log --date=short \
--pretty='format:%cd [%h] %<(8,trunc)%cN: %<(48,trunc)%s%n%n%w(,4,4)%b%n' \
$(git merge-base origin/master upstream/REL9_4_STABLE)..origin/master
2) Improve a handful of existing notes
3) Add missing entries about a couple features.
4) Add a bunch of straight-forward FIXMEs
Minor corrections and clarifications. Notably, for stuff that got moved
out of contrib, make sure it's documented somewhere other than "Additional
Modules".
I'm sure these need more work, but that's all I have time for today.
Avoid memory leak from incorrect choice of how to free a StringInfo
(resetStringInfo doesn't do it). Now that pg_split_opts doesn't scribble
on the optstr, mark that as "const" for clarity. Attach the commentary in
protocol.sgml to the right place, and add documentation about the
user-visible effects of this change on postgres' -o option and libpq's
PGOPTIONS option.
As first committed, this view reported on the file contents as they were
at the last SIGHUP event. That's not as useful as reporting on the current
contents, and what's more, it didn't work right on Windows unless the
current session had serviced at least one SIGHUP. Therefore, arrange to
re-read the files when pg_show_all_settings() is called. This requires
only minor refactoring so that we can pass changeVal = false to
set_config_option() so that it won't actually apply any changes locally.
In addition, add error reporting so that errors that would prevent the
configuration files from being loaded, or would prevent individual settings
from being applied, are visible directly in the view. This makes the view
usable for pre-testing whether edits made in the config files will have the
desired effect, before one actually issues a SIGHUP.
I also added an "applied" column so that it's easy to identify entries that
are superseded by later entries; this was the main use-case for the original
design, but it seemed unnecessarily hard to use for that.
Also fix a 9.4.1 regression that allowed multiple entries for a
PGC_POSTMASTER variable to cause bogus complaints in the postmaster log.
(The issue here was that commit bf007a27ac unintentionally reverted
3e3f65973a, which suppressed any duplicate entries within
ParseConfigFp. However, since the original coding of the pg_file_settings
view depended on such suppression *not* happening, we couldn't have fixed
this issue now without first doing something with pg_file_settings.
Now we suppress duplicates by marking them "ignored" within
ProcessConfigFileInternal, which doesn't hide them in the view.)
Lesser changes include:
Drive the view directly off the ConfigVariable list, instead of making a
basically-equivalent second copy of the data. There's no longer any need
to hang onto the data permanently, anyway.
Convert show_all_file_settings() to do its work in one call and return a
tuplestore; this avoids risks associated with assuming that the GUC state
will hold still over the course of query execution. (I think there were
probably latent bugs here, though you might need something like a cursor
on the view to expose them.)
Arrange to run SIGHUP processing in a short-lived memory context, to
forestall process-lifespan memory leaks. (There is one known leak in this
code, in ProcessConfigDirectory; it seems minor enough to not be worth
back-patching a specific fix for.)
Remove mistaken assignment to ConfigFileLineno that caused line counting
after an include_dir directive to be completely wrong.
Add missed failure check in AlterSystemSetConfigFile(). We don't really
expect ParseConfigFp() to fail, but that's not an excuse for not checking.
When archive recovery and restartpoints were initially introduced,
checkpoint_segments was ignored on the grounds that the files restored from
archive don't consume any space in the recovery server. That was changed in
later releases, but even then it was arguably a feature rather than a bug,
as performing restartpoints as often as checkpoints during normal operation
might be excessive, but you might nevertheless not want to waste a lot of
space for pre-allocated WAL by setting checkpoint_segments to a high value.
But now that we have separate min_wal_size and max_wal_size settings, you
can bound WAL usage with max_wal_size, and still avoid consuming excessive
space usage by setting min_wal_size to a lower value, so that argument is
moot.
There are still some issues with actually limiting the space usage to
max_wal_size: restartpoints in recovery can only start after seeing the
checkpoint record, while a checkpoint starts flushing buffers as soon as
the redo-pointer is set. Restartpoint is paced to happen at the same
leisurily speed, determined by checkpoint_completion_target, as checkpoints,
but because they are started later, max_wal_size can be exceeded by upto
one checkpoint cycle's worth of WAL, depending on
checkpoint_completion_target. But that seems better than not trying at all,
and max_wal_size is a soft limit anyway.
The documentation already claimed that max_wal_size is obeyed in recovery,
so this just fixes the behaviour to match the docs. However, add some
weasel-words there to mention that max_wal_size may well be exceeded by
some amount in recovery.
This makes it possible to use the functions without getting errors, if there
is a chance that the file might be removed or renamed concurrently.
pg_rewind needs to do just that, although this could be useful for other
purposes too. (The changes to pg_rewind to use these functions will come in
a separate commit.)
The read_binary_file() function isn't very well-suited for extensions.c's
purposes anymore, if it ever was. So bite the bullet and make a copy of it
in extension.c, tailored for that use case. This seems better than the
accidental code reuse, even if it's a some more lines of code.
Michael Paquier, with plenty of kibitzing by me.
Allow CustomPath to have a list of paths, CustomPlan a list of plans,
and CustomPlanState a list of planstates known to the core system, so
that custom path/plan providers can more reasonably use this
infrastructure for nodes with multiple children.
KaiGai Kohei, per a design suggestion from Tom Lane, with some
further kibitzing by me.
Some of the entries in the inclusion opclasses where missing operators,
and we had an entry for inet_inclusion_ops instead of
network_inclusion_ops. Sort the operators within each opclass by
strategy number, just to make it easier to spot mistakes.
Also sort the rows by data type name, rather than OID.
This adjusts commit 82233ce7ea so that the
postmaster does not exit until all its child processes have exited, even
if the 5-second timeout elapses and we have to send SIGKILL. There is no
great value in having the postmaster process quit sooner, and doing so can
mislead onlookers into thinking that the cluster is fully terminated when
actually some child processes still survive.
This effect might explain recent test failures on buildfarm member hamster,
wherein we failed to restart a cluster just after shutting it down with
"pg_ctl stop -m immediate".
I also did a bit of code review/beautification, including fixing a faulty
use of the Max() macro on a volatile expression.
Back-patch to 9.4. In older branches, the postmaster never waited for
children to exit during immediate shutdowns, and changing that would be
too much of a behavioral change.