Revert 59ab4ac32, as well as the followup fix 33862cb9c, in all
branches. We need to think a bit harder about what the behavior
of LOCK TABLE on views should be, and there's no time for that
before next week's releases. We'll take another crack at this
later.
Discussion: https://postgr.es/m/16703-e348f58aab3cf6cc@postgresql.org
The intention in commit 491c029db was to require superuserness to
change the BYPASSRLS property, but the actual effect of the coding
in AlterRole() was to require superuserness to change anything at all
about a BYPASSRLS role. Other properties of a BYPASSRLS role should
be changeable under the same rules as for a normal role, though.
Fix that, and also take care of some documentation omissions related
to BYPASSRLS and REPLICATION role properties.
Tom Lane and Stephen Frost, per bug report from Wolfgang Walther.
Back-patch to all supported branches.
Discussion: https://postgr.es/m/a5548a9f-89ee-3167-129d-162b5985fcf8@technowledgy.de
The restriction that only tables and views can be locked by LOCK TABLE
is quite arbitrary, since the underlying mechanism can lock any relation
type. Drop the restriction so that programs such as pg_dump can lock
all relations they're interested in, preventing schema changes that
could cause a dump to fail after expending much effort.
Backpatch to 9.5.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Wells Oliver <wells.oliver@gmail.com>
Discussion: https://postgr.es/m/20201021200659.GA32358@alvherre.pgsql
psql's \connect claims to be able to re-use previous connection
parameters, but in fact it only re-uses the database name, user name,
host name (and possibly hostaddr, depending on version), and port.
This is problematic for assorted use cases. Notably, pg_dump[all]
emits "\connect databasename" commands which we would like to have
re-use all other parameters. If such a script is loaded in a psql run
that initially had "-d connstring" with some non-default parameters,
those other parameters would be lost, potentially causing connection
failure. (Thus, this is the same kind of bug addressed in commits
a45bc8a4f and 8e5793ab6, although the details are much different.)
To fix, redesign do_connect() so that it pulls out all properties
of the old PGconn using PQconninfo(), and then replaces individual
properties in that array. In the case where we don't wish to re-use
anything, get libpq's default settings using PQconndefaults() and
replace entries in that, so that we don't need different code paths
for the two cases.
This does result in an additional behavioral change for cases where
the original connection parameters allowed multiple hosts, say
"psql -h host1,host2", and the \connect request allows re-use of the
host setting. Because the previous coding relied on PQhost(), it
would only permit reconnection to the same host originally selected.
Although one can think of scenarios where that's a good thing, there
are others where it is not. Moreover, that behavior doesn't seem to
meet the principle of least surprise, nor was it documented; nor is
it even clear it was intended, since that coding long pre-dates the
addition of multi-host support to libpq. Hence, this patch is content
to drop it and re-use the host list as given.
Per Peter Eisentraut's comments on bug #16604. Back-patch to all
supported branches.
Discussion: https://postgr.es/m/16604-933f4b8791227b15@postgresql.org
When told to process all databases, clusterdb, reindexdb, and vacuumdb
would reconnect by replacing their --maintenance-db parameter with the
name of the target database. If that parameter is a connstring (which
has been allowed for a long time, though we failed to document that
before this patch), we'd lose any other options it might specify, for
example SSL or GSS parameters, possibly resulting in failure to connect.
Thus, this is the same bug as commit a45bc8a4f fixed in pg_dump and
pg_restore. We can fix it in the same way, by using libpq's rules for
handling multiple "dbname" parameters to add the target database name
separately. I chose to apply the same refactoring approach as in that
patch, with a struct to handle the command line parameters that need to
be passed through to connectDatabase. (Maybe someday we can unify the
very similar functions here and in pg_dump/pg_restore.)
Per Peter Eisentraut's comments on bug #16604. Back-patch to all
supported branches.
Discussion: https://postgr.es/m/16604-933f4b8791227b15@postgresql.org
- Misc grammar and punctuation fixes.
- Stylistic cleanup: use spaces between function arguments and JSON fields
in examples. For example "foo(a,b)" -> "foo(a, b)". Add semicolon after
last END in a few PL/pgSQL examples that were missing them.
- Make sentence that talked about "..." and ".." operators more clear,
by avoiding to end the sentence with "..". That makes it look the same
as "..."
- Fix syntax description for HAVING: HAVING conditions cannot be repeated
Patch by Justin Pryzby, per Yaroslav Schekin's report. Backpatch to all
supported versions, to the extent that the patch applies easily.
Discussion: https://www.postgresql.org/message-id/20201005191922.GE17626%40telsasoft.com
The Windows documentation insists that every WSAStartup call should
have a matching WSACleanup call. However, if that ever had actual
relevance, it wasn't in this century. Every remotely-modern Windows
kernel is capable of cleaning up when a process exits without doing
that, and must be so to avoid resource leaks in case of a process
crash. Moreover, Postgres backends have done WSAStartup without
WSACleanup since commit 4cdf51e64 in 2004, and we've never seen any
indication of a problem with that.
libpq's habit of doing WSAStartup during connection start and
WSACleanup during shutdown is also rather inefficient, since a
series of non-overlapping connection requests leads to repeated,
quite expensive DLL unload/reload cycles. We document a workaround
for that (having the application call WSAStartup for itself), but
that's just a kluge. It's also worth noting that it's far from
uncommon for applications to exit without doing PQfinish, and
we've not heard reports of trouble from that either.
However, the real reason for acting on this is that recent
experiments by Alexander Lakhin show that calling WSACleanup
during PQfinish is triggering the symptom we occasionally see
that a process using libpq fails to emit expected stdio output.
Therefore, let's change libpq so that it calls WSAStartup only
once per process, during the first connection attempt, and never
calls WSACleanup at all.
While at it, get rid of the only other WSACleanup call in our code
tree, in pg_dump/parallel.c; that presumably is equally useless.
Back-patch of HEAD commit 7d00a6b2d.
Discussion: https://postgr.es/m/ac976d8c-03df-d6b8-025c-15a2de8d9af1@postgrespro.ru
The rules to choose the number of parallel workers to perform parallel
vacuum operation were not clearly specified.
Reported-by: Peter Eisentraut
Author: Amit Kapila
Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/36aa8aea-61b7-eb3c-263b-648e0cb117b7@2ndquadrant.com
Section 8.5.1.4, which defines these literals, made only a vague
reference to the fact that they might be evaluated too soon to be
safe in non-interactive contexts. Provide a more explicit caution
against misuse. Also, generalize the wording in the related tip in
section 9.9.4: while it clearly described this problem, it implied
(or really, stated outright) that the problem only applies to table
DEFAULT clauses.
Per gripe from Tijs van Dam. Back-patch to all supported branches.
Discussion: https://postgr.es/m/c2LuRv9BiRT3bqIo5mMQiVraEXey_25B4vUn0kDqVqilwOEu_iVF1tbtvLnyQK7yDG3PFaz_GxLLPil2SDkj1MCObNRVaac-7j1dVdFERk8=@thalex.com
Commit a97e85f2b caused "exceed the available area" warnings in PDF
builds. Fine-tune colwidth values to avoid that.
Back-patch to 9.6, like the prior patch. (This is of dubious value
before v13, since we were far from free of such warnings in older
branches. But we might as well keep the SGML looking the same in all
branches.)
Per buildfarm.
Previously it was documented that toast_tuple_target affected column
marked as only External or Extended. But this description is not correct
and toast_tuple_target affects also column marked as Main.
Back-patch to v11 where toast_tuple_target reloption was introduced.
Author: Shinya Okano
Reviewed-by: Tatsuhito Kasahara, Fujii Masao
Discussion: https://postgr.es/m/93f46e311a67422e89e770d236059817@oss.nttdata.com
I removed the duplicate command tags for START_REPLICATION inadvertently
in commit 07082b08cc, but the replication protocol requires them. The
fact that the replication protocol was broken was not noticed because
all our test cases use an optimized code path that exits early, failing
to verify that the behavior is correct for non-optimized cases. Put
them back.
Also document this protocol quirk.
Add a test case that shows the failure. It might still succeed even
without the patch when run on a fast enough server, but it suffices to
show the bug in enough cases that it would be noticed in buildfarm.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Henry Hinze <henry.hinze@gmail.com>
Reviewed-by: Petr Jelínek <petr.jelinek@2ndquadrant.com>
Discussion: https://postgr.es/m/16643-eaadeb2a1a58d28c@postgresql.org
The descriptions of make_interval() and pg_options_to_table()
were randomly different from the reality embedded in pg_proc.
(These are not all the discrepancies I found in a quick search,
but the others perhaps require more discussion, since there's
at least a case to be made for changing pg_proc not the docs.)
make_interval issue noted by Thomas Kellerer.
Discussion: https://postgr.es/m/7b154ef0-9f22-90b9-7734-4bf23686695b@gmx.net
Previously, a conversion such as
to_date('-44-02-01','YYYY-MM-DD')
would result in '0045-02-01 BC', as the code attempted to interpret
the negative year as BC, but failed to apply the correction needed
for our internal handling of BC years. Fix the off-by-one problem.
Also, arrange for the combination of a negative year and an
explicit "BC" marker to cancel out and produce AD. This is how
the negative-century case works, so it seems sane to do likewise.
Continue to read "year 0000" as 1 BC. Oracle would throw an error,
but we've accepted that case for a long time so I'm hesitant to
change it in a back-patch.
Per bug #16419 from Saeed Hubaishan. Back-patch to all supported
branches.
Dar Alathar-Yemen and Tom Lane
Discussion: https://postgr.es/m/16419-d8d9db0a7553f01b@postgresql.org
Explicitly mention that primary key constraints are also included in the
limitation that the constraint columns must be a superset of the partition key
columns.
Wording suggestion from Tom Lane.
Discussion: https://postgr.es/m/64062533.78364.1601415362244@mail.yahoo.com
Backpatch-through: 11, where unique constraints on partitioned tables were added
Previously the standby server didn't archive timeline history files
streamed from the primary even when archive_mode is set to "always",
while it archives the streamed WAL files. This could cause the PITR to
fail because there was no required timeline history file in the archive.
The cause of this issue was that walreceiver didn't mark those files as
ready for archiving.
This commit makes walreceiver mark those streamed timeline history
files as ready for archiving if archive_mode=always. Then the archiver
process archives the marked timeline history files.
Back-patch to all supported versions.
Reported-by: Grigory Smolkin
Author: Grigory Smolkin, Fujii Masao
Reviewed-by: David Zhang, Anastasia Lubennikova
Discussion: https://postgr.es/m/54b059d4-2b48-13a4-6f43-95a087c92367@postgrespro.ru
We have had multiple reports that point to the
'@colReorder=latn-digit' collation customization being buggy. We have
reported this to ICU and are waiting for a fix. In the meantime,
remove references to this from the documentation and replace it by
another reordering example. Apparently, many users have been picking
up this example specifically from the documentation.
Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Discussion: https://www.postgresql.org/message-id/flat/153201618542.1404.3611626898935613264%40wrigleys.postgresql.org
Commit 15cb2bd27 neglected to make the running text match the
tables, leaving the reader with the strong impression that
we cannot count. Also, don't drop an unrelated para between
a table and the para describing it.
The previous version of the docs mentioned that files are rewritten,
implying that a second copy of each file gets created, but each file is
updated in-place.
Author: Michael Banck
Reviewed-by: Daniel Gustafsson, Michael Paquier
Discussion: https://postgr.es/m/858086b6a42fb7d17995b6175856f7e7ec44d0a2.camel@credativ.de
Backpatch-through: 12
The majority of our audience is probably using a pre-packaged Postgres
build rather than raw sources. For them, much of runtime.sgml is not
too relevant, and they should be reading the packager's docs instead.
Add some notes pointing that way in appropriate places.
Text by me; thanks to Daniel Gustafsson for review and discussion,
and to Laurenz Albe for an earlier version.
Discussion: https://postgr.es/m/159430831443.16535.11360317280100947016@wrigleys.postgresql.org
We were already raising an error for DROP INDEX CONCURRENTLY on a
partitioned table, albeit a different and confusing one:
ERROR: DROP INDEX CONCURRENTLY must be first action in transaction
Change that to throw a more comprehensible error:
ERROR: cannot drop partitioned index \"%s\" concurrently
Michael Paquier authored the test case for indexes on temporary
partitioned tables.
Backpatch to 11, where indexes on partitioned tables were added.
Reported-by: Jan Mussler <jan.mussler@zalando.de>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/16594-d2956ca909585067@postgresql.org
Per discussion, we're planning to remove parser support for postfix
operators in order to simplify the grammar. So it behooves us to
put out a deprecation notice at least one release before that.
There is only one built-in postfix operator, ! for factorial.
Label it deprecated in the docs and in pg_description, and adjust
some examples that formerly relied on it. (The sister prefix
operator !! is also deprecated. We don't really have to remove
that one, but since we're suggesting that people use factorial()
instead, it seems better to remove both operators.)
Also state in the CREATE OPERATOR ref page that postfix operators
in general are going away.
Although this changes the initial contents of pg_description,
I did not force a catversion bump; it doesn't seem essential.
In v13, also back-patch 4c5cf5431, so that there's someplace for
the <link>s to point to.
Mark Dilger and John Naylor, with some adjustments by me
Discussion: https://postgr.es/m/BE2DF53D-251A-4E26-972F-930E523580E9@enterprisedb.com
Commit ce77abe63c allowed EXPLAIN (BUFFERS) to report the information
on buffer usage during planning phase. However three issues were
reported regarding this feature.
(1) Previously, EXPLAIN option BUFFERS required ANALYZE. So the query
had to be actually executed by specifying ANALYZE even when we
want to see only the planner's buffer usage. This was inconvenient
especially when the query was write one like DELETE.
(2) EXPLAIN included the planner's buffer usage in summary
information. So SUMMARY option had to be enabled to report that.
Also this format was confusing.
(3) The output structure for planning information was not consistent
between TEXT format and the others. For example, "Planning" tag
was output in JSON format, but not in TEXT format.
For (1), this commit allows us to perform EXPLAIN (BUFFERS) without
ANALYZE to report the planner's buffer usage.
For (2), this commit changed EXPLAIN output so that the planner's
buffer usage is reported before summary information.
For (3), this commit made the output structure for planning
information more consistent between the formats.
Back-patch to v13 where the planner's buffer usage was allowed to
be reported in EXPLAIN.
Reported-by: Pierre Giraud, David Rowley
Author: Fujii Masao
Reviewed-by: David Rowley, Julien Rouhaud, Pierre Giraud
Discussion: https://postgr.es/m/07b226e6-fa49-687f-b110-b7c37572f69e@dalibo.com
When the leftover invalid index is "ccold", there's no need to re-run
the command. Reword the instructions to make that explicit.
Backpatch to 12, where REINDEX CONCURRENTLY appeared.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://postgr.es/m/20200819211312.GA15497@alvherre.pgsql
Put the -r option in the right section (it certainly isn't an
option controlling "the location and format of the output").
Clarify the behavior of the tablespace and waldir options
(that part per gripe from robert@interactive.co.uk).
Make a large number of small copy-editing fixes in text that
visibly wasn't written by native speakers, and try to avoid
grammatical inconsistencies between the descriptions of
the different options.
Back-patch to v13, since HEAD hasn't meaningfully diverged yet.
Discussion: https://postgr.es/m/159749418850.14322.216503677134569752@wrigleys.postgresql.org
The SimpleLruTruncate() header comment states the new coding rule. To
achieve this, add locktype "frozenid" and two LWLocks. This closes a
rare opportunity for data loss, which manifested as "apparent
wraparound" or "could not access status of transaction" errors. Data
loss is more likely in pg_multixact, due to released branches' thin
margin between multiStopLimit and multiWrapLimit. If a user's physical
replication primary logged ": apparent wraparound" messages, the user
should rebuild standbys of that primary regardless of symptoms. At less
risk is a cluster having emitted "not accepting commands" errors or
"must be vacuumed" warnings at some point. One can test a cluster for
this data loss by running VACUUM FREEZE in every database. Back-patch
to 9.5 (all supported versions).
Discussion: https://postgr.es/m/20190218073103.GA1434723@rfd.leadboat.com
Up to now, upon receipt of a SIGTERM ("smart shutdown" command), the
postmaster has immediately killed all "optional" background processes,
and subsequently refused to launch new ones while it's waiting for
foreground client processes to exit. No doubt this seemed like an OK
policy at some point; but it's a pretty bad one now, because it makes
for a seriously degraded environment for the remaining clients:
* Parallel queries are killed, and new ones fail to launch. (And our
parallel-query infrastructure utterly fails to deal with the case
in a reasonable way --- it just hangs waiting for workers that are
not going to arrive. There is more work needed in that area IMO.)
* Autovacuum ceases to function. We can tolerate that for awhile,
but if bulk-update queries continue to run in the surviving client
sessions, there's eventually going to be a mess. In the worst case
the system could reach a forced shutdown to prevent XID wraparound.
* The bgwriter and walwriter are also stopped immediately, likely
resulting in performance degradation.
Hence, let's rearrange things so that the only immediate change in
behavior is refusing to let in new normal connections. Once the last
normal connection is gone, shut everything down as though we'd received
a "fast" shutdown. To implement this, remove the PM_WAIT_BACKUP and
PM_WAIT_READONLY states, instead staying in PM_RUN or PM_HOT_STANDBY
while normal connections remain. A subsidiary state variable tracks
whether or not we're letting in new connections in those states.
This also allows having just one copy of the logic for killing child
processes in smart and fast shutdown modes. I moved that logic into
PostmasterStateMachine() by inventing a new state PM_STOP_BACKENDS.
Back-patch to 9.6 where parallel query was added. In principle
this'd be a good idea in 9.5 as well, but the risk/reward ratio
is not as good there, since lack of autovacuum is not a problem
during typical uses of smart shutdown.
Per report from Bharath Rupireddy.
Patch by me, reviewed by Thomas Munro
Discussion: https://postgr.es/m/CALj2ACXAZ5vKxT9P7P89D87i3MDO9bfS+_bjMHgnWJs8uwUOOw@mail.gmail.com
Make these examples self-contained by providing declarations of the
user-defined row types they rely on. There wasn't room to do this
in the old doc format, but now there is, and I think it makes the
examples a good bit less confusing.
Hostile objects located within the installation-time search_path could
capture references in an extension's installation or upgrade script.
If the extension is being installed with superuser privileges, this
opens the door to privilege escalation. While such hazards have existed
all along, their urgency increases with the v13 "trusted extensions"
feature, because that lets a non-superuser control the installation path
for a superuser-privileged script. Therefore, make a number of changes
to make such situations more secure:
* Tweak the construction of the installation-time search_path to ensure
that references to objects in pg_catalog can't be subverted; and
explicitly add pg_temp to the end of the path to prevent attacks using
temporary objects.
* Disable check_function_bodies within installation/upgrade scripts,
so that any security gaps in SQL-language or PL-language function bodies
cannot create a risk of unwanted installation-time code execution.
* Adjust lookup of type input/receive functions and join estimator
functions to complain if there are multiple candidate functions. This
prevents capture of references to functions whose signature is not the
first one checked; and it's arguably more user-friendly anyway.
* Modify various contrib upgrade scripts to ensure that catalog
modification queries are executed with secure search paths. (These
are in-place modifications with no extension version changes, since
it is the update process itself that is at issue, not the end result.)
Extensions that depend on other extensions cannot be made fully secure
by these methods alone; therefore, revert the "trusted" marking that
commit eb67623c9 applied to earthdistance and hstore_plperl, pending
some better solution to that set of issues.
Also add documentation around these issues, to help extension authors
write secure installation scripts.
Patch by me, following an observation by Andres Freund; thanks
to Noah Misch for review.
Security: CVE-2020-14350
This ought to work much like C's "#elif defined(name)"; but the code
implemented it in a way equivalent to endif followed by ifdef, so that
it didn't matter whether any previous branch of the IF construct had
succeeded. Fix that; add some test cases covering elif and nested IFs;
and improve the documentation, which also seemed a bit confused.
AFAICS the code has been like this since the feature was added in 1999
(commit b57b0e044). So while it's surely wrong, there might be code
out there relying on the current behavior. Hence, don't back-patch
into stable branches. It seems all right to fix it in v13 though.
Per report from Ashutosh Sharma. Reviewed by Ashutosh Sharma and
Michael Meskes.
Discussion: https://postgr.es/m/CAE9k0P=dQk9X0cU2tN49S7a9tv733-e1pVdpB1P-pWJ5PdTktg@mail.gmail.com
The type-name pattern in \dAc and \dAf was matched only to the actual
pg_type.typname string, which is fairly user-unfriendly in cases where
that is not what's shown to the user by format_type (compare "_int4"
and "integer[]"). Make this code match what \dT does, i.e. match the
pattern against either typname or format_type() output. Also fix its
broken handling of schema-name restrictions. (IOW, make these
processSQLNamePattern calls match \dT's.) While here, adjust
whitespace to make the query a little prettier in -E output, too.
Also improve some inaccuracies and shaky grammar in the related
documentation.
Noted while working on a patch for intarray's opclasses; I wondered
why I couldn't get a match to "integer*" for the input type name.
In "High Availability, Load Balancing, and Replication" chapter,
certain descriptions of Pgpool-II were not correct at this point. It
does not need conflict resolution. Also "Multiple-Server Parallel
Query Execution" is not supported anymore.
Discussion: https://postgr.es/m/20200726.230128.53842489850344110.t-ishii%40sraoss.co.jp
Author: Tatsuo Ishii
Reviewed-by: Bruce Momjian
Backpatch-through: 9.5
Partitioned indexes are also registered in pg_inherits, but the
description of this catalog did not reflect that.
Author: Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/87k0ynj35y.fsf@wibble.ilmari.org
Backpatch-through: 11
Add a GUC that acts as a multiplier on work_mem. It gets applied when
sizing executor node hash tables that were previously size constrained
using work_mem alone.
The new GUC can be used to preferentially give hash-based nodes more
memory than the generic work_mem limit. It is intended to enable admin
tuning of the executor's memory usage. Overall system throughput and
system responsiveness can be improved by giving hash-based executor
nodes more memory (especially over sort-based alternatives, which are
often much less sensitive to being memory constrained).
The default value for hash_mem_multiplier is 1.0, which is also the
minimum valid value. This means that hash-based nodes continue to apply
work_mem in the traditional way by default.
hash_mem_multiplier is generally useful. However, it is being added now
due to concerns about hash aggregate performance stability for users
that upgrade to Postgres 13 (which added disk-based hash aggregation in
commit 1f39bce0). While the old hash aggregate behavior risked
out-of-memory errors, it is nevertheless likely that many users actually
benefited. Hash agg's previous indifference to work_mem during query
execution was not just faster; it also accidentally made aggregation
resilient to grouping estimate problems (at least in cases where this
didn't create destabilizing memory pressure).
hash_mem_multiplier can provide a certain kind of continuity with the
behavior of Postgres 12 hash aggregates in cases where the planner
incorrectly estimates that all groups (plus related allocations) will
fit in work_mem/hash_mem. This seems necessary because hash-based
aggregation is usually much slower when only a small fraction of all
groups can fit. Even when it isn't possible to totally avoid hash
aggregates that spill, giving hash aggregation more memory will reliably
improve performance (the same cannot be said for external sort
operations, which appear to be almost unaffected by memory availability
provided it's at least possible to get a single merge pass).
The PostgreSQL 13 release notes should advise users that increasing
hash_mem_multiplier can help with performance regressions associated
with hash aggregation. That can be taken care of by a later commit.
Author: Peter Geoghegan
Reviewed-By: Álvaro Herrera, Jeff Davis
Discussion: https://postgr.es/m/20200625203629.7m6yvut7eqblgmfo@alap3.anarazel.de
Discussion: https://postgr.es/m/CAH2-WzmD%2Bi1pG6rc1%2BCjc4V6EaFJ_qSuKCCHVnH%3DoruqD-zqow%40mail.gmail.com
Backpatch: 13-, where disk-based hash aggregation was introduced.
The planner is in fact willing to use hash aggregation when work_mem is
not set high enough for everything to fit in memory. This has been the
case since commit 1f39bce0, which added disk-based hash aggregation.
There are a few remaining cases in which hash aggregation is avoided as
a matter of policy when the planner surmises that spilling will be
necessary. For example, callers of choose_hashed_setop() still
conservatively avoid hash aggregation when spilling is anticipated.
That doesn't seem like a good enough reason to mention hash aggregation
in this context.
Backpatch: 13-, where disk-based hash aggregation was introduced.
Note: This GUC was originally named enable_hashagg_disk when it appeared
in commit 1f39bce0, which added disk-based hash aggregation. It was
subsequently renamed in commit 92c58fd9.
Author: Peter Geoghegan
Reviewed-By: Jeff Davis, Álvaro Herrera
Discussion: https://postgr.es/m/9d9d1e1252a52ea1bad84ea40dbebfd54e672a0f.camel%40j-davis.com
Backpatch: 13-, where disk-based hash aggregation was introduced.
The initial implementation of leader_pid in pg_stat_activity added by
b025f32 took the approach to strictly print what a PGPROC entry
includes. In short, if a backend has been involved in parallel query at
least once, leader_pid would remain set as long as the backend is alive.
For a parallel group leader, this means that the field would always be
set after it participated at least once in parallel query, and after
more discussions this could be confusing if using for example a
connection pooler.
This commit changes the data printed so as leader_pid becomes always
NULL for a parallel group leader, showing up a non-NULL value only for
the parallel workers, and actually as long as a parallel query is
running as workers are shut down once the query has completed.
This does not change the definition of any catalog, so no catalog bump
is needed. Per discussion with Justin Pryzby, Álvaro Herrera, Julien
Rouhaud and me.
Discussion: https://postgr.es/m/20200721035145.GB17300@paquier.xyz
Backpatch-through: 13
TLS 1.3 uses a different way of specifying ciphers and a different
OpenSSL API. PostgreSQL currently does not support setting those
ciphers. For now, just document this. In the future, support for
this might be added somehow.
Reviewed-by: Jonathan S. Katz <jkatz@postgresql.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
The executor checks for this error, and so does the bootstrap catalog
loader, but we never checked for it in retail catalog manipulations.
The folly of that has now been exposed, so let's add assertions
checking it. Checking in CatalogTupleInsert[WithInfo] and
CatalogTupleUpdate[WithInfo] should be enough to cover this.
Back-patch to v10; the aforesaid functions didn't exist before that,
and it didn't seem worth adapting the patch to the oldest branches.
But given the risk of JIT crashes, I think we certainly need this
as far back as v11.
Pre-v13, we have to explicitly exclude pg_subscription.subslotname
and pg_subscription_rel.srsublsn from the checks, since they are
mismarked. (Even if we change our mind about applying BKI_FORCE_NULL
in the branch tips, it doesn't seem wise to have assertions that
would fire in existing databases.)
Discussion: https://postgr.es/m/298837.1595196283@sss.pgh.pa.us
The code has always set this column to NULL when it's not valid,
but the catalog header's description failed to reflect that,
as did the SGML docs, as did some of the code. To prevent future
coding errors of the same ilk, let's hide the field from C code
as though it were variable-length (which, in a sense, it is).
As with commit 72eab84a5, we can only fix this cleanly in HEAD
and v13; the problem extends further back but we'll need some
klugery in the released branches.
Discussion: https://postgr.es/m/367660.1595202498@sss.pgh.pa.us
max_slot_wal_keep_size that was added in v13 and wal_keep_segments are
the GUC parameters to specify how much WAL files to retain for
the standby servers. While max_slot_wal_keep_size accepts the number of
bytes of WAL files, wal_keep_segments accepts the number of WAL files.
This difference of setting units between those similar parameters could
be confusing to users.
To alleviate this situation, this commit renames wal_keep_segments to
wal_keep_size, and make users specify the WAL size in it instead of
the number of WAL files.
There was also the idea to rename max_slot_wal_keep_size to
max_slot_wal_keep_segments, in the discussion. But we have been moving
away from measuring in segments, for example, checkpoint_segments was
replaced by max_wal_size. So we concluded to rename wal_keep_segments
to wal_keep_size.
Back-patch to v13 where max_slot_wal_keep_size was added.
Author: Fujii Masao
Reviewed-by: Álvaro Herrera, Kyotaro Horiguchi, David Steele
Discussion: https://postgr.es/m/574b4ea3-e0f9-b175-ead2-ebea7faea855@oss.nttdata.com
Due to the layout of this catalog, subslotname has to be explicitly
marked BKI_FORCE_NULL, else initdb will default to the assumption
that it's non-nullable. Since, in fact, CREATE/ALTER SUBSCRIPTION
will store null values there, the existing marking is just wrong,
and has been since this catalog was invented.
We haven't noticed because not much in the system actually depends
on attnotnull being truthful. However, JIT'ed tuple deconstruction
does depend on that in some cases, allowing crashes or wrong answers
in queries that inspect pg_subscription. Commit 9de77b545 quite
accidentally exposed this on the buildfarm members that force JIT
activation.
Back-patch to v13. The problem goes further back, but we cannot
force initdb in released branches, so some klugier solution will
be needed there. Before working on that, push this simple fix
to try to get the buildfarm back to green.
Discussion: https://postgr.es/m/4118109.1595096139@sss.pgh.pa.us
pg_dump produces custom-format archive files that lack data offsets
when it is unable to seek its output. Up to now that's been a hazard
for pg_restore. But if pg_restore is able to seek in the archive
file, there is no reason to throw up our hands when asked to restore
data blocks out of order. Instead, whenever we are searching for a
data block, record the locations of the blocks we passed over (that
is, fill in the missing data-offset fields in our in-memory copy of
the TOC data). Then, when we hit a case that requires going
backwards, we can just seek back.
Also track the furthest point that we've searched to, and seek back
to there when beginning a search for a new data block. This avoids
possible O(N^2) time consumption, by ensuring that each data block
is examined at most twice. (On Unix systems, that's at most twice
per parallel-restore job; but since Windows uses threads here, the
threads can share block location knowledge, reducing the amount of
duplicated work.)
We can also improve the code a bit by using fseeko() to skip over
data blocks during the search.
This is all of some use even in simple restores, but it's really
significant for parallel pg_restore. In that case, we require
seekability of the input already, and we will very probably need
to do out-of-order restores.
Back-patch to v12, as this fixes a regression introduced by commit
548e50976. Before that, parallel restore avoided requesting
out-of-order restores, so it would work on a data-offset-less
archive. Now it will again.
Ideally this patch would include some test coverage, but there are
other open bugs that need to be fixed before we can extend our
coverage of parallel restore very much. Plan to revisit that later.
David Gilman and Tom Lane; reviewed by Justin Pryzby
Discussion: https://postgr.es/m/CALBH9DDuJ+scZc4MEvw5uO-=vRyR2=QF9+Yh=3hPEnKHWfS81A@mail.gmail.com
The stats with this commit was available only for WALSenders, however,
users might want to see for backends doing logical decoding via SQL API.
Then, users might want to reset and access these stats across server
restart which was not possible with the current patch.
List of commits reverted:
caa3c4242c Don't call elog() while holding spinlock.
e641b2a995 Doc: Update the documentation for spilled transaction
statistics.
5883f5fe27 Fix unportable printf format introduced in commit 9290ad198.
9290ad198b Track statistics for spilling of changes from ReorderBuffer.
Additionaly, remove the release notes entry for this feature.
Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/CA+fd4k5_pPAYRTDrO2PbtTOe0eHQpBvuqmCr8ic39uTNmR49Eg@mail.gmail.com
Re-point comp.ai.genetic FAQ link to a more stable address.
Remove stale links to AIX documentation; we don't really need to
tell AIX users how to use their systems.
Remove stale links to HP documentation about SSL. We've had to
update those twice before, making it increasingly obvious that
HP does not intend them to be stable landing points. They're
not particularly authoritative, either. (This change effectively
reverts bbd3bdba3.)
Daniel Gustafsson and Álvaro Herrera, per a gripe from
Kyotaro Horiguchi. Back-patch, since these links are
just as dead in the back branches.
Discussion: https://postgr.es/m/20200709.161226.204639179120026914.horikyota.ntt@gmail.com
pg_stat_activity.query text is truncated at 1024 bytes. But previously
the document described that it's truncated at 1024 characters.
This was not accurate when considering multibyte characters.
Back-patch to v10 where this inaccurate description was added.
Author: Atsushi Torikoshi
Reviewed-by: Daniel Gustafsson, Fujii Masao
Discussion: https://postgr.es/m/cd5b49a5a14e887542f5f569c1c6bde2@oss.nttdata.com
The previous definition of the column was almost universally disliked,
so provide this updated definition which is more useful for monitoring
purposes: a large positive value is good, while zero or a negative value
means danger. This should be operationally more convenient.
Backpatch to 13, where the new column to pg_replication_slots (and the
feature it represents) were added.
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Discussion: https://postgr.es/m/9ddfbf8c-2f67-904d-44ed-cf8bc5916228@oss.nttdata.com
Enabling pg_stat_statements.track_plaanning may incur a noticeable
performance penalty, especially when a fewer kinds of queries are executed
on many concurrent connections. This commit documents this note.
Back-patch to v13 where pg_stat_statements.track_plaanning was added.
Suggested-by: Pavel Stehule
Author: Fujii Masao
Reviewed-by: Pavel Stehule
Discussion: https://postgr.es/m/CAFj8pRC9Jxa8r5i0TNBWLb8mzuaYzEoLq3QOvip0jVpHPOLbVA@mail.gmail.com
After running GetForeignRelSize for a foreign table, adjust rel->tuples
to be at least as large as rel->rows. This prevents bizarre behavior
in estimate_num_groups() and perhaps other places, especially in the
scenario where rel->tuples is zero because pg_class.reltuples is
(suggesting that ANALYZE has never been run for the table). As things
stood, we'd end up estimating one group out of any GROUP BY on such a
table, whereas the default group-count estimate is more likely to result
in a sane plan.
Also, clarify in the documentation that GetForeignRelSize has the option
to override the rel->tuples value if it has a better idea of what to use
than what is in pg_class.reltuples.
Per report from Jeff Janes. Back-patch to all supported branches.
Patch by me; thanks to Etsuro Fujita for review
Discussion: https://postgr.es/m/CAMkU=1xNo9cnan+Npxgz0eK7394xmjmKg-QEm8wYG9P5-CcaqQ@mail.gmail.com
Previously the document explained that restart_lsn indicates the LSN of
oldest WAL won't be automatically removed during checkpoints. But
since v13 this was no longer true thanks to max_slot_wal_keep_size.
Back-patch to v13 where max_slot_wal_keep_size was added.
Author: Fujii Masao
Discussion: https://postgr.es/m/6497f1e9-3148-c5da-7e49-b2fddad9a42f@oss.nttdata.com
Since v13 pg_stat_statements is allowed to track the planning time of
statements when track_planning option is enabled. Its default was on.
But this feature could cause more terrible spinlock contentions in
pg_stat_statements. As a result of this, Robins Tharakan reported that
v13 beta1 showed ~45% performance drop at high DB connection counts
(when compared with v12.3) during fully-cached SELECT-only test using
pgbench.
To avoid this performance regression by the default setting,
this commit changes default of pg_stat_statements.track_planning to off.
Back-patch to v13 where pg_stat_statements.track_planning was introduced.
Reported-by: Robins Tharakan
Author: Fujii Masao
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/2895b53b033c47ccb22972b589050dd9@EX13D05UWC001.ant.amazon.com
The IANA tzcode library has a feature to read a time zone file named
"posixrules" and apply the daylight-savings transition dates and times
therein, when it is given a POSIX-style time zone specification that
lacks an explicit transition rule. However, there's a problem with
that code: it doesn't work for dates past the Y2038 time_t rollover.
(Effectively, all times beyond that point are treated as standard
time.) The IANA crew regard this feature as legacy, so their plan is
to remove it not fix it. The time frame in which that will happen
is unclear, but presumably it'll happen well before 2038.
Moreover, effective with the next IANA data update (probably this
fall), the recommended default will be to not install a "posixrules"
file in the first place. The time frame in which tzdata packagers
might adopt that suggestion is likewise unclear, but at least some
platforms will probably do it in the next year or so. While we could
ignore that recommendation so far as PG-supplied tzdata trees are
concerned, builds using --with-system-tzdata will be subject to
whatever the platform's tzdata packager decides to do.
Thus, whether or not we do anything, some increasing fraction of
Postgres users will be exposed to the behavior observed when there
is no "posixrules" file; and if we do nothing, we'll have essentially
no control over the timing of that change.
The best thing to do to ameliorate the uncertainty seems to be to
proactively remove the posixrules-reading feature. If we do that in
a scheduled release then at least we can release-note the behavioral
change, rather than having users be surprised by it after a routine
tzdata update.
The change in question is fairly minor anyway: to be affected,
you have to be using a POSIX-style timezone spec, it has to not
have an explicit rule, and it has to not be one of the four traditional
continental-USA zone names (EST5EDT, CST6CDT, MST7MDT, or PST8PDT),
as those are special-cased. Since the default "posixrules" file
provides USA DST rules, the number of people who are likely to find
such a zone spec useful is probably quite small. Moreover, the
fallback behavior with no explicit rule and no "posixrules" file is to
apply current USA rules, so the only thing that really breaks is the
DST transitions in years before 2007 (and you get the countervailing
fix that transitions after 2038 will be applied).
Now, some installations might have replaced the "posixrules" file,
allowing e.g. EU rules to be applied to a POSIX-style timezone spec.
That won't work anymore. But it's not exactly clear why this solution
would be preferable to using a regular named zone. In any case, given
the Y2038 issue, we need to be pushing users to stop depending on this.
Back-patch into v13; it hasn't been released yet, so it seems OK to
change its behavior. (Personally I think we ought to back-patch
further, but I've been outvoted.)
Discussion: https://postgr.es/m/1390.1562258309@sss.pgh.pa.us
Discussion: https://postgr.es/m/20200621211855.6211-1-eggert@cs.ucla.edu
Warnings start 10M transactions before xidStopLimit, which is 11M
transactions before wraparound. The sample WARNING output showed a
value greater than 11M, and its HINT message predated commit
25ec228ef7. Hence, the sample was
impossible. Back-patch to 9.5 (all supported versions).
When we initially created this parameter, in commit ff8ca5fad, we left
the default as "allow any protocol version" on grounds of backwards
compatibility. However, that's inconsistent with the backend's default
since b1abfec82; protocol versions prior to 1.2 are not considered very
secure; and OpenSSL has had TLSv1.2 support since 2012, so the number
of PG servers that need a lesser minimum is probably quite small.
On top of those things, it emerges that some popular distros (including
Debian and RHEL) set MinProtocol=TLSv1.2 in openssl.cnf. Thus, far
from having "allow any protocol version" behavior in practice, what
we actually have as things stand is a platform-dependent lower limit.
So, change our minds and set the min version to TLSv1.2. Anybody
wanting to connect with a new libpq to a pre-2012 server can either
set ssl_min_protocol_version=TLSv1 or accept the fallback to non-SSL.
Back-patch to v13 where the aforementioned patches appeared.
Patch by me, reviewed by Daniel Gustafsson
Discussion: https://postgr.es/m/a9408304-4381-a5af-d259-e55d349ae4ce@2ndquadrant.com
In pg_replication_slot, change output from normal/reserved/lost to
reserved/extended/unreserved/ lost, which better expresses the possible
states particularly near the time where segments are no longer safe but
checkpoint has not run yet.
Under the new definition, reserved means the slot is consuming WAL
that's still under the normal WAL size constraints; extended means it's
consuming WAL that's being protected by wal_keep_segments or the slot
itself, whose size is below max_slot_wal_keep_size; unreserved means the
WAL is no longer safe, but checkpoint has not yet removed those files.
Such as slot is in imminent danger, but can still continue for a little
while and may catch up to the reserved WAL space.
Also, there were some bugs in the calculations used to report the
status; fixed those.
Backpatch to 13.
Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20200616.120236.1809496990963386593.horikyota.ntt@gmail.com
Defining duplicates as "close by" to each other was unclear. Simplify
the definition.
Backpatch: 13-, where deduplication was introduced (by commit 0d861bbb)
911e702077 added opclass options and adjusted documentation for each
particular affected opclass. However, documentation for extendability was
not adjusted. This commit adjusts documentation for interfaces of index AMs
and opclasses.
Discussion: https://postgr.es/m/CAH2-WzmQnW6%2Bz5F9AW%2BSz%2BzEcEvXofTwh_A9J3%3D_WA-FBP0wYg%40mail.gmail.com
Author: Alexander Korotkov
Reported-by: Peter Geoghegan
Reviewed-by: Peter Geoghegan
The IANA time zone folk have deprecated use of a "posixrules" file in
the tz database. While for now it's our choice whether to keep
supplying one in our own builds, installations built with
--with-system-tzdata will soon be needing to cope with that file not
being present, at least on some platforms.
This causes a problem for the horology test, which expected the
nonstandard POSIX zone spec "CST7CDT" to apply pre-2007 US daylight
savings rules. That does happen if the posixrules file supplies such
information, but otherwise the test produces undesired results.
To fix, add an explicit transition date rule that matches 2005 practice.
(We could alternatively have switched the test to use some real time
zone, but it seems useful to have coverage of this type of zone spec.)
While at it, update a documentation example that also relied on
"CST7CDT"; use a real-world zone name instead. Also, document why
the zone names EST5EDT, CST6CDT, MST7MDT, PST8PDT aren't subject to
similar failures when "posixrules" is missing.
Back-patch to all supported branches, since the hazard is the same
for all.
Discussion: https://postgr.es/m/1665379.1592581287@sss.pgh.pa.us
Mostly in response to Jürgen Purtz critique of previous definitions,
though I added many other changes.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Jürgen Purtz <juergen@purtz.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Erik Rijkers <er@xs4all.nl>
Discussion: https://postgr.es/m/c1e06008-2132-30f4-9b38-877e8683d418@purtz.de
We'd glossed over most of this complexity for years, but it's hard
to avoid writing it all down now, so that we can explain what happens
when there's no "posixrules" file in the IANA time zone database.
That was at best a tiny minority situation till now, but it's likely
to become quite common in the future, so we'd better explain it.
Nonetheless, we don't really encourage people to use POSIX zone specs;
picking a named zone is almost always what you really want, unless
perhaps you're stuck with an out-of-date zone database. Therefore,
let's shove all this detail into an appendix.
Patch by me; thanks to Robert Haas for help with some awkward wording.
Discussion: https://postgr.es/m/1390.1562258309@sss.pgh.pa.us
Our documentation failed to point out that REPEATABLE READ is really
snapshot isolation, which might be important to some users. Point to
the standard reference paper for this complicated topic.
Likewise, add a reference to the VLDB paper about PostgreSQL SSI, for
technical information about our SSI implementation and how it compares
to S2PL.
While here, add a note about catalog access using a lower isolation
level, per recent user complaint.
Back-patch to all releases.
Reported-by: Kyle Kingsbury <aphyr@jepsen.io>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Reviewed-by: Tatsuo Ishii <ishii@sraoss.co.jp>
Discussion: https://postgr.es/m/db7b729d-0226-d162-a126-8a8ab2dc4443%40jepsen.io
Discussion: https://postgr.es/m/16454-9408996bb1750faf%40postgresql.org
Eliminate enable_groupingsets_hash_disk, which was primarily useful
for testing grouping sets that use HashAgg and spill. Instead, hack
the table stats to convince the planner to choose hashed aggregation
for grouping sets that will spill to disk. Suggested by Melanie
Plageman.
Rename enable_hashagg_disk to hashagg_avoid_disk_plan, and invert the
meaning of on/off. The new name indicates more strongly that it only
affects the planner. Also, the word "avoid" is less definite, which
should avoid surprises when HashAgg still needs to use the
disk. Change suggested by Justin Pryzby, though I chose a different
GUC name.
Discussion: https://postgr.es/m/CAAKRu_aisiENMsPM2gC4oUY1hHG3yrCwY-fXUg22C6_MJUwQdA%40mail.gmail.com
Discussion: https://postgr.es/m/20200610021544.GA14879@telsasoft.com
Backpatch-through: 13
In PostgreSQL 10, we stopped using System V semaphores on Linux
systems. Update the example we give of an error message from a
misconfigured system to show what people are most likely to see these
days.
Back-patch to 10, where PREFERRED_SEMAPHORES=UNNAMED_POSIX arrived.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGLmJUSwybaPQv39rB8ABpqJq84im2UjZvyUY4feYhpWMw%40mail.gmail.com
Whitespace between tags is significant, and in some cases it creates
extra vertical space in man pages. The fix is either to remove some
newlines or in some cases to reword slightly to avoid the awkward
markup layout.