Add:
- "Restartpoint"
- "Log sequence number"
"LSN" was already listed in the Acronyms appendix, but it is more
suitable as a glossary entry, so move it there and have the acronyms
entry link into the glossary.
Also turn on DocBook parameter glossentry.show.acronym to show
acronyms for glossary entries, which is being used here.
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/60915312-62cd-9c94-0d94-556023ece45f%40enterprisedb.com
plperl, plpython, and pltcl all provide query-execution functions
that are thin wrappers around SPI_execute() or its variants.
The SPI functions document their row-count limit arguments clearly,
as "maximum number of rows to return, or 0 for no limit". However
the PLs' documentation failed to explain this special behavior of
zero, so that a reader might well assume it means "fetch zero
rows". Improve that.
Daniel Gustafsson and Tom Lane, per report from Kieran McCusker
Discussion: https://postgr.es/m/CAGgUQ6H6qYScctOhktQ9HLFDDoafBKHyUgJbZ6q_dOApnzNTXg@mail.gmail.com
vacuum_defer_cleanup_age was introduced before hot_standby_feedback and
replication slots existed. It is hard to use reasonably - commonly it will
either be set too low (not preventing recovery conflicts, while still causing
some bloat), or too high (causing a lot of bloat). The alternatives do not
have that issue.
That on its own might not be sufficient reason to remove
vacuum_defer_cleanup_age, but it also complicates computation of xid
horizons. See e.g. the bug fixed in be504a3e97. It also is untested.
This commit removes TransactionIdRetreatSafely(), as there are no users
anymore. There might be potential future users, hence noting that here.
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20230317230930.nhsgk3qfk7f4axls@awork3.anarazel.de
Unaligned siglen could lead to an unaligned access to subsequent key fields.
Backpatch to 13, where opclass options were introduced.
Reported-by: Alexander Lakhin
Bug: 17847
Discussion: https://postgr.es/m/17847-171232970bea406b%40postgresql.org
Reviewed-by: Tom Lane, Pavel Borisov, Alexander Lakhin
Backpatch-through: 13
a9c70b46 added the statistics view pg_stat_io which contained columns
"io_context" and "io_object". Given that the columns are in the
pg_stat_io view, the "io" prefix is somewhat redundant, so remove it.
The code variables referring to these fields are kept unchanged so as
they can keep their context about I/O.
Bump catalog version.
Author: Melanie Plageman
Reviewed-by: Kyotaro Horiguchi, Fabrízio de Royes Mello
Discussion: https://postgr.es/m/CAAKRu_aAQoJWrvT2BYYQvJChFKra_O-5ra3jhzKJZqWsTR1CPQ@mail.gmail.com
indexes-unique.html mentioned nothing about the availability of NULLS NOT
DISTINCT to modify the NULLs-are-not-equal behavior of unique indexes.
Add this to the synopsis and clarify what it does regarding NULLs.
Author: David Gilman, David Rowley
Reviewed-by: Corey Huinker
Discussion: https://postgr.es/m/CALBH9DDr3NLqzWop1z5uZE-M5G_GYUuAeHFHQeyzFbNd8W0d=Q@mail.gmail.com
Backpatch-through: 15, where NULLS NOT DISTINCT was added
This issue can be reproduced by running `make dist` from the root of the
tree. Error introduced in fcb21b3, where additions of links in
installation.sgml require custom rules in standalone-profile.xsl to make
sure that ./INSTALL is generated correctly for the distribution tarball,
where links are replaced by equivalent terms from the profile file
changed by this commit.
Per buildfarm member guaibasaurus.
Discussion: https://postgr.es/m/ZD859FmcMRCNtz0W@paquier.xyz
We have two existing conventions for long options: either alphabetical
among short options, or all long options after all the short options.
But the convention apparently used here, next to a functionally
related option, is not one of them.
Here we remove the notes which mention which version the given vacuumdb
option is available from. There are now 11 of these notes and they're
both quite untidy and take up far more space than they seem to be worth.
On running a print preview of the compiled HTML, removing these notes
saves about 1 A4 page (~20% less space).
If people need to see which options are available on older versions, then
consulting the documents for that version seems like a good idea. In any
case, when using newer vacuumdb versions on older servers, the user will
receive an error if they try to use an unsupported option.
Additionally, 3 of the notes are warning about the option only being
available from PostgreSQL 9.6 and later. That version's support ended 2.5
years ago. So, it's quite clear that the value of these notes diminishes
over time.
Discussion: https://postgr.es/m/CAApHDvrCQn6tupx2R67VL9RP1Qy4dDuWKRvt4jaB0vk2akQchw@mail.gmail.com
Other vacuumdb options seem to have notes about which version they're
available from, so let's follow this trend for the newly added
--buffer-usage-limit option.
This addresses various deficiencies in the documentation for VACUUM and
ANALYZE's BUFFER_USEAGE_LIMIT docs.
Here we declare "size" in the syntax synopsis for VACUUM and ANALYZE's
BUFFER_USAGE_LIMIT option and then define exactly what values can be
specified for it in the section for that below.
Also, fix the incorrect ordering of vacuumdb options both in the documents
and in vacuumdb's --help output. These should be in alphabetical order.
In passing also add the minimum/maximum range for the BUFFER_USAGE_LIMIT
option. These will also serve as example values that can be modified and
used.
Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/16845cb1-b228-e157-f293-5892bced9253@enterprisedb.com
Starting with OpenSSL 1.1.0 there is no need to call PQinitOpenSSL
or PQinitSSL to avoid duplicate initialization of OpenSSL. Add a
note to the documentation to explain this.
Backpatch to all supported versions as older OpenSSL versions are
equally likely to be used for all branches.
Reported-by: Sebastien Flaesch <sebastien.flaesch@4js.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/DBAP191MB12895BFFEC4B5FE0460D0F2FB0459@DBAP191MB1289.EURP191.PROD.OUTLOOK.COM
Backpatch-through: 11, all supported versions
WHen building with GSSAPI support, explicitly require MIT Kerberos and
check for gssapi_ext.h in configure.ac and meson.build. Also add
documentation explicitly stating that we now require MIT Kerberos when
building with GSSAPI support.
Reveiwed by: Johnathan Katz
Discussion: https://postgr.es/m/abcc73d0-acf7-6896-e0dc-f5bc12a61bb1@postgresql.org
This reverts commit 3d03b24c3 (Revert Add support for Kerberos
credential delegation) which was committed on the grounds of concern
about portability, but on further review and discussion, it's clear that
we are better off explicitly requiring MIT Kerberos as that appears to
be the only GSSAPI library currently that's under proper maintenance
and ongoing development. The API used for storing credentials was added
to MIT Kerberos over a decade ago while for the other libraries which
appear to be mainly based on Heimdal, which exists explicitly to be a
re-implementation of MIT Kerberos, the API never made it to a released
version (even though it was added to the Heimdal git repo over 5 years
ago..).
This post-feature-freeze change was approved by the RMT.
Discussion: https://postgr.es/m/ZDDO6jaESKaBgej0%40tamriel.snowman.net
In the HTML output, this decorates section headers and variable list
terms with a marker ("#") that is a link to the same section/term.
That way, links inside a page can be discovered for easier sharing.
The marker only appears when hovering.
This now requires that all elements that are candidates for such a
link have an id attribute. Otherwise, an error will be generated.
All previously missing ids have been added prior to this patch.
Author: Brar Piening <brar@gmx.de>
Reviewed-by: Karl O. Pinc <kop@karlpinc.com>
Discussion: https://www.postgresql.org/message-id/flat/CAB8KJ=jpuQU9QJe4+RgWENrK5g9jhoysMw2nvTN_esoOU0=a_w@mail.gmail.com
This reverts commit e056c557ae and minor later fixes thereof.
There's a few problems in this new feature -- most notably regarding
pg_upgrade behavior, but others as well. This new feature is not in any
way critical on its own, so instead of scrambling to fix it we revert it
and try again in early 17 with these issues in mind.
Discussion: https://postgr.es/m/3801207.1681057430@sss.pgh.pa.us
The previous wording used MVF to indicate the Most Common Values'
Frequencies, but the abbreviation was never explained or defined.
Reword to mcv_freqs to make the use clearer.
Also add MCF and MCV as acronyms as they were using <acronym>
markup but were missing from the acronyms page.
Reported-by: Eric Mutta <eric.mutta@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/166112292492.654.5377188452604176150@wrigleys.postgresql.org
The tables in "71.3. Extensibility" listing the support functions
for bloom and minmax-multi opclasses should include the associated
options function. While this isn't quite as required as the rest,
you need it for full functionality of the opclass.
Back-patch to v14 where these functions were added.
We cannot use the generic array_desc approach with per-tuple nbtree
posting list update metadata because array_desc can only deal with fixed
width elements (e.g., page offset numbers). Using array_desc led to
incorrect rmgr descriptions for updates from nbtree DELETE/VACUUM WAL
records.
To fix, add specialized code to describe the update metadata as array
elements in desc output. We now iterate over the update metadata using
an approach that matches related REDO routines.
Also stop showing the updates offset number array separately in nbtree
DELETE/VACUUM desc output. It's redundant information, since the same
page offset numbers appear in the description of each individual update
element. Also make some small tweaks to the way that we format arrays
in all desc routines (not just nbtree desc routines) to make arrays a
little less verbose.
Oversight in commit 1c453cfd, which enhanced the nbtree rmgr desc
routines.
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkbYuvwYKm-Y-72QEh6SPMQcAo9uONv+mR3bMGcu9E_Cg@mail.gmail.com
EXTRACT(EPOCH), EXTRACT(SECOND), and some related cases print more
trailing zeroes than they used to. This behavior change happened
with commit a2da77cdb (Change return type of EXTRACT to numeric),
and it was intentional according to the commit log:
- Return values when extracting fields with possibly fractional
values, such as second and epoch, now have the full scale that the
value has internally (so, for example, '1.000000' instead of just
'1').
It's been like that for two releases now, so while I suggested
changing this back, it's probably better to adjust the documentation
examples.
Per bug #17866 from Евгений Жужнев. Back-patch to v14 where the
change came in.
Discussion: https://postgr.es/m/17866-18eb70095b1594e2@postgresql.org
This reverts commit 3d4fa227bc.
Per discussion and buildfarm, this depends on APIs that seem to not
be available on at least one platform (NetBSD). Should be certainly
possible to rework to be optional on that platform if necessary but bit
late for that at this point.
Discussion: https://postgr.es/m/3286097.1680922218@sss.pgh.pa.us
Unsurprisingly, this requires wal_level = logical to be set on the primary and
standby. The infrastructure added in 26669757b6 ensures that slots are
invalidated if the primary's wal_level is lowered.
Creating a slot on a standby waits for a xl_running_xact record to be
processed. If the primary is idle (and thus not emitting xl_running_xact
records), that can take a while. To make that faster, this commit also
introduces the pg_log_standby_snapshot() function. By executing it on the
primary, completion of slot creation on the standby can be accelerated.
Note that logical decoding on a standby does not itself enforce that required
catalog rows are not removed. The user has to use physical replication slots +
hot_standby_feedback or other measures to prevent that. If catalog rows
required for a slot are removed, the slot is invalidated.
See 6af1793954 for an overall design of logical decoding on a standby.
Bumps catversion, for the addition of the pg_log_standby_snapshot() function.
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Andres Freund <andres@anarazel.de> (in an older version)
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: FabrÌzio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
During WAL replay on the standby, when a conflict with a logical slot is
identified, invalidate such slots. There are two sources of conflicts:
1) Using the information added in 6af1793954, logical slots are invalidated if
required rows are removed
2) wal_level on the primary server is reduced to below logical
Uses the infrastructure introduced in the prior commit. FIXME: add commit
reference.
Change InvalidatePossiblyObsoleteSlot() to use a recovery conflict to
interrupt use of a slot, if called in the startup process. The new recovery
conflict is added to pg_stat_database_conflicts, as confl_active_logicalslot.
See 6af1793954 for an overall design of logical decoding on a standby.
Bumps catversion for the addition of the pg_stat_database_conflicts column.
Bumps PGSTAT_FILE_FORMAT_ID for the same reason.
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20230407075009.igg7be27ha2htkbt@awork3.anarazel.de
Needed for logical decoding on a standby. Slots need to be invalidated because
of the horizon if rows required for logical decoding are removed. If the
primary's wal_level is lowered from 'logical', logical slots on the standby
need to be invalidated.
The new invalidation methods will be used in a subsequent commit.
Logical slots that have been invalidated can be identified via the new
pg_replication_slots.conflicting column.
See 6af1793954 for an overall design of logical decoding on a standby.
Bumps catversion for the addition of the new pg_replication_slots column.
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20230407075009.igg7be27ha2htkbt@awork3.anarazel.de
Provide a way to ask the kernel to use O_DIRECT (or local equivalent)
where available for data and WAL files, to avoid or minimize kernel
caching. This hurts performance currently and is not intended for end
users yet. Later proposed work would introduce our own I/O clustering,
read-ahead, etc to replace the facilities the kernel disables with this
option.
The only user-visible change, if the developer-only GUC is not used, is
that this commit also removes the obscure logic that would activate
O_DIRECT for the WAL when wal_sync_method=open_[data]sync and
wal_level=minimal (which also requires max_wal_senders=0). Those are
non-default and unlikely settings, and this behavior wasn't (correctly)
documented. The same effect can be achieved with io_direct=wal.
Author: Thomas Munro <thomas.munro@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGK1X532hYqJ_MzFWt0n1zt8trz980D79WbjwnT-yYLZpg%40mail.gmail.com
Support GSSAPI/Kerberos credentials being delegated to the server by a
client. With this, a user authenticating to PostgreSQL using Kerberos
(GSSAPI) credentials can choose to delegate their credentials to the
PostgreSQL server (which can choose to accept them, or not), allowing
the server to then use those delegated credentials to connect to
another service, such as with postgres_fdw or dblink or theoretically
any other service which is able to be authenticated using Kerberos.
Both postgres_fdw and dblink are changed to allow non-superuser
password-less connections but only when GSSAPI credentials have been
delegated to the server by the client and GSSAPI is used to
authenticate to the remote system.
Authors: Stephen Frost, Peifeng Qiu
Reviewed-By: David Christensen
Discussion: https://postgr.es/m/CO1PR05MB8023CC2CB575E0FAAD7DF4F8A8E29@CO1PR05MB8023.namprd05.prod.outlook.com
a9c70b46db and 8aaa04b32S added counting of IO operations to a new view,
pg_stat_io. Now, add IO timing for reads, writes, extends, and fsyncs to
pg_stat_io as well.
This combines the tracking for pgBufferUsage with the tracking for pg_stat_io
into a new function pgstat_count_io_op_time(). This should make it a bit
easier to avoid the somewhat costly instr_time conversion done for
pgBufferUsage.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/flat/CAAKRu_ay5iKmnbXZ3DsauViF3eMxu4m1oNnJXqV_HyqYeg55Ww%40mail.gmail.com
Add helper functions that output arrays in a standard format, and use
the functions inside heapdesc routines. This allows tools like
pg_walinspect to show a detailed description of the page offset number
arrays for records like PRUNE and VACUUM (unless there was an FPI).
Also document the conventions that desc routines should follow. Only
the heapdesc routines follow the conventions for now, so they're just
guidelines for the time being.
Based on a suggestion from Andres Freund.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/flat/20230109215842.fktuhesvayno6o4g%40awork3.anarazel.de
This modernized version of Soundex works significantly better than
the original, particularly for non-English names.
Dag Lem, reviewed by quite a few people along the way
Discussion: https://postgr.es/m/yger1atbgfy.fsf@sid.nimrod.no
It was pointed out that pg_buffercache_summary()'s report of
the overall average usage count isn't that useful, and what
would be more helpful in many cases is to report totals for
each possible usage count. Add a new function to do it like
that. Since pg_buffercache 1.4 is already new for v16,
we don't need to create a new extension version; we'll just
define this as part of 1.4.
Nathan Bossart
Discussion: https://postgr.es/m/20230130233040.GA2800702@nathanxps13
We now create pg_constaint rows for NOT NULL constraints with
contype='n'.
We propagate these constraints during operations such as adding
inheritance relationships, creating and attaching partitions, creating
tables LIKE other tables. We mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations; for example, as opposed to CHECK constraints, we don't
match NOT NULL ones by name when descending a hierarchy to alter it;
instead we match by column number. This means we don't require the
constraint names to be identical across a hierarchy.
For now, we omit them from system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
This has been very long in the making. The first patch was written by
Bernd Helmle in 2010 to add a new pg_constraint.contype value ('n'),
which I (Álvaro) then hijacked in 2011 and 2012, until that one was
killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring additional pg_attribute columns to
track the OID of the NOT NULL constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/CACA0E642A0267EDA387AF2B%40%5B172.26.14.62%5D
Discussion: https://postgr.es/m/AANLkTinLXMOEMz+0J29tf1POokKi4XDkWJ6-DDR9BKgU@mail.gmail.com
Discussion: https://postgr.es/m/20110707213401.GA27098@alvh.no-ip.org
Discussion: https://postgr.es/m/1343682669-sup-2532@alvh.no-ip.org
Discussion: https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Discussion: https://postgr.es/m/20220817181249.q7qvj3okywctra3c@alvherre.pgsql
The old wording described these as being multiplied by max_connections
plus max_prepared_transactions, which hasn't been exactly right for
some time thanks to the addition of various auxiliary processes.
Moreover, exactness here is a bit pointless given that the lock tables
can expand into the initially-unallocated "slop" space in shared
memory. Rather than trying to track exactly what the code is doing,
let's just use the term "server processes".
Likewise adjust these GUCs' description strings in guc_tables.c.
Wang Wei, reviewed by Nathan Bossart and myself
Discussion: https://postgr.es/m/OS3PR01MB6275BDD09C9B875C65FCC5AB9EA39@OS3PR01MB6275.jpnprd01.prod.outlook.com
1cbbee033 added BUFFER_USAGE_LIMIT to the VACUUM and ANALYZE commands, so
here we permit that option to be specified in vacuumdb.
In passing, adjust the documents for vacuum_buffer_usage_limit and the
BUFFER_USAGE_LIMIT VACUUM option to mention "kB" rather than "KB". Do the
same for the ERROR message in ExecVacuum() and
check_vacuum_buffer_usage_limit(). Without that we might tell a user that
the valid minimum value is 128 KB only to reject that because we accept
only "kB" and not "KB".
Also, add a small reminder comment in vacuum.h to try to trigger the
memory of anyone adding new fields to VacuumParams that they might want to
consider if vacuumdb needs to grow a new option too.
Author: Melanie Plageman
Reviewed-by: Justin Pryzby
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/ZAzTg3iEnubscvbf@telsasoft.com
Add new options to the VACUUM and ANALYZE commands called
BUFFER_USAGE_LIMIT to allow users more control over how large to make the
buffer access strategy that is used to limit the usage of buffers in
shared buffers. Larger rings can allow VACUUM to run more quickly but
have the drawback of VACUUM possibly evicting more buffers from shared
buffers that might be useful for other queries running on the database.
Here we also add a new GUC named vacuum_buffer_usage_limit which controls
how large to make the access strategy when it's not specified in the
VACUUM/ANALYZE command. This defaults to 256KB, which is the same size as
the access strategy was prior to this change. This setting also
controls how large to make the buffer access strategy for autovacuum.
Per idea by Andres Freund.
Author: Melanie Plageman
Reviewed-by: David Rowley
Reviewed-by: Andres Freund
Reviewed-by: Justin Pryzby
Reviewed-by: Bharath Rupireddy
Discussion: https://postgr.es/m/20230111182720.ejifsclfwymw2reb@awork3.anarazel.de
Make the \g, \o, \w, and \copy commands set these variables
when closing a pipe. We missed doing this in commit b0d8f2d98,
but it seems like a good idea.
There are some remaining places in psql that intentionally don't
update these variables after running a child program:
* pager invocations
* backtick evaluation within a prompt
* \e (edit query buffer)
Corey Huinker and Tom Lane
Discussion: https://postgr.es/m/CADkLM=eSKwRGF-rnRqhtBORRtL49QsjcVUCa-kLxKTqxypsakw@mail.gmail.com
\watch can now be told to stop after N executions of the query.
With the idea that we might want to add more options to \watch
in future, this patch generalizes the command's syntax to a list
of name=value options, with the interval allowed to omit the name
for backwards compatibility.
Andrey Borodin, reviewed by Kyotaro Horiguchi, Nathan Bossart,
Michael Paquier, Yugo Nagata, and myself
Discussion: https://postgr.es/m/CAAhFRxiZ2-n_L1ErMm9AZjgmUK=qS6VHb+0SaMn8sqqbhF7How@mail.gmail.com
zstd compression supports a special mode for finding matched in distant
past, which may result in better compression ratio, at the expense of
using more memory (the window size is 128MB).
To enable this optional mode, use the "long" keyword when specifying the
compression method (--compress=zstd:long).
Author: Justin Pryzby
Reviewed-by: Tomas Vondra, Jacob Champion
Discussion: https://postgr.es/m/20230224191840.GD1653@telsasoft.com
Discussion: https://postgr.es/m/20220327205020.GM28503@telsasoft.com
postgres_fdw aborts remote (sub)transactions opened on remote server(s)
in a local (sub)transaction one by one when the local (sub)transaction
aborts. This patch allows it to abort the remote (sub)transactions in
parallel to improve performance. This is enabled by the server option
"parallel_abort". The default is false.
Etsuro Fujita, reviewed by David Zhang.
Discussion: http://postgr.es/m/CAPmGK15FuPVGx3TGHKShsbPKKtF1y58-ZLcKoxfN-nqLj1dZ%3Dg%40mail.gmail.com
The primary bottlenecks for relation extension are:
1) The extension lock is held while acquiring a victim buffer for the new
page. Acquiring a victim buffer can require writing out the old page
contents including possibly needing to flush WAL.
2) When extending via ReadBuffer() et al, we write a zero page during the
extension, and then later write out the actual page contents. This can
nearly double the write rate.
3) The existing bulk relation extension infrastructure in hio.c just amortized
the cost of acquiring the relation extension lock, but none of the other
costs.
Unfortunately 1) cannot currently be addressed in a central manner as the
callers to ReadBuffer() need to acquire the extension lock. To address that,
this this commit moves the responsibility for acquiring the extension lock
into bufmgr.c functions. That allows to acquire the relation extension lock
for just the required time. This will also allow us to improve relation
extension further, without changing callers.
The reason we write all-zeroes pages during relation extension is that we hope
to get ENOSPC errors earlier that way (largely works, except for CoW
filesystems). It is easier to handle out-of-space errors gracefully if the
page doesn't yet contain actual tuples. This commit addresses 2), by using the
recently introduced smgrzeroextend(), which extends the relation, without
dirtying the kernel page cache for all the extended pages.
To address 3), this commit introduces a function to extend a relation by
multiple blocks at a time.
There are three new exposed functions: ExtendBufferedRel() for extending the
relation by a single block, ExtendBufferedRelBy() to extend a relation by
multiple blocks at once, and ExtendBufferedRelTo() for extending a relation up
to a certain size.
To avoid duplicating code between ReadBuffer(P_NEW) and the new functions,
ReadBuffer(P_NEW) now implements relation extension with
ExtendBufferedRel(), using a flag to tell ExtendBufferedRel() that the
relation lock is already held.
Note that this commit does not yet lead to a meaningful performance or
scalability improvement - for that uses of ReadBuffer(P_NEW) will need to be
converted to ExtendBuffered*(), which will be done in subsequent commits.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
This adds a new option to libpq's sslrootcert, "system", which will load
the system trusted CA roots for certificate verification. This is a more
convenient way to achieve this than pointing to the system CA roots
manually since the location can differ by installation and be locally
adjusted by env vars in OpenSSL.
When sslrootcert is set to system, sslmode is forced to be verify-full
as weaker modes aren't providing much security for public CAs.
Changing the location of the system roots by setting environment vars is
not supported by LibreSSL so the tests will use a heuristic to determine
if the system being tested is LibreSSL or OpenSSL.
The workaround in .cirrus.yml is required to handle a strange interaction
between homebrew and the openssl@3 formula; hopefully this can be removed
in the near future.
The original patch was written by Thomas Habets, which was later revived
by Jacob Champion.
Author: Jacob Champion <jchampion@timescale.com>
Author: Thomas Habets <thomas@habets.se>
Reviewed-by: Jelte Fennema <postgres@jeltef.nl>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Magnus Hagander <magnus@hagander.net>
Discussion: https://www.postgresql.org/message-id/flat/CA%2BkHd%2BcJwCUxVb-Gj_0ptr3_KZPwi3%2B67vK6HnLFBK9MzuYrLA%40mail.gmail.com
Allow pg_dump to use the zstd compression, in addition to gzip/lz4. Bulk
of the new compression method is implemented in compress_zstd.{c,h},
covering the pg_dump compression APIs. The rest of the patch adds test
and makes various places aware of the new compression method.
The zstd library (which this patch relies on) supports multithreaded
compression since version 1.5. We however disallow that feature for now,
as it might interfere with parallel backups on platforms that rely on
threads (e.g. Windows). This can be improved / relaxed in the future.
This also fixes a minor issue in InitDiscoverCompressFileHandle(), which
was not updated to check if the file already has the .lz4 extension.
Adding zstd compression was originally proposed in 2020 (see the second
thread), but then was reworked to use the new compression API introduced
in e9960732a9. I've considered both threads when compiling the list of
reviewers.
Author: Justin Pryzby
Reviewed-by: Tomas Vondra, Jacob Champion, Andreas Karlsson
Discussion: https://postgr.es/m/20230224191840.GD1653@telsasoft.com
Discussion: https://postgr.es/m/20201221194924.GI30237@telsasoft.com
Since 8b9e9644d, the messages for failed permissions checks report
"table" where appropriate, rather than "relation".
Backpatch to all supported branches
The meson docs generation hardcoded using the website style so far. Make it
configurable via a meson option.
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reported-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/3fc3bb9b-f7f8-d442-35c1-ec82280c564a@enterprisedb.com
Until now the meson built docs did not have a working reference to the css
stylesheet, it was copied in the make target. Instead of duplicating that for
meson, use the docbook-xsl parameter custom.css.source to reference it. An
additional benefit of that approach is that the stylesheet is now included in
the single-file HTML documentation.
Reported-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/3fc3bb9b-f7f8-d442-35c1-ec82280c564a@enterprisedb.com
Until now the meson built HTML docs had non-working references to images. They
were copied in the make target. Instead of duplicating that for meson, copy
them as part of the xslt stylesheet.
Reported-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/3fc3bb9b-f7f8-d442-35c1-ec82280c564a@enterprisedb.com
Detect and report if the tools necessary to build documentation are available
during configure. This is represented as two new options 'docs' and
'docs_pdf', both defaulting to 'auto'.
This should also fix a meson error about the installdocs target, when none of
the doc tools are found.
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20230325201414.sh7c6xlut2fpunnv@awork3.anarazel.de
Discussion: https://postgr.es/m/ZB8331v5IhUA/pNu@telsasoft.com
We had partial support for generating documentation suitable for .chm
files. However, we only had wired up generating the input files using
docbook-xsl, not generating an actual .chm file. Nor did we document how to do
so. Additionally, it was very slow to generate htmlhelp, as we never applied
the docbook-xsl stylesheet performance improvements to htmlhelp.
It doesn't look like there's any interest in the htmlhelp output, so remove
it, instead of spending cycles to finish the support.
Discussion: https://postgr.es/m/20230324165822.wcrj3akllbqquy7u@awork3.anarazel.de
The explanation describing the dependency to system read() calls for
these two functions has been removed in ddfc2d9. And after more
discussion about d69c404, we have concluded that adding more details
makes them easier to understand.
While on it, use the term "block read requests" (maybe found in cache)
rather than "buffers fetched" and "buffer hits".
Per discussion with Melanie Plageman, Kyotaro Horiguchi, Bertrand
Drouvot and myself.
Discussion: https://postgr.es/m/CAAKRu_ZmdiScT4q83OAbfmR5AH-L5zWya3SXjaxiJvhCob-e2A@mail.gmail.com
Backpatch-through: 11
Convert to BCP47 language tags before storing in the catalog, except
during binary upgrade or when the locale comes from an existing
collation or template database.
The resulting language tags can vary slightly between ICU
versions. For instance, "@colBackwards=yes" is converted to
"und-u-kb-true" in older versions of ICU, and to the simpler (but
equivalent) "und-u-kb" in newer versions.
The process of canonicalizing to a language tag also understands more
input locale string formats than ucol_open(). For instance,
"fr_CA.UTF-8" is misinterpreted by ucol_open() and the region is
ignored; effectively treating it the same as the locale "fr" and
opening the wrong collator. Canonicalization properly interprets the
language and region, resulting in the language tag "fr-CA", which can
then be understood by ucol_open().
This commit fixes a problem in prior versions due to ucol_open()
misinterpreting locale strings as described above. For instance,
creating an ICU collation with locale "fr_CA.UTF-8" would store that
string directly in the catalog, which would later be passed to (and
misinterpreted by) ucol_open(). After this commit, the locale string
will be canonicalized to language tag "fr-CA" in the catalog, which
will be properly understood by ucol_open(). Because this fix affects
the resulting collator, we cannot change the locale string stored in
the catalog for existing databases or collations; otherwise we'd risk
corrupting indexes. Therefore, only canonicalize locales for
newly-created (not upgraded) collations/databases. For similar
reasons, do not backport.
Discussion: https://postgr.es/m/8c7af6820aed94dc7bc259d2aa7f9663518e6137.camel@j-davis.com
Reviewed-by: Peter Eisentraut
Invent "GET DIAGNOSTICS oid_variable = PG_ROUTINE_OID".
This is useful for avoiding the maintenance nuisances that come
with embedding a function's name in its body, as one might do
for logging purposes for example. Typically users would cast the
result to regproc or regprocedure to get something human-readable,
but we won't pre-judge whether that's appropriate.
Pavel Stehule, reviewed by Kirk Wolak and myself
Discussion: https://postgr.es/m/CAFj8pRA4zMd5pY-B89Gm64bDLRt-L+akOd34aD1j4PEstHHSVQ@mail.gmail.com
This option is normally false, but can be set to true to obtain
the legacy behavior where the subscription runs with the permissions
of the subscription owner rather than the permissions of the
table owner. The advantages of this mode are (1) it doesn't require
that the subscription owner have permission to SET ROLE to each
table owner and (2) since no role switching occurs, the
SECURITY_RESTRICTED_OPERATION restrictions do not apply.
On the downside, it allows any table owner to easily usurp
the privileges of the subscription owner - basically, to take
over their account. Because that's generally quite undesirable,
we don't make this mode the default, but we do make it available,
just in case the new behavior causes too many problems for someone.
Discussion: http://postgr.es/m/CA+TgmoZ-WEeG6Z14AfH7KhmpX2eFh+tZ0z+vf0=eMDdbda269g@mail.gmail.com
Up until now, logical replication actions have been performed as the
subscription owner, who will generally be a superuser. Commit
cec57b1a0f documented hazards
associated with that situation, namely, that any user who owns a
table on the subscriber side could assume the privileges of the
subscription owner by attaching a trigger, expression index, or
some other kind of executable code to it. As a remedy, it suggested
not creating configurations where users who are not fully trusted
own tables on the subscriber.
Although that will work, it basically precludes using logical
replication in the way that people typically want to use it,
namely, to replicate a database from one node to another
without necessarily having any restrictions on which database
users can own tables. So, instead, change logical replication to
execute INSERT, UPDATE, DELETE, and TRUNCATE operations as the
table owner when they are replicated.
Since this involves switching the active user frequently within
a session that is authenticated as the subscription user, also
impose SECURITY_RESTRICTED_OPERATION restrictions on logical
replication code. As an exception, if the table owner can SET
ROLE to the subscription owner, these restrictions have no
security value, so don't impose them in that case.
Subscription owners are now required to have the ability to
SET ROLE to every role that owns a table that the subscription
is replicating. If they don't, replication will fail. Superusers,
who normally own subscriptions, satisfy this property by default.
Non-superusers users who own subscriptions will need to be
granted the roles that own relevant tables.
Patch by me, reviewed (but not necessarily in its entirety) by
Jelte Fennema, Jeff Davis, and Noah Misch.
Discussion: http://postgr.es/m/CA+TgmoaSCkg9ww9oppPqqs+9RVqCexYCE6Aq=UsYPfnOoDeFkw@mail.gmail.com
The trace point was using the relfileno / fork / block for the to-be-read-in
buffer. Some upcoming work would make that more expensive to provide. We still
have buffer-flush-start/done, which can serve most tracing needs that
buffer-write-dirty could serve.
Discussion: https://postgr.es/m/f5164e7a-eef6-8972-75a3-8ac622ed0c6e@iki.fi
Traditionally, vacuum always makes use of a buffer access strategy 32
buffers in size. This means that running vacuums tend not to cause too
many shared buffers to become dirty, however, this can cause vacuums to
run much more slowly than they otherwise could as WAL flushes will occur
more frequently due to having to flush WAL out to the LSN of the dirty
page before that page can be written to disk.
When we are performing failsafe VACUUMs (as added in 1e55e7d17), we really
want to make the vacuum work go as quickly as possible, so here we disable
the buffer access strategy when entering failsafe mode while vacuuming a
relation.
Per idea and analyis from Andres Freund.
In passing, also include some changes I had intended for 32fbe0239.
Author: Melanie Plageman
Reviewed-by: Justin Pryzby, David Rowley
Discussion: https://postgr.es/m/20230111182720.ejifsclfwymw2reb%40awork3.anarazel.de
It seems useful to add this to the glossary as there's discussion around
adding an option to VACUUM to disable and adjust the size of the buffer
access strategy that VACUUM uses.
Author: Melanie Plageman
Reviewed-by: Justin Pryzby, David Rowley
Discussion: https://postgr.es/m/ZBYDTrD1kyGg%2BHkS%40telsasoft.com
Allow users to opt out of returning FPI data and block data from
pg_get_wal_block_info as an optimization. Testing has shown that this
can make function execution over twice as fast in some cases.
When pg_get_wal_block_info is called with "show_data := false", it
always returns NULL values for its block_data and block_fpi_data bytea
output parameters. Nothing else changes. In particular, the function
will still return the usual per-block summary of block data/FPI space
overhead. Use of "show_data := false" is therefore feasible with all
queries that don't specifically require these raw binary strings.
Follow-up to recent work in commit 122376f0. There still hasn't been a
stable release with the pg_get_wal_block_info function, so no bump in
the pg_walinspect extension version.
Per suggestion from Melanie Plageman.
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAAKRu_bJvbcYBRj2cN6G2xV7B7-Ja+pjTO1nEnEhRR8OXYiABA@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-Wzm9shOkEDM10_+qOZkRSQhKVxwBFiehH6EHWQQRd_rDPw@mail.gmail.com
This patch introduces the SQL standard IS JSON predicate. It operates
on text and bytea values representing JSON, as well as on the json and
jsonb types. Each test has IS and IS NOT variants and supports a WITH
UNIQUE KEYS flag. The tests are:
IS JSON [VALUE]
IS JSON ARRAY
IS JSON OBJECT
IS JSON SCALAR
These should be self-explanatory.
The WITH UNIQUE KEYS flag makes these return false when duplicate keys
exist in any object within the value, not necessarily directly contained
in the outermost object.
Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Oleg Bartunov <obartunov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>
Author: Andrew Dunstan <andrew@dunslane.net>
Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.
Discussion: https://postgr.es/m/CAF4Au4w2x-5LTnN_bxky-mq4=WOqsGsxSpENCzHRAzSnEd8+WQ@mail.gmail.com
Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
This converts pg_regress output format to emit TAP compliant output
while keeping it as human readable as possible for use without TAP
test harnesses. As verbose harness related information isn't really
supported by TAP this also reduces the verbosity of pg_regress runs
which makes scrolling through log output in buildfarm/CI runs a bit
easier as well.
As the meson TAP parser conumes whitespace, the leading indentation
for differentiating parallel tests from sequential tests has been
changed to a single character prefix.
This patch has been around for an extended period of time, reviewers
listed below may have been involved in reviewing a version quite
different from the version in this commit. The original idea for
this patch was a hacking session with Jinbao Chen.
TAP format testing is also enabled in meson as of this.
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Nikolay Shaplov <dhyan@nataraj.su>
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/BD4B107D-7E53-4794-ACBA-275BEB4327C9@yesql.se
Discussion: https://postgr.es/m/20220221164736.rq3ornzjdkmwk2wo@alap3.anarazel.de
Among other things, this should make it easier to calculate a useful cache hit
ratio by excluding buffer reads via buffer access strategies. As buffer access
strategies reuse buffers (and thus evict the prior buffer contents), it is
normal to see reads on repeated scans of the same data.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAAKRu_beMa9Hzih40%3DXPYqhDVz6tsgUGTrhZXRo%3Dunp%2Bszb%3DUA%40mail.gmail.com
Expand the output parameters in pg_walinspect's pg_get_wal_block_info
function to return additional information that was previously only
available from pg_walinspect's pg_get_wal_records_info function. Some
of the details are attributed to individual block references, rather
than aggregated into whole-record values, since the function returns one
row per block reference per WAL record (unlike pg_get_wal_records_info,
which always returns one row per WAL record).
This structure is much easier to work with when writing queries that
track how individual blocks changed over time, or when attributing costs
to individual blocks (not WAL records) is useful.
This is the second time that pg_get_wal_block_info has been enhanced in
recent weeks. Commit 9ecb134a expanded on the original version of the
function added in commit c31cf1c0 (where it first appeared under the
name pg_get_wal_fpi_info). There still hasn't been a stable release
since commit c31cf1c0, so no bump in the pg_walinspect extension
version.
Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Kyotaro HORIGUCHI <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/CALj2ACVRK5=Z+2ZVsjgTTSkfEnQzCuwny7iigpG7g1btk4Ws2A@mail.gmail.com
This role can be granted to non-superusers to allow them to issue
CREATE SUBSCRIPTION. The non-superuser must additionally have CREATE
permissions on the database in which the subscription is to be
created.
Most forms of ALTER SUBSCRIPTION, including ALTER SUBSCRIPTION .. SKIP,
now require only that the role performing the operation own the
subscription, or inherit the privileges of the owner. However, to
use ALTER SUBSCRIPTION ... RENAME or ALTER SUBSCRIPTION ... OWNER TO,
you also need CREATE permission on the database. This is similar to
what we do for schemas. To change the owner of a schema, you must also
have permission to SET ROLE to the new owner, similar to what we do
for other object types.
Non-superusers are required to specify a password for authentication
and the remote side must use the password, similar to what is required
for postgres_fdw and dblink. A superuser who wants a non-superuser to
own a subscription that does not rely on password authentication may
set the new password_required=false property on that subscription. A
non-superuser may not set password_required=false and may not modify a
subscription that already has password_required=false.
This new password_required subscription property works much like the
eponymous postgres_fdw property. In both cases, the actual semantics
are that a password is not required if either (1) the property is set
to false or (2) the relevant user is the superuser.
Patch by me, reviewed by Andres Freund, Jeff Davis, Mark Dilger,
and Stephen Frost (but some of those people did not fully endorse
all of the decisions that the patch makes).
Discussion: http://postgr.es/m/CA+TgmoaDH=0Xj7OBiQnsHTKcF2c4L+=gzPBUKSJLh8zed2_+Dg@mail.gmail.com
This adds support for load balancing connections with libpq using a
connection parameter: load_balance_hosts=<string>. When setting the
param to random, hosts and addresses will be connected to in random
order. This then results in load balancing across these addresses and
hosts when multiple clients or frequent connection setups are used.
The randomization employed performs two levels of shuffling:
1. The given hosts are randomly shuffled, before resolving them
one-by-one.
2. Once a host its addresses get resolved, the returned addresses
are shuffled, before trying to connect to them one-by-one.
Author: Jelte Fennema <postgres@jeltef.nl>
Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-by: Michael Banck <mbanck@gmx.net>
Reviewed-by: Andrey Borodin <amborodin86@gmail.com>
Discussion: https://postgr.es/m/PR3PR83MB04768E2FF04818EEB2179949F7A69@PR3PR83MB0476.EURPRD83.prod.outlook.
This commit introduces the SQL/JSON standard-conforming constructors for
JSON types:
JSON_ARRAY()
JSON_ARRAYAGG()
JSON_OBJECT()
JSON_OBJECTAGG()
Most of the functionality was already present in PostgreSQL-specific
functions, but these include some new functionality such as the ability
to skip or include NULL values, and to allow duplicate keys or throw
error when they are found, as well as the standard specified syntax to
specify output type and format.
Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Oleg Bartunov <obartunov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>
Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.
Discussion: https://postgr.es/m/CAF4Au4w2x-5LTnN_bxky-mq4=WOqsGsxSpENCzHRAzSnEd8+WQ@mail.gmail.com
Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
When there are multiple publications for a subscription and one of those
publishes via the parent table by using publish_via_partition_root and the
other one directly publishes the child table, we end up copying the same
data twice during initial synchronization. The reason for this was that we
get both the parent and child tables from the publisher and try to copy
the data for both of them.
This patch extends the function pg_get_publication_tables() to take a
publication list as its input parameter. This allows us to exclude a
partition table whose ancestor is published by the same publication list.
This problem does exist in back-branches but we decide to fix it there in
a separate commit if required. The fix for back-branches requires quite
complicated changes to fetch the required table information from the
publisher as we can't update the function pg_get_publication_tables() in
back-branches. We are not sure whether we want to deviate and complicate
the code in back-branches for this problem as there are no field reports
yet.
Author: Wang wei
Reviewed-by: Peter Smith, Jacob Champion, Kuroda Hayato, Vignesh C, Osumi Takamichi, Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB57167F45D481F78CDC5986F794B99@OS0PR01MB5716.jpnprd01.prod.outlook.com
Commit ecb696527c added an XML ID attribute to one varlistentry in
create_subscription.sgml. Following 78ee60ed84, this commit adds XML ID
attributes to all varlistentries in create_subscription.sgml.
Additionally, links are added to refer to the subscription options,
enhancing the readability of documents.
Author: Kuroda Hayato
Reviewed-by: Peter Smith, Amit Kapila
Discussion: https://postgr.es/m/TYAPR01MB58667AE04D291924671E2051F5879@TYAPR01MB5866.jpnprd01.prod.outlook.com
For ICU collations, ensure that the locale's language exists in ICU,
and that the locale can be opened.
Basic validation helps avoid minor mistakes and misspellings, which
often fall back to the root locale instead of the intended
locale. It's even more important to avoid such mistakes in ICU
versions 54 and earlier, where the same (misspelled) locale string
could fall back to different locales depending on the environment.
Discussion: https://postgr.es/m/11b1eeb7e7667fdd4178497aeb796c48d26e69b9.camel@j-davis.com
Discussion: https://postgr.es/m/df2efad0cae7c65180df8e5ebb709e5eb4f2a82b.camel@j-davis.com
Reviewed-by: Peter Eisentraut
Change the columns attndims, attstattarget, and attinhcount from int32
to int16, and reorder a bit. This saves some space (currently 4
bytes) in pg_attribute and tuple descriptors, which translates into
small performance benefits and/or room for new columns in pg_attribute
needed by future features.
attndims and attinhcount are never realistically used with values
larger than int16. Just to be sure, add some overflow checks.
attstattarget is currently limited explicitly to 10000.
For consistency, pg_constraint.coninhcount is also changed like
attinhcount.
Discussion: https://www.postgresql.org/message-id/flat/d07ffc2b-e0e8-77f7-38fb-be921dff71af%40enterprisedb.com
Commit 4c8d65408 incorrectly stated that Homebrew has changed its
prefix for Apple M1 machines, but the prefix change applies to all
Apple Silicon based machines. Fix by writing Apple Silicon instead
of Apple M1.
Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://postgr.es/m/87mt3ys8ng.fsf@wibble.ilmari.org
Replace the hardcoded value with a GUC such that the iteration
count can be raised in order to increase protection against
brute-force attacks. The hardcoded value for SCRAM iteration
count was defined to be 4096, which is taken from RFC 7677, so
set the default for the GUC to 4096 to match. In RFC 7677 the
recommendation is at least 15000 iterations but 4096 is listed
as a SHOULD requirement given that it's estimated to yield a
0.5s processing time on a mobile handset of the time of RFC
writing (late 2015).
Raising the iteration count of SCRAM will make stored passwords
more resilient to brute-force attacks at a higher computational
cost during connection establishment. Lowering the count will
reduce computational overhead during connections at the tradeoff
of reducing strength against brute-force attacks.
There are however platforms where even a modest iteration count
yields a too high computational overhead, with weaker password
encryption schemes chosen as a result. In these situations,
SCRAM with a very low iteration count still gives benefits over
weaker schemes like md5, so we allow the iteration count to be
set to one at the low end.
The new GUC is intentionally generically named such that it can
be made to support future SCRAM standards should they emerge.
At that point the value can be made into key:value pairs with
an undefined key as a default which will be backwards compatible
with this.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Jonathan S. Katz <jkatz@postgresql.org>
Discussion: https://postgr.es/m/F72E7BC7-189F-4B17-BF47-9735EB72C364@yesql.se
Word-smith section 22.1 ("Database Roles") a little bit in hopes
of removing confusion about how the bootstrap superuser's name
is chosen.
While here, I couldn't help noticing that the claim that the bootstrap
superuser is the only initially-existing role has been a lie since
we started to invent predefined roles. We don't want too much detail
in this very introductory text, but it seems worth changing it to say
that it's the only initially-existing login-capable role.
Per documentation comment from Maja Zaloznik.
Discussion: https://postgr.es/m/167931662853.3349090.18217722739345182859@wrigleys.postgresql.org
The "partitions_total" and "partitions_done" fields were updated
as though the current level of partitioning was the only one.
In multi-level cases, not only could partitions_total change
over the course of the command, but partitions_done could go
backwards or exceed the currently-reported partitions_total.
Fix by setting partitions_total to the total number of direct
and indirect children once at command start, and then just
incrementing partitions_done at appropriate points. Invent
a new progress monitoring function "pgstat_progress_incr_param"
to simplify doing the latter. We can avoid adding cost for the
former when doing CREATE INDEX, because ProcessUtility already
enumerates the children and it's pretty easy to pass the count
down to DefineIndex. In principle the same could be done in
ALTER TABLE, but that's structurally difficult; for now, just
eat the cost of an extra find_all_inheritors scan in that case.
Ilya Gladyshev and Justin Pryzby
Discussion: https://postgr.es/m/a15f904a70924ffa4ca25c3c744cff31e0e6e143.camel@gmail.com
These were causing "contents ... exceed the available area"
warnings in PDF builds, and also didn't quite follow our markup
conventions for function examples. To fix the overwidth
problem, reduce the number of fields shown in one example,
and also insert &zwsp; to let the header line be broken in
a reasonable place.
Discussion: https://postgr.es/m/20230324194701.dqkzcdtlcikseo22@awork3.anarazel.de
This provides a very simple way to see the generic plan for a
parameterized query. Without this, it's necessary to define
a prepared statement and temporarily change plan_cache_mode,
which is a bit tedious.
One thing that's a bit of a hack perhaps is that we disable
execution-time partition pruning when the GENERIC_PLAN option
is given. That's because the pruning code may attempt to
fetch the value of one of the parameters, which would fail.
Laurenz Albe, reviewed by Julien Rouhaud, Christoph Berg,
Michel Pelletier, Jim Jones, and myself
Discussion: https://postgr.es/m/0a29b954b10b57f0d135fe12aa0909bd41883eb0.camel@cybertec.at
The sslcertmode option controls whether the server is allowed and/or
required to request a certificate from the client. There are three
modes:
- "allow" is the default and follows the current behavior, where a
configured client certificate is sent if the server requests one
(via one of its default locations or sslcert). With the current
implementation, will happen whenever TLS is negotiated.
- "disable" causes the client to refuse to send a client certificate
even if sslcert is configured or if a client certificate is available in
one of its default locations.
- "require" causes the client to fail if a client certificate is never
sent and the server opens a connection anyway. This doesn't add any
additional security, since there is no guarantee that the server is
validating the certificate correctly, but it may helpful to troubleshoot
more complicated TLS setups.
sslcertmode=require requires SSL_CTX_set_cert_cb(), available since
OpenSSL 1.0.2. Note that LibreSSL does not include it.
Using a connection parameter different than require_auth has come up as
the simplest design because certificate authentication does not rely
directly on any of the AUTH_REQ_* codes, and one may want to require a
certificate to be sent in combination of a given authentication method,
like SCRAM-SHA-256.
TAP tests are added in src/test/ssl/, some of them relying on sslinfo to
check if a certificate has been set. These are compatible across all
the versions of OpenSSL supported on HEAD (currently down to 1.0.1).
Author: Jacob Champion
Reviewed-by: Aleksander Alekseev, Peter Eisentraut, David G. Johnston,
Michael Paquier
Discussion: https://postgr.es/m/9e5a8ccddb8355ea9fa4b75a1e3a9edc88a70cd3.camel@vmware.com
Add pgstat counter to track row updates that result in the successor
version going to a new heap page, leaving behind an original version
whose t_ctid points to the new version. The current count is shown by
the n_tup_newpage_upd column of each of the pg_stat_*_tables views.
The new n_tup_newpage_upd column complements the existing n_tup_hot_upd
and n_tup_upd columns. Tables that have high n_tup_newpage_upd values
(relative to n_tup_upd) are good candidates for tuning heap fillfactor.
Corey Huinker, with small tweaks by me.
Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CADkLM=ded21M9iZ36hHm-vj2rE2d=zcKpUQMds__Xm2pxLfHKA@mail.gmail.com
This patch allows copying tables in the binary format during table
synchronization when the binary option for a subscription is enabled.
Previously, tables are copied in text format even if the subscription is
created with the binary option enabled. Copying tables in binary format
may reduce the time spent depending on column types.
A binary copy for initial table synchronization is supported only when
both publisher and subscriber are v16 or later.
Author: Melih Mutlu
Reviewed-by: Peter Smith, Shi yu, Euler Taveira, Vignesh C, Kuroda Hayato, Osumi Takamichi, Bharath Rupireddy, Hou Zhijie
Discussion: https://postgr.es/m/CAGPVpCQvAziCLknEnygY0v1-KBtg%2BOm-9JHJYZOnNPKFJPompw%40mail.gmail.com
* Commit 3048898e dropped -ING from PHJ wait event names. Update the
corresponding barrier phases names to match.
* Rename the "DONE" phases to "FREE". That's symmetrical with
"ALLOCATE", and names the activity that actually happens in that phase
(as we do for the other phases) rather than a state. The bug fixed by
commit 8d578b9b might have been more obvious with this name.
* Rename the batch/bucket growth barriers' "ALLOCATE" phases to
"REALLOCATE", a better description of what they do.
* Update the high level comments about phases to highlight phases
are executed by a single process with an asterisk (mostly memory
management phases).
No behavior change, as this is just improving internal identifiers. The
only user-visible sign of this is that a couple of wait events' display
names change from "...Allocate" to "...Reallocate" in pg_stat_activity,
to stay in sync with the internal names.
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2BMDpwF2Eo2LAvzd%3DpOh81wUTsrwU1uAwR-v6OGBB6%2B7g%40mail.gmail.com
This option, or its long form --set, sets the GUC "name" to "value".
The setting applies in the bootstrap and standalone servers run by
initdb, and is also written into the generated postgresql.conf.
This can save an extra editing step when creating a new cluster,
but the real use-case is for coping with situations where the
bootstrap server fails to start due to environmental issues;
for example, if it's necessary to force huge_pages to off.
Discussion: https://postgr.es/m/2844176.1674681919@sss.pgh.pa.us
This commit adds some documentation about two monitoring functions:
- pg_stat_get_xact_blocks_fetched()
- pg_stat_get_xact_blocks_hit()
The description of these functions has been removed in ddfc2d9, later
simplified by 5f2b089, assuming that all the functions whose
descriptions were removed are used in system views. Unfortunately, some
of them were are not used in any system views, so they lacked
documentation.
This gap exists in the docs for a long time, so backpatch all the way
down.
Reported-by: Michael Paquier
Author: Bertrand Drouvot
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/ZBeeH5UoNkTPrwHO@paquier.xyz
Backpatch-through: 11
These are set after a \! command or a backtick substitution.
SHELL_ERROR is just "true" for error (nonzero exit status) or "false"
for success, while SHELL_EXIT_CODE records the actual exit status
following standard shell/system(3) conventions.
Corey Huinker, reviewed by Maxim Orlov and myself
Discussion: https://postgr.es/m/CADkLM=cWao2x2f+UDw15W1JkVFr_bsxfstw=NGea7r9m4j-7rQ@mail.gmail.com
This makes it easier to specify values taken directly from WAL file
names.
The option parsing is arranged in the style of option_parse_int() (but
we need to parse unsigned int), to allow future refactoring in the
same manner.
Reviewed-by: Sébastien Lardière <sebastien@lardiere.net>
Discussion: https://www.postgresql.org/message-id/flat/8fef346e-2541-76c3-d768-6536ae052993@lardiere.net
@extschema:name@ extends the existing @extschema@ feature so that
we can also insert the schema name of some required extension,
thus making cross-extension references robust even if they are in
different schemas.
However, this has the same hazard as @extschema@: if the schema
name is embedded literally in an installed object, rather than being
looked up once during extension script execution, then it's no longer
safe to relocate the other extension to another schema. To deal with
that without restricting things unnecessarily, add a "no_relocate"
option to extension control files. This allows an extension to
specify that it cannot handle relocation of some of its required
extensions, even if in themselves those extensions are relocatable.
We detect "no_relocate" requests of dependent extensions during
ALTER EXTENSION SET SCHEMA.
Regina Obe, reviewed by Sandro Santilli and myself
Discussion: https://postgr.es/m/003001d8f4ae$402282c0$c0678840$@pcorp.us
When determining whether an index update may be skipped by using HOT, we
can ignore attributes indexed by block summarizing indexes without
references to individual tuples that need to be cleaned up.
A new type TU_UpdateIndexes provides a signal to the executor to
determine which indexes to update - no indexes, all indexes, or only the
summarizing indexes.
This also removes rd_indexattr list, and replaces it with rd_attrsvalid
flag. The list was not used anywhere, and a simple flag is sufficient.
This was originally committed as 5753d4ee32, but then got reverted by
e3fcca0d0d because of correctness issues.
Original patch by Josef Simanek, various fixes and improvements by Tomas
Vondra and me.
Authors: Matthias van de Meent, Josef Simanek, Tomas Vondra
Reviewed-by: Tomas Vondra, Alvaro Herrera
Discussion: https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
Discussion: https://postgr.es/m/CAFp7QwpMRGcDAQumN7onN9HjrJ3u4X3ZRXdGFT0K5G2JWvnbWg%40mail.gmail.com
Add versions of timestamptz + interval, timestamptz - interval, and
generate_series(timestamptz, ...) in which a timezone can be specified
explicitly instead of defaulting to the TimeZone GUC setting.
The new functions for the first two are named date_add and
date_subtract. This might seem too generic, but we could use
overloading to add additional variants if that seems useful.
Along the way, improve the docs' pretty inadequate explanation
of how timestamptz +- interval works.
Przemysław Sztoch and Gurjeet Singh; cosmetic changes and most of
the docs work by me
Discussion: https://postgr.es/m/01a84551-48dd-1359-bf7e-f6b0203a6bd0@sztoch.pl
Hash partitioning on an enum is problematic because the hash codes are
derived from the OIDs assigned to the enum values, which will almost
certainly be different after a dump-and-reload than they were before.
This means that some rows probably end up in different partitions than
before, causing restore to fail because of partition constraint
violations. (pg_upgrade dodges this problem by using hacks to force
the enum values to keep the same OIDs, but that's not possible nor
desirable for pg_dump.)
Users can work around that by specifying --load-via-partition-root,
but since that's a dump-time not restore-time decision, one might
find out the need for it far too late. Instead, teach pg_dump to
apply that option automatically when dealing with a partitioned
table that has hash-on-enum partitioning.
Also deal with a pre-existing issue for --load-via-partition-root
mode: in a parallel restore, we try to TRUNCATE target tables just
before loading them, in order to enable some backend optimizations.
This is bad when using --load-via-partition-root because (a) we're
likely to suffer deadlocks from restore jobs trying to restore rows
into other partitions than they came from, and (b) if we miss getting
a deadlock we might still lose data due to a TRUNCATE removing rows
from some already-completed restore job.
The fix for this is conceptually simple: just don't TRUNCATE if we're
dealing with a --load-via-partition-root case. The tricky bit is for
pg_restore to identify those cases. In dumps using COPY commands we
can inspect each COPY command to see if it targets the nominal target
table or some ancestor. However, in dumps using INSERT commands it's
pretty impractical to examine the INSERTs in advance. To provide a
solution for that going forward, modify pg_dump to mark TABLE DATA
items that are using --load-via-partition-root with a comment.
(This change also responds to a complaint from Robert Haas that
the dump output for --load-via-partition-root is pretty confusing.)
pg_restore checks for the special comment as well as checking the
COPY command if present. This will fail to identify the combination
of --load-via-partition-root and --inserts in pre-existing dump files,
but that should be a pretty rare case in the field. If it does
happen you will probably get a deadlock failure that you can work
around by not using parallel restore, which is the same as before
this bug fix.
Having done this, there seems no remaining reason for the alarmism
in the pg_dump man page about combining --load-via-partition-root
with parallel restore, so remove that warning.
Patch by me; thanks to Julien Rouhaud for review. Back-patch to
v11 where hash partitioning was introduced.
Discussion: https://postgr.es/m/1376149.1675268279@sss.pgh.pa.us
Support for SCM credential authentication has been removed in the
backend in 9.1, and libpq has kept some code to handle it for
compatibility.
Commit be4585b, that did the cleanup of the backend code, has done
so because the code was not really portable originally. And, as there
are likely little chances that this is used these days, this removes the
remaining code from libpq. An error will now be raised by libpq if
attempting to connect to a server that returns AUTH_REQ_SCM_CREDS,
instead.
References to SCM credential authentication are removed from the
protocol documentation. This removes some meson and configure checks.
Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/ZBLH8a4otfqgd6Kn@paquier.xyz
Clarify that ATTACH/DETACH PARTITION can be used to perform partition
maintenance with less locking than straight CREATE TABLE/DROP TABLE.
This was already stated in some places, but not emphasized.
Back-patch to v14 where DETACH PARTITION CONCURRENTLY was added.
(We had lower lock levels for ATTACH PARTITION before that, but
this wording wouldn't apply.)
Justin Pryzby, reviewed by Robert Treat and Jakub Wartak;
a little further wordsmithing by me
Discussion: https://postgr.es/m/20220718143304.GC18011@telsasoft.com
This adds the ability to pretty-print XML documents ... according to
libxml's somewhat idiosyncratic notions of what's pretty, anyway.
One notable divergence from a strict reading of the spec is that
libxml is willing to collapse empty nodes "<node></node>" to just
"<node/>", whereas SQL and the underlying XML spec say that this
option should only result in whitespace tweaks. Nonetheless,
it seems close enough to justify using the SQL-standard syntax.
Jim Jones, reviewed by Peter Smith and myself
Discussion: https://postgr.es/m/2f5df461-dad8-6d7d-4568-08e10608a69b@uni-muenster.de
Using REPLICA IDENTITY FULL on the publisher can lead to a full table scan
per tuple change on the subscription when REPLICA IDENTITY or PK index is
not available. This makes REPLICA IDENTITY FULL impractical to use apart
from some small number of use cases.
This patch allows using indexes other than PRIMARY KEY or REPLICA
IDENTITY on the subscriber during apply of update/delete. The index that
can be used must be a btree index, not a partial index, and it must have
at least one column reference (i.e. cannot consist of only expressions).
We can uplift these restrictions in the future. There is no smart
mechanism to pick the index. If there is more than one index that
satisfies these requirements, we just pick the first one. We discussed
using some of the optimizer's low-level APIs for this but ruled it out
as that can be a maintenance burden in the long run.
This patch improves the performance in the vast majority of cases and the
improvement is proportional to the amount of data in the table. However,
there could be some regression in a small number of cases where the indexes
have a lot of duplicate and dead rows. It was discussed that those are
mostly impractical cases but we can provide a table or subscription level
option to disable this feature if required.
Author: Onder Kalaci, Amit Kapila
Reviewed-by: Peter Smith, Shi yu, Hou Zhijie, Vignesh C, Kuroda Hayato, Amit Kapila
Discussion: https://postgr.es/m/CACawEhVLqmAAyPXdHEPv1ssU2c=dqOniiGz7G73HfyS7+nGV4w@mail.gmail.com
This patch adds new pg_dump switches
--table-and-children=pattern
--exclude-table-and-children=pattern
--exclude-table-data-and-children=pattern
which act the same as the existing --table, --exclude-table, and
--exclude-table-data switches, except that any partitions or
inheritance child tables of the table(s) matching the pattern
are also included or excluded.
Gilles Darold, reviewed by Stéphane Tachoires
Discussion: https://postgr.es/m/5aa393b5-5f67-8447-b83e-544516990ee2@migops.com
Use PostgreSQL consistently for referring to the productname rather
than Postgres. This also adds <productname> markup.
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: "Jonathan S. Katz" <jkatz@postgresql.org>
Discussion: https://postgr.es/m/9C019644-9EA4-4B79-A52C-5C47A5B6B2DF@yesql.se
This commit reworks a bit the set-returning functions of pg_walinspect,
making them more flexible regarding their end LSN:
- pg_get_wal_records_info()
- pg_get_wal_stats()
- pg_get_wal_block_info()
The end LSNs given to these functions is now handled so as a value
higher than the current LSN of the cluster (insert LSN for a primary, or
replay LSN for a standby) does not raise an error, giving more
flexibility to monitoring queries. Instead, the functions return
results up to the current LSN, as found at the beginning of each
function call.
As an effect of that, pg_get_wal_records_info_till_end_of_wal() and
pg_get_wal_stats_till_end_of_wal() are now removed from 1.1, as the
existing, equivalent functions are able to offer the same
possibilities.
Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACU0_q-o4DSweyaW9NO1KBx-QkN6G_OzYQvpjf3CZVASkg@mail.gmail.com
Expose the standard error functions as SQL-callable functions. These
are expected to be useful to people working with normal distributions,
and we use them here to test the distribution from random_normal().
Since these functions are defined in the POSIX and C99 standards, they
should in theory be available on all supported platforms. If that
turns out not to be the case, more work will be needed.
On all platforms tested so far, using extra_float_digits = -1 in the
regression tests is sufficient to allow for variations between
implementations. However, past experience has shown that there are
almost certainly going to be additional unexpected portability issues,
so these tests may well need further adjustments, based on the
buildfarm results.
Dean Rasheed, reviewed by Nathan Bossart and Thomas Munro.
Discussion: https://postgr.es/m/CAEZATCXv5fi7+Vu-POiyai+ucF95+YMcCMafxV+eZuN1B-=MkQ@mail.gmail.com
The new connection parameter require_auth allows a libpq client to
define a list of comma-separated acceptable authentication types for use
with the server. There is no negotiation: if the server does not
present one of the allowed authentication requests, the connection
attempt done by the client fails.
The following keywords can be defined in the list:
- password, for AUTH_REQ_PASSWORD.
- md5, for AUTH_REQ_MD5.
- gss, for AUTH_REQ_GSS[_CONT].
- sspi, for AUTH_REQ_SSPI and AUTH_REQ_GSS_CONT.
- scram-sha-256, for AUTH_REQ_SASL[_CONT|_FIN].
- creds, for AUTH_REQ_SCM_CREDS (perhaps this should be removed entirely
now).
- none, to control unauthenticated connections.
All the methods that can be defined in the list can be negated, like
"!password", in which case the server must NOT use the listed
authentication type. The special method "none" allows/disallows the use
of unauthenticated connections (but it does not govern transport-level
authentication via TLS or GSSAPI).
Internally, the patch logic is tied to check_expected_areq(), that was
used for channel_binding, ensuring that an incoming request is
compatible with conn->require_auth. It also introduces a new flag,
conn->client_finished_auth, which is set by various authentication
routines when the client side of the handshake is finished. This
signals to check_expected_areq() that an AUTH_REQ_OK from the server is
expected, and allows the client to complain if the server bypasses
authentication entirely, with for example the reception of a too-early
AUTH_REQ_OK message.
Regression tests are added in authentication TAP tests for all the
keywords supported (except "creds", because it is around only for
compatibility reasons). A new TAP script has been added for SSPI, as
there was no script dedicated to it yet. It relies on SSPI being the
default authentication method on Windows, as set by pg_regress.
Author: Jacob Champion
Reviewed-by: Peter Eisentraut, David G. Johnston, Michael Paquier
Discussion: https://postgr.es/m/9e5a8ccddb8355ea9fa4b75a1e3a9edc88a70cd3.camel@vmware.com
This allows for a string which if an input field matches causes the
column's default value to be inserted. The advantage of this is that
the default can be inserted in some rows and not others, for which
non-default data is available.
The file_fdw extension is also modified to take allow use of this
option.
Israel Barth Rubio
Discussion: https://postgr.es/m/CAO_rXXAcqesk6DsvioOZ5zmeEmpUN5ktZf-9=9yu+DTr0Xr8Uw@mail.gmail.com
The 'ssl' option is of type 'combo', but we add a choice 'auto' that
simulates the behavior of a feature option. This way, openssl is used
automatically by default if present, but we retain the ability to
potentially select another ssl library.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/ad65ffd1-a9a7-fda1-59c6-f7dc763c3051%40enterprisedb.com
Previously, the default encoding was derived from the locale when
using libc; while the default was always UTF-8 when using ICU. That
would throw an error when the locale was not compatible with UTF-8.
This commit causes initdb to derive the default encoding from the
locale for both providers. If --no-locale is specified (or if the
locale is C or POSIX), the default encoding will be UTF-8 for ICU
(because ICU does not support SQL_ASCII) and SQL_ASCII for libc.
Per buildfarm failure on system "hoverfly" related to commit
27b62377b4.
Discussion: https://postgr.es/m/d191d5841347301a8f1238f609471ddd957fc47e.camel%40j-davis.com
This commit reworks pg_get_wal_fpi_info() to become aware of all the
block information that can be attached to a record rather than just its
full-page writes:
- Addition of the block id as assigned by XLogRegisterBuffer(),
XLogRegisterBlock() or XLogRegisterBufData().
- Addition of the block data, as bytea, or NULL if none. The length of
the block data can be guessed with length(), so there is no need to
store its length in a separate field.
- Addition of the full-page image length, as counted without a hole or
even compressed.
- Modification of the handling of the full-page image data. This is
still a bytea, but it could become NULL if none is assigned to a block.
- Addition of the full-page image flags, tracking if a page is stored
with a hole, if it needs to be applied and the type of compression
applied to it, as of all the BKPIMAGE_* values in xlogrecord.h.
The information of each block is returned as one single record, with the
record's ReadRecPtr included to be able to join the block information
with the existing pg_get_wal_records_info(). Note that it is perfectly
possible for a block to hold both data and full-page image.
Thanks also to Kyotaro Horiguchi and Matthias van de Meent for the
discussion.
This commit uses some of the work proposed by Melanie, though it has
been largely redesigned and rewritten by me. Bharath has helped in
refining a bit the whole.
Reported-by: Melanie Plageman
Author: Michael Paquier, Melanie Plageman, Bharath Rupireddy
Discussion: https://postgr.es/m/CAAKRu_bORebdZmcV8V4cZBzU8M_C6tDDdbiPhCZ6i-iuSXW9TA@mail.gmail.com
This exposes the ICU facility to add custom collation rules to a
standard collation.
New options are added to CREATE COLLATION, CREATE DATABASE, createdb,
and initdb to set the rules.
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
Discussion: https://www.postgresql.org/message-id/flat/821c71a4-6ef0-d366-9acf-bb8e367f739f@enterprisedb.com
Add description of which one is the default between two complementary
options of --bypassrls and --replication in the help text and docs. In
correspondence let the command always include the tokens corresponding
to every options of that kind in the SQL command sent to server. Tests
are updated accordingly.
Also fix the checks of some trivalue vars which were using literal zero
for checking default value instead of the enum label TRI_DEFAULT. While
not a bug, since TRI_DEFAULT is defined as zero, fixing improves read-
ability improved readability (and avoid bugs if the enum is changed).
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220810.151243.1073197628358749087.horikyota.ntt@gmail.com
Disabling this option is useful to run VACUUM (with or without FULL) on
only the toast table of a relation, bypassing the main relation. This
option is enabled by default.
Running directly VACUUM on a toast table was already possible without
this feature, by using the non-deterministic name of a toast relation
(as of pg_toast.pg_toast_N, where N would be the OID of the parent
relation) in the VACUUM command, and it required a scan of pg_class to
know the name of the toast table. So this feature is basically a
shortcut to be able to run VACUUM or VACUUM FULL on a toast relation,
using only the name of the parent relation.
A new switch called --no-process-main is added to vacuumdb, to work as
an equivalent of PROCESS_MAIN.
Regression tests are added to cover VACUUM and VACUUM FULL, looking at
pg_stat_all_tables.vacuum_count to see how many vacuums have run on
each table, main or toast.
Author: Nathan Bossart
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/20221230000028.GA435655@nathanxps13
Add support for non-decimal integer literals and underscores in
numeric literals to SQL JSON path language. This follows the rules of
ECMAScript, as referred to by the SQL standard.
Internally, all the numeric literal parsing of jsonpath goes through
numeric_in, which already supports all this, so this patch is just a
bit of lexer work and some tests and documentation.
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/b11b25bb-6ec1-d42f-cedd-311eae59e1fb@enterprisedb.com
Beginning in v15, if you apply ALTER TABLE ENABLE/DISABLE TRIGGER to
a partitioned table, it also affects the partitions' cloned versions
of the affected trigger(s). The initial implementation of this
located the clones by name, but that fails on foreign-key triggers
which have names incorporating their own OIDs. We can fix that, and
also make the behavior more bulletproof in the face of user-initiated
trigger renames, by identifying the cloned triggers by tgparentid.
Following the lead of earlier commits in this area, I took care not
to break ABI in the v15 branch, even though I rather doubt there
are any external callers of EnableDisableTrigger.
While here, update the documentation, which was not touched when
the semantics were changed.
Per bug #17817 from Alan Hodgson. Back-patch to v15; older versions
do not have this behavior.
Discussion: https://postgr.es/m/17817-31dfb7c2100d9f3d@postgresql.org
Our previous habit of showing the full function body is really
pretty unfriendly for tabular viewing of functions, and now that
we have \sf and \ef commands there seems no good reason why \df+
has to do it. It still seems to make sense to show prosrc for
internal and C-language functions, since in those cases prosrc
is just the C function name; but then let's rename the column to
"Internal name" which is a more accurate descriptor.
Isaac Morland
Discussion: https://postgr.es/m/CAMsGm5eqKc6J1=Lwn=ZONG=6ZDYWRQ4cgZQLqMuZGB1aVt_JBg@mail.gmail.com
The current implementation of query normalization in pg_stat_statements
is optimistic. If an entry is deallocated between the post-analyze hook
and the planner and/or execution hook, it can be possible to find query
strings with literal constant values (like "SELECT 1, 2") rather than
their normalized flavor (like "SELECT $1, $2").
This commit adds in the documentation a paragraph about this limitation,
and that this risk can be reduced by increasing pg_stat_statements.max,
particularly if pg_stat_statements_info reports a high number of
deallocations.
Author: Sami Imseih
Discussion: https://postgr.es/m/9CFF3512-355B-4676-8CCC-6CF622F4DC1A@amazon.com
Since 3db72eb, the calculation of the query ID hash for utilities is not
done based on the textual query strings, but on their internal Query
representation, meaning that there can be an overlap when they use
literal constants. The documentation of pg_stat_statements was missing
a refresh about that.
Extracted from a larger patch by me.
Discussion: https://postgr.es/m/Y+MRdEq9W9XVa2AB@paquier.xyz
pg_input_error_info() is now a SQL function able to return a row with
more than just the error message generated for incorrect data type
inputs when these are able to handle soft failures, returning more
contents of ErrorData, as of:
- The error message (same as before).
- The error detail, if set.
- The error hint, if set.
- SQL error code.
All the regression tests that relied on pg_input_error_message() are
updated to reflect the effects of the rename.
Per discussion with Tom Lane and Andrew Dunstan.
Author: Nathan Bossart
Discussion: https://postgr.es/m/139a68e1-bd1f-a9a7-b5fe-0be9845c6311@dunslane.net
Expand pg_dump's compression streaming and file APIs to support the lz4
algorithm. The newly added compress_lz4.{c,h} files cover all the
functionality of the aforementioned APIs. Minor changes were necessary
in various pg_backup_* files, where code for the 'lz4' file suffix has
been added, as well as pg_dump's compression option parsing.
Author: Georgios Kokolatos
Reviewed-by: Michael Paquier, Rachel Heaton, Justin Pryzby, Shi Yu, Tomas Vondra
Discussion: https://postgr.es/m/faUNEOpts9vunEaLnmxmG-DldLSg_ql137OC3JYDmgrOMHm1RvvWY2IdBkv_CRxm5spCCb_OmKNk2T03TMm0fBEWveFF9wA1WizPuAgB7Ss%3D%40protonmail.com
SQL:2023 defines an ANY_VALUE aggregate whose purpose is to emit an
implementation-dependent (i.e. non-deterministic) value from the
aggregated rows.
Author: Vik Fearing <vik@postgresfriends.org>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/5cff866c-10a8-d2df-32cb-e9072e6b04a2@postgresfriends.org
The -Dcassert and -Db_coverage that can be specified to a meson command
require values after an equal sign but the documentation did not
properly reflect that. All the other options specify the argument
values they expect, so close the gap.
Author: Jelte Fennema
Discussion: https://postgr.es/m/CAGECzQRXd1z+AoQ4tC5tqPk1_NQJohf6xwdEL=z9KgxHau2maQ@mail.gmail.com
A new callback named startup_cb, called shortly after a module is
loaded, is added. This makes possible the initialization of any
additional state data required by a module. This initial state data can
be saved in a ArchiveModuleState, that is now passed down to all the
callbacks that can be defined in a module. With this design, it is
possible to have a per-module state, aimed at opening the door to the
support of more than one archive module.
The initialization of the callbacks is changed so as
_PG_archive_module_init() does not anymore give in input a
ArchiveModuleCallbacks that a module has to fill in with callback
definitions. Instead, a module now needs to return a const
ArchiveModuleCallbacks.
All the structure and callback definitions of archive modules are moved
into their own header, named archive_module.h, from pgarch.h.
Command-based archiving follows the same line, with a new set of files
named shell_archive.{c,h}.
There are a few more items that are under discussion to improve the
design of archive modules, like the fact that basic_archive calls
sigsetjmp() by itself to define its own error handling flow. These will
be adjusted later, the changes done here cover already a good portion
of what has been discussed.
Any modules created for v15 will need to be adjusted to this new
design.
Author: Nathan Bossart
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/20230130194810.6fztfgbn32e7qarj@awork3.anarazel.de
d9d7fe68d3 made use of an existing wait event when sending data from the
apply worker, but we should have invented a new wait event since this is a
new place to wait.
This patch corrects the mistake by using a new wait event
"LogicalApplySendData".
Author: Hou Zhijie
Reviewed-by: Peter Smith
Discussion: https://postgr.es/m/CA+TgmobWzbr9H3yN3dLVckviEZKemPwd+XyCFKEgyZQZhgP66Q@mail.gmail.com
force_parallel_mode is meant to be used to allow us to exercise the
parallel query infrastructure to ensure that it's working as we expect.
It seems some users think this GUC is for forcing the query planner into
picking a parallel plan regardless of the costs. A quick look at the
documentation would have made them realize that they were wrong, but the
GUC is likely too conveniently named which, evidently, seems to often
result in users expecting that it forces the planner into usefully
parallelizing queries.
Here we rename the GUC to something which casual users are less likely to
mistakenly think is what they need to make their query run more quickly.
For now, the old name can still be used. We'll revisit if the old name
mapping can be removed once the buildfarm configs are all updated.
Reviewed-by: John Naylor
Discussion: https://postgr.es/m/CAApHDvrsOi92_uA7PEaHZMH-S4Xv+MGhQWA+GrP8b1kjpS1HjQ@mail.gmail.com
If the locale provider is not specified, it defaults to be the same as
the template from which it was created. Previously, the documentation
said the default was libc.
Also adjust wording of CREATE DATABASE and CREATE COLLATION docs to be
definite that there are exactly two possible collation providers.
Discussion: https://postgr.es/m/6befdaada61c046b67f3b269f7fa6f069a35803e.camel%40j-davis.com
Reviewed-by: Nathan Bossart
Builds on 28e626bde0 and f30d62c2fc. See the former for motivation.
Rows of the view show IO operations for a particular backend type, IO target
object, IO context combination (e.g. a client backend's operations on
permanent relations in shared buffers) and each column in the view is the
total number of IO Operations done (e.g. writes). So a cell in the view would
be, for example, the number of blocks of relation data written from shared
buffers by client backends since the last stats reset.
In anticipation of tracking WAL IO and non-block-oriented IO (such as
temporary file IO), the "op_bytes" column specifies the unit of the "reads",
"writes", and "extends" columns for a given row.
Rows for combinations of IO operation, backend type, target object and context
that never occur, are ommitted entirely. For example, checkpointer will never
operate on temporary relations.
Similarly, if an IO operation never occurs for such a combination, the IO
operation's cell will be null, to distinguish from 0 observed IO
operations. For example, bgwriter should not perform reads.
Note that some of the cells in the view are redundant with fields in
pg_stat_bgwriter (e.g. buffers_backend). For now, these have been kept for
backwards compatibility.
Bumps catversion.
Author: Melanie Plageman <melanieplageman@gmail.com>
Author: Samay Sharma <smilingsamay@gmail.com>
Reviewed-by: Maciek Sakrejda <m.sakrejda@gmail.com>
Reviewed-by: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
This commit adds the infrastructure for more detailed IO statistics. The calls
to actually count IOs, a system view to access the new statistics,
documentation and tests will be added in subsequent commits, to make review
easier.
While we already had some IO statistics, e.g. in pg_stat_bgwriter and
pg_stat_database, they did not provide sufficient detail to understand what
the main sources of IO are, or whether configuration changes could avoid
IO. E.g., pg_stat_bgwriter.buffers_backend does contain the number of buffers
written out by a backend, but as that includes extending relations (always
done by backends) and writes triggered by the use of buffer access strategies,
it cannot easily be used to tune background writer or checkpointer. Similarly,
pg_stat_database.blks_read cannot easily be used to tune shared_buffers /
compute a cache hit ratio, as the use of buffer access strategies will often
prevent a large fraction of the read blocks to end up in shared_buffers.
The new IO statistics count IO operations (evict, extend, fsync, read, reuse,
and write), and are aggregated for each combination of backend type (backend,
autovacuum worker, bgwriter, etc), target object of the IO (relations, temp
relations) and context of the IO (normal, vacuum, bulkread, bulkwrite).
What is tracked in this series of patches, is sufficient to perform the
aforementioned analyses. Further details, e.g. tracking the number of buffer
hits, would make that even easier, but was left out for now, to keep the scope
of the already large patchset manageable.
Bumps PGSTAT_FILE_FORMAT_ID.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de
It appears no longer possible to build the SGML docs without a local
installation of the DocBook DTD, because sourceforge.net now only
permits HTTPS access, and no common version of xsltproc supports that.
Hence, remove the bits of our documentation suggesting that that's
possible or useful.
In fact, we might as well add the --nonet option to the build recipes
automatically, for a bit of extra security.
Also fix our documentation-tool-installation recipes for macOS to
ensure that xmllint and xsltproc are pulled in from MacPorts or
Homebrew. The previous recipes assumed you could use the
Apple-supplied versions of these tools; which still works, except that
you'd need to set an environment variable to ensure that they would
find DTD files provided by those package managers. Simpler and easier
to just recommend pulling in the additional packages.
In HEAD, also document how to build docs using Meson, and adjust
"ninja docs" to just build the HTML docs, for consistency with the
default behavior of doc/src/sgml/Makefile.
In a fit of neatnik-ism, I also made the ordering of the package
lists match the order in which the tools are described at the head
of the appendix.
Aleksander Alekseev, Peter Eisentraut, Tom Lane
Discussion: https://postgr.es/m/CAJ7c6TO8Aro2nxg=EQsVGiSDe-TstP4EsSvDHd7DSRsP40PgGA@mail.gmail.com
The splitting into parts actually started earlier than the text had
claimed, but that is ancient history anyway by now and does not need
to be mentioned. Update that and tweak the text a bit.
This adds a new option to pg_verifybackup called -P/--progress, showing
every second some information about the progress of the checksum
verification based on the data of a backup manifest.
Similarly to what is done for pg_rewind and pg_basebackup, the
information printed in the progress report consists of the current
amount of data computed and the total amount of data that will be
computed. Note that files found with an incorrect size do not have
their checksum verified, hence their size is not appended to the total
amount of data estimated during the first scan of the manifest data
(such incorrect sizes could be overly high, for one, falsifying the
progress report).
Author: Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoC5+JOgMd4o3z_oxw0f8JDSsCYY7zSbhe-O9x7f33rw_A@mail.gmail.com
This allows underscores to be used in integer and numeric literals,
and their corresponding type input functions, for visual grouping.
For example:
1_500_000_000
3.14159_26535_89793
0xffff_ffff
0b_1001_0001
A single underscore is allowed between any 2 digits, or immediately
after the base prefix indicator of non-decimal integers, per SQL:202x
draft.
Peter Eisentraut and Dean Rasheed
Discussion: https://postgr.es/m/84aae844-dc55-a4be-86d9-4f0fa405cc97%40enterprisedb.com
Breaking <phrase> over two lines is not handled by psql's
create_help.pl. (It creates faulty \help output.)
Undo the formatting change introduced by
9bdad1b515 to fix this for now.
An early release of AF_UNIX in Windows apparently supported Linux-style
"abstract" Unix sockets, but they do not seem to work in current Windows
versions and there is no mention of any of this in the Winsock
documentation. Remove the mention of Windows from the documentation.
Back-patch to 14, where commit c9f0624b landed.
Discussion: https://postgr.es/m/CA%2BhUKGKrYbSZhrk4NGfoQGT_3LQS5pC5KNE1g0tvE_pPBZ7uew%40mail.gmail.com
Extend the existing developer option 'logical_replication_mode' to help
test the parallel apply of large transactions on the subscriber.
When set to 'buffered', the leader sends changes to parallel apply workers
via a shared memory queue. When set to 'immediate', the leader serializes
all changes to files and notifies the parallel apply workers to read and
apply them at the end of the transaction.
This helps in adding tests to cover the serialization code path in
parallel streaming mode.
Author: Hou Zhijie
Reviewed-by: Peter Smith, Kuroda Hayato, Sawada Masahiko, Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
This was only mentioned in the description of the text/label, which
are marked as being in quotes in the synopsis, which can cause
confusion (as witnessed on IRC).
Also separate the literal and NULL cases in the parameter list, per
suggestion from Tom Lane.
Also add an example of dropping a security label.
Dagfinn Ilmari Mannsåker, with some tweaks by me
Discussion: https://postgr.es/m/87sffqk4zp.fsf@wibble.ilmari.org
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
defGetBoolean() allows the "value" part of "option = value"
syntax to be omitted, in which case it's taken as "true".
This is acknowledged in our syntax summaries for relevant commands,
but we don't seem to have documented the actual behavior anywhere.
Do so for CREATE/ALTER PUBLICATION/SUBSCRIPTION. Use generic
boilerplate text for this, with the idea that we can copy-and-paste
it into other relevant reference pages, whenever someone gets
around to that.
Peter Smith, edited a bit by me
Discussion: https://postgr.es/m/CAHut+PvwjZfdGt2R8HTXgSZft=jZKymrS8KUg31pS7zqaaWKKw@mail.gmail.com
Rename the developer option 'logical_decoding_mode' to the more flexible
name 'logical_replication_mode' because doing so will make it easier to
extend this option in the future to help test other areas of logical
replication.
Currently, it is used on the publisher side to allow streaming or
serializing each change in logical decoding. In the upcoming patch, we are
planning to use it on the subscriber. On the subscriber, it will allow
serializing the changes to file and notifies the parallel apply workers to
read and apply them at the end of the transaction.
We discussed exposing this parameter as a subscription option but
it did not seem advisable since it is primarily used for testing/debugging
and there is no other such parameter. We also discussed having separate
GUCs for publisher and subscriber but for current testing/debugging
requirements, one GUC is sufficient.
Author: Hou Zhijie
Reviewed-by: Peter Smith, Kuroda Hayato, Sawada Masahiko, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoAy2c=Mx=FTCs+EwUsf2kQL5MmU3N18X84k0EmCXntK4g@mail.gmail.com
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
VACUUM VERBOSE/autovacuuming logging have reported on the number of
pages frozen by VACUUM since commit d977ffd9 added that capability.
This information is directly related to relfrozenxid advancement, so
update an older tip from the documentation about how relfrozenxid is
reported on by the same instrumentation code. Now the tip directly
mentions newly frozen pages, too.
Eager freezing strategy avoids large build-ups of all-visible pages. It
makes VACUUM trigger page-level freezing whenever doing so will enable
the page to become all-frozen in the visibility map. This is useful for
tables that experience continual growth, particularly strict append-only
tables such as pgbench's history table. Eager freezing significantly
improves performance stability by spreading out the cost of freezing
over time, rather than doing most freezing during aggressive VACUUMs.
It complements the insert autovacuum mechanism added by commit b07642db.
VACUUM determines its freezing strategy based on the value of the new
vacuum_freeze_strategy_threshold GUC (or reloption) with logged tables.
Tables that exceed the size threshold use the eager freezing strategy.
Unlogged tables and temp tables always use eager freezing strategy,
since the added cost is negligible there. Non-permanent relations won't
incur any extra overhead in WAL written (for the obvious reason), nor in
pages dirtied (since any extra freezing will only take place on pages
whose PD_ALL_VISIBLE bit needed to be set either way).
VACUUM uses lazy freezing strategy for logged tables that fall under the
GUC size threshold. Page-level freezing triggers based on the criteria
established in commit 1de58df4, which added basic page-level freezing.
Eager freezing is strictly more aggressive than lazy freezing. Settings
like vacuum_freeze_min_age still get applied in just the same way in
every VACUUM, independent of the strategy in use. The only mechanical
difference between eager and lazy freezing strategies is that only the
former applies its own additional criteria to trigger freezing pages.
Note that even lazy freezing strategy will trigger freezing whenever a
page happens to have required that an FPI be written during pruning,
provided that the page will thereby become all-frozen in the visibility
map afterwards (due to the FPI optimization from commit 1de58df4).
The vacuum_freeze_strategy_threshold default setting is 4GB. This is a
relatively low setting that prioritizes performance stability. It will
be reviewed at the end of the Postgres 16 beta period.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Jeff Davis <pgsql@j-davis.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzkFok_6EAHuK39GaW4FjEFQsY=3J0AAd6FXk93u-Xq3Fg@mail.gmail.com
network_ops is an opclass family of SpGiST, and the opclass able to
work on the inet type is named inet_ops.
Oversight in 7a1cd52, that reworked the design of the table listing all
the operators available.
Reported-by: Laurence Parry
Reviewed-by: Tom Lane, David G. Johnston
Discussion: https://postgr.es/m/167458110639.2667300.14741268666497110766@wrigleys.postgresql.org
Backpatch-through: 14
This rename is in preparation for the introduction of recovery modules,
where basic_wal_module will be used as a base template for the set of
callbacks introduced. The former name did not really reflect all that.
Author: Nathan Bossart
Discussion: https://postgr.es/m/20221227192449.GA3672473@nathanxps13
Previously, a CREATEROLE user without SUPERUSER could not alter
REPLICATION users in any way, and could not set the BYPASSRLS
attribute. However, they could manipulate the CREATEDB property
even if they themselves did not possess it.
With this change, a CREATEROLE user without SUPERUSER can set or
clear the REPLICATION, BYPASSRLS, or CREATEDB property on a new
role or a role that they have rights to manage if and only if
that property is set for their own role.
This implements the standard idea that you can't give permissions
you don't have (but you can give the ones you do have). We might
in the future want to provide more powerful ways to constrain
what a CREATEROLE user can do - for example, to limit whether
CONNECTION LIMIT can be set or the values to which it can be set -
but that is left as future work.
Patch by me, reviewed by Nathan Bossart, Tushar Ahuja, and Neha
Sharma.
Discussion: http://postgr.es/m/CA+TgmobX=LHg_J5aT=0pi9gJy=JdtrUVGAu0zhr-i5v5nNbJDg@mail.gmail.com
This function is able to extract the full page images from a range of
records, specified as of input arguments start_lsn and end_lsn. Like
the other functions of this module, an error is returned if using LSNs
that do not reflect real system values. All the FPIs stored in a single
record are extracted.
The module's version is bumped to 1.1.
Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/CALj2ACVCcvzd7WiWvD=6_7NBvVB_r6G0EGSxL4F8vosAi6Se4g@mail.gmail.com
This adds combine, serial and deserial functions for the array_agg() and
string_agg() aggregate functions, thus allowing these aggregates to
partake in partial aggregations. This allows both parallel aggregation to
take place when these aggregates are present and also allows additional
partition-wise aggregation plan shapes to include plans that require
additional aggregation once the partially aggregated results from the
partitions have been combined.
Author: David Rowley
Reviewed-by: Andres Freund, Tomas Vondra, Stephen Frost, Tom Lane
Discussion: https://postgr.es/m/CAKJS1f9sx_6GTcvd6TMuZnNtCh0VhBzhX6FZqw17TgVFH-ga_A@mail.gmail.com
Enforce wal_retrieve_retry_interval on a per-subscription basis,
rather than globally, and arrange to skip that delay in case of
an intentional worker exit. This probably makes little difference
in the field, where apply workers wouldn't be restarted often;
but it has a significant impact on the runtime of our logical
replication regression tests (even though those tests use
artificially-small wal_retrieve_retry_interval settings already).
Nathan Bossart, with mostly-cosmetic editorialization by me
Discussion: https://postgr.es/m/20221122004119.GA132961@nathanxps13
This provides a way to reserve connection slots for non-superusers.
The slots reserved via the new GUC are available only to users who
have the new predefined role pg_use_reserved_connections.
superuser_reserved_connections remains as a final reserve in case
reserved_connections has been exhausted.
Patch by Nathan Bossart. Reviewed by Tushar Ahuja and by me.
Discussion: http://postgr.es/m/20230119194601.GA4105788@nathanxps13
Commit ea92368cd1 made max_wal_senders
a separate pool of backends from max_connections, but the documentation
and error message for superuser_reserved_connections weren't updated
at the time, and as a result are somewhat misleading. Update.
This is arguably a back-patchable bug fix, but because it seems quite
minor, no back-patch.
Patch by Nathan Bossart. Reviewed by Tushar Ahuja and by me.
Discussion: http://postgr.es/m/20230119194601.GA4105788@nathanxps13
The original titles only had the module name, which is not very useful
when scanning the list. By adding a very brief description to each
title, the table of contents becomes friendlier.
Also amend the introduction in the "additional modules" appendix, using
the word "Extension" more extensively. Nowadays, almost all contrib
modules are extensions, so this is also helpful.
Author: Karl O. Pinc <kop@karlpinc.com>
Reviewed-by: Brar Piening <brar@gmx.de>
Discussion: https://postgr.es/m/20230102180015.372995a9@slate.karlpinc.com
While pg_hba.conf has support for non-literal username matches, and
this commit extends the capabilities that are supported for the
PostgreSQL user listed in an ident entry part of pg_ident.conf, with
support for:
1. The "all" keyword, where all the requested users are allowed.
2. Membership checks using the + prefix.
3. Using a regex to match against multiple roles.
1. is a feature that has been requested by Jelte Fennema, 2. something
that has been mentioned independently by Andrew Dunstan, and 3. is
something I came up with while discussing how to extend the first one,
whose implementation is facilitated by 8fea868.
This allows matching certain system users against many different
postgres users with a single line in pg_ident.conf. Without this, one
would need one line for each of the postgres users that a system user
can log in as, which can be cumbersome to maintain.
Tests are added to the TAP test of peer authentication to provide
coverage for all that.
Note that this introduces a set of backward-incompatible changes to be
able to detect the new patterns, for the following cases:
- A role named "all".
- A role prefixed with '+' characters, which is something that would not
have worked in HBA entries anyway.
- A role prefixed by a slash character, similarly to 8fea868.
Any of these can be still be handled by using quotes in the Postgres
role defined in an ident entry.
A huge advantage of this change is that the code applies the same checks
for the Postgres roles in HBA and ident entries, via the common routine
check_role().
**This compatibility change should be mentioned in the release notes.**
Author: Jelte Fennema
Discussion: https://postgr.es/m/DBBPR83MB0507FEC2E8965012990A80D0F7FC9@DBBPR83MB0507.EURPRD83.prod.outlook.com
Add leader_pid to pg_stat_subscription. leader_pid is the process ID of
the leader apply worker if this process is a parallel apply worker. If
this field is NULL, it indicates that the process is a leader apply
worker or a synchronization worker. The new column makes it easier to
distinguish parallel apply workers from other kinds of workers and helps
to identify the leader for the parallel workers corresponding to a
particular subscription.
Additionally, update the leader_pid column in pg_stat_activity as well to
display the PID of the leader apply worker for parallel apply workers.
Author: Hou Zhijie
Reviewed-by: Peter Smith, Sawada Masahiko, Amit Kapila, Shveta Mallik
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
Avoid use of "_" in SGML IDs. Awhile back that was actually
disallowed by the toolchain, as a consequence of which our convention
has been to use "-" instead. Fix a couple of stragglers that are
particularly inconsistent with that convention and with related IDs.
This is just neatnik-ism, so no need for back-patch.
Discussion: https://postgr.es/m/769446.1673478332@sss.pgh.pa.us
Entries of pg-user in pg_ident.conf that are quoted and include '\1'
allow a replacement from a subexpression in a system user regexp. This
commit adds a test to track this behavior and a note in the
documentation, as it could be affected by the use of an AuthToken for
the pg-user in the IdentLines parsed.
This subject has come up in the discussion aimed at extending the
support of pg-user in ident entries for more patterns.
Author: Jelte Fennema
Discussion: https://postgr.es/m/CAGECzQRNow4MwkBjgPxywXdJU_K3a9+Pm78JB7De3yQwwkTDew@mail.gmail.com
Add a cross-reference from the part of the page that introdues SECURITY
INVOKER and SECURITY DEFINER to the part of the page that talks about
writing SECURITY DEFINER functions safely, so that users are less likely
to miss it.
Remove discussion of the pre-8.3 behavior on the theory that it's
probably not very relevant any more, that release having gone out of
support nearly a decade ago.
Add a mention of the new createrole_self_grant GUC, which in
certain cases might need to be set to a safe value to avoid
unexpected consequences.
Possibly this section needs major surgery rather than just these
small tweaks, but hopefully this is at least a small step
forward.
Discussion: http://postgr.es/m/CA+Tgmoauqd1cHQjsNEoxL5O-kEO4iC9dAPyCudSvmNqPJGmy9g@mail.gmail.com
Update the reference pages for various ALTER commands that
mentioned that you must be a member of role that will be the
new owner to instead say that you must be able to SET ROLE
to the new owner. Update ddl.sgml's generate statement on this
topic along similar lines.
Likewise, update CREATE SCHEMA and CREATE DATABASE, which
have options to specify who will own the new objects, to say
that you must be able to SET ROLE to the role that will own
them.
Finally, update the documentation for the GRANT statement
itself with some general principles about how the SET option
works and how it can be used.
Patch by me, reviewed (but not fully endorsed) by Noah Misch.
Discussion: http://postgr.es/m/CA+TgmoZk6VB3DQ83+DO5P_HP=M9PQAh1yj-KgeV30uKefVaWDg@mail.gmail.com
Commit 60684dd8 left loose ends when it came to maintaining toast
tables or partitions.
For toast tables, simply skip the privilege check if the toast table
is an indirect target of the maintenance command, because the main
table privileges have already been checked.
For partitions, allow the maintenance command if the user has the
MAINTAIN privilege on the partition or any parent.
Also make CLUSTER emit "skipping" messages when the user doesn't have
privileges, similar to VACUUM.
Author: Nathan Bossart
Reported-by: Pavel Luzanov
Reviewed-by: Pavel Luzanov, Ted Yu
Discussion: https://postgr.es/m/20230113231339.GA2422750@nathanxps13
The prior behavior was confusing and hard to document. For instance,
if you had UPDATE privileges, you could lock a table in any lock mode
except ACCESS SHARE mode.
Now, if granted a privilege to lock at a given mode, one also has
privileges to lock at a less-conflicting mode. MAINTAIN, UPDATE,
DELETE, and TRUNCATE privileges allow any lock mode. INSERT privileges
allow ROW EXCLUSIVE (or below). SELECT privileges allow ACCESS SHARE.
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/9550c76535404a83156252b25a11babb4792ea1e.camel%40j-davis.com
As introduced in 2258e76, the docs were hard to parse:
- The examples used listed a lot of long records, bloating the output.
These are switched to show less records with the expanded format,
similarly to pageinspect.
- The function descriptions listed all the OUT parameters, producing
long lines. This is updated so as only the input parameters are
documented, clarifying the whole.
- Remove one example on pg_get_wal_stats() when per_record is set to
true, which is not really necessary once we know the output produced,
and the behavior of the parameter is documented.
While on it, fix a few grammar mistakes and simplify a couple of
sentences.
Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACVGcUpziGgQrcT-1G3dHWQQfWjYBu1YQ2ypv9y86dgogg@mail.gmail.com
Backpatch-through: 15
In both partitioning and traditional inheritance, require child
columns to be GENERATED if and only if their parent(s) are.
Formerly we allowed the case of an inherited column being
GENERATED when its parent isn't, but that results in inconsistent
behavior: the column can be directly updated through an UPDATE
on the parent table, leading to it containing a user-supplied
value that might not match the generation expression. This also
fixes an oversight that we enforced partition-key-columns-can't-
be-GENERATED against parent tables, but not against child tables
that were dynamically attached to them.
Also, remove the restriction that the child's generation expression
be equivalent to the parent's. In the wake of commit 3f7836ff6,
there doesn't seem to be any reason that we need that restriction,
since generation expressions are always computed per-table anyway.
By removing this, we can also allow a child to merge multiple
inheritance parents with inconsistent generation expressions, by
overriding them with its own expression, much as we've long allowed
for DEFAULT expressions.
Since we're rejecting a case that we used to accept, this doesn't
seem like a back-patchable change. Given the lack of field
complaints about the inconsistent behavior, it's likely that no
one is doing this anyway, but we won't change it in minor releases.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/2793383.1672944799@sss.pgh.pa.us
Can be set to the empty string, or to either or both of "set" or
"inherit". If set to a non-empty value, a non-superuser who creates
a role (necessarily by relying up the CREATEROLE privilege) will
grant that role back to themselves with the specified options.
This isn't a security feature, because the grant that this feature
triggers can also be performed explicitly. Instead, it's a user experience
feature. A superuser would necessarily inherit the privileges of any
created role and be able to access all such roles via SET ROLE;
with this patch, you can configure createrole_self_grant = 'set, inherit'
to provide a similar experience for a user who has CREATEROLE but not
SUPERUSER.
Discussion: https://postgr.es/m/CA+TgmobN59ct+Emmz6ig1Nua2Q-_o=r6DSD98KfU53kctq_kQw@mail.gmail.com
Previously, CREATEROLE users were permitted to make nearly arbitrary
changes to roles that they didn't create, with certain exceptions,
particularly superuser roles. Instead, allow CREATEROLE users to make such
changes to roles for which they possess ADMIN OPTION, and to
grant membership only in roles for which they possess ADMIN OPTION.
When a CREATEROLE user who is not a superuser creates a role, grant
ADMIN OPTION on the newly-created role to the creator, so that they
can administer roles they create or for which they have been given
privileges.
With these changes, CREATEROLE users still have very significant
powers that unprivileged users do not receive: they can alter, rename,
drop, comment on, change the password for, and change security labels
on roles. However, they can now do these things only for roles for
which they possess appropriate privileges, rather than all
non-superuser roles; moreover, they cannot grant a role such as
pg_execute_server_program unless they themselves possess it.
Patch by me, reviewed by Mark Dilger.
Discussion: https://postgr.es/m/CA+TgmobN59ct+Emmz6ig1Nua2Q-_o=r6DSD98KfU53kctq_kQw@mail.gmail.com
Currently, for large transactions, the publisher sends the data in
multiple streams (changes divided into chunks depending upon
logical_decoding_work_mem), and then on the subscriber-side, the apply
worker writes the changes into temporary files and once it receives the
commit, it reads from those files and applies the entire transaction. To
improve the performance of such transactions, we can instead allow them to
be applied via parallel workers.
In this approach, we assign a new parallel apply worker (if available) as
soon as the xact's first stream is received and the leader apply worker
will send changes to this new worker via shared memory. The parallel apply
worker will directly apply the change instead of writing it to temporary
files. However, if the leader apply worker times out while attempting to
send a message to the parallel apply worker, it will switch to
"partial serialize" mode - in this mode, the leader serializes all
remaining changes to a file and notifies the parallel apply workers to
read and apply them at the end of the transaction. We use a non-blocking
way to send the messages from the leader apply worker to the parallel
apply to avoid deadlocks. We keep this parallel apply assigned till the
transaction commit is received and also wait for the worker to finish at
commit. This preserves commit ordering and avoid writing to and reading
from files in most cases. We still need to spill if there is no worker
available.
This patch also extends the SUBSCRIPTION 'streaming' parameter so that the
user can control whether to apply the streaming transaction in a parallel
apply worker or spill the change to disk. The user can set the streaming
parameter to 'on/off', or 'parallel'. The parameter value 'parallel' means
the streaming will be applied via a parallel apply worker, if available.
The parameter value 'on' means the streaming transaction will be spilled
to disk. The default value is 'off' (same as current behaviour).
In addition, the patch extends the logical replication STREAM_ABORT
message so that abort_lsn and abort_time can also be sent which can be
used to update the replication origin in parallel apply worker when the
streaming transaction is aborted. Because this message extension is needed
to support parallel streaming, parallel streaming is not supported for
publications on servers < PG16.
Author: Hou Zhijie, Wang wei, Amit Kapila with design inputs from Sawada Masahiko
Reviewed-by: Sawada Masahiko, Peter Smith, Dilip Kumar, Shi yu, Kuroda Hayato, Shveta Mallik
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
This allows an optional "S" modifier to be added to \dp and \z, to
have them include system objects in the list.
Note that this also changes the behaviour of a bare \dp or \z without
the "S" modifier to include temp objects in the list, and exclude
information_schema objects, making them consistent with other psql
meta-commands.
Nathan Bossart, reviewed by Maxim Orlov.
Discussion: https://postgr.es/m/20221206193606.GB3078082@nathanxps13
VACUUM normally ends by running vac_update_datfrozenxid(), which
requires a scan of pg_class. Therefore, if one attempts to vacuum a
database one table at a time --- as vacuumdb has done since v12 ---
we will spend O(N^2) time in vac_update_datfrozenxid(). That causes
serious performance problems in databases with tens of thousands of
tables, and indeed the effect is measurable with only a few hundred.
To add insult to injury, only one process can run
vac_update_datfrozenxid at the same time per DB, so this behavior
largely defeats vacuumdb's -j option.
Hence, invent options SKIP_DATABASE_STATS and ONLY_DATABASE_STATS
to allow applications to postpone vac_update_datfrozenxid() until the
end of a series of VACUUM requests, and teach vacuumdb to use them.
Per bug #17717 from Gunnar L. Sadly, this answer doesn't seem
like something we'd consider back-patching, so the performance
problem will remain in v12-v15.
Tom Lane and Nathan Bossart
Discussion: https://postgr.es/m/17717-6c50eb1c7d23a886@postgresql.org
In user-manag.sgml, document precisely what privileges are conveyed
by CREATEROLE. Make particular note of the fact that it allows
changing passwords and granting access to high-privilege roles.
Also remove the suggestion of using a user with CREATEROLE and
CREATEDB instead of a superuser, as there is no real security
advantage to this approach.
Elsewhere in the documentation, adjust text that suggests that
<literal>CREATEROLE</literal> only allows for role creation, and
refer to the documentation in user-manag.sgml as appropriate.
Patch by me, reviewed by Álvaro Herrera
Discussion: http://postgr.es/m/CA+TgmoZBsPL8nPhvYecx7iGo5qpDRqa9k_AcaW1SbOjugAY1Ag@mail.gmail.com
While on it, newlines are removed from the end of two elog() strings.
The others are simple grammar mistakes. One comment in pg_upgrade
referred incorrectly to sequences since a7e5457.
Author: Justin Pryzby
Discussion: https://postgr.es/m/20221230231257.GI1153@telsasoft.com
Backpatch-through: 11
This is like the existing bt_page_stats() function, but it can
report on a range of pages rather than just one at a time.
I don't have a huge amount of faith in the portability of the
new test cases, but they do pass in a 32-bit FreeBSD VM here.
Further adjustment may be needed depending on buildfarm results.
Hamid Akhtar, reviewed by Naeem Akhter, Bertrand Drouvot,
Bharath Rupireddy, and myself
Discussion: https://postgr.es/m/CANugjht-=oGMRmNJKMqnBC69y7vr+wHDmm0ZK6-1pJsxoBKBbA@mail.gmail.com
A refcursor variable that is bound to a specific query (by declaring
it with "CURSOR FOR") now chooses a portal name in the same way as an
unbound, plain refcursor variable. Its string value starts out as
NULL, and unless that's overridden by manual assignment, it will be
replaced by a unique-within-session portal name during OPEN.
The previous behavior was to initialize such variables to contain
their own name, resulting in that also being the portal name unless
the user overwrote it before OPEN. The trouble with this is that
it causes failures due to conflicting portal names if the same
cursor variable name is used in different functions. It is pretty
non-orthogonal to have bound and unbound refcursor variables behave
differently on this point, too, so let's change it.
This change can cause compatibility problems for applications that
open a bound cursor in a plpgsql function and then use it in the
calling code without explicitly passing back the refcursor value
(portal name). If the calling code simply assumes that the portal
name matches the called function's variable name, it will now fail.
That can be fixed by explicitly assigning a string value to the
refcursor variable before OPEN, e.g.
DECLARE myc CURSOR FOR SELECT ...;
BEGIN
myc := 'myc'; -- add this
OPEN myc;
We have no documentation examples showing the troublesome usage
pattern, so we can hope it's rare in practice.
Patch by me; thanks to Pavel Stehule and Jan Wieck for review.
Discussion: https://postgr.es/m/1465101.1667345983@sss.pgh.pa.us
When collecting ANALYZE sample on foreign tables, postgres_fdw fetched
all rows and performed the sampling locally. For large tables this means
transferring and immediately discarding large amounts of data.
This commit allows the sampling to be performed on the remote server,
transferring only the much smaller sample. The sampling is performed
using the built-in TABLESAMPLE methods (system, bernoulli) or random()
function, depending on the remote server version.
Remote sampling can be enabled by analyze_sampling on the foreign server
and/or foreign table, with supported values 'off', 'auto', 'system',
'bernoulli' and 'random'. The default value is 'auto' which uses either
'bernoulli' (TABLESAMPLE method) or 'random' (for remote servers without
TABLESAMPLE support).
Teach VACUUM to decide on whether or not to trigger freezing at the
level of whole heap pages. Individual XIDs and MXIDs fields from tuple
headers now trigger freezing of whole pages, rather than independently
triggering freezing of each individual tuple header field.
Managing the cost of freezing over time now significantly influences
when and how VACUUM freezes. The overall amount of WAL written is the
single most important freezing related cost, in general. Freezing each
page's tuples together in batch allows VACUUM to take full advantage of
the freeze plan WAL deduplication optimization added by commit 9e540599.
Also teach VACUUM to trigger page-level freezing whenever it detects
that heap pruning generated an FPI. We'll have already written a large
amount of WAL just to do that much, so it's very likely a good idea to
get freezing out of the way for the page early. This only happens in
cases where it will directly lead to marking the page all-frozen in the
visibility map.
In most cases "freezing a page" removes all XIDs < OldestXmin, and all
MXIDs < OldestMxact. It doesn't quite work that way in certain rare
cases involving MultiXacts, though. It is convenient to define "freeze
the page" in a way that gives FreezeMultiXactId the leeway to put off
the work of processing an individual tuple's xmax whenever it happens to
be a MultiXactId that would require an expensive second pass to process
aggressively (allocating a new multi is especially worth avoiding here).
FreezeMultiXactId is eager when processing is cheap (as it usually is),
and lazy in the event of an individual multi that happens to require
expensive second pass processing. This avoids regressions related to
processing of multis that page-level freezing might otherwise cause.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Jeff Davis <pgsql@j-davis.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAH2-WzkFok_6EAHuK39GaW4FjEFQsY=3J0AAd6FXk93u-Xq3Fg@mail.gmail.com
Given the soft-input-error feature, we can reduce these functions
to be just thin wrappers around a soft-error call of the
corresponding datatype input function. This means less code and
more certainty that the to_reg* functions match the normal input
behavior.
Notably, it also means that they will accept numeric OID input,
which they didn't before. It's not clear to me if that omission
had more than laziness behind it, but it doesn't seem like
something we need to work hard to preserve.
Discussion: https://postgr.es/m/3910031.1672095600@sss.pgh.pa.us
This option extracts (potentially decompressing) full-page images
included in WAL records into a given target directory. These images are
subject to the same filtering rules as the normal display of the WAL
records, hence with --relation one can for example extract only the FPIs
issued on the relation defined. By default, the records are printed or
their stats computed (--stats), using --quiet would only save the images
without any output generated.
This is a tool aimed mostly for very experienced users, useful for
fixing page-level corruption or just analyzing the past state of a page,
and there were no easy way to do that with the in-core tools up to now
when looking at WAL.
Each block is saved in a separate file, to ease their manipulation, with
the file respecting <lsn>.<ts>.<db>.<rel>.<blk>_<fork> with as format.
For instance, 00000000-010000C0.1663.1.6117.123_main refers to:
- WAL record LSN in hexa format (00000000-010000C0).
- Tablespace OID (1663).
- Database OID (1).
- Relfilenode (6117).
- Block number (123).
- Fork name of the file this block came from (_main).
Author: David Christensen
Reviewed-by: Sho Kato, Justin Pryzby, Bharath Rupireddy, Matthias van de
Meent
Discussion: https://postgr.es/m/CAOxo6XKjQb2bMSBRpePf3ZpzfNTwjQUc4Tafh21=jzjX6bX8CA@mail.gmail.com
This enables streaming or serializing changes immediately in logical
decoding. This parameter is intended to be used to test logical decoding
and replication of large transactions for which otherwise we need to
generate the changes till logical_decoding_work_mem is reached.
This helps in reducing the timing of existing tests related to logical
replication of in-progress transactions and will help in writing tests for
for the upcoming feature for parallelly applying large in-progress
transactions.
Author: Shi yu
Reviewed-by: Sawada Masahiko, Shveta Mallik, Amit Kapila, Dilip Kumar, Kuroda Hayato, Kyotaro Horiguchi
Discussion: https://postgr.es/m/OSZPR01MB63104E7449DBE41932DB19F1FD1B9@OSZPR01MB6310.jpnprd01.prod.outlook.com
After some copy-edit I made in commit 3a06a79cd1, we have a <sect2>
that only contains a warning box. This doesn't look good. Rework by
moving the sect2 title to be the warning's title, and put the 'id' to it
as well, so that the external reference continues to work.
Backpatch to 15.
In branch master, I also take the opportunity to add titles to a couple
of other warning boxes elsewhere in the documentation.
Discussion: https://postgr.es/m/20221219164713.ccnlvtkyj6lmshqq@alvherre.pgsql
Commit 2f9661311b changed command tags from strings to numbers, but
forgot to adjust the code in the event trigger example, which
consequently failed to compile.
While fixing that, improve the indentation to adhere to pgindent style.
Backpatch to v13, where the change was introduced.
Author: Laurenz Albe
Discussion: https://postgr.es/m/81e36ac17dc80489e74dc5b6914afa6ccdb1a99d.camel@cybertec.at
The former name was discussed as being confusing, so use "split", as per
a suggestion from Magnus Hagander.
While on it, one of the output arguments is renamed from "segno" to
"segment_number", as per a suggestion from Kyotaro Horiguchi.
The documentation is updated to reflect all these changes.
Bump catalog version.
Author: Bharath Rupireddy, Michael Paquier
Discussion: https://postgr.es/m/CABUevEytQVaOOhGdoh0D7hGwe3fuKcRF6NthsSW7ww04EmtFgQ@mail.gmail.com
1349d279 added query planner support to allow more efficient execution of
aggregate functions which have an ORDER BY or a DISTINCT clause. Prior to
that commit, the planner would only request that the lower planner produce
a plan with the order required for the GROUP BY clause and it would be
left up to nodeAgg.c to perform the final sort of records within each
group so that the aggregate transition functions were called in the
correct order. Now that the planner requests the lower planner produce a
plan with the GROUP BY and the ORDER BY / DISTINCT aggregates in mind,
there is the possibility that the planner chooses a plan which could be
less efficient than what would have been produced before 1349d279.
While developing 1349d279, I had in mind that Incremental Sort would help
us in cases where an index exists only on the GROUP BY column(s).
Incremental Sort would just replace the implicit tuplesorts which are
being performed in nodeAgg.c. However, because the planner has the
flexibility to instead choose a plan which just performs a full sort on
both the GROUP BY and ORDER BY / DISTINCT aggregate columns, there is
potential for the planner to make a bad choice. The costing for
Incremental Sort is not perfect as it assumes an even distribution of rows
to sort within each sort group.
Here we add an escape hatch in the form of the enable_presorted_aggregate
GUC. This will allow users to get the pre-PG16 behavior in cases where
they have no other means to convince the query planner to produce a plan
which only sorts on the GROUP BY column(s).
Discussion: https://postgr.es/m/CAApHDvr1Sm+g9hbv4REOVuvQKeDWXcKUAhmbK5K+dfun0s9CvA@mail.gmail.com
This function takes in input a WAL segment name and returns a tuple made
of the segment sequence number (dependent on the WAL segment size of the
cluster) and its timeline, as of a thin SQL wrapper around the existing
XLogFromFileName().
This function has multiple usages, like being able to compile a LSN from
a file name and an offset, or finding the timeline of a segment without
having to do to some maths based on the first eight characters of the
segment.
Bump catalog version.
Author: Bharath Rupireddy
Reviewed-by: Nathan Bossart, Kyotaro Horiguchi, Maxim Orlov, Michael
Paquier
Discussion: https://postgr.es/m/CALj2ACWV=FCddsxcGbVOA=cvPyMr75YCFbSQT6g4KDj=gcJK4g@mail.gmail.com
A new function pg_stat_get_backend_subxact() can be used to get
information about the number of subtransactions in the cache of
a particular backend and whether that cache has overflowed. This
can be useful for tracking down performance problems that can
result from overflowed snapshots.
Dilip Kumar, reviewed by Zhihong Yu, Nikolay Samokhvalov,
Justin Pryzby, Nathan Bossart, Ashutosh Sharma, Julien
Rouhaud. Additional design comments from Andres Freund,
Tom Lane, Bruce Momjian, and David G. Johnston.
Discussion: http://postgr.es/m/CAFiTN-ut0uwkRJDQJeDPXpVyTWD46m3gt3JDToE02hTfONEN=Q@mail.gmail.com
This option selects the default transfer mode. Having an explicit
option is handy to make scripts and tests more explicit. It also
makes it easier to talk about a "copy" mode rather than "the default
mode" or something like that, since until now the default mode didn't
have an externally visible name.
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/50a97009-8ff9-ca4d-a0f6-6086a6775a5b%40enterprisedb.com
Add support for hexadecimal, octal, and binary integer literals:
0x42F
0o273
0b100101
per SQL:202x draft.
This adds support in the lexer as well as in the integer type input
functions.
Reviewed-by: John Naylor <john.naylor@enterprisedb.com>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb97d@enterprisedb.com
Commits f92944137 et al. made IsInTransactionBlock() set the
XACT_FLAGS_NEEDIMMEDIATECOMMIT flag before returning "false",
on the grounds that that kept its API promises equivalent to those of
PreventInTransactionBlock(). This turns out to be a bad idea though,
because it allows an ANALYZE in a pipelined series of commands to
cause an immediate commit, which is unexpected.
Furthermore, if we return "false" then we have another issue,
which is that ANALYZE will decide it's allowed to do internal
commit-and-start-transaction sequences, thus possibly unexpectedly
committing the effects of previous commands in the pipeline.
To fix the latter situation, invent another transaction state flag
XACT_FLAGS_PIPELINING, which explicitly records the fact that we
have executed some extended-protocol command and not yet seen a
commit for it. Then, require that flag to not be set before allowing
InTransactionBlock() to return "false".
Having done that, we can remove its setting of NEEDIMMEDIATECOMMIT
without fear of causing problems. This means that the API guarantees
of IsInTransactionBlock now diverge from PreventInTransactionBlock,
which is mildly annoying, but it seems OK given the very limited usage
of IsInTransactionBlock. (In any case, a caller preferring the old
behavior could always set NEEDIMMEDIATECOMMIT for itself.)
For consistency also require XACT_FLAGS_PIPELINING to not be set
in PreventInTransactionBlock. This too is meant to prevent commands
such as CREATE DATABASE from silently committing previous commands
in a pipeline.
Per report from Peter Eisentraut. As before, back-patch to all
supported branches (which sadly no longer includes v10).
Discussion: https://postgr.es/m/65a899dd-aebc-f667-1d0a-abb89ff3abf8@enterprisedb.com
Add some cross-links between chapter "20. Server Parameters" and
"31. Logical Replication" regarding the available configuration
parameters, for easier navigation; and some more explanatory text too.
I (Álvaro) chose to duplicate max_replication_slots in Chapter 20,
because it has completely different meanings at each side of the
replication link.
Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: samay sharma <smilingsamay@gmail.com>
Discussion: https://postgr.es/m/CAHut+PsESqpy7w3Y6cX98c255ZuCjvipkhKjy6hZBjOv4E6iJA@mail.gmail.com
This is straightforward as far as it goes. However, it does not
attempt to trap errors occurring during the execution of domain
CHECK constraints. Since those are general user-defined
expressions, the only way to do that would involve starting up a
subtransaction for each check. Of course the entire point of
the soft-errors feature is to not need subtransactions, so that
would be self-defeating. For now, we'll rely on the assumption
that domain checks are written to avoid throwing errors.
Discussion: https://postgr.es/m/1181028.1670635727@sss.pgh.pa.us
pg_input_is_valid() returns boolean, while pg_input_error_message()
returns the primary error message if the input is bad, or NULL
if the input is OK. The main reason for having two functions is
so that we can test both the details-wanted and the no-details-wanted
code paths.
Although these are primarily designed with testing in mind,
it could well be that they'll be useful to end users as well.
This patch is mostly by me, but it owes very substantial debt to
earlier work by Nikita Glukhov, Andrew Dunstan, and Amul Sul.
Thanks to Andres Freund for review.
Discussion: https://postgr.es/m/3bbbb0df-7382-bf87-9737-340ba096e034@postgrespro.ru