Commit Graph

13890 Commits

Author SHA1 Message Date
Robert Haas c6306db24b Add 'basebackup_to_shell' contrib module.
As a demonstration of the sort of thing that can be done by adding a
custom backup target, this defines a 'shell' target which executes a
command defined by the system administrator. The command is executed
once for each tar archive generate by the backup and once for the
backup manifest, if any. Each time the command is executed, it
receives the contents of th file for which it is executed via standard
input.

The configured command can use %f to refer to the name of the archive
(e.g. base.tar, $TABLESPACE_OID.tar, backup_manifest) and %d to refer
to the target detail (pg_basebackup --target shell:DETAIL). A target
detail is required if %d appears in the configured command and
forbidden if it does not.

Patch by me, reviewed by Abhijit Menon-Sen.

Discussion: http://postgr.es/m/CA+TgmoaqvdT-u3nt+_kkZ7bgDAyqDB0i-+XOMmr5JN2Rd37hxw@mail.gmail.com
2022-03-15 13:24:23 -04:00
Michael Paquier 6bdf1a1400 Fix collection of typos in the code and the documentation
Some words were duplicated while other places were grammatically
incorrect, including one variable name in the code.

Author: Otto Kekalainen, Justin Pryzby
Discussion: https://postgr.es/m/7DDBEFC5-09B6-4325-B942-B563D1A24BDC@amazon.com
2022-03-15 11:29:35 +09:00
Amit Kapila 705e20f855 Optionally disable subscriptions on error.
Logical replication apply workers for a subscription can easily get stuck
in an infinite loop of attempting to apply a change, triggering an error
(such as a constraint violation), exiting with the error written to the
subscription server log, and restarting.

To partially remedy the situation, this patch adds a new subscription
option named 'disable_on_error'. To be consistent with old behavior, this
option defaults to false. When true, both the tablesync worker and apply
worker catch any errors thrown and disable the subscription in order to
break the loop. The error is still also written in the logs.

Once the subscription is disabled, users can either manually resolve the
conflict/error or skip the conflicting transaction by using
pg_replication_origin_advance() function. After resolving the conflict,
users need to enable the subscription to allow apply process to proceed.

Author: Osumi Takamichi and Mark Dilger
Reviewed-by: Greg Nancarrow, Vignesh C, Amit Kapila, Wang wei, Tang Haiying, Peter Smith, Masahiko Sawada, Shi Yu
Discussion : https://postgr.es/m/DB35438F-9356-4841-89A0-412709EBD3AB%40enterprisedb.com
2022-03-14 09:32:40 +05:30
Michael Paquier 8e375ea4a0 Bump XLOG_PAGE_MAGIC due to the addition of wal_compression=zstd
While on it, fix a thinko in the docs, introduced by the same commit.

Oversights in e953732.

Reported-by: Justin Pryzby
Discussion: https://postgr.es/m/20220311214900.GN28503@telsasoft.com
2022-03-12 09:39:13 +09:00
Michael Paquier 9198e63996 doc: Standardize capitalization of term "hot standby"/"Hot Standby"
"Hot Standby" was capitalized in a couple of places in the docs, as the
style primarily used when it was introduced, but this has not been much
respected across the years.  Per discussion, it is more natural for the
reader to use "hot standby" (aka lower-case only) when in the middle of
a sentence, and "Hot standby" (aka capitalized) in a title.  This commit
adjusts all the places in the docs to be consistent with this choice,
rather than applying one style or the other midway.

Author: Daniel Westermann
Reviewed-by: Kyotaro Horiguchi, Aleksander Alekseev, Robert Treat
Discussion: https://postgr.es/m/GVAP278MB093160025A779A1A5788D0EAD2039@GVAP278MB0931.CHEP278.PROD.OUTLOOK.COM
2022-03-11 15:16:21 +09:00
Michael Paquier e9537321a7 Add support for zstd with compression of full-page writes in WAL
wal_compression gains a new value, "zstd", to allow the compression of
full-page images using the compression method of the same name.

Compression is done using the default level recommended by the library,
as of ZSTD_CLEVEL_DEFAULT = 3.  Some benchmarking has shown that it
could make sense to use a level lower for the FPI compression, like 1 or
2, as the compression rate did not change much with a bit less CPU
consumed, but any tests done would only cover few scenarios so it is
hard to come to a clear conclusion.  Anyway, there is no reason to not
use the default level instead, which is the level recommended by the
library so it should be fine for most cases.

zstd outclasses easily pglz, and is better than LZ4 where one wants to
have more compression at the cost of extra CPU but both are good enough
in their own scenarios, so the choice between one or the other of these
comes to a study of the workload patterns and the schema involved,
mainly.

This commit relies heavily on 4035cd5, that reshaped the code creating
and restoring full-page writes to be aware of the compression type,
making this integration straight-forward.

This patch borrows some early work from Andrey Borodin, though the patch
got a complete rewrite.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220222231948.GJ9008@telsasoft.com
2022-03-11 12:18:53 +09:00
Michael Paquier e3df32bbc3 doc: Add ALTER/DROP ROUTINE to the event trigger matrix
ALTER ROUTINE triggers the events ddl_command_start and ddl_command_end,
and DROP ROUTINE triggers sql_drop, ddl_command_start and
ddl_command_end, but this was not mention on the matrix table.

Reported-by: Leslie Lemaire
Discussion: https://postgr.es/m/164647533363.646.5802968483136493025@wrigleys.postgresql.org
Backpatch-through: 11
2022-03-09 14:59:08 +09:00
Michael Paquier 7687ca996e doc: Improve references to term "FSM" in pageinspect and pgfreespacemap
Author: Dong Wook Lee
Reviewed-by: Laurenz Albe
Discussion: https://postgr.es/m/CAAcBya+=F=HaHxJ7tGjAM1r=A=+bDbimpsex8Vqrb4GjqFDYsQ@mail.gmail.com
2022-03-09 10:43:25 +09:00
Robert Haas 7cf085f077 Add support for zstd base backup compression.
Both client-side compression and server-side compression are now
supported for zstd. In addition, a backup compressed by the server
using zstd can now be decompressed by the client in order to
accommodate the use of -Fp.

Jeevan Ladhe, with some edits by me.

Discussion: http://postgr.es/m/CA+Tgmobyzfbz=gyze2_LL1ZumZunmaEKbHQxjrFkOR7APZGu-g@mail.gmail.com
2022-03-08 09:52:43 -05:00
Amit Kapila d3e8368c4b Add the additional information to the logical replication worker errcontext.
This commits adds both the finish LSN (commit_lsn in case transaction got
committed, prepare_lsn in case of a prepared transaction, etc.) and
replication origin name to the existing error context message.

This will help users in specifying the origin name and transaction finish
LSN to pg_replication_origin_advance() SQL function to skip a particular
transaction.

Author: Masahiko Sawada
Reviewed-by: Takamichi Osumi, Euler Taveira, and Amit Kapila
Discussion: https://postgr.es/m/CAD21AoBarBf2oTF71ig2g_o=3Z_Dt6_sOpMQma1kFgbnA5OZ_w@mail.gmail.com
2022-03-08 08:08:32 +05:30
Andres Freund 4228cabb72 plpython: Adjust docs after removal of Python 2 support.
Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20211031184548.g4sxfe47n2kyi55r@alap3.anarazel.de
2022-03-07 18:30:57 -08:00
Michael Paquier b3c8aae008 doc: Fix description of pg_stop_backup()
The function was still documented as returning a set of records,
something not true as of 62ce0c7.

Reported-by: Tom Lane
Discussion: https://postgr.es/m/3159823.1646320180@sss.pgh.pa.us
2022-03-04 09:51:12 +09:00
Tom Lane 50f03473ed Doc: update libpq.sgml for root-owned SSL private keys.
My oversight in a59c79564.

Discussion: https://postgr.es/m/f4b7bc55-97ac-9e69-7398-335e212f7743@pgmasters.net
2022-03-02 11:29:11 -05:00
Peter Eisentraut e58791c6ad Add id's to various elements in protocol.sgml
For easier direct linking.

Author: Brar Piening <brar@gmx.de>
Discussion: https://www.postgresql.org/message-id/flat/dbad4f77-4dce-1b05-2b65-831acb5d5b66@gmx.de
2022-03-02 10:33:12 +01:00
Amit Kapila 7a85073290 Reconsider pg_stat_subscription_workers view.
It was decided (refer to the Discussion link below) that the stats
collector is not an appropriate place to store the error information of
subscription workers.

This patch changes the pg_stat_subscription_workers view (introduced by
commit 8d74fc96db) so that it stores only statistics counters:
apply_error_count and sync_error_count, and has one entry for
each subscription. The removed error information such as error-XID and
the error message would be stored in another way in the future which is
more reliable and persistent.

After removing these error details, there is no longer any relation
information, so the subscription statistics are now a cluster-wide
statistics.

The patch also changes the view name to pg_stat_subscription_stats since
the word "worker" is an implementation detail that we use one worker for
one tablesync and one apply.

Author: Masahiko Sawada, based on suggestions by Andres Freund
Reviewed-by: Peter Smith, Haiying Tang, Takamichi Osumi, Amit Kapila
Discussion: https://postgr.es/m/20220125063131.4cmvsxbz2tdg6g65@alap3.anarazel.de
2022-03-01 06:17:52 +05:30
Tom Lane 2e517818f4 Fix SPI's handling of errors during transaction commit.
SPI_commit previously left it up to the caller to recover from any error
occurring during commit.  Since that's complicated and requires use of
low-level xact.c facilities, it's not too surprising that no caller got
it right.  Let's move the responsibility for cleanup into spi.c.  Doing
that requires redefining SPI_commit as starting a new transaction, so
that it becomes equivalent to SPI_commit_and_chain except that you get
default transaction characteristics instead of preserving the prior
transaction's characteristics.  We can make this pretty transparent
API-wise by redefining SPI_start_transaction() as a no-op.  Callers
that expect to do something in between might be surprised, but
available evidence is that no callers do so.

Having made that API redefinition, we can fix this mess by having
SPI_commit[_and_chain] trap errors and start a new, clean transaction
before re-throwing the error.  Likewise for SPI_rollback[_and_chain].
Some cleanup is also needed in AtEOXact_SPI, which was nowhere near
smart enough to deal with SPI contexts nested inside a committing
context.

While plperl and pltcl need no changes beyond removing their now-useless
SPI_start_transaction() calls, plpython needs some more work because it
hadn't gotten the memo about catching commit/rollback errors in the
first place.  Such an error resulted in longjmp'ing out of the Python
interpreter, which leaks Python stack entries at present and is reported
to crash Python 3.11 altogether.  Add the missing logic to catch such
errors and convert them into Python exceptions.

We are probably going to have to back-patch this once Python 3.11 ships,
but it's a sufficiently basic change that I'm a bit nervous about doing
so immediately.  Let's let it bake awhile in HEAD first.

Peter Eisentraut and Tom Lane

Discussion: https://postgr.es/m/3375ffd8-d71c-2565-e348-a597d6e739e3@enterprisedb.com
Discussion: https://postgr.es/m/17416-ed8fe5d7213d6c25@postgresql.org
2022-02-28 12:45:36 -05:00
Etsuro Fujita 04e706d423 postgres_fdw: Add support for parallel commit.
postgres_fdw commits remote (sub)transactions opened on remote server(s)
in a local (sub)transaction one by one when the local (sub)transaction
commits.  This patch allows it to commit the remote (sub)transactions in
parallel to improve performance.  This is enabled by the server option
"parallel_commit".  The default is false.

Etsuro Fujita, reviewed by Fujii Masao and David Zhang.

Discussion: http://postgr.es/m/CAPmGK17dAZCXvwnfpr1eTfknTGdt%3DhYTV9405Gt5SqPOX8K84w%40mail.gmail.com
2022-02-24 14:30:00 +09:00
Amit Kapila 52e4f0cd47 Allow specifying row filters for logical replication of tables.
This feature adds row filtering for publication tables. When a publication
is defined or modified, an optional WHERE clause can be specified. Rows
that don't satisfy this WHERE clause will be filtered out. This allows a
set of tables to be partially replicated. The row filter is per table. A
new row filter can be added simply by specifying a WHERE clause after the
table name. The WHERE clause must be enclosed by parentheses.

The row filter WHERE clause for a table added to a publication that
publishes UPDATE and/or DELETE operations must contain only columns that
are covered by REPLICA IDENTITY. The row filter WHERE clause for a table
added to a publication that publishes INSERT can use any column. If the
row filter evaluates to NULL, it is regarded as "false". The WHERE clause
only allows simple expressions that don't have user-defined functions,
user-defined operators, user-defined types, user-defined collations,
non-immutable built-in functions, or references to system columns. These
restrictions could be addressed in the future.

If you choose to do the initial table synchronization, only data that
satisfies the row filters is copied to the subscriber. If the subscription
has several publications in which a table has been published with
different WHERE clauses, rows that satisfy ANY of the expressions will be
copied. If a subscriber is a pre-15 version, the initial table
synchronization won't use row filters even if they are defined in the
publisher.

The row filters are applied before publishing the changes. If the
subscription has several publications in which the same table has been
published with different filters (for the same publish operation), those
expressions get OR'ed together so that rows satisfying any of the
expressions will be replicated.

This means all the other filters become redundant if (a) one of the
publications have no filter at all, (b) one of the publications was
created using FOR ALL TABLES, (c) one of the publications was created
using FOR ALL TABLES IN SCHEMA and the table belongs to that same schema.

If your publication contains a partitioned table, the publication
parameter publish_via_partition_root determines if it uses the partition's
row filter (if the parameter is false, the default) or the root
partitioned table's row filter.

Psql commands \dRp+ and \d <table-name> will display any row filters.

Author: Hou Zhijie, Euler Taveira, Peter Smith, Ajin Cherian
Reviewed-by: Greg Nancarrow, Haiying Tang, Amit Kapila, Tomas Vondra, Dilip Kumar, Vignesh C, Alvaro Herrera, Andres Freund, Wei Wang
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
2022-02-22 08:11:50 +05:30
Michael Paquier ebf6c5249b Add compute_query_id = regress
"regress" is a new mode added to compute_query_id aimed at facilitating
regression testing when a module computing query IDs is loaded into the
backend, like pg_stat_statements.  It works the same way as "auto",
meaning that query IDs are computed if a module enables it, except that
query IDs are hidden in EXPLAIN outputs to ensure regression output
stability.

Like any GUCs of the kind (force_parallel_mode, etc.), this new
configuration can be added to an instance's postgresql.conf, or just
passed down with PGOPTIONS at command level.  compute_query_id uses an
enum for its set of option values, meaning that this addition ensures
ABI compatibility.

Using this new configuration mode allows installcheck-world to pass when
running the tests on an instance with pg_stat_statements enabled,
stabilizing the test output while checking the paths doing query ID
computations.

Reported-by: Anton Melnikov
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/1634283396.372373993@f75.i.mail.ru
Discussion: https://postgr.es/m/YgHlxgc/OimuPYhH@paquier.xyz
Backpatch-through: 14
2022-02-22 10:22:15 +09:00
Michael Paquier bf4ed12b58 doc: Mention environment variable ZSTD in the TAP tests for MSVC
6c417bb has added the build infrastructure to support ZSTD, but forgot
to update this section of the docs to mention the variable ZSTD, as per
the change done in vcregress.pl.

While on it, reword this section of the docs to describe what happens in
the default case, as per a suggestion from Robert Haas.

Discussion: https://postgr.es/m/YhCL0fKnDv/Zvtuo@paquier.xyz
2022-02-21 09:55:55 +09:00
Michael Paquier d7a978601d doc: Simplify description of --with-lz4
LZ4 is used in much more areas of the system now than just WAL and table
data.  This commit simplifies the installation documentation of Windows
and *nix by removing any details of the areas extended when building
with LZ4.

Author: Jeevan Ladhe
Discussion: https://postgr.es/m/CANm22Cgny8AF76pitomXp603NagwKXbA4dyN2Fac4yHPebqdqg@mail.gmail.com
2022-02-19 15:06:53 +09:00
Robert Haas 6c417bbcc8 Add support for building with ZSTD.
This commit doesn't actually add anything that uses ZSTD; that will be
done separately. It just puts the basic infrastructure into place.

Jeevan Ladhe, Robert Haas, and Michael Paquier. Reviewed by Justin
Pryzby and Andres Freund.

Discussion: http://postgr.es/m/CA+TgmoatQKGd+8SjcV+bzvw4XaoEwminHjU83yG12+NXtQzTTQ@mail.gmail.com
2022-02-18 13:40:31 -05:00
Tom Lane 2e372869aa Don't let libpq PGEVT_CONNRESET callbacks break a PGconn.
As currently implemented, failure of a PGEVT_CONNRESET callback
forces the PGconn into the CONNECTION_BAD state (without closing
the socket, which is inconsistent with other failure paths), and
prevents later callbacks from being called.  This seems highly
questionable, and indeed is questioned by comments in the source.

Instead, let's just ignore the result value of PGEVT_CONNRESET
calls.  Like the preceding commit, this converts event callbacks
into "pure observers" that cannot affect libpq's processing logic.

Discussion: https://postgr.es/m/3185105.1644960083@sss.pgh.pa.us
2022-02-18 11:43:04 -05:00
Tom Lane ce1e7a2f71 Don't let libpq "event" procs break the state of PGresult objects.
As currently implemented, failure of a PGEVT_RESULTCREATE callback
causes the PGresult to be converted to an error result.  This is
intellectually inconsistent (shouldn't a failing callback likewise
prevent creation of the error result? what about side-effects on the
behavior seen by other event procs? why does PQfireResultCreateEvents
act differently from PQgetResult?), but more importantly it destroys
any promises we might wish to make about the behavior of libpq in
nontrivial operating modes, such as pipeline mode.  For example,
it's not possible to promise that PGRES_PIPELINE_SYNC results will
be returned if an event callback fails on those.  With this
definition, expecting applications to behave sanely in the face of
possibly-failing callbacks seems like a very big lift.

Hence, redefine the result of a callback failure as being simply
that that event procedure won't be called any more for this PGresult
(which was true already).  Event procedures can still signal failure
back to the application through out-of-band mechanisms, for example
via their passthrough arguments.

Similarly, don't let failure of a PGEVT_RESULTCOPY callback prevent
PQcopyResult from succeeding.  That definition allowed a misbehaving
event proc to break single-row mode (our sole internal use of
PQcopyResult), and it probably had equally deleterious effects for
outside uses.

Discussion: https://postgr.es/m/3185105.1644960083@sss.pgh.pa.us
2022-02-18 11:37:27 -05:00
Fujii Masao 94c49d5340 postgres_fdw: Make postgres_fdw.application_name support more escape sequences.
Commit 6e0cb3dec1 allowed postgres_fdw.application_name to include
escape sequences %a (application name), %d (database name), %u (user name)
and %p (pid). In addition to them, this commit makes it support
the escape sequences for session ID (%c) and cluster name (%C).
These are helpful to investigate where each remote transactions came from.

Author: Fujii Masao
Reviewed-by: Ryohei Takahashi, Kyotaro Horiguchi
Discussion: https://postgr.es/m/1041dc9a-c976-049f-9f14-e7d94c29c4b2@oss.nttdata.com
2022-02-18 11:38:12 +09:00
Andres Freund 19252e8ec9 plpython: Reject Python 2 during build configuration.
Python 2.7 went EOL 2020-01-01 and the support for Python 2 requires a fair
bit of infrastructure. Therefore we are removing Python 2 support in plpython.

This patch just rejects Python 2 during configure / mkvcbuild.pl. Future
commits will remove the code and infrastructure for Python 2 support and
adjust more of the documentation. This way we can see the buildfarm state
after the removal sooner and we can be sure that failures are due to
desupporting Python 2, rather than caused by infrastructure cleanup.

Reviewed-By: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/20211031184548.g4sxfe47n2kyi55r@alap3.anarazel.de
2022-02-16 22:47:35 -08:00
Peter Geoghegan 8f388f6f55 Increase hash_mem_multiplier default to 2.0.
Double the default setting for hash_mem_multiplier, from 1.0 to 2.0.
This setting makes hash-based executor nodes use twice the usual
work_mem limit.

The PostgreSQL 15 release notes should have a compatibility note about
this change.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wzndc_ROk6CY-bC6p9O53q974Y0Ey4WX8jcPbuTZYM4Q3A@mail.gmail.com
2022-02-16 18:41:52 -08:00
Etsuro Fujita 27d195a578 Doc: Update documentation for modifying postgres_fdw foreign tables.
Document that they can be modified using COPY as well.

Back-patch to v11 where commit 3d956d956 added support for COPY in
postgres_fdw.
2022-02-16 15:15:00 +09:00
Heikki Linnakangas 853c6400bf Fix race condition in 028_pitr_timelines.pl test, add note to docs.
The 028_pitr_timelines.pl test would sometimes hang, waiting for a WAL
segment that was just filled up to be archived. It was because the
test used 'pg_stat_archiver.last_archived_wal' to check if a file was
archived, but the order that WAL files are archived when a standby is
promoted is not fully deterministic, and 'last_archived_wal' tracks
the last segment that was archived, not the highest-numbered WAL
segment. Because of that, if the archiver archived segment 3, and then
2, 'last_archived_wal' say 2, and the test query would think that 3
has not been archived yet.

Normally, WAL files are marked ready for archival in order, and the
archiver process will process them in order, so that issue doesn't
arise.  We have used the same query on 'last_archived_wal' in a few
other tests with no problem. But when a standby is promoted, things
are a bit chaotic. After promotion, the server will try to archive all
the WAL segments from the old timeline that are in pg_wal, as well as
the history file and any new WAL segments on the new timeline. The
end-of-recovery checkpoint will create the .ready files for all the
WAL files on the old timeline, but at the same time, the new timeline
is opened up for business. A file from the new timeline can therefore
be archived before the files from the old timeline have been marked as
ready for archival.

It turns out that we don't really need to wait for the archival in
this particular test, because the standby server is about to be
stopped, and stopping a server will wait for the end-of-recovery
checkpoint and all WAL archivals to finish, anyway. So we can just
remove it from the test.

Add a note to the docs on 'pg_stat_archiver' view that files can be
archived out of order.

Reviewed-by: Tom Lane
Discussion: https://www.postgresql.org/message-id/3186114.1644960507@sss.pgh.pa.us
2022-02-16 01:37:48 +02:00
Andres Freund 1f6e0ce3be docs: Work around bug in the docbook xsl stylesheets.
docbook-xsl's index generation stylesheet (autoidx.xsl) has a small bug: It
doesn't include xlink in exclude-result-prefixes. Normally just leads to a a
single xmlns:xlink in the <div> containing the index, but because our
customization emits that, xmlns:xlink intead gets added to every element
output by autoidx.xsl below the <div>, totalling around 100kB.

Adding the spurious xmlns:xlink to the <div> ourselves isn't great, but avoids
the duplication.

Reviewed-By: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/20220213201618.qz6p6noon3wagr3f%40alap3.anarazel.de
2022-02-15 13:52:40 -08:00
Peter Eisentraut 6538be9e1e Fix XML namespace declarations
The XSL stylesheets used a mix of incorrect or outdated namespace
declarations for XHTML, probably based on ancient advice and examples.
Clean all this up.

Besides improving correctness (although probably no impact in
practice, other than possible validation failures), this removes a
bunch of useless namespace declarations in the HTML output.

Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/20220213201618.qz6p6noon3wagr3f%40alap3.anarazel.de
2022-02-15 11:13:49 +01:00
John Naylor a59135a81a Spell "startup process" with lower case in the documentation
Most uses were already lower case, so this just makes all user-visible
spellings consistent.

Bharath Rupireddy

The proposed patch also had analagous changes for the code comments,
but I decided that wasn't worth the churn.

Discussion:
https://www.postgresql.org/message-id/flat/CALj2ACW7%2Bv_0QBPoWB%3DqKr67JKC019Htm%3DX8sKewS17bOquefg%40mail.gmail.com
2022-02-15 14:30:57 +07:00
Peter Eisentraut 37851a8b83 Database-level collation version tracking
This adds to database objects the same version tracking that collation
objects have.  There is a new pg_database column datcollversion that
stores the version, a new function
pg_database_collation_actual_version() to get the version from the
operating system, and a new subcommand ALTER DATABASE ... REFRESH
COLLATION VERSION.

This was not originally added together with pg_collation.collversion,
since originally version tracking was only supported for ICU, and ICU
on a database-level is not currently supported.  But we now have
version tracking for glibc (since PG13), FreeBSD (since PG14), and
Windows (since PG13), so this is useful to have now.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/f0ff3190-29a3-5b39-a179-fa32eee57db6%40enterprisedb.com
2022-02-14 08:27:26 +01:00
Thomas Munro cba5b994c9 Use WL_SOCKET_CLOSED for client_connection_check_interval.
Previously we used poll() directly to check for a POLLRDHUP event.
Instead, use the WaitEventSet API to poll the socket for
WL_SOCKET_CLOSED, which knows how to detect this condition on many more
operating systems.

Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Reviewed-by: Maksim Milyutin <milyutinma@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/77def86b27e41f0efcba411460e929ae%40postgrespro.ru
2022-02-14 16:52:23 +13:00
Amit Kapila 5e01001ffb WAL log unchanged toasted replica identity key attributes.
Currently, during UPDATE, the unchanged replica identity key attributes
are not logged separately because they are getting logged as part of the
new tuple. But if they are stored externally then the untoasted values are
not getting logged as part of the new tuple and logical replication won't
be able to replicate such UPDATEs. So we need to log such attributes as
part of the old_key_tuple during UPDATE.

Reported-by: Haiying Tang
Author: Dilip Kumar and Amit Kapila
Reviewed-by: Alvaro Herrera, Haiying Tang, Andres Freund
Backpatch-through: 10
Discussion: https://postgr.es/m/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com
2022-02-14 08:55:58 +05:30
Robert Haas 751b8d23b7 pg_basebackup: Allow client-side LZ4 (de)compression.
LZ4 compression can now be performed on the client using
pg_basebackup -Ft --compress client-lz4, and LZ4 decompression of
a backup compressed on the server can be performed on the client
using pg_basebackup -Fp --compress server-lz4.

Dipesh Pandit, reviewed and tested by Jeevan Ladhe and Tushar Ahuja,
with a few corrections - and some documentation - by me.

Discussion: http://postgr.es/m/CAN1g5_FeDmiA9D8wdG2W6Lkq5CpubxOAqTmd2et9hsinTJtsMQ@mail.gmail.com
2022-02-11 09:41:42 -05:00
Robert Haas dab298471f Add suport for server-side LZ4 base backup compression.
LZ4 compression can be a lot faster than gzip compression, so users
may prefer it even if the compression ratio is not as good. We will
want pg_basebackup to support LZ4 compression and decompression on the
client side as well, and there is a pending patch for that, but it's
by a different author, so I am committing this part separately for
that reason.

Jeevan Ladhe, reviewed by Tushar Ahuja and by me.

Discussion: http://postgr.es/m/CANm22Cg9cArXEaYgHVZhCnzPLfqXCZLAzjwTq7Fc0quXRPfbxA@mail.gmail.com
2022-02-11 08:29:38 -05:00
Tomas Vondra 0da92dc530 Logical decoding of sequences
This extends the logical decoding to also decode sequence increments.
We differentiate between sequences created in the current (in-progress)
transaction, and sequences created earlier. This mixed behavior is
necessary because while sequences are not transactional (increments are
not subject to ROLLBACK), relfilenode changes are. So we do this:

* Changes for sequences created in the same top-level transaction are
  treated as transactional, i.e. just like any other change from that
  transaction, and discarded in case of a rollback.

* Changes for sequences created earlier are applied immediately, as if
  performed outside any transaction. This applies also after ALTER
  SEQUENCE, which may create a new relfilenode.

Moreover, if we ever get support for DDL replication, the sequence
won't exist until the transaction gets applied.

Sequences created in the current transaction are tracked in a simple
hash table, identified by a relfilenode. That means a sequence may
already exist, but if a transaction does ALTER SEQUENCE then the
increments for the new relfilenode will be treated as transactional.

For each relfilenode we track the XID of (sub)transaction that created
it, which is needed for cleanup at transaction end. We don't need to
check the XID to decide if an increment is transactional - if we find a
match in the hash table, it has to be the same transaction.

This requires two minor changes to WAL-logging. Firstly, we need to
ensure the sequence record has a valid XID - until now the the increment
might have XID 0 if it was the first change in a subxact. But the
sequence might have been created in the same top-level transaction. So
we ensure the XID is assigned when WAL-logging increments.

The other change is addition of "created" flag, marking increments for
newly created relfilenodes. This makes it easier to maintain the hash
table of sequences that need transactional handling.
Note: This is needed because of subxacts. A XID 0 might still have the
sequence created in a different subxact of the same top-level xact.

This does not include any changes to test_decoding and/or the built-in
replication - those will be committed in separate patches.

A patch adding decoding of sequences was originally submitted by Cary
Huang. This commit reworks various important aspects (e.g. the WAL
logging and transactional/non-transactional handling). However, the
original patch and reviews were very useful.

Author: Tomas Vondra, Cary Huang
Reviewed-by: Peter Eisentraut, Hannu Krosing, Andres Freund
Discussion: https://postgr.es/m/d045f3c2-6cfb-06d3-5540-e63c320df8bc@enterprisedb.com
Discussion: https://postgr.es/m/1710ed7e13b.cd7177461430746.3372264562543607781@highgo.ca
2022-02-10 18:43:51 +01:00
Robert Haas 0d4513b613 Remove server support for the previous base backup protocol.
Commit cc333f3233 added a new COPY
sub-protocol for taking base backups, but retained support for the
previous protocol. For the same reasons articulated in the message
for commit 9cd28c2e5f, remove support
for the previous protocol from the server.

Discussion: http://postgr.es/m/CA+TgmoazKcKUWtqVa0xZqSzbKgTH+X-aw4V7GyLD68EpDLMh8A@mail.gmail.com
2022-02-10 12:12:43 -05:00
Robert Haas 9cd28c2e5f Remove server support for old BASE_BACKUP command syntax.
Commit 0ba281cb4b added a new syntax
for the BASE_BACKUP command, with extensible options, but maintained
support for the legacy syntax. This isn't important for PostgreSQL,
where pg_basebackup works with older server versions but not newer
ones, but it could in theory matter for out-of-core users of the
replication protocol.

Discussion on pgsql-hackers, however, suggests that no one is aware
of any out-of-core use of the BASE_BACKUP command, and the consensus
is in favor of removing support for the old syntax to simplify the
code, so do that.

Discussion: http://postgr.es/m/CA+TgmoazKcKUWtqVa0xZqSzbKgTH+X-aw4V7GyLD68EpDLMh8A@mail.gmail.com
2022-02-10 10:48:33 -05:00
Fujii Masao 400fc6b648 Add min() and max() aggregates for xid8.
Bump catalog version.

Author: Ken Kato
Reviewed-by: Kyotaro Horiguchi, Fujii Masao
Discussion: https://postgr.es/m/47d77b18c44f87f8222c4c7a3e2dee6b@oss.nttdata.com
2022-02-10 12:33:41 +09:00
Daniel Gustafsson f48385c132 Fix typo in archive modules docs
Discussion: https://postgr.es/m/4F8E8D8F-45CA-4833-AB19-CC6105326583@yesql.se
2022-02-09 15:36:46 +01:00
Michael Paquier 38bfae3652 pg_upgrade: Move all the files generated internally to a subdirectory
Historically, the location of any files generated by pg_upgrade, as of
the per-database logs and internal dumps, has been the current working
directory, leaving all those files behind when using --retain or on a
failure.

Putting all those contents in a targeted subdirectory makes the whole
easier to debug, and simplifies the code in charge of cleaning up the
logs.  Note that another reason is that this facilitates the move of
pg_upgrade to TAP with a fixed location for all the logs to grab if the
test fails repeatedly.

Initially, we thought about being able to specify the output directory
with a new option, but we have settled on using a subdirectory located
at the root of the new cluster's data folder, "pg_upgrade_output.d",
instead, as at the end the new data directory is the location of all the
data generated by pg_upgrade.  There is a take with group permissions
here though: if the new data folder has been initialized with this
option, we need to create all the files and paths with the correct
permissions or a base backup taken after a pg_upgrade --retain would
fail, meaning that GetDataDirectoryCreatePerm() has to be called before
creating the log paths, before a couple of sanity checks on the clusters
and before getting the socket directory for the cluster's host settings.
The idea of the new location is based on a suggestion from Peter
Eisentraut.

Also thanks to Andrew Dunstan, Peter Eisentraut, Daniel Gustafsson, Tom
Lane and Bruce Momjian for the discussion (in alphabetical order).

Author: Justin Pryzby
Discussion: https://postgr.es/m/20211212025017.GN17618@telsasoft.com
2022-02-06 12:27:29 +09:00
Tom Lane cbadfc1f8a Doc: be clearer that foreign-table partitions need user-added constraints.
A very well-informed user might deduce this from what we said already,
but I'd bet against it.  Lay it out explicitly.

While here, rewrite the comment about tuple routing to be more
intelligible to an average SQL user.

Per bug #17395 from Alexander Lakhin.  Back-patch to v11.  (The text
in this area is different in v10 and I'm not sufficiently excited
about this point to adapt the patch.)

Discussion: https://postgr.es/m/17395-8c326292078d1a57@postgresql.org
2022-02-05 12:55:44 -05:00
Robert Haas 5ef1eefd76 Allow archiving via loadable modules.
Running a shell command for each file to be archived has a lot of
overhead and may not offer as much error checking as you want, or the
exact semantics that you want. So, offer the option to call a loadable
module for each file to be archived, rather than running a shell command.

Also, add a 'basic_archive' contrib module as an example implementation
that archives to a local directory.

Nathan Bossart, with a little bit of kibitzing by me.

Discussion: http://postgr.es/m/20220202224433.GA1036711@nathanxps13
2022-02-03 14:05:02 -05:00
Peter Eisentraut 94aa7cc5f7 Add UNIQUE null treatment option
The SQL standard has been ambiguous about whether null values in
unique constraints should be considered equal or not.  Different
implementations have different behaviors.  In the SQL:202x draft, this
has been formalized by making this implementation-defined and adding
an option on unique constraint definitions UNIQUE [ NULLS [NOT]
DISTINCT ] to choose a behavior explicitly.

This patch adds this option to PostgreSQL.  The default behavior
remains UNIQUE NULLS DISTINCT.  Making this happen in the btree code
is pretty easy; most of the patch is just to carry the flag around to
all the places that need it.

The CREATE UNIQUE INDEX syntax extension is not from the standard,
it's my own invention.

I named all the internal flags, catalog columns, etc. in the negative
("nulls not distinct") so that the default PostgreSQL behavior is the
default if the flag is false.

Reviewed-by: Maxim Orlov <orlovmg@gmail.com>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/84e5ee1b-387e-9a54-c326-9082674bde78@enterprisedb.com
2022-02-03 11:48:21 +01:00
Bruce Momjian 9d179d9c23 doc: clarify syntax notation, particularly parentheses
Also move TCL syntax to the PL/tcl section.

Reported-by: davs2rt@gmail.com

Discussion: https://postgr.es/m/164308146320.12460.3590769444508751574@wrigleys.postgresql.org

Backpatch-through: 10
2022-02-02 21:53:52 -05:00
Peter Eisentraut 87669de72c Some cleanup for change of collate and ctype fields to type text
Some cleanup for commit 54637508f87bd5f07fb9406bac6b08240283be3b:
Reformat pg_database.dat to reflect the new field order.  Also update
the corresponding example in bki.sgml.  Reorder the way the fields are
filled in dbcommands.c to correspond to the new order.
2022-02-02 11:58:55 +01:00
Peter Eisentraut cb2bab14ff doc: Fix mistake in PL/Python documentation
Small thinko introduced by 94aceed317

Reported-by: nassehk@gmail.com
2022-02-02 09:14:26 +01:00
Tom Lane a5a9d77b8b Doc: modernize documentation for lo_create()/lo_creat().
At this point lo_creat() is a legacy function with little if any
real use-case, so describing it first doesn't make much sense.
Describe lo_create() first, and then explain lo_creat() as a
backwards-compatibility alternative.

Discussion: https://postgr.es/m/164353261519.713.8748040527537500758@wrigleys.postgresql.org
2022-02-01 10:57:38 -05:00
Michael Paquier d10e41d423 Introduce pg_settings_get_flags() to find flags associated to a GUC
The most meaningful flags are shown, which are the ones useful for the
user and for automating and extending the set of tests supported
currently by check_guc.

This script may actually be removed in the future, but we are not
completely sure yet if and how we want to support the remaining sanity
checks performed there, that are now integrated in the main regression
test suite as of this commit.

Thanks also to Peter Eisentraut and Kyotaro Horiguchi for the
discussion.

Bump catalog version.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20211129030833.GJ17618@telsasoft.com
2022-01-31 08:56:41 +09:00
Tom Lane 02b8048ba5 psql: improve tab-complete's handling of variant SQL names.
This patch improves tab completion's ability to deal with
valid variant spellings of SQL identifiers.  Notably:

* Unquoted upper-case identifiers are now downcased as the backend
would do, allowing them to be completed correctly.

* Tab completion can now match identifiers that are quoted even
though they don't need to be; for example "f<TAB> now completes
to "foo" if that's the only available name.  Previously, only
names that require quotes would be offered.

* Schema-qualified identifiers are now supported where SQL syntax
allows it; many lesser-used completion rules neglected this.

* Completion operations that refer back to some previously-typed
name (for example, to complete names of columns belonging to a
previously-mentioned table) now allow variant spellings of the
previous name too.

In addition, performance of tab completion queries has been
improved for databases containing many objects, although
you'd only be likely to notice with a heavily-loaded server.

Authors of future tab-completion patches should note that this
commit changes many details about how tab completion queries
must be written:

* Tab completion queries now deal in raw object names; do not
use quote_ident().

* The name-matching restriction in a query must now be written
as "outputcol LIKE '%s'", not "substring(outputcol,1,%d)='%s'".

* The SchemaQuery mechanism has been extended so that it can
handle queries that refer back to a previous name.  Most completion
queries that do that should be converted to SchemaQuery form.
Only consider using a literal query if the previous name can
never be schema-qualified.  Don't use a literal query if the
name-to-be-completed can validly be schema-qualified, either.

* Use set_completion_reference() to specify which word is the previous
name to consider, for either a SchemaQuery or a literal query.

* If you want to offer some keywords in addition to a query result
(for example, offer COLUMN in addition to column names after
"ALTER TABLE t RENAME"), do not use the old hack of tacking the
keywords on with UNION.  Instead use the new QUERY_PLUS macros
to write such keywords separately from the query proper.  The
"addon" macro arguments that used to be used for this purpose
are gone.

* If your query returns something that's not a SQL identifier
(such as an attribute number or enum label), use the new
QUERY_VERBATIM macros to prevent the result from incorrectly
getting double-quoted.  You may still need to use quote_literal
in such a query, too.

Tom Lane and Haiying Tang

Discussion: https://postgr.es/m/a63cbd45e3884cf9b3961c2a6a95dcb7@G08CNEXMBPEKD05.g08.fujitsu.local
2022-01-30 13:33:23 -05:00
Robert Haas 7f6772317b Adjust server-side backup to depend on pg_write_server_files.
I had made it depend on superuser, but that seems clearly inferior.
Also document the permissions requirement in the straming replication
protocol section of the documentation, rather than only in the
section having to do with pg_basebackup.

Idea and patch from Dagfinn Ilmari Mannsåker.

Discussion: http://postgr.es/m/87bkzw160u.fsf@wibble.ilmari.org
2022-01-28 12:31:40 -05:00
Robert Haas d45099425e Allow server-side compression to be used with -Fp.
If you have a low-bandwidth connection between the client and the
server, it's reasonable to want to compress on the server side but
then decompress and extract the backup on the client side. This
commit allows you do to do just that.

Dipesh Pandit, with minor and mostly cosmetic changes by me.

Discussion: http://postgr.es/m/CAN1g5_HiSh8ajUMd4ePtGyCXo89iKZTzaNyzP_qv1eJbi4YHXA@mail.gmail.com
2022-01-28 08:41:25 -05:00
Peter Eisentraut 43f33dc018 Add HEADER support to COPY text format
The COPY CSV format supports the HEADER option to output a header
line.  This patch adds the same option to the default text format.  On
input, the HEADER option causes the first line to be skipped, same as
with CSV.

Author: Rémi Lapeyre <remi.lapeyre@lenstra.fr>
Discussion: https://www.postgresql.org/message-id/flat/CAF1-J-0PtCWMeLtswwGV2M70U26n4g33gpe1rcKQqe6wVQDrFA@mail.gmail.com
2022-01-28 09:44:47 +01:00
Peter Eisentraut 9a50f2e51c doc: Update ALTER COLLATION wording
The original text on refreshing collation versions was written
specifically for ICU, and then information for other providers was
incrementally tacked on at the end.  Reword this to be a bit more
general and less reflective of how it was added.
2022-01-28 08:21:36 +01:00
Peter Eisentraut c9cfc861fc Remove some trailing whitespace in documentation files 2022-01-27 18:31:01 +01:00
Peter Eisentraut 54637508f8 Change collate and ctype fields to type text
This changes the data type of the catalog fields datcollate, datctype,
collcollate, and collctype from name to text.  There wasn't ever a
really good reason for them to be of type name; presumably this was
just carried over from when they were fixed-size fields in pg_control,
first into the corresponding pg_database fields, and then to
pg_collation.  The values are not identifiers or object names, and we
don't ever look them up that way.

Changing to type text saves space in the typical case, since locale
names are typically only a few bytes long.  But it is also possible
that an ICU locale name with several customization options appended
could be longer than 63 bytes, so this also enables that case, which
was previously probably broken.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/5e756dd6-0e91-d778-96fd-b1bcb06c161a@2ndquadrant.com
2022-01-27 08:54:25 +01:00
Tom Lane bd233bdd8d Replace use of deprecated Python module distutils.sysconfig, take 2.
With Python 3.10, configure spits out warnings about the module
distutils.sysconfig being deprecated and scheduled for removal in
Python 3.12.  Change the uses in configure to use the module sysconfig
instead.  The logic stays largely the same, although we have to
rely on INCLUDEPY instead of the deprecated get_python_inc function.

Note that sysconfig exists since Python 2.7, so this moves the minimum
required version up from Python 2.6.  Also, sysconfig didn't exist in
Python 3.1, so the minimum 3.x version is now 3.2.

We should consider back-patching this if it gives no further trouble,
as the no-longer-supported versions are old enough to probably not
be interesting to anyone.

Peter Eisentraut, Tom Lane, Andres Freund

Discussion: https://postgr.es/m/c74add3c-09c4-a9dd-1a03-a846e5b2fc52@enterprisedb.com
2022-01-25 18:52:44 -05:00
Robert Haas e1f860f134 Tidy up a few cosmetic issues related to pg_basebackup.
Commit 0ad8032910 failed to update
the pg_basebackup documentation to mention that "client-" or
"server-" can now be prepended to the compression method name. Fix
it there, and also in the --help output that you get from running
the binary.

Also in the documentation, there's an old issue that the arguments to
--checkpoint shouldn't be marked as parameters, because "fast" and
"spread" are literal strings. Fix that too.

Dagfinn Ilmari Mannsåker, partly as per a report from
Shinoda Noriyoshi.

Discussion: http://postgr.es/m/PH7PR84MB1885C1CF433057807551172BEE5F9@PH7PR84MB1885.NAMPRD84.PROD.OUTLOOK.COM
2022-01-25 14:59:37 -05:00
Michael Paquier 410aa248e5 Fix various typos, grammar and code style in comments and docs
This fixes a set of issues that have accumulated over the past months
(or years) in various code areas.  Most fixes are related to some recent
additions, as of the development of v15.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220124030001.GQ23027@telsasoft.com
2022-01-25 09:40:04 +09:00
Robert Haas 0ad8032910 Server-side gzip compression.
pg_basebackup's --compression option now lets you write either
"client-gzip" or "server-gzip" instead of just "gzip" to specify
where the compression should be performed. If you write simply
"gzip" it's taken to mean "client-gzip" unless you also use
--target, in which case it is interpreted to mean "server-gzip",
because that's the only thing that makes any sense in that case.

To make this work, the BASE_BACKUP command now takes new
COMPRESSION and COMPRESSION_LEVEL options.

At present, pg_basebackup cannot decompress .gz files, so
server-side compression will cause a failure if (1) -Ft is not
used or (2) -R is used or (3) -D- is used without --no-manifest.

Along the way, I removed the information message added by commit
5c649fe153 which occurred if you
specified no compression level and told you that the default level
had been used instead. That seemed like more output than most
people would want.

Also along the way, this adds a check to the server for
unrecognized base backup options. This repairs a bug introduced
by commit 0ba281cb4b.

This commit also adds some new test cases for pg_verifybackup.
They take a server-side backup with and without compression, and
then extract the backup if we have the OS facilities available
to do so, and then run pg_verifybackup on the extracted
directory. That is a good test of the functionality added by
this commit and also improves test coverage for the backup target
patch (commit 3500ccc39b) and for
pg_verifybackup itself.

Patch by me, with a bug fix by Jeevan Ladhe.  The patch set of which
this is a part has also had review and/or testing from Tushar Ahuja,
Suraj Kharage, Dipesh Pandit, and Mark Dilger.

Discussion: http://postgr.es/m/CA+Tgmoa-ST7fMLsVJduOB7Eub=2WjfpHS+QxHVEpUoinf4bOSg@mail.gmail.com
2022-01-24 15:13:18 -05:00
Robert Haas aa01051418 pg_upgrade: Preserve database OIDs.
Commit 9a974cbcba arranged to preserve
relfilenodes and tablespace OIDs. For similar reasons, also arrange
to preserve database OIDs.

One problem is that, up until now, the OIDs assigned to the template0
and postgres databases have not been fixed. This could be a problem
when upgrading, because pg_upgrade might try to migrate a database
from the old cluster to the new cluster while keeping the OID and find
a different database with that OID, resulting in a failure. If it finds
a database with the same name and the same OID that's OK: it will be
dropped and recreated. But the same OID and a different name is a
problem.

To prevent that, fix the OIDs for postgres and template0 to specific
values less than 16384. To avoid running afoul of this rule, these
values should not be changed in future releases. It's not a problem
that these OIDs aren't fixed in existing releases, because the OIDs
that we're assigning here weren't used for either of these databases
in any previous release. Thus, there's no chance that an upgrade of
a cluster from any previous release will collide with the OIDs we're
assigning here. And going forward, the OIDs will always be fixed, so
the only potential collision is with a system database having the
same name and the same OID, which is OK.

This patch lets users assign a specific OID to a database as well,
provided however that it can't be less than 16384. I (rhaas) thought
it might be better not to expose this capability to users, but the
consensus was otherwise, so the syntax is documented. Letting users
assign OIDs below 16384 would not be OK, though, because a
user-created database with a low-numbered OID might collide with a
system-created database in a future release. We therefore prohibit
that.

Shruthi KC, based on an earlier patch from Antonin Houska, reviewed
and with some adjustments by me.

Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com
Discussion: http://postgr.es/m/CAASxf_Mnwm1Dh2vd5FAhVX6S1nwNSZUB1z12VddYtM++H2+p7w@mail.gmail.com
2022-01-24 14:23:43 -05:00
Tom Lane 4f02cbcb68 Doc: add cross-references between array_to_string and string_to_array.
These functions aren't exact inverses, but they're closely related;
yet we document them in two different sections.  Add cross-ref
<link>s to improve that situation.

While here, move the strpos and substr entries to re-alphabetize
Table 9.10.  Also, drop an ancient compatibility note about
string_to_array, which wasn't even on the right page, so probably
few people ever saw it.

Discussion: https://postgr.es/m/164287017550.704.5840412183184961677@wrigleys.postgresql.org
2022-01-22 15:44:51 -05:00
Michael Paquier 5c649fe153 Extend the options of pg_basebackup to control compression
The option --compress is extended to accept a compression method and an
optional compression level, as of the grammar METHOD[:LEVEL].  The
methods currently support are "none" and "gzip", for client-side
compression.  Any of those methods use only an integer value for the
compression level, but any method implemented in the future could use
more specific keywords if necessary.

This commit keeps the logic backward-compatible.  Hence, the following
compatibility rules apply for the new format of the option --compress:
* -z/--gzip is a synonym of --compress=gzip.
* --compress=NUM implies:
** --compress=none if NUM = 0.
** --compress=gzip:NUM if NUM > 0.

Note that there are also plans to extend more this grammar with
server-side compression.

Reviewed-by: Robert Haas, Magnus Hagander, Álvaro Herrera, David
G. Johnston, Georgios Kokolatos
Discussion: https://postgr.es/m/Yb3GEgWwcu4wZDuA@paquier.xyz
2022-01-21 11:08:43 +09:00
Tom Lane 512fc2dd79 Revert "Make configure prefer python3 to plain python."
This reverts commit f201da39ed.
The buildfarm is not ready for python3, evidently, so we'll
give the owners some more time to get set up.

Discussion: https://postgr.es/m/2872c9a0-4b0a-1354-d5f6-94d6f85ba354@enterprisedb.com
2022-01-20 17:32:21 -05:00
Robert Haas 3500ccc39b Support base backup targets.
pg_basebackup now has a --target=TARGET[:DETAIL] option. If specfied,
it is sent to the server as the value of the TARGET option to the
BASE_BACKUP command. If DETAIL is included, it is sent as the value of
the new TARGET_DETAIL option to the BASE_BACKUP command.  If the
target is anything other than 'client', pg_basebackup assumes that it
will now be the server's job to write the backup in a location somehow
defined by the target, and that it therefore needs to write nothing
locally. However, the server will still send messages to the client
for progress reporting purposes.

On the server side, we now support two additional types of backup
targets.  There is a 'blackhole' target, which just throws away the
backup data without doing anything at all with it. Naturally, this
should only be used for testing and debugging purposes, since you will
not actually have a backup when it finishes running. More usefully,
there is also a 'server' target, so you can now use something like
'pg_basebackup -Xnone -t server:/SOME/PATH' to write a backup to some
location on the server. We can extend this to more types of targets
in the future, and might even want to create an extensibility
mechanism for adding new target types.

Since WAL fetching is handled with separate client-side logic, it's
not part of this mechanism; thus, backups with non-default targets
must use -Xnone or -Xfetch.

Patch by me, with a bug fix by Jeevan Ladhe.  The patch set of which
this is a part has also had review and/or testing from Tushar Ahuja,
Suraj Kharage, Dipesh Pandit, and Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmoaYZbz0=Yk797aOJwkGJC-LK3iXn+wzzMx7KdwNpZhS5g@mail.gmail.com
2022-01-20 10:46:33 -05:00
Robert Haas ab4fd4f868 Remove 'datlastsysoid'.
It hasn't been used for anything for a long time. Up until recently,
we still queried it when dumping very old servers, but since
commit 30e7c175b8, there's no longer any
code at all that cares about it.

Discussion: http://postgr.es/m/CA+Tgmoa14=BRq0WEd0eevjEMn9EkghDB1FZEkBw7+UAb7tF49A@mail.gmail.com
2022-01-20 09:01:12 -05:00
Michael Paquier b2a76bb7d0 doc: Mention the level of locks taken on objects in COMMENT
This information was nowhere to be found.  This adds one note on the
page of COMMENT, and one note in the section dedicated to explicit
locking, both telling that a SHARE UPDATE EXCLUSIVE lock is taken on the
object commented.

Author: Nikolai Berkoff
Reviewed-by: Laurenz Albe
Discussion: https://postgr.es/m/_0HDHIGcCdCsUyXn22QwI2FEuNR6Fs71rtgGX6hfyBlUh5rrnE2qMmvIFu9EY4Pijr2gUmJEAXCjuNU2Oxku9TryLp9CdHllpsCfN3gD0-Y=@pm.me
Backpatch-through: 10
2022-01-20 16:54:47 +09:00
Tom Lane f201da39ed Make configure prefer python3 to plain python.
This avoids possibly selecting Python 2.x on systems that have
both Python 2 and Python 3.  We used to feel that what "python"
links to is a user choice that we should honor.  However, we're
about to cease support for Python 2, so users will no longer have
any choice of that sort.  This small change is being made ahead
of the big Python-2-ectomy so that we can see how much of the
buildfarm is not yet prepared for that.  Systems with only
Python 2 will continue to build that way, for now.

Discussion: https://postgr.es/m/2872c9a0-4b0a-1354-d5f6-94d6f85ba354@enterprisedb.com
2022-01-19 15:38:58 -05:00
Michael Paquier 44129c0432 doc: Fix description of pg_replication_origin_oid() in error case
This function returns NULL if the replication origin given in input
argument does not exist, contrary to what the docs described
previously.

Author: Ian Barwick
Discussion: https://postgr.es/m/CAB8KJ=htJjBL=103URqjOxV2mqb4rjphDpMeKdyKq_QXt6h05w@mail.gmail.com
Backpatch-through: 10
2022-01-19 10:37:40 +09:00
Tom Lane 5987feb70b Make PQcancel use the PGconn's tcp_user_timeout and keepalives settings.
If connectivity to the server has been lost or become flaky, the
user might well try to send a query cancel.  It's highly annoying
if PQcancel hangs up in such a case, but that's exactly what's likely
to happen.  To ameliorate this problem, apply the PGconn's
tcp_user_timeout and keepalives settings to the TCP connection used
to send the cancel.  This should be safe on Unix machines, since POSIX
specifies that setsockopt() is async-signal-safe.  We are guessing
that WSAIoctl(SIO_KEEPALIVE_VALS) is similarly safe on Windows.
(Note that at least in psql and our other frontend programs, there's
no safety issue involved anyway, since we run PQcancel in its own
thread rather than in a signal handler.)

Most of the value here comes from the expectation that tcp_user_timeout
will be applied as a connection timeout.  That appears to happen on
Linux, even though its tcp(7) man page claims differently.  The
keepalive options probably won't help much, but as long as we can
apply them for not much code, we might as well.

Jelte Fennema, reviewed by Fujii Masao and myself

Discussion: https://postgr.es/m/AM5PR83MB017870DE81FC84D5E21E9D1EF7AA9@AM5PR83MB0178.EURPRD83.prod.outlook.com
2022-01-18 14:13:13 -05:00
Robert Haas cc333f3233 Modify pg_basebackup to use a new COPY subprotocol for base backups.
In the new approach, all files across all tablespaces are sent in a
single COPY OUT operation. The CopyData messages are no longer raw
archive content; rather, each message is prefixed with a type byte
that describes its purpose, e.g. 'n' signifies the start of a new
archive and 'd' signifies archive or manifest data. This protocol
is significantly more extensible than the old approach, since we can
later create more message types, though not without concern for
backward compatibility.

The new protocol sends a few things to the client that the old one
did not. First, it sends the name of each archive explicitly, instead
of letting the client compute it. This is intended to make it easier
to write future patches that might send archives in a format other
that tar (e.g. cpio, pax, tar.gz). Second, it sends explicit progress
messages rather than allowing the client to assume that progress is
defined by the number of bytes received. This will help with future
features where the server compresses the data, or sends it someplace
directly rather than transmitting it to the client.

The old protocol is still supported for compatibility with previous
releases. The new protocol is selected by means of a new
TARGET option to the BASE_BACKUP command. Currently, the
only supported target is 'client'. Support for additional
targets will be added in a later commit.

Patch by me. The patch set of which this is a part has had review
and/or testing from Jeevan Ladhe, Tushar Ahuja, Suraj Kharage,
Dipesh Pandit, and Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmoaYZbz0=Yk797aOJwkGJC-LK3iXn+wzzMx7KdwNpZhS5g@mail.gmail.com
2022-01-18 13:47:49 -05:00
Peter Eisentraut dda42ff8e6 Revert "Replace use of deprecated Python module distutils.sysconfig"
This reverts commit e0e567a106.

On various platforms, the new approach using the sysconfig module
reported incorrect values for the include directory, and so any
Python-related compilations failed.  Revert for now and revisit later.
2022-01-18 17:42:29 +01:00
Peter Eisentraut e0e567a106 Replace use of deprecated Python module distutils.sysconfig
With Python 3.10, configure spits out warnings about the module
distutils.sysconfig being deprecated and scheduled for removal in
Python 3.12.  Change the uses in configure to use the module sysconfig
instead.  The logic stays the same.

Note that sysconfig exists since Python 2.7, so this moves the minimum
required version up from Python 2.6.

Discussion: https://www.postgresql.org/message-id/flat/c74add3c-09c4-a9dd-1a03-a846e5b2fc52%40enterprisedb.com
2022-01-18 06:37:02 +01:00
Michael Paquier 2158628864 Add support for --no-table-access-method in pg_{dump,dumpall,restore}
The logic is similar to default_tablespace in some ways, so as no SET
queries on default_table_access_method are generated before dumping or
restoring an object (table or materialized view support table AMs) when
specifying this new option.

This option is useful to enforce the use of a default access method even
if some tables included in a dump use an AM different than the system's
default.

There are already two cases in the TAP tests of pg_dump with a table and
a materialized view that use a non-default table AM, and these are
extended that the new option does not generate SET clauses on
default_table_access_method.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20211207153930.GR17618@telsasoft.com
2022-01-17 14:51:46 +09:00
Thomas Munro f47ed79cc8 Test replay of regression tests, attempt II.
See commit message for 123828a7fa.  The
only change this time is the order of the arguments passed to
pg_regress.  The previously version broke in the build farm environment
due to the contents of EXTRA_REGRESS_OPTS (see also commit 8cade04c
which had to do something similar).

Discussion: https://postgr.es/m/CA%2BhUKGKpRWQ9SxdxxDmTBCJoR0YnFpMBe7kyzY8SUQk%2BHeskxg%40mail.gmail.com
2022-01-17 16:34:55 +13:00
Michael Paquier dc686681e0 Introduce log_destination=jsonlog
"jsonlog" is a new value that can be added to log_destination to provide
logs in the JSON format, with its output written to a file, making it
the third type of destination of this kind, after "stderr" and
"csvlog".  The format is convenient to feed logs to other applications.
There is also a plugin external to core that provided this feature using
the hook in elog.c, but this had to overwrite the output of "stderr" to
work, so being able to do both at the same time was not possible.  The
files generated by this log format are suffixed with ".json", and use
the same rotation policies as the other two formats depending on the
backend configuration.

This takes advantage of the refactoring work done previously in ac7c807,
bed6ed3, 8b76f89 and 2d77d83 for the backend parts, and 72b76f7 for the
TAP tests, making the addition of any new file-based format rather
straight-forward.

The documentation is updated to list all the keys and the values that
can exist in this new format.  pg_current_logfile() also required a
refresh for the new option.

Author: Sehrope Sarkuni, Michael Paquier
Reviewed-by: Nathan Bossart, Justin Pryzby
Discussion: https://postgr.es/m/CAH7T-aqswBM6JWe4pDehi1uOiufqe06DJWaU5=X7dDLyqUExHg@mail.gmail.com
2022-01-17 10:16:53 +09:00
Tomas Vondra 269b532aef Add stxdinherit flag to pg_statistic_ext_data
Add pg_statistic_ext_data.stxdinherit flag, so that for each extended
statistics definition we can store two versions of data - one for the
relation alone, one for the whole inheritance tree. This is analogous to
pg_statistic.stainherit, but we failed to include such flag in catalogs
for extended statistics, and we had to work around it (see commits
859b3003de, 36c4bc6e72 and 20b9fa308e).

This changes the relationship between the two catalogs storing extended
statistics objects (pg_statistic_ext and pg_statistic_ext_data). Until
now, there was a simple 1:1 mapping - for each definition there was one
pg_statistic_ext_data row, and this row was inserted while creating the
statistics (and then updated during ANALYZE). With the stxdinherit flag,
we don't know how many rows there will be (child relations may be added
after the statistics object is defined), so there may be up to two rows.

We could make CREATE STATISTICS to always create both rows, but that
seems wasteful - without partitioning we only need stxdinherit=false
rows, and declaratively partitioned tables need only stxdinherit=true.
So we no longer initialize pg_statistic_ext_data in CREATE STATISTICS,
and instead make that a responsibility of ANALYZE. Which is what we do
for regular statistics too.

Patch by me, with extensive improvements and fixes by Justin Pryzby.

Author: Tomas Vondra, Justin Pryzby
Reviewed-by: Tomas Vondra, Justin Pryzby
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
2022-01-16 13:38:01 +01:00
Tom Lane 4483b2cf29 Remove standby_schedule and associated test files.
Since this test schedule is not run by default, it's next door to
unused.  Moreover, its test coverage is very thin, and what there is
is just about entirely superseded by the src/test/recovery tests.
Let's drop it instead of carrying obsolete tests.

Discussion: https://postgr.es/m/3911012.1641246643@sss.pgh.pa.us
2022-01-15 15:54:10 -05:00
Peter Geoghegan 49c9d9fcfa Unify VACUUM VERBOSE and autovacuum logging.
The log_autovacuum_min_duration instrumentation used its own dedicated
code for logging, which was not reused by VACUUM VERBOSE.  This was
highly duplicative, and sometimes led to each code path using slightly
different accounting for essentially the same information.

Clean things up by making VACUUM VERBOSE reuse the same instrumentation
code.  This code restructuring changes the structure of the VACUUM
VERBOSE output itself, but that seems like an overall improvement.  The
most noticeable change in VACUUM VERBOSE output is that it no longer
outputs a distinct message per index per round of index vacuuming.  Most
of the same information (about each index) is now shown in its new
per-operation summary message.  This is far more legible.

A few details are no longer displayed by VACUUM VERBOSE, but that's no
real loss in practice, especially in the common case where we don't need
multiple index scans/rounds of vacuuming.  This super fine-grained
information is still available via DEBUG2 messages, which might still be
useful in debugging scenarios.

VACUUM VERBOSE now shows new instrumentation, which is typically very
useful: all of the log_autovacuum_min_duration instrumentation that it
missed out on before now.  This includes information about WAL overhead,
buffers hit/missed/dirtied information, and I/O timing information.

VACUUM VERBOSE still retains a few INFO messages of its own.  This is
limited to output concerning the progress of heap rel truncation, as
well as some basic information about parallel workers.  These details
are still potentially quite useful.  They aren't a good fit for the log
output, which must summarize the whole operation.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAH2-WzmW4Me7_qR4X4ka7pxP-jGmn7=Npma_-Z-9Y1eD0MQRLw@mail.gmail.com
2022-01-14 16:50:34 -08:00
Thomas Munro dccee0f2b7 Revert "Test replay of regression tests."
This reverts commit 123828a7fa.

Discussion: https://postgr.es/m/CA%2BhUKG%2BGBC-6QhOKt6Y7ccrXSjbRHB7Di295%3D0rAGhE7a7hSrQ%40mail.gmail.com
2022-01-15 00:44:32 +13:00
Thomas Munro 123828a7fa Test replay of regression tests.
Add a new TAP test under src/test/recovery to run the standard
regression tests while a streaming replica replays the WAL.  This
provides a basic workout for WAL decoding and redo code, and compares
the replicated result.

Optionally, enable (expensive) wal_consistency_checking if listed in
the env variable PG_TEST_EXTRA.

Reviewed-by: 綱川 貴之 (Takayuki Tsunakawa) <tsunakawa.takay@fujitsu.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Anastasia Lubennikova <lubennikovaav@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKGKpRWQ9SxdxxDmTBCJoR0YnFpMBe7kyzY8SUQk%2BHeskxg%40mail.gmail.com
2022-01-15 00:09:24 +13:00
Thomas Munro 7170f2159f Allow "in place" tablespaces.
Provide a developer-only GUC allow_in_place_tablespaces, disabled by
default.  When enabled, tablespaces can be created with an empty
LOCATION string, meaning that they should be created as a directory
directly beneath pg_tblspc.  This can be used for new testing scenarios,
in a follow-up patch.  Not intended for end-user usage, since it might
confuse backup tools that expect symlinks.

Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKGKpRWQ9SxdxxDmTBCJoR0YnFpMBe7kyzY8SUQk%2BHeskxg%40mail.gmail.com
2022-01-15 00:09:24 +13:00
Fujii Masao 379a28b56a doc: Add "(process)" to the term "WAL receiver" in glossary.
This commit changes the term "WAL receiver" to "WAL receiver (process)"
in the glossary, so that users can easily understand WAL receiver is
obviously a process.

Author: Fujii Masao
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/51b76596-ab4f-c9f0-1005-60ce1bb14b6b@oss.nttdata.com
2022-01-13 12:17:33 +09:00
Fujii Masao 790fbda902 Enhance pg_log_backend_memory_contexts() for auxiliary processes.
Previously pg_log_backend_memory_contexts() could request to
log the memory contexts of backends, but not of auxiliary processes
such as checkpointer. This commit enhances the function so that
it can also send the request to auxiliary processes. It's useful to
look at the memory contexts of those processes for debugging purpose
and better understanding of the memory usage pattern of them.

Note that pg_log_backend_memory_contexts() cannot send the request
to logger or statistics collector. Because this logging request
mechanism is based on shared memory but those processes aren't
connected to that.

Author: Bharath Rupireddy
Reviewed-by: Vignesh C, Kyotaro Horiguchi, Fujii Masao
Discussion: https://postgr.es/m/CALj2ACU1nBzpacOK2q=a65S_4+Oaz_rLTsU1Ri0gf7YUmnmhfQ@mail.gmail.com
2022-01-11 23:19:59 +09:00
Amit Kapila 85c61ba892 Update docs of logical replication for commit 8d74fc96db.
After commit 8d74fc96db, the details of logical replication conflicts can
be found in pg_stat_subscription_workers view.

Author: Masahiko Sawada
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CAD21AoC+zm5tGN8x88AJZYcX0g_eiEuu5XdrksNmSeR3Xzwjfg@mail.gmail.com
2022-01-11 12:01:48 +05:30
Tom Lane 9ef2c65555 Doc: fix bogus example about ambiguous timestamps.
I had a brain fade in commit d32899157, and used 2:30AM as the
example timestamp for both spring-forward and fall-back cases.
But it's not actually ambiguous at all in the fall-back case,
because that transition is from 2AM to 1AM under USA rules.
Fix the example to use 1:30AM, which *is* ambiguous.

Noted while answering a question from Aleksander Alekseev.
Back-patch to all supported branches.

Discussion: https://postgr.es/m/2191355.1641828552@sss.pgh.pa.us
2022-01-10 11:46:16 -05:00
Jeff Davis 96a6f11c06 More cleanup of a2ab9c06ea.
Require SELECT privileges when performing UPDATE or DELETE, to be
consistent with the way a normal UPDATE or DELETE command works.

Simplify subscription test it so that it runs faster. Also, wait for
initial table sync to complete to avoid intermittent failures.

Minor doc fixup.

Discussion: https://postgr.es/m/CAA4eK1L3-qAtLO4sNGaNhzcyRi_Ufmh2YPPnUjkROBK0tN%3Dx%3Dg%40mail.gmail.com
Discussion: https://postgr.es/m/1514479.1641664638%40sss.pgh.pa.us
Discussion: https://postgr.es/m/Ydkfj5IsZg7mQR0g@paquier.xyz
2022-01-08 20:07:16 -08:00
Jeff Davis a2ab9c06ea Respect permissions within logical replication.
Prevent logical replication workers from performing insert, update,
delete, truncate, or copy commands on tables unless the subscription
owner has permission to do so.

Prevent subscription owners from circumventing row-level security by
forbidding replication into tables with row-level security policies
which the subscription owner is subject to, without regard to whether
the policy would ordinarily allow the INSERT, UPDATE, DELETE or
TRUNCATE which is being replicated.  This seems sufficient for now, as
superusers, roles with bypassrls, and target table owners should still
be able to replicate despite RLS policies.  We can revisit the
question of applying row-level security policies on a per-row basis if
this restriction proves too severe in practice.

Author: Mark Dilger
Reviewed-by: Jeff Davis, Andrew Dunstan, Ronan Dunklau
Discussion: https://postgr.es/m/9DFC88D3-1300-4DE8-ACBC-4CEF84399A53%40enterprisedb.com
2022-01-07 17:40:56 -08:00
Bruce Momjian 27b77ecf9f Update copyright for 2022
Backpatch-through: 10
2022-01-07 19:04:57 -05:00
Tom Lane 328dfbdabd Extend psql's \lo_list/\dl to be able to print large objects' ACLs.
The ACL is printed when you add + to the command, similarly to
various other psql backslash commands.

Along the way, move the code for this into describe.c,
where it is a better fit (and can share some code).

Pavel Luzanov, reviewed by Georgios Kokolatos

Discussion: https://postgr.es/m/6d722115-6297-bc53-bb7f-5f150e765299@postgrespro.ru
2022-01-06 13:09:05 -05:00
Michael Paquier ee5822361d doc: Remove link to JSON support in the SQL specification
The link used in the documentation is dead, and the only options to have
an access to this part of the SQL specification are not free.  Like any
other books referred, just remove the link to keep some neutrality but
keep its reference.

Reported-by: Erik Rijkers
Discussion: https://postgr.es/m/989abd7d-af30-ab52-1201-bf0b4f33b872@xs4all.nl
Backpatch-through: 12
2022-01-06 11:41:09 +09:00
Magnus Hagander 69872d0bbe Fix typo
Reported-By: Eric Mutta
Backpatch-through: 10
Discussion: https://postgr.es/m/164052477973.21665.7888120874624887609@wrigleys.postgresql.org
2022-01-02 17:06:32 +01:00
Fujii Masao 6e0cb3dec1 postgres_fdw: Allow postgres_fdw.application_name to include escape sequences.
application_name that used when postgres_fdw establishes a connection to
a foreign server can be specified in either or both a connection parameter
of a server object and GUC postgres_fdw.application_name. This commit
allows those parameters to include escape sequences that begins with
% character. Then postgres_fdw replaces those escape sequences with
status information. For example, %d and %u are replaced with user name
and database name in local server, respectively. This feature enables us
to add information more easily to track remote transactions or queries,
into application_name of a remote connection.

Author: Hayato Kuroda
Reviewed-by: Kyotaro Horiguchi, Masahiro Ikeda, Hou Zhijie, Fujii Masao
Discussion: https://postgr.es/m/TYAPR01MB5866FAE71C66547C64616584F5EB9@TYAPR01MB5866.jpnprd01.prod.outlook.com
Discussion: https://postgr.es/m/TYCPR01MB5870D1E8B949DAF6D3B84E02F5F29@TYCPR01MB5870.jpnprd01.prod.outlook.com
2021-12-24 16:55:11 +09:00
Bruce Momjian e2e1bbde46 doc: clarify when expression indexes evaluate their expressions
Only non-HOT updates evaluate the index expression.

Reported-by: Chris Lowder

Discussion: https://postgr.es/m/163967385701.26064.15365003480975321072@wrigleys.postgresql.org

Backpatch-through: 10
2021-12-22 16:29:16 -05:00
Michael Paquier fc95d35b94 Correct comment and some documentation about REPLICA_IDENTITY_INDEX
catalog/pg_class.h was stating that REPLICA_IDENTITY_INDEX with a
dropped index is equivalent to REPLICA_IDENTITY_DEFAULT.  The code tells
a different story, as it is equivalent to REPLICA_IDENTITY_NOTHING.

The behavior exists since the introduction of replica identities, and
fe7fd4e even added tests for this case but I somewhat forgot to fix this
comment.

While on it, this commit reorganizes the documentation about replica
identities on the ALTER TABLE page, and a note is added about the case
of dropped indexes with REPLICA_IDENTITY_INDEX.

Author: Michael Paquier, Wei Wang
Reviewed-by: Euler Taveira
Discussion: https://postgr.es/m/OS3PR01MB6275464AD0A681A0793F56879E759@OS3PR01MB6275.jpnprd01.prod.outlook.com
Backpatch-through: 10
2021-12-22 16:37:58 +09:00
Tom Lane 33d3eeadb2 Add a \getenv command to psql.
\getenv fetches the value of an environment variable into a psql
variable.  This is the inverse of the \setenv command that was added
over ten years ago.  We'd not seen a compelling use-case for \getenv
at the time, but upcoming regression test refactoring provides a
sufficient reason to add it now.

Discussion: https://postgr.es/m/1655733.1639871614@sss.pgh.pa.us
2021-12-20 13:17:58 -05:00
Peter Eisentraut 222b697ec0 doc: More documentation on regular expressions and SQL standard
Reviewed-by: Gilles Darold <gilles@darold.net>
Discussion: https://www.postgresql.org/message-id/b7988566-daa2-80ed-2fdc-6f6630462d26@enterprisedb.com
2021-12-20 10:36:44 +01:00
Michael Paquier 3d5ffccb6d Add option -N/--no-sync to pg_upgrade
This is an option consistent with what the other tools of src/bin/
(pg_checksums, pg_dump, pg_rewind and pg_basebackup) provide which is
useful for leveraging the I/O effort when testing things.  This is not
to be used in a production environment.

All the regression tests of pg_upgrade are updated to use this new
option.  This happens to cut at most a couple of seconds in environments
constrained on I/O, by avoiding a flush of data folder for the new
cluster upgraded.

Author: Michael Paquier
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/YbrhzuBmBxS/DkfX@paquier.xyz
2021-12-18 16:18:45 +09:00
Fujii Masao 58e2e6eb67 doc: Add note about postgres_fdw.application_name.
postgres_fdw.application_name can be any string of any length
and contain even non-ASCII characters.  However when it's passed
to and used as application_name in a foreign server, it's truncated
to less than NAMEDATALEN characters and any characters
other than printable ASCII ones in it will be replaced with question
marks. This commit adds these notes into the docs.

Author: Hayato Kuroda
Reviewed-by: Kyotaro Horiguchi, Fujii Masao
Discussion: https://postgr.es/m/TYCPR01MB5870D1E8B949DAF6D3B84E02F5F29@TYCPR01MB5870.jpnprd01.prod.outlook.com
2021-12-16 15:18:30 +09:00
Tom Lane 2a712066d0 Remove pg_dump's --no-synchronized-snapshots switch.
Server versions for which there was a plausible reason to
use this switch are all out of support now.  Leaving it
around would accomplish little except to let careless DBAs
shoot themselves in the foot.

Discussion: https://postgr.es/m/556122.1639520324@sss.pgh.pa.us
2021-12-15 18:44:47 -05:00
Michael Paquier 7acd01015c Adjust behavior of some env settings for the TAP tests of MSVC
edc2332 has introduced in vcregress.pl some control on the environment
variables LZ4, TAR and GZIP_PROGRAM to allow any TAP tests to be able
use those commands.  This makes the settings more consistent with
src/Makefile.global.in, as the same default gets used for Make and MSVC
builds.

Each parameter can be changed in buildenv.pl, but as a default gets
assigned after loading buldenv.pl, it is not possible to unset any of
these, and using an empty value would not work with "||=" either.  As
some environments may not have a compatible command in their PATH (tar
coming from MinGW is an issue, for one), this could break tests without
an exit path to bypass any failing test.  This commit changes things so
as the default values for LZ4, TAR and GZIP_PROGRAM are assigned before
loading buildenv.pl, not after.  This way, we keep the same amount of
compatibility as a GNU build with the same defaults, and it becomes
possible to unset any of those values.

While on it, this adds some documentation about those three variables in
the section dedicated to the TAP tests for MSVC.

Per discussion with Andrew Dunstan.

Discussion: https://postgr.es/m/YbGYe483803il3X7@paquier.xyz
Backpatch-through: 10
2021-12-15 10:39:24 +09:00
Tom Lane e469f0aaf3 Remove pg_upgrade support for upgrading from pre-9.2 servers.
Per discussion, we'll limit support for old servers to those branches
that can still be built easily on modern platforms, which as of now
is 9.2 and up.

Discussion: https://postgr.es/m/2923349.1634942313@sss.pgh.pa.us
2021-12-14 19:17:55 -05:00
Tom Lane 30e7c175b8 Remove pg_dump/pg_dumpall support for dumping from pre-9.2 servers.
Per discussion, we'll limit support for old servers to those branches
that can still be built easily on modern platforms, which as of now
is 9.2 and up.  Remove over a thousand lines of code dedicated to
dumping from older server versions.  (As in previous changes of
this sort, we aren't removing pg_restore's ability to read older
archive files ... though it's fair to wonder how that might be
tested nowadays.)  This cleans up some dead code left behind by
commit 989596152.

Discussion: https://postgr.es/m/2923349.1634942313@sss.pgh.pa.us
2021-12-14 17:09:07 -05:00
Tom Lane 922b23c13b Doc: de-document unimplemented geometric operators.
In commit 791090bd7, I made an effort to fill in documentation
for all geometric operators listed in pg_operator.  However,
it now appears that at least some of the omissions may have been
intentional, because some of those operator entries point at
unimplemented stub functions.  Remove those from the docs again.

(In HEAD, poly_distance stays, because c5c192d7b just added an
implementation for it.)

Per complaint from Anton Voloshin.

Discussion: https://postgr.es/m/3426566.1638832718@sss.pgh.pa.us
2021-12-13 17:49:36 -05:00
Tom Lane c5c192d7bd Implement poly_distance().
geo_ops.c contains half a dozen functions that are just stubs throwing
ERRCODE_FEATURE_NOT_SUPPORTED.  Since it's been like that for more
than twenty years, there's clearly not a lot of interest in filling in
the stubs.  However, I'm uncomfortable with deleting poly_distance(),
since every other geometric type supports a distance-to-another-object-
of-the-same-type function.  We can easily add this capability by
cribbing from poly_overlap() and path_distance().

It's possible that the (existing) test case for this will show some
numeric instability, but hopefully the buildfarm will expose it if so.

In passing, improve the documentation to try to explain why polygons
are distinct from closed paths in the first place.

Discussion: https://postgr.es/m/3426566.1638832718@sss.pgh.pa.us
2021-12-13 17:33:32 -05:00
Robert Haas 64da07c41a Default to log_checkpoints=on, log_autovacuum_min_duration=10m
The idea here is that when a performance problem is known to have
occurred at a certain point in time, it's a good thing if there is
some information available from the logs to help figure out what
might have happened around that time.

This change attracted an above-average amount of dissent, because
it means that a server with default settings will produce some amount
of log output even if nothing has gone wrong. However, by my count,
the mailing list discussion had about twice as many people in favor
of the change as opposed. The reasons for believing that the extra
log output is not an issue in practice are: (1) the rate at which
messages can be generated by this setting is bounded to one every
few minutes on a properly-configured system and (2) production
systems tend to have a lot more junk in the log from that due to
failed connection attempts, ERROR messages generated by application
activity, and the like.

Bharath Rupireddy, reviewed by Fujii Masao and by me. Many other
people commented on the thread, but as far as I can see that was
discussion of the merits of the change rather than review of the
patch.

Discussion: https://postgr.es/m/CALj2ACX-rW_OeDcp4gqrFUAkf1f50Fnh138dmkd0JkvCNQRKGA@mail.gmail.com
2021-12-13 09:48:48 -05:00
Tom Lane 07eee5a0dc Create a new type category for "internal use" types.
Historically we've put type "char" into the S (String) typcategory,
although calling it a string is a stretch considering it can only
store one byte.  (In our actual usage, it's more like an enum.)
This choice now seems wrong in view of the special heuristics
that parse_func.c and parse_coerce.c have for TYPCATEGORY_STRING:
it's not a great idea for "char" to have those preferential casting
behaviors.

Worse than that, recent patches inventing special-purpose types
like pg_node_tree have assigned typcategory S to those types,
meaning they also get preferential casting treatment that's designed
on the assumption that they can hold arbitrary text.

To fix, invent a new category TYPCATEGORY_INTERNAL for internal-use
types, and assign that to all these types.  I used code 'Z' for
lack of a better idea ('I' was already taken).

This change breaks one query in psql/describe.c, which now needs to
explicitly cast a catalog "char" column to text before concatenating
it with an undecorated literal.  Also, a test case in contrib/citext
now needs an explicit cast to convert citext to "char".  Since the
point of this change is to not have "char" be a surprisingly-available
cast target, these breakages seem OK.

Per report from Ian Campbell.

Discussion: https://postgr.es/m/2216388.1638480141@sss.pgh.pa.us
2021-12-11 14:10:51 -05:00
Tomas Vondra 4c6145b514 Add bool to btree_gist documentation
Commit 57e3c516 added bool opclass to btree_gist, but update the list of
data types in docs to reflect this change.

Reported-by: Pavel Luzanov
Discussion: https://postgr.es/m/CAE2gYzyDKJBZngssR84VGZEN=Ux=V9FV23QfPgo+7-yYnKKg4g@mail.gmail.com
2021-12-11 04:59:15 +01:00
Michael Paquier 5d08137076 Fix some typos with {a,an}
One of the changes impacts the documentation, so backpatch.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pu6+c+r3mY24VT7u+H+E_s6vMr5OdRiZ8NT3EOa-E5Lmw@mail.gmail.com
Backpatch-through: 14
2021-12-09 15:20:36 +09:00
Tom Lane 6f0e6ab04d Doc: improve xfunc-c-type-table.
List types numeric and timestamptz, which don't seem to have ever been
included here.  Restore bigint, which was no-doubt-accidentally deleted
in v12.  Fix some errors, or at least obsolete usages (nobody declares
float arguments as "float8*" anymore, even though they might be that
under the hood).  Re-alphabetize.  Remove the seeming claim that this
is a complete list of built-in types.

Per question from Oskar Stenberg.

Discussion: https://postgr.es/m/HE1PR03MB2971DE2527ECE1E99D6C19A8F96E9@HE1PR03MB2971.eurprd03.prod.outlook.com
2021-12-08 16:54:52 -05:00
Peter Eisentraut d6f96ed94e Allow specifying column list for foreign key ON DELETE SET actions
Extend the foreign key ON DELETE actions SET NULL and SET DEFAULT by
allowing the specification of a column list, like

    CREATE TABLE posts (
        ...
        FOREIGN KEY (tenant_id, author_id) REFERENCES users ON DELETE SET NULL (author_id)
    );

If a column list is specified, only those columns are set to
null/default, instead of all the columns in the foreign-key
constraint.

This is useful for multitenant or sharded schemas, where the tenant or
shard ID is included in the primary key of all tables but shouldn't be
set to null.

Author: Paul Martinez <paulmtz@google.com>
Discussion: https://www.postgresql.org/message-id/flat/CACqFVBZQyMYJV=njbSMxf+rbDHpx=W=B7AEaMKn8dWn9OZJY7w@mail.gmail.com
2021-12-08 11:13:57 +01:00
Peter Eisentraut e9e63b7022 Fix inappropriate uses of PG_GETARG_UINT32()
The chr() function used PG_GETARG_UINT32() even though the argument is
declared as (signed) integer.  As a result, you can pass negative
arguments to this function and it internally interprets them as
positive.  Ultimately ends up being harmless, but it seems wrong, so
fix this and rearrange the internal error checking a bit to
accommodate this.

Another case was in the documentation, where example code used
PG_GETARG_UINT32() with an argument declared as signed integer.

Reviewed-by: Nathan Bossart <bossartn@amazon.com>
Discussion: https://www.postgresql.org/message-id/flat/7e43869b-d412-8f81-30a3-809783edc9a3%40enterprisedb.com
2021-12-06 13:37:11 +01:00
Daniel Gustafsson fadac33bb8 Doc: Fix misleading wording of CRL parameters
ssl_crl_file and ssl_crl_dir are both used to for client certificate
revocation, not server certificates.  The description for the params
could be easily misread to mean the opposite however,  as evidenced
by the bugreport leading to this fix.  Similarly, expand sslcrl and
and sslcrldir to explicitly mention server certificates. While there
also mention sslcrldir where previously only sslcrl was discussed.

Backpatch down to v10, with the CRL dir fixes down to 14 where they
were introduced.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/20211202.135441.590555657708629486.horikyota.ntt@gmail.com
Discussion: https://postgr.es/m/CABWY_HCBUCjY1EJHrEGePGEaSZ5b29apgTohCyygtsqe_ySYng@mail.gmail.com
Backpatch-through: 10
2021-12-03 14:15:50 +01:00
Michael Paquier f2c52eeba9 pg_waldump: Emit stats summary when interrupted by SIGINT
Previously, pg_waldump would not display its statistics summary if it
got interrupted by SIGINT (or say a simple Ctrl+C).  It gains with this
commit a signal handler for SIGINT, trapping the signal to exit at the
earliest convenience to allow a display of the stats summary before
exiting.  This makes the reports more interactive, similarly to strace
-c.

This new behavior makes the combination of the options --stats and
--follow much more useful, so as the user will get a report for any
invocation of pg_waldump in such a case.  Information about the LSN
range of the stats computed is added as a header to the report
displayed.

This implementation comes from a suggestion by Álvaro Herrera and
myself, following a complaint by the author of this patch about --stats
and --follow not being useful together originally.

As documented, this is not supported on Windows, though its support
would be possible by catching the terminal events associated to Ctrl+C,
for example (this may require a more centralized implementation, as
other tools could benefit from a common API).

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACUUx3PcK2z9h0_m7vehreZAUbcmOky9WSEpe8TofhV=PQ@mail.gmail.com
2021-12-02 13:52:16 +09:00
Robert Haas 81fca310b3 Document that tar archives are now properly terminated.
Commit 5a1007a508 changed the server
behavior, but I didn't notice that the existing behavior was
documented, and therefore did not update the documentation.
This commit does that.

I chose to mention that the behavior has changed rather than just
removing the reference to a deviation from a standard. It seemed
like that might be helpful to tool authors.

Discussion: http://postgr.es/m/CA+TgmoaYZbz0=Yk797aOJwkGJC-LK3iXn+wzzMx7KdwNpZhS5g@mail.gmail.com
2021-12-01 08:55:00 -05:00
Peter Eisentraut 5786fe154b doc: Some additional information about when to use referential actions 2021-12-01 11:25:58 +01:00
Amit Kapila eb7828f54a Doc: Add "Attach Partition" limitation during logical replication.
ATTACHing a table into a partition tree whose root is published using a
publication with publish_via_partition_root set to true does not result in
the table's existing contents being replicated. This happens because
subscriber doesn't consider replicating the newly attached partition as
the root table is already in a 'ready' state.

This behavior was introduced in PG13 (83fd4532a7) where we allowed to
publish partition changes via ancestors.

We can consider fixing this limitation in the future.

Author: Amit Langote
Reviewed-by: Hou Zhijie, Amit Kapila
Backpatch-through: 13
Discussion: https://postgr.es/m/OS0PR01MB5716E97F00732B52DC2BBC2594989@OS0PR01MB5716.jpnprd01.prod.outlook.com
2021-12-01 10:07:45 +05:30
Tomas Vondra 5753d4ee32 Ignore BRIN indexes when checking for HOT udpates
When determining whether an index update may be skipped by using HOT, we
can ignore attributes indexed only by BRIN indexes. There are no index
pointers to individual tuples in BRIN, and the page range summary will
be updated anyway as it relies on visibility info.

This also removes rd_indexattr list, and replaces it with rd_attrsvalid
flag. The list was not used anywhere, and a simple flag is sufficient.

Patch by Josef Simanek, various fixes and improvements by me.

Author: Josef Simanek
Reviewed-by: Tomas Vondra, Alvaro Herrera
Discussion: https://postgr.es/m/CAFp7QwpMRGcDAQumN7onN9HjrJ3u4X3ZRXdGFT0K5G2JWvnbWg%40mail.gmail.com
2021-11-30 20:04:38 +01:00
Amit Kapila 8d74fc96db Add a view to show the stats of subscription workers.
This commit adds a new system view pg_stat_subscription_workers, that
shows information about any errors which occur during the application of
logical replication changes as well as during performing initial table
synchronization. The subscription statistics entries are removed when the
corresponding subscription is removed.

It also adds an SQL function pg_stat_reset_subscription_worker() to reset
single subscription errors.

The contents of this view can be used by an upcoming patch that skips the
particular transaction that conflicts with the existing data on the
subscriber.

This view can be extended in the future to track other xact related
statistics like the number of xacts committed/aborted for subscription
workers.

Author: Masahiko Sawada
Reviewed-by: Greg Nancarrow, Hou Zhijie, Tang Haiying, Vignesh C, Dilip Kumar, Takamichi Osumi, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
2021-11-30 08:54:30 +05:30
Tom Lane 4f33af23e7 Doc: improve documentation about ORDER BY in matviews.
Remove the confusing use of ORDER BY in an example materialized
view.  It adds nothing to the example, but might encourage
people to follow bad practice.  Clarify REFRESH MATERIALIZED
VIEW's note about whether view ordering is retained (it isn't).

Maciek Sakrejda

Discussion: https://postgr.es/m/CAOtHd0D-OvrUU0C=4hX28p4BaSE1XL78BAQ0VcDaLLt8tdUzsg@mail.gmail.com
2021-11-29 12:13:12 -05:00
Alvaro Herrera dd484c97f5
Copy-edit vacuuumdb --analyze-in-stages doc blurb
I had made a few typos, and Nikolai Berkoff made a wording change
suggestion.

Discussion: https://postgr.es/m/VMwe7-sGegrQPQ7fJjSCdsEbESKeJFOb6G4DFxxNrf45I7DzHio7sNUH88wWRMnAy5a5G0-FB31dxPM47ldigW6WdiCPncHgqO9bNl6F240=@pm.me
2021-11-26 14:42:15 -03:00
Alvaro Herrera 013bb6c8c0
Document units for max_slot_wal_keep_size
The doc blurb failed to mention units, as well as lacking the point
about changeability.

Backpatch to 13.

Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reported by: b1000101@pm.me
Discussion: https://postgr.es/m/163760291192.26193.10801700492025355788@wrigleys.postgresql.org
2021-11-26 14:31:57 -03:00
Tom Lane 4ac452e228 Doc: improve documentation about nextval()/setval().
Clarify that the results of nextval and setval are not guaranteed
persistent until the calling transaction commits.  Some people
seem to have drawn the opposite conclusion from the statement that
these functions are never rolled back, so re-word to avoid saying
it quite that way.

Discussion: https://postgr.es/m/CAKU4AWohO=NfM-4KiZWvdc+z3c1C9FrUBR6xnReFJ6sfy0i=Lw@mail.gmail.com
2021-11-24 13:37:11 -05:00
Heikki Linnakangas 373e552189 Fix missing space in docs.
Author: Japin Li
Discussion: https://www.postgresql.org/message-id/MEYP282MB1669C36E5F733C2EFBDCB80BB6619@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
2021-11-24 18:32:56 +02:00
Michael Paquier b2265d305d Add support for Visual Studio 2022 in build scripts
Documentation and any code paths related to VS are updated to keep the
whole consistent.  Similarly to 2017 and 2019, the version of VS and the
version of nmake that we use to determine which code paths to use for
the build are still inconsistent in their own way.

Backpatch down to 10, so as buildfarm members are able to use this new
version of Visual Studio on all the stable branches supported.

Author: Hans Buschmann
Discussion: https://postgr.es/m/1633101364685.39218@nidsa.net
Backpatch-through: 10
2021-11-24 13:03:23 +09:00
Michael Paquier 1922d7c6e1 Add SQL functions to monitor the directory contents of replication slots
This commit adds a set of functions able to look at the contents of
various paths related to replication slots:
- pg_ls_logicalsnapdir, for pg_logical/snapshots/
- pg_ls_logicalmapdir, for pg_logical/mappings/
- pg_ls_replslotdir, for pg_replslot/<slot_name>/

These are intended to be used by monitoring tools.  Unlike pg_ls_dir(),
execution permission can be granted to non-superusers.  Roles members of
pg_monitor gain have access to those functions.

Bump catalog version.

Author: Bharath Rupireddy
Reviewed-by: Nathan Bossart, Justin Pryzby
Discussion: https://postgr.es/m/CALj2ACWsfizZjMN6bzzdxOk1ADQQeSw8HhEjhmVXn_Pu+7VzLw@mail.gmail.com
2021-11-23 19:29:42 +09:00
Fujii Masao 1b06d7bac9 Report wait events for local shell commands like archive_command.
This commit introduces new wait events for archive_command,
archive_cleanup_command, restore_command and recovery_end_command.

Author: Fujii Masao
Reviewed-by: Bharath Rupireddy, Michael Paquier
Discussion: https://postgr.es/m/4ca4f920-6b48-638d-08b2-93598356f5d3@oss.nttdata.com
2021-11-22 10:28:21 +09:00
Daniel Gustafsson 3374a87b62 Doc: add see-also references to CREATE PUBLICATION.
The "See also" section on the reference page for CREATE PUBLICATION
didn't match the cross references on CREATE SUBSCRIPTION and their
ALTER counterparts. Fixed by adding an xref to the CREATE and ALTER
SUBSCRIPTION pages.  Backpatch down to v10 where CREATE PUBLICATION
was introduced.

Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAHut+PvGWd3-Ktn96c-z6uq-8TGVVP=TPOkEovkEfntoo2mRhw@mail.gmail.com
Backpatch-through: 10
2021-11-17 13:34:41 +01:00
Peter Eisentraut db9f287711 doc: Add referential actions to CREATE/ALTER TABLE synopsis
The general constraint synopsis references "referential_action", but
this was not further defined in the synopsis section.  Compared to the
level of detail that the synopsis gives to other subclauses, this
should surely be there.

extracted from a patch by Paul Martinez <hellopfm@gmail.com>

Discussion: https://www.postgresql.org/message-id/flat/CACqFVBZQyMYJV=njbSMxf+rbDHpx=W=B7AEaMKn8dWn9OZJY7w@mail.gmail.com
2021-11-11 10:49:44 +01:00
Tom Lane c3b33698cf Doc: improve protocol spec for logical replication Type messages.
protocol.sgml documented the layout for Type messages, but completely
dropped the ball otherwise, failing to explain what they are, when
they are sent, or what they're good for.  While at it, do a little
copy-editing on the description of Relation messages.

In passing, adjust the comment for apply_handle_type() to make it
clearer that we choose not to do anything when receiving a Type
message, not that we think it has no use whatsoever.

Per question from Stefen Hillman.

Discussion: https://postgr.es/m/CAPgW8pMknK5pup6=T4a_UG=Cz80Rgp=KONqJmTdHfaZb0RvnFg@mail.gmail.com
2021-11-10 13:13:04 -05:00
Jeff Davis 4168a47454 Add pg_checkpointer predefined role for CHECKPOINT command.
Any user with the privileges of pg_checkpointer can issue a CHECKPOINT
command.

Reviewed-by: Stephen Frost
Discussion: https://postgr.es/m/67a1d667e8ec228b5e07f232184c80348c5d93f4.camel%40j-davis.com
2021-11-09 16:59:14 -08:00
Fujii Masao ec21779a58 doc: Add index entries for pg_stat_statements configuration parameters.
Author: Ken Kato
Reviewed-by: Julien Rouhaud, Fujii Masao
Discussion: https://postgr.es/m/699cfd8170178db087e54c954b21ece4@oss.nttdata.com
2021-11-09 12:39:47 +09:00
Tom Lane 160c025880 libpq: reject extraneous data after SSL or GSS encryption handshake.
libpq collects up to a bufferload of data whenever it reads data from
the socket.  When SSL or GSS encryption is requested during startup,
any additional data received with the server's yes-or-no reply
remained in the buffer, and would be treated as already-decrypted data
once the encryption handshake completed.  Thus, a man-in-the-middle
with the ability to inject data into the TCP connection could stuff
some cleartext data into the start of a supposedly encryption-protected
database session.

This could probably be abused to inject faked responses to the
client's first few queries, although other details of libpq's behavior
make that harder than it sounds.  A different line of attack is to
exfiltrate the client's password, or other sensitive data that might
be sent early in the session.  That has been shown to be possible with
a server vulnerable to CVE-2021-23214.

To fix, throw a protocol-violation error if the internal buffer
is not empty after the encryption handshake.

Our thanks to Jacob Champion for reporting this problem.

Security: CVE-2021-23222
2021-11-08 11:14:56 -05:00
Tom Lane cbe25dcff7 Disallow making an empty lexeme via array_to_tsvector().
The tsvector data type has always forbidden lexemes to be empty.
However, array_to_tsvector() didn't get that memo, and would
allow an empty-string array element to become an empty lexeme.
This could result in dump/restore failures later, not to mention
whatever semantic issues might be behind the original prohibition.

However, other functions that take a plain text input directly as
a lexeme value do not need a similar restriction, because they only
match the string against existing tsvector entries.  In particular
it'd be a bad idea to make ts_delete() reject empty strings, since
that is the most convenient way to clean up any bad data that might
have gotten into a tsvector column via this bug.

Reflecting on that, let's also remove the prohibition against NULL
array elements in tsvector_delete_arr and tsvector_setweight_by_filter.
It seems more consistent to ignore them, as an empty-string element
would be ignored.

There's a case for back-patching this, since it's clearly a bug fix.
On balance though, it doesn't seem like something to change in a
minor release.

Jean-Christophe Arnu

Discussion: https://postgr.es/m/CAHZmTm1YVndPgUVRoag2WL0w900XcoiivDDj-gTTYBsG25c65A@mail.gmail.com
2021-11-06 13:28:53 -04:00
Alvaro Herrera df80f9da5c
Document that ALTER TABLE .. TYPE removes statistics
Co-authored-by: Nikolai Berkoff <nikolai.berkoff@pm.me>
Discussion: https://postgr.es/m/vCc8XnwDmlP4ZnHBQLIVxzD405BiYHVC9qZlhIF7IsfxK0gC9mZ4PUUOH0-3y6kv5p-87-3_ljqT1KvQVAnb8OoWhPU3kcqWn2ZpmxRBCQg=@pm.me
2021-11-05 12:09:31 -03:00
Alvaro Herrera 105c1de019
Pipeline mode disallows multicommand strings
... so mention that in appropriate places of the libpq docs.

Backpatch to 14.

Reported-by: RekGRpth <rekgrpth@gmail.com>
Discussion: https://postgr.es/m/17235-53bb38fc5be593dc@postgresql.org
2021-11-05 11:40:03 -03:00
Alvaro Herrera e543906e21
Document default and changeability of log_startup_progress_interval
Review for 9ce346eabf.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/202110292123.bnf6axcp27vx@alvherre.pgsql
2021-11-05 11:31:57 -03:00
Alvaro Herrera 00a354a135
Reword doc blurb for vacuumdb --analyze-in-stages
Make users aware that using it in a database with existing stats might
cause transient problems.

Author: Nikolai Berkoff <nikolai.berkoff@pm.me>
Discussion: https://postgr.es/m/s-kSljtWXMWgMfGTztPTPcS80R8FHdOrBxDTnrQI6GMZbT7au1A4b0fzaSFtKwCI8nwN0MhgPLfVOTvJ7DwTjkip4P3d0o4VgrMJs4OLN-o=@pm.me
2021-11-05 11:22:30 -03:00
Peter Eisentraut db7d1a7b05 pgcrypto: Remove non-OpenSSL support
pgcrypto had internal implementations of some encryption algorithms,
as an alternative to calling out to OpenSSL.  These were rarely used,
since most production installations are built with OpenSSL.  Moreover,
maintaining parallel code paths makes the code more complex and
difficult to maintain.

This patch removes these internal implementations.  Now, pgcrypto is
only built if OpenSSL support is configured.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/0b42f1df-8cba-6a30-77d7-acc241cc88c1%40enterprisedb.com
2021-11-05 14:06:59 +01:00
Michael Paquier babbbb595d Add support for LZ4 compression in pg_receivewal
pg_receivewal gains a new option, --compression-method=lz4, available
when the code is compiled with --with-lz4.  Similarly to gzip, this
gives the possibility to compress archived WAL segments with LZ4.  This
option is not compatible with --compress.

The implementation uses LZ4 frames, and is compatible with simple lz4
commands.  Like gzip, using --synchronous ensures that any data will be
flushed to disk within the current .partial segment, so as it is
possible to retrieve as much WAL data as possible even from a
non-completed segment (this requires completing the partial file with
zeros up to the WAL segment size supported by the backend after
decompression, but this is the same as gzip).

The calculation of the streaming start LSN is able to transparently find
and check LZ4-compressed segments.  Contrary to gzip where the
uncompressed size is directly stored in the object read, the LZ4 chunk
protocol does not store the uncompressed data by default.  There is
contentSize that can be used with LZ4 frames by that would not help if
using an archive that includes segments compressed with the defaults of
a "lz4" command, where this is not stored.  So, this commit has taken
the most extensible approach by decompressing the already-archived
segment to check its uncompressed size, through a blank output buffer in
chunks of 64kB (no actual performance difference noticed with 8kB, 16kB
or 32kB, and the operation in itself is actually fast).

Tests have been added to verify the creation and correctness of the
generated LZ4 files.  The latter is achieved by the use of command
"lz4", if found in the environment.

The tar-based WAL method in walmethods.c, used now only by
pg_basebackup, does not know yet about LZ4.  Its code could be extended
for this purpose.

Author: Georgios Kokolatos
Reviewed-by: Michael Paquier, Jian Guo, Magnus Hagander, Dilip Kumar
Discussion: https://postgr.es/m/ZCm1J5vfyQ2E6dYvXz8si39HQ2gwxSZ3IpYaVgYa3lUwY88SLapx9EEnOf5uEwrddhx2twG7zYKjVeuP5MwZXCNPybtsGouDsAD1o2L_I5E=@pm.me
2021-11-05 11:33:25 +09:00
Michael Paquier 9588622945 Fix some thinkos with pg_receivewal --compression-method
The option name was incorrect in one of the error messages, and the
short option 'I' was used in the code but we did not intend things to be
this way.  While on it, fix the documentation to refer to a "method",
and not a "level.

Oversights in commit d62bcc8, that I have detected after more review of
the LZ4 patch for pg_receivewal.
2021-11-04 12:32:37 +09:00
Michael Paquier d62bcc8b07 Rework compression options of pg_receivewal
pg_receivewal includes since cada1af the option --compress, to allow the
compression of WAL segments using gzip, with a value of 0 (the default)
meaning that no compression can be used.

This commit introduces a new option, called --compression-method, able
to use as values "none", the default, and "gzip", to make things more
extensible.  The case of --compress=0 becomes fuzzy with this option
layer, so we have made the choice to make pg_receivewal return an error
when using "none" and a non-zero compression level, meaning that the
authorized values of --compress are now [1,9] instead of [0,9].  Not
specifying --compress with "gzip" as compression method makes
pg_receivewal use the default of zlib instead (Z_DEFAULT_COMPRESSION).

The code in charge of finding the streaming start LSN when scanning the
existing archives is refactored and made more extensible.  While on it,
rename "compression" to "compression_level" in walmethods.c, to reduce
the confusion with the introduction of the compression method, even if
the tar method used by pg_basebackup does not rely on the compression
method (yet, at least), but just on the compression level (this area
could be improved more, actually).

This is in preparation for an upcoming patch that adds LZ4 support to
pg_receivewal.

Author: Georgios Kokolatos
Reviewed-by: Michael Paquier, Jian Guo, Magnus Hagander, Dilip Kumar,
Robert Haas
Discussion: https://postgr.es/m/ZCm1J5vfyQ2E6dYvXz8si39HQ2gwxSZ3IpYaVgYa3lUwY88SLapx9EEnOf5uEwrddhx2twG7zYKjVeuP5MwZXCNPybtsGouDsAD1o2L_I5E=@pm.me
2021-11-04 11:10:31 +09:00
Tom Lane 7d9ec0754a Doc: clean up some places that mentioned template1 but not template0.
Improve old text that wasn't updated when we added template0 to
the standard database set.

Per suggestion from P. Luzanov.

Discussion: https://postgr.es/m/163583775122.675.3700595100340939507@wrigleys.postgresql.org
2021-11-02 12:54:35 -04:00
Tom Lane af8c580e5c Doc: be more precise about conflicts between relation names.
Use verbiage like "The name of the table must be distinct from the name
of any other relation (table, sequence, index, view, materialized view,
or foreign table) in the same schema." in the reference pages for all
those object types.  The main change here is to mention materialized
views explicitly; although a couple of these pages failed to say
anything at all about name conflicts.

Per suggestion from Daniel Westermann.

Discussion: https://postgr.es/m/ZR0P278MB0920D0946509233459AF0DEFD2889@ZR0P278MB0920.CHEP278.PROD.OUTLOOK.COM
2021-11-02 12:12:02 -04:00
Fujii Masao cd29be5459 pgbench: Improve error-handling in pgbench.
Previously failures of initial connection and logfile open caused pgbench
to proceed the benchmarking, report the incomplete results and exit with
status 2. It didn't make sense to proceed the benchmarking even when
pgbench could not start as prescribed.

This commit improves pgbench so that early errors that occur when
starting benchmark such as those failures should make pgbench exit
immediately with status 1.

Author: Yugo Nagata
Reviewed-by: Fabien COELHO, Kyotaro Horiguchi, Fujii Masao
Discussion: https://postgr.es/m/TYCPR01MB5870057375ACA8A73099C649F5349@TYCPR01MB5870.jpnprd01.prod.outlook.com
2021-11-02 22:49:57 +09:00
Peter Eisentraut e6c60719e6 doc: Remove some obsolete pgcrypto documentation
The pgcrypto documentation contained acknowledgments of used external
code, but some of this code has been moved to src/common/, so
mentioning it with pgcrypto no longer makes sense, so remove it.
2021-10-30 13:14:52 +02:00
Peter Eisentraut b8b62b4be2 Remove unused chunk from standalone-profile.xsl
unused since 1707a0d2aa
2021-10-30 12:38:14 +02:00
Tom Lane a2a731d6c9 Test and document the behavior of initialization cross-refs in plpgsql.
We had a test showing that a variable isn't referenceable in its
own initialization expression, nor in prior ones in the same block.
It *is* referenceable in later expressions in the same block, but
AFAICS there is no test case exercising that.  Add one, and also
add some error cases.

Also, document that this is possible, since the docs failed to
cover the point.

Per question from tomás at tuxteam.  I don't feel any need to
back-patch this, but we should ensure we don't break it in future.

Discussion: https://postgr.es/m/20211029121435.GA5414@tuxteam.de
2021-10-29 12:45:33 -04:00