Commit Graph

22462 Commits

Author SHA1 Message Date
Amit Kapila e464cb7af3 Fix origin timestamp during decoding of ROLLBACK PREPARED operation.
This happens because we were passing incorrect arguments to
ReorderBufferFinishPrepared().

Author: Masahiko Sawada
Reviewed-by: Vignesh C
Backpatch-through: 14
Discussion: https://postgr.es/m/CAD21AoBqhUqgDZUhUVnnwKRubPDNJ6m6fJDPgok3E5cWJLL+pA@mail.gmail.com
2021-12-08 15:18:56 +05:30
Amit Kapila 1a2aaeb0db Fix changing the ownership of ALL TABLES IN SCHEMA publication.
Ensure that the new owner of ALL TABLES IN SCHEMA publication must be a
superuser. The same is already ensured during CREATE PUBLICATION.

Author: Vignesh C
Reviewed-by: Nathan Bossart, Greg Nancarrow, Michael Paquier, Haiying Tang
Discussion: https://postgr.es/m/CALDaNm0E5U-RqxFuFrkZrQeG7ae5trGa=xs=iRtPPHULtT4zOw@mail.gmail.com
2021-12-08 11:31:16 +05:30
Amit Kapila a61bff2bf4 De-duplicate the result of pg_publication_tables view.
We show duplicate values for child tables in publications that have both
child and parent tables and are published with publish_via_partition_root
as false which is not what the user would expect.

We decided not to backpatch this as there is no user complaint about this
and it doesn't seem to be a critical issue.

Author: Hou Zhijie
Reviewed-by: Bharath Rupireddy, Amit Langote, Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB5716E97F00732B52DC2BBC2594989@OS0PR01MB5716.jpnprd01.prod.outlook.com
2021-12-08 11:15:25 +05:30
Michael Paquier 00029deaf6 Improve parsing of options of CREATE/ALTER SUBSCRIPTION
This simplifies the code so as it is not necessary anymore for the
caller of parse_subscription_options() to zero SubOpts, holding a
bitmaps of the provided options as well as the default/parsed option
values.  This also simplifies some checks related to the options
supported by a command when checking for incompatibilities.

While on it, the errors generated for unsupported combinations with
"slot_name = NONE" are reordered.  This may generate a different errors
compared to the previous major versions, but users have to go through
all those errors to get a correct command in this case when using
incorrect values for options "enabled" and "create\slot", so at the end
the resulting command would remain the same.

Author: Peter Smith
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/CAHut+PtXHfLgLHDDJ8ZN5f5Be_37mJoxpEsRg8LNmm4XCr06Rw@mail.gmail.com
2021-12-08 12:36:31 +09:00
Michael Paquier f99870dd86 Fix corruption of toast indexes with REINDEX CONCURRENTLY
REINDEX CONCURRENTLY run on a toast index or a toast relation could
corrupt the target indexes rebuilt, as a backend running in parallel
that manipulates toast values would directly release the lock on the
toast relation when its local operation is done, rather than releasing
the lock once the transaction that manipulated the toast values
committed.

The fix done here is simple: we now hold a ROW EXCLUSIVE lock on the
toast relation when saving or deleting a toast value until the
transaction working on them is committed, so as a concurrent reindex
happening in parallel would be able to wait for any activity and see any
new rows inserted (or deleted).

An isolation test is added to check after the case fixed here, which is
a bit fancy by design as it relies on allow_system_table_mods to rename
the toast table and its index to fixed names.  This way, it is possible
to reindex them directly without any dependency on the OID of the
underlying relation.  Note that this could not use a DO block either, as
REINDEX CONCURRENTLY cannot be run in a transaction block.  The test is
backpatched down to 13, where it is possible, thanks to c4a7a39, to use
allow_system_table_mods in a test suite.

Reported-by: Alexey Ermakov
Analyzed-by: Andres Freund, Noah Misch
Author: Michael Paquier
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/17268-d2fb426e0895abd4@postgresql.org
Backpatch-through: 12
2021-12-08 11:01:08 +09:00
Tom Lane ed52c3707b On Windows, also call shutdown() while closing the client socket.
Further experimentation shows that commit 6051857fc is not sufficient
when using (some versions of?) OpenSSL.  The reason is obscure, but
calling shutdown(socket, SD_SEND) improves matters.

Per testing by Andrew Dunstan and Alexander Lakhin.
Back-patch as before.

Discussion: https://postgr.es/m/af5e0bf3-6a61-bb97-6cba-061ddf22ff6b@dunslane.net
2021-12-07 13:34:06 -05:00
Peter Eisentraut bba962f0c0 Update snowball
Update to snowball tag v2.2.0.  Minor changes only.
2021-12-07 07:04:05 +01:00
Peter Eisentraut e9e63b7022 Fix inappropriate uses of PG_GETARG_UINT32()
The chr() function used PG_GETARG_UINT32() even though the argument is
declared as (signed) integer.  As a result, you can pass negative
arguments to this function and it internally interprets them as
positive.  Ultimately ends up being harmless, but it seems wrong, so
fix this and rearrange the internal error checking a bit to
accommodate this.

Another case was in the documentation, where example code used
PG_GETARG_UINT32() with an argument declared as signed integer.

Reviewed-by: Nathan Bossart <bossartn@amazon.com>
Discussion: https://www.postgresql.org/message-id/flat/7e43869b-d412-8f81-30a3-809783edc9a3%40enterprisedb.com
2021-12-06 13:37:11 +01:00
Peter Eisentraut 37b2764593 Some RELKIND macro refactoring
Add more macros to group some RELKIND_* macros:

- RELKIND_HAS_PARTITIONS()
- RELKIND_HAS_TABLESPACE()
- RELKIND_HAS_TABLE_AM()

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/a574c8f1-9c84-93ad-a9e5-65233d6fc00f%40enterprisedb.com
2021-12-03 14:08:19 +01:00
Michael Paquier 03774f9bb3 Improve the description of various GUCs
This commit fixes a couple of inconsistencies in the descriptions of
some GUCs, while making their wording more general regarding the units
they rely on.

For most of them, this removes the use of terms like "N seconds" or "N
bytes", which may not apply easily to all the languages these strings
are translated to (from my own experience, this works in French and
English, less in Japanese).

Per debate between the authors listed below.

Author: Justin Pryzby, Michael Paquier
Discussion: https://postgr.es/m/20211129030833.GJ17618@telsasoft.com
2021-12-03 09:39:03 +09:00
Tom Lane 6051857fc9 On Windows, close the client socket explicitly during backend shutdown.
It turns out that this is necessary to keep Winsock from dropping any
not-yet-sent data, such as an error message explaining the reason for
process termination.  It's pretty weird that the implicit close done
by the kernel acts differently from an explicit close, but it's hard
to argue with experimental results.

Independently submitted by Alexander Lakhin and Lars Kanis (comments
by me, though).  Back-patch to all supported branches.

Discussion: https://postgr.es/m/90b34057-4176-7bb0-0dbb-9822a5f6425b@greiz-reinsdorf.de
Discussion: https://postgr.es/m/16678-253e48d34dc0c376@postgresql.org
2021-12-02 17:14:43 -05:00
Tom Lane babe545cae Avoid leaking memory during large-scale REASSIGN OWNED BY operations.
The various ALTER OWNER routines tend to leak memory in
CurrentMemoryContext.  That's not a problem when they're only called
once per command; but in this usage where we might be touching many
objects, it can amount to a serious memory leak.  Fix that by running
each call in a short-lived context.

(DROP OWNED BY likely has a similar issue, except that you'll probably
run out of lock table space before noticing.  REASSIGN is worth fixing
since for most non-table object types, it won't take any lock.)

Back-patch to all supported branches.  Unfortunately, in the back
branches this helps to only a limited extent, since the sinval message
queue bloats quite a lot in this usage before commit 3aafc030a,
consuming memory more or less comparable to what's actually leaked.
Still, it's clearly a leak with a simple fix, so we might as well fix it.

Justin Pryzby, per report from Guillaume Lelarge

Discussion: https://postgr.es/m/CAECtzeW2DAoioEGBRjR=CzHP6TdL=yosGku8qZxfX9hhtrBB0Q@mail.gmail.com
2021-12-01 13:44:46 -05:00
Peter Eisentraut 89d1c15d64 Remove unused includes
These haven't been needed for a long time.

Reviewed-by: John Naylor <john.naylor@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb97d@enterprisedb.com
2021-12-01 16:10:56 +01:00
Peter Eisentraut fb7f70112f Improve some comments in scanner files
Reviewed-by: John Naylor <john.naylor@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb97d@enterprisedb.com
2021-12-01 16:10:52 +01:00
Peter Eisentraut 75d22069e0 Warning on SET of nonexisting setting with a prefix reserved by an extension
An extension can already de facto reserve a GUC prefix using
EmitWarningsOnPlaceholders().  But this was only checked against
settings that exist at the time the extension is loaded (or the
extension chooses to call this).  No diagnostic is given when a SET
command later uses a nonexisting setting with a custom prefix.

With this change, EmitWarningsOnPlaceholders() saves the prefixes it
reserves in a list, and SET checks when it finds a "placeholder"
setting whether it belongs to a reserved prefix and issues a warning
in that case.

Add a regression test that checks the patch using the "plpgsql"
registered prefix.

Author: Florin Irion <florin.irion@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+HEvJDhWuuTpGTJT9Tgbdzm4QS4EzPAwDBScWK18H2Q=FVJFw@mail.gmail.com
2021-12-01 15:08:32 +01:00
Daniel Gustafsson 018b800245 Remove mention of TimeLineID update from comments
Commit 4a92a1c3d removed the TimeLineID update from RecoveryInProgress,
update comments accordingly.

Author: Amul Sul <sulamul@gmail.com>
Discussion: https://postgr.es/m/CAAJ_b96wyzs8N45jc-kYd-bTE02hRWQieLZRpsUtNbhap7_PuQ@mail.gmail.com
2021-12-01 14:17:24 +01:00
Michael Paquier 7799d4e3bd Fix comment grammar in slotfuncs.c
Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACUkrNR2xTak+QaqxoTjPKGn8zXWripv7SR27t+Q5qF1Wg@mail.gmail.com
2021-12-01 20:28:19 +09:00
Peter Geoghegan 4bdfe68559 vacuumlazy.c: fix remaining "dead tuple" references.
Oversight in commit 4f8d9d12.

Reported-By: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAD21AoDm38Em0bvRqeQKr4HPvOj65Y8cUgCP4idMk39iaLrxyw@mail.gmail.com
2021-11-30 11:40:33 -08:00
Tomas Vondra 5753d4ee32 Ignore BRIN indexes when checking for HOT udpates
When determining whether an index update may be skipped by using HOT, we
can ignore attributes indexed only by BRIN indexes. There are no index
pointers to individual tuples in BRIN, and the page range summary will
be updated anyway as it relies on visibility info.

This also removes rd_indexattr list, and replaces it with rd_attrsvalid
flag. The list was not used anywhere, and a simple flag is sufficient.

Patch by Josef Simanek, various fixes and improvements by me.

Author: Josef Simanek
Reviewed-by: Tomas Vondra, Alvaro Herrera
Discussion: https://postgr.es/m/CAFp7QwpMRGcDAQumN7onN9HjrJ3u4X3ZRXdGFT0K5G2JWvnbWg%40mail.gmail.com
2021-11-30 20:04:38 +01:00
Alvaro Herrera 4c83e59e01
Increase size of shared memory for pg_commit_ts
Like 5364b357fb did for pg_commit, change the formula used to
determine number of pg_commit_ts buffers, which helps performance with
larger servers.

Discussion: https://postgr.es/m/20210115220744.GA24457@alvherre.pgsql
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
2021-11-30 14:29:31 -03:00
Daniel Gustafsson ac0db34e0e Remove PF_USED_FOR_ASSERTS_ONLY from variables in general use
fsstate in process_pending_requests (in postgres_fdw.c) was added in
8998e3cafa as an assertion-only variable,  1ec7fca859 stated using
the variable outside of assertions.

rd_index in get_index_column_opclass (in lsyscache.c) was introduced
in 2a6368343f, and then promptly used in the fix commit 7e04160390
shortly thereafter.

This removes the PG_USED_FOR_ASSERTS_ONLY variable decoration from
the above mentioned variables.

Reviewed-by: Greg Nancarrow <gregn4422@gmail.com>
Discussion: https://postgr.es/m/F959106C-0F21-43A5-B2AE-D007D51ACBEE@yesql.se
2021-11-30 14:02:14 +01:00
Michael Paquier be5455124b Fix flags of some GUCs and improve some descriptions
This commit fixes some issues with GUCs:
- enable_incremental_sort was not marked as GUC_EXPLAIN, causing it to
not be listed in the output of EXPLAIN (SETTINGS) if using a value
different than the default, contrary to the other planner-level GUCs.
- trace_recovery_messages missed GUC_NOT_IN_SAMPLE, like the other
developer options.
- ssl_renegotiation_limit should be marked as COMPAT_OPTIONS_PREVIOUS.

While on it, this fixes one incorrect comment related to
autovacuum_freeze_max_age, and improves the descriptions of some other
GUCs, recently introduced.

Extracted from a larger patch set by the same author.

Author: Justin Pryzby
Description: https://postgr.es/m/20211129030833.GJ17618@telsasoft.com
2021-11-30 14:38:49 +09:00
Amit Kapila 8d74fc96db Add a view to show the stats of subscription workers.
This commit adds a new system view pg_stat_subscription_workers, that
shows information about any errors which occur during the application of
logical replication changes as well as during performing initial table
synchronization. The subscription statistics entries are removed when the
corresponding subscription is removed.

It also adds an SQL function pg_stat_reset_subscription_worker() to reset
single subscription errors.

The contents of this view can be used by an upcoming patch that skips the
particular transaction that conflicts with the existing data on the
subscriber.

This view can be extended in the future to track other xact related
statistics like the number of xacts committed/aborted for subscription
workers.

Author: Masahiko Sawada
Reviewed-by: Greg Nancarrow, Hou Zhijie, Tang Haiying, Vignesh C, Dilip Kumar, Takamichi Osumi, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
2021-11-30 08:54:30 +05:30
Michael Paquier 98105e53e0 Fix typos
Author: Lingjie Qiang
Discussion: https://postgr.es/m/OSAPR01MB71654E773F62AC88DC1FC8CC80669@OSAPR01MB7165.jpnprd01.prod.outlook.com
2021-11-30 11:05:15 +09:00
Peter Geoghegan 4f8d9d1217 vacuumlazy.c: Rename dead_tuples to dead_items.
Commit 8523492d simplified what it meant for an item to be considered
"dead" to VACUUM: TIDs collected in memory (in preparation for index
vacuuming) must always come from LP_DEAD stub line pointers in heap
pages, found following pruning.  This formalized the idea that index
vacuuming (and heap vacuuming) are optional processes.  Unlike pruning,
they can be delayed indefinitely, without any risk of that violating
fundamental invariants.  For example, leaving LP_DEAD items behind
clearly won't add to the risk of transaction ID wraparound.  You can't
have transaction ID wraparound without transaction IDs.  Renaming
anything that references DEAD tuples (tuples with storage) reinforces
all this.

Code outside vacuumlazy.c continues to fudge the distinction between
dead/deleted tuples, and LP_DEAD items.  This is necessary because
autovacuum scheduling is still mostly driven by "dead items/tuples"
statistics.  In the future we may find it useful to replace this model
with something more sophisticated, as a step towards teaching autovacuum
to perform more frequent vacuuming that targeting individual indexes
that happen to be more prone to becoming bloated through version churn.

In passing, simplify some function signatures that deal with VACUUM's
dead_items array.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzktGBg4si6DEdmq3q6SoXSDqNi6MtmB8CmmTmvhsxDTLA@mail.gmail.com
2021-11-29 09:58:01 -08:00
Michael Paquier 6fb7c5d67c Centralize timestamp computation of control file on updates
This commit moves the timestamp computation of the control file within
the routine of src/common/ in charge of updating the backend's control
file, which is shared by multiple frontend tools (pg_rewind,
pg_checksums and pg_resetwal) and the backend itself.

This change has as direct effect to update the control file's timestamp
when writing the control file in pg_rewind and pg_checksums, something
that is helpful to keep track of control file updates for those
operations, something also tracked by the backend at startup within its
logs.  This part is arguably a bug, as ControlFileData->time should be
updated each time a new version of the control file is written, but this
is a behavior change so no backpatch is done.

Author: Amul Sul
Reviewed-by: Nathan Bossart, Michael Paquier, Bharath Rupireddy
Discussion: https://postgr.es/m/CAAJ_b97nd_ghRpyFV9Djf9RLXkoTbOUqnocq11WGq9TisX09Fw@mail.gmail.com
2021-11-29 13:36:13 +09:00
Tom Lane 3804539e48 Replace random(), pg_erand48(), etc with a better PRNG API and algorithm.
Standardize on xoroshiro128** as our basic PRNG algorithm, eliminating
a bunch of platform dependencies as well as fundamentally-obsolete PRNG
code.  In addition, this API replacement will ease replacing the
algorithm again in future, should that become necessary.

xoroshiro128** is a few percent slower than the drand48 family,
but it can produce full-width 64-bit random values not only 48-bit,
and it should be much more trustworthy.  It's likely to be noticeably
faster than the platform's random(), depending on which platform you
are thinking about; and we can have non-global state vectors easily,
unlike with random().  It is not cryptographically strong, but neither
are the functions it replaces.

Fabien Coelho, reviewed by Dean Rasheed, Aleksander Alekseev, and myself

Discussion: https://postgr.es/m/alpine.DEB.2.22.394.2105241211230.165418@pseudo
2021-11-28 21:33:07 -05:00
Peter Geoghegan 276db875d4 vacuumlazy.c: prefer the term "cleanup lock".
The term "super-exclusive lock" is an acceptable synonym of "cleanup
lock".  Even still, switching from one term to the other in the same
file is confusing.  Standardize on "cleanup lock" within vacuumlazy.c.

Per a complaint from Andres Freund.
2021-11-27 16:05:01 -08:00
Peter Geoghegan 12b5ade902 Update high level vacuumlazy.c comments.
Update vacuumlazy.c file header comments (as well as comments above the
lazy_scan_heap function) that were largely written before the
introduction of the HOT optimization, when lazy_scan_heap did far less,
and didn't actually prune during its initial heap pass.

Since lazy_scan_heap now outsources far more work to lower level
functions, it makes sense to introduce the function by talking about the
high level invariant that dictates the order in which each phase takes
place.  Also deemphasize the case where we run out of memory for TIDs,
since delaying that discussion makes it easier to talk about issues of
central importance.

Finally, remove discussion of parallel VACUUM from header comments.
These don't add much, and are in the wrong place.
2021-11-27 14:29:43 -08:00
Peter Geoghegan 1a6f5a0e87 Go back to considering HOT on pages marked full.
Commit 2fd8685e7f simplified the checking of modified attributes that
takes place within heap_update().  This included a micro-optimization
affecting pages marked PD_PAGE_FULL: don't even try to use HOT to save a
few cycles on determining HOT safety.  The assumption was that it won't
work out this time around, since it can't have worked out last time
around.

Remove the micro-optimization.  It could only ever save cycles that are
consumed by the vast majority of heap_update() calls, which hardly seems
worth the added complexity.  It also seems quite possible that there are
workloads that will do worse over time by repeated application of the
micro-optimization, despite saving some cycles on average, in the short
term.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CAH2-WznU1L3+DMPr1F7o2eJBT7=3bAJoY6ZkWABAxNt+-afyTA@mail.gmail.com
2021-11-26 10:58:38 -08:00
Alvaro Herrera 44bd3ed332
Fix determination of broken LSN in OVERWRITTEN_CONTRECORD
In commit ff9f111bce I mixed up inconsistent definitions of the LSN of
the first record in a page, when the previous record ends exactly at the
page boundary.  The correct LSN is adjusted to skip the WAL page header;
I failed to use that when setting XLogReaderState->overwrittenRecPtr,
so at WAL replay time VerifyOverwriteContrecord would refuse to let
replay continue past that record.

Backpatch to 10.  9.6 also contains this bug, but it's no longer being
maintained.

Discussion: https://postgr.es/m/45597.1637694259@sss.pgh.pa.us
2021-11-26 11:14:27 -03:00
Daniel Gustafsson b2a459edfe Fix GRANTED BY support in REVOKE ROLE statements
Commit 6aaaa76bb added support for the GRANTED BY clause in GRANT and
REVOKE statements, but missed adding support for checking the role in
the REVOKE ROLE case. Fix by checking that the parsed role matches the
CURRENT_ROLE/CURRENT_USER requirement, and also add some tests for it.
Backpatch to v14 where GRANTED BY support was introduced.

Discussion: https://postgr.es/m/B7F6699A-A984-4943-B9BF-CEB84C003527@yesql.se
Backpatch-through: 14
2021-11-26 14:02:01 +01:00
Peter Eisentraut 36cb5e7c51 Update comments
Various places wanted to point out that tuple descriptors don't
contain the variable-length fields of pg_attribute.  This started when
attacl was added, but more fields have been added since, and these
comments haven't been kept up to date consistently.  Reword so that
the purpose is clearer and we don't have to keep updating them.
2021-11-26 09:57:23 +01:00
Michael Paquier f0d43947a1 Block ALTER TABLE .. DROP NOT NULL on columns in replica identity index
Replica identities that depend directly on an index rely on a set of
properties, one of them being that all the columns defined in this index
have to be marked as NOT NULL.  There was a hole in the logic with ALTER
TABLE DROP NOT NULL, where it was possible to remove the NOT NULL
property of a column part of an index used as replica identity, so block
it to avoid problems with logical decoding down the road.

The same check was already done columns part of a primary key, so the
fix is straight-forward.

Author: Haiying Tang, Hou Zhijie
Reviewed-by: Dilip Kumar, Michael Paquier
Discussion: https://postgr.es/m/OS0PR01MB6113338C102BEE8B2FFC5BD9FB619@OS0PR01MB6113.jpnprd01.prod.outlook.com
Backpatch-through: 10
2021-11-25 15:04:56 +09:00
Andres Freund 3030903dfe Replace straggling uses of ReadRecPtr/EndRecPtr.
d2ddfa681d removed ReadRecPtr/EndRecPtr, but two uses within an #ifdef
WAL_DEBUG escaped.

Discussion: https://postgr.es/m/20211124231206.gbadj5bblcljb6d5@alap3.anarazel.de
2021-11-24 16:56:14 -08:00
Robert Haas d2ddfa681d xlog.c: Remove global variables ReadRecPtr and EndRecPtr.
In most places, the variables necessarily store the same value as the
eponymous members of the XLogReaderState that we use during WAL
replay, because ReadRecord() assigns the values from the structure
members to the global variables just after XLogReadRecord() returns.
However, XLogBeginRead() adjusts the structure members but not the
global variables, so after XLogBeginRead() and before the completion
of XLogReadRecord() the values can differ. Otherwise, they must be
identical.  According to my analysis, the only place where either
variable is referenced at a point where it might not have the same
value as the structure member is the refrence to EndRecPtr within
XLogPageRead.

Therefore, at every other place where we are using the global
variable, we can just switch to using the structure member instead,
and remove the global variable. However, we can, and in fact should,
do this in XLogPageRead() as well, because at that point in the code,
the global variable will actually store the start of the record we
want to read - either because it's where the last WAL record ended, or
because the read position has been changed using XLogBeginRead since
the last record was read. The structure member, on the other hand,
will already have been updated to point to the end of the record we
just read. Elsewhere, the latter is what we use as an argument to
emode_for_corrupt_record(), so we should do the same here.

This part of the patch is perhaps a bug fix, but I don't think it has
any important consequences, so no back-patch. The point here is just
to continue to whittle down the entirely excessive use of global
variables in xlog.c.

Discussion: http://postgr.es/m/CA+Tgmoao96EuNeSPd+hspRKcsCddu=b1h-QNRuKfY8VmfNQdfg@mail.gmail.com
2021-11-24 11:27:39 -05:00
Robert Haas e7ea2fa342 Fix corner-case failure to detect improper timeline switch.
rescanLatestTimeLine() contains a guard against switching to
a timeline that forked off from the current one prior to the
current recovery point, but that guard does not work if the
timeline switch occurs before the first WAL recod (which must
be the checkpoint record) is read. Without this patch, an
improper timeline switch is therefore possible in such cases.

This happens because rescanLatestTimeLine() relies on the global
variable EndRecPtr to understand the current position of WAL
replay. However, EndRecPtr at this point in the code contains
the endpoint of the last-replayed record, not the startpoint or
endpoint of the record being replayed now. Thus, before any
records have been replayed, it's zero, which causes the sanity
check to always pass.

To fix, pass down the correct timeline explicitly. The
EndRecPtr value we want is the one from the xlogreader, which
will be the starting position of the record we're about to
try to read, rather than the global variable, which is the
ending position of the last record we successfully read.
They're usually the same, but not in the corner case described
here.

No back-patch, because in v14 and earlier branhes, we were using
the wrong TLI here as well as the wrong LSN. In master, that was
fixed by commit 4a92a1c3d1, but
that and it's prerequisite patches are too invasive to
back-patch for such a minor issue.

Patch by me, reviewed by Amul Sul.

Discussion: http://postgr.es/m/CA+Tgmoao96EuNeSPd+hspRKcsCddu=b1h-QNRuKfY8VmfNQdfg@mail.gmail.com
2021-11-24 08:13:10 -05:00
David Rowley 411137a429 Flush Memoize cache when non-key parameters change, take 2
It's possible that a subplan below a Memoize node contains a parameter
from above the Memoize node.  If this parameter changes then cache entries
may become out-dated due to the new parameter value.

Previously Memoize was mistakenly not aware of this.  We fix this here by
flushing the cache whenever a parameter that's not part of the cache
key changes.

Bug: #17213
Reported by: Elvis Pranskevichus
Author: David Rowley
Discussion: https://postgr.es/m/17213-988ed34b225a2862@postgresql.org
Backpatch-through: 14, where Memoize was added
2021-11-24 23:29:14 +13:00
Amit Kapila 875e02c2df Rename SnapBuild* macros in slot.c.
Same macro names for SnapBuildOnDiskNotChecksummedSize and
SnapBuildOnDiskChecksummedSize are being used in slot.c and snapbuild.c.
This patch renames them, in slot.c, to
ReplicationSlotOnDiskNotChecksummedSize and
ReplicationSlotOnDiskChecksummedSize similar to the other macros. This
makes all macro names look consistent in slot.c.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACVZo-piDGzBOJRY4ob=_goFR6t9DhZMDMjJWN7LQs34Aw@mail.gmail.com
2021-11-24 08:09:00 +05:30
David Rowley dad20ad470 Revert "Flush Memoize cache when non-key parameters change"
This reverts commit 1050048a31.
2021-11-24 15:27:43 +13:00
David Rowley 1050048a31 Flush Memoize cache when non-key parameters change
It's possible that a subplan below a Memoize node contains a parameter
from above the Memoize node.  If this parameter changes then cache entries
may become out-dated due to the new parameter value.

Previously Memoize was mistakenly not aware of this.  We fix this here by
flushing the cache whenever a parameter that's not part of the cache
key changes.

Bug: #17213
Reported by: Elvis Pranskevichus
Author: David Rowley
Discussion: https://postgr.es/m/17213-988ed34b225a2862@postgresql.org
Backpatch-through: 14, where Memoize was added
2021-11-24 14:56:18 +13:00
David Rowley e502150f7d Allow Memoize to operate in binary comparison mode
Memoize would always use the hash equality operator for the cache key
types to determine if the current set of parameters were the same as some
previously cached set.  Certain types such as floating points where -0.0
and +0.0 differ in their binary representation but are classed as equal by
the hash equality operator may cause problems as unless the join uses the
same operator it's possible that whichever join operator is being used
would be able to distinguish the two values.  In which case we may
accidentally return in the incorrect rows out of the cache.

To fix this here we add a binary mode to Memoize to allow it to the
current set of parameters to previously cached values by comparing
bit-by-bit rather than logically using the hash equality operator.  This
binary mode is always used for LATERAL joins and it's used for normal
joins when any of the join operators are not hashable.

Reported-by: Tom Lane
Author: David Rowley
Discussion: https://postgr.es/m/3004308.1632952496@sss.pgh.pa.us
Backpatch-through: 14, where Memoize was added
2021-11-24 10:06:59 +13:00
Michael Paquier 1922d7c6e1 Add SQL functions to monitor the directory contents of replication slots
This commit adds a set of functions able to look at the contents of
various paths related to replication slots:
- pg_ls_logicalsnapdir, for pg_logical/snapshots/
- pg_ls_logicalmapdir, for pg_logical/mappings/
- pg_ls_replslotdir, for pg_replslot/<slot_name>/

These are intended to be used by monitoring tools.  Unlike pg_ls_dir(),
execution permission can be granted to non-superusers.  Roles members of
pg_monitor gain have access to those functions.

Bump catalog version.

Author: Bharath Rupireddy
Reviewed-by: Nathan Bossart, Justin Pryzby
Discussion: https://postgr.es/m/CALj2ACWsfizZjMN6bzzdxOk1ADQQeSw8HhEjhmVXn_Pu+7VzLw@mail.gmail.com
2021-11-23 19:29:42 +09:00
Alvaro Herrera 2fed48f48f
Be more specific about OOM in XLogReaderAllocate
A couple of spots can benefit from an added errdetail(), which matches
what we were already doing in other places; and those that cannot
withstand errdetail() can get a more descriptive primary message.

Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://postgr.es/m/CALj2ACV+cX1eM03GfcA=ZMLXh5fSn1X1auJLz3yuS1duPSb9QA@mail.gmail.com
2021-11-22 13:43:43 -03:00
Alvaro Herrera 042412879e
autovacuum: Improve wording in a couple places
A few strings (one WARNING and some memory context names) in the
autovacuum code were written in a world where "worker" had no other
possible meaning than "autovacuum worker", but that's long time gone.
Be more specific about it.

Also, change the WARNING from elog() to ereport(), to add translability.

Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Nathan Bossart <bossartn@amazon.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CALj2ACX2UHp76dqdoZq92a7v4APFuV5wJQ+AUrb+2HURrKN=NQ@mail.gmail.com
2021-11-22 12:55:36 -03:00
Alvaro Herrera 67385544ce
Add missing words in comment
Reported by Zhihong Yu.

Discussion: https://postgr.es/m/CALNJ-vR6uZivg_XkB1zKjEXeyZDEgoYanFXB-++1kBT9yZQoUw@mail.gmail.com
2021-11-22 12:38:41 -03:00
Peter Eisentraut d6d1dfcc99 Add ABI extra field to fmgr magic block
This allows derived products to intentionally make their fmgr ABI
incompatible, with a clean error message.

Discussion: https://www.postgresql.org/message-id/flat/55215fda-db31-a045-d6b7-d6f2d2dc9920%40enterprisedb.com
2021-11-22 08:00:14 +01:00
Fujii Masao 1b06d7bac9 Report wait events for local shell commands like archive_command.
This commit introduces new wait events for archive_command,
archive_cleanup_command, restore_command and recovery_end_command.

Author: Fujii Masao
Reviewed-by: Bharath Rupireddy, Michael Paquier
Discussion: https://postgr.es/m/4ca4f920-6b48-638d-08b2-93598356f5d3@oss.nttdata.com
2021-11-22 10:28:21 +09:00
Peter Geoghegan 97f5aef609 Remove lazy_scan_heap parallel VACUUM comment block.
This doesn't belong next to very high level discussion of the tasks that
lazy_scan_heap performs.  There is already a similar, longer comment
block at the top of vacuumlazy.c that mentions lazy_scan_heap directly.
2021-11-21 16:22:57 -08:00
Tom Lane f4e7ae2b8a Fix SP-GiST scan initialization logic for binary-compatible cases.
Commit ac9099fc1 rearranged the logic in spgGetCache() that determines
the index's attType (nominal input data type) and leafType (actual
type stored in leaf index tuples).  Turns out this broke things for
the case where (a) the actual input data type is different from the
nominal type, (b) the opclass's config function leaves leafType
defaulted, and (c) the opclass has no "compress" function.  (b) caused
us to assign the actual input data type as leafType, and then since
that's not attType, we complained that a "compress" function is
required.  For non-polymorphic opclasses, condition (a) arises in
binary-compatible cases, such as using SP-GiST text_ops for a varchar
column, or using any opclass on a domain over its nominal input type.

To fix, use attType for leafType when the index's declared column type
is different from but binary-compatible with attType.  Do this only in
the defaulted-leafType case, to avoid overriding any explicit
selection made by the opclass.

Per bug #17294 from Ilya Anfimov.  Back-patch to v14.

Discussion: https://postgr.es/m/17294-8f6c7962ce877edc@postgresql.org
2021-11-20 14:29:56 -05:00
Andres Freund 3b34645678 Initialize backend status reporting during bootstrap.
This allows a later commit to reduce the number of branches in performance
sensitive functions during normal running, compared to a very minor saving
during bootstrapping.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAAKRu_Yeg+vh6SHNEo1+=O7e-BPX35cU0XQM=YwQRnkFyv_y+w@mail.gmail.com
2021-11-19 08:43:12 -08:00
Amit Kapila 0f0cfb4940 Fix parallel operations that prevent oldest xmin from advancing.
While determining xid horizons, we skip over backends that are running
Vacuum. We also ignore Create Index Concurrently, or Reindex Concurrently
for the purposes of computing Xmin for Vacuum. But we were not setting the
flags corresponding to these operations when they are performed in
parallel which was preventing Xid horizon from advancing.

The optimization related to skipping Create Index Concurrently, or Reindex
Concurrently operations was implemented in PG-14 but the fix is the same
for the Parallel Vacuum as well so back-patched till PG-13.

Author: Masahiko Sawada
Reviewed-by: Amit Kapila
Backpatch-through: 13
Discussion: https://postgr.es/m/CAD21AoCLQqgM1sXh9BrDFq0uzd3RBFKi=Vfo6cjjKODm0Onr5w@mail.gmail.com
2021-11-19 09:04:40 +05:30
Tom Lane 5f1148224b Provide a variant of simple_prompt() that can be interrupted by ^C.
Up to now, you couldn't escape out of psql's \password command
by typing control-C (or other local spelling of SIGINT).  This
is pretty user-unfriendly, so improve it.  To do so, we have to
modify the functions provided by pg_get_line.c; but we don't
want to mess with psql's SIGINT handler setup, so provide an
API that lets that handler cause the cancel to occur.

This relies on the assumption that we won't do any major harm by
longjmp'ing out of fgets().  While that's obviously a little shaky,
we've long had the same assumption in the main input loop, and few
issues have been reported.

psql has some other simple_prompt() calls that could usefully
be improved the same way; for now, just deal with \password.

Nathan Bossart, minor tweaks by me

Discussion: https://postgr.es/m/747443.1635536754@sss.pgh.pa.us
2021-11-17 19:09:54 -05:00
Tom Lane a148f8bc04 Add a planner support function for starts_with().
This fills in some gaps in planner support for starts_with() and
the equivalent ^@ operator:

* A condition such as "textcol ^@ constant" can now use a regular
btree index, not only an SP-GiST index, so long as the index's
collation is C.  (This works just like "textcol LIKE 'foo%'".)

* "starts_with(textcol, constant)" can be optimized the same as
"textcol ^@ constant".

* Fixed-prefix LIKE and regex patterns are now more like starts_with()
in another way: if you apply one to an SPGiST-indexed column, you'll
get an index condition using ^@ rather than two index conditions with
>= and <.

Per a complaint from Shay Rojansky.  Patch by me; thanks to
Nathan Bossart for review.

Discussion: https://postgr.es/m/232599.1633800229@sss.pgh.pa.us
2021-11-17 16:54:12 -05:00
Tom Lane a8d8445a7b Fix display of SQL-standard function's arguments in INSERT/SELECT.
If a SQL-standard function body contains an INSERT ... SELECT statement,
any function parameters referenced within the SELECT were always printed
in $N style, rather than using the parameter name if any.  While not
strictly incorrect, this wasn't the intention, and it's inconsistent
with the way that such parameters would be printed in any other kind
of statement.

The cause is that the recursion to get_query_def from
get_insert_query_def neglected to pass down the context->namespaces
list, passing constant NIL instead.  This is a very ancient oversight,
but AFAICT it had no visible consequences before commit e717a9a18
added an outermost namespace with function parameters.  We don't allow
INSERT ... SELECT as a sub-query, except in a top-level WITH clause,
where it couldn't contain any outer references that might need to access
upper namespaces.  So although that's arguably a bug, I don't see any
point in changing it before v14.

In passing, harden the code added to get_parameter by e717a9a18 so that
it won't crash if a PARAM_EXTERN Param appears in an unexpected place.

Per report from Erki Eessaar.  Code fix by me, regression test case
by Masahiko Sawada.

Discussion: https://postgr.es/m/AM9PR01MB8268347BED344848555167FAFE949@AM9PR01MB8268.eurprd01.prod.exchangelabs.com
2021-11-17 11:31:31 -05:00
Daniel Gustafsson aa12781b0d Improve publication error messages
Commit 81d5995b4b introduced more fine-grained errormessages for
incorrect relkinds for publication, while unlogged and temporary
tables were reported with using the same message.  This provides
separate error messages for these types of relpersistence.

Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Jeevan Ladhe <jeevan.ladhe@enterprisedb.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Discussion: https://postgr.es/m/CALj2ACW9S=AswyQHjtO6WMcsergMkCBTtzXGrM8DX26DzfeTLQ@mail.gmail.com
2021-11-17 14:40:38 +01:00
Michael Paquier f975fc3a35 Remove global variable "LastRec" in xlog.c
This variable is used only by StartupXLOG() now, so let's make it local
to simplify the code.

Author: Amul Sul
Reviewed-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/CAAJ_b96Qd023itERBRN9Z7P2saNDT3CYvGuMO8RXwndVNN6z7g@mail.gmail.com
2021-11-17 11:04:18 +09:00
Robert Haas e51c46991f Move InitXLogInsert() call from InitXLOGAccess() to BaseInit().
At present, there is an undocumented coding rule that you must call
RecoveryInProgress(), or do something else that results in a call
to InitXLogInsert(), before trying to write WAL. Otherwise, the
WAL construction buffers won't be initialized, resulting in
failures.

Since it's not good to rely on a status inquiry function like
RecoveryInProgress() having the side effect of initializing
critical data structures, instead do the initialization eariler,
when the backend first starts up.

Patch by me. Reviewed by Nathan Bossart and Michael Paquier.

Discussion: http://postgr.es/m/CA+TgmoY7b65qRjzHN_tWUk8B4sJqk1vj1d31uepVzmgPnZKeLg@mail.gmail.com
2021-11-16 09:43:17 -05:00
Amit Kapila 354a1f8d22 Invalidate relcache when changing REPLICA IDENTITY index.
When changing REPLICA IDENTITY INDEX to another one, the target table's
relcache was not being invalidated. This leads to skipping update/delete
operations during apply on the subscriber side as the columns required to
search corresponding rows won't get logged.

Author: Tang Haiying, Hou Zhijie
Reviewed-by: Euler Taveira, Amit Kapila
Backpatch-through: 10
Discussion: https://postgr.es/m/OS0PR01MB61133CA11630DAE45BC6AD95FB939@OS0PR01MB6113.jpnprd01.prod.outlook.com
2021-11-16 08:10:13 +05:30
Robert Haas 1b098da200 Fix thinko in bbsink_throttle_manifest_contents.
Report and diagnosis by Dmitry Dolgov.

Discussion: http://postgr.es/m/20211115162641.dmo6l32fklh64gnw@localhost
2021-11-15 14:22:13 -05:00
Peter Geoghegan b0f7425ec2 Explain pruning pgstats accounting subtleties.
Add a comment explaining why the pgstats accounting used during
opportunistic heap pruning operations (to maintain the current number of
dead tuples in the relation) needs to compensate by subtracting away the
number of new LP_DEAD items.  This is needed so it can avoid completely
forgetting about tuples that become LP_DEAD items during pruning -- they
should still count.

It seems more natural to discuss this issue at the only relevant call
site (opportunistic pruning), since the same issue does not apply to the
only other caller (the VACUUM call site).  Move everything there too.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wzm7f+A6ej650gi_ifTgbhsadVW5cujAL3punpupHff5Yg@mail.gmail.com
2021-11-12 19:45:58 -08:00
Michael Paquier a45ed975c5 Fix memory overrun when querying pg_stat_slru
pg_stat_get_slru() in pgstatfuncs.c would point to one element after the
end of the array PgStat_SLRUStats when finishing to scan its entries.
This had no direct consequences as no data from the extra memory area
was read, but static analyzers would rightfully complain here.  So let's
be clean.

While on it, this adds one regression test in the area reserved for
system views.

Reported-by: Alexander Kozhemyakin, via AddressSanitizer
Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/17280-37da556e86032070@postgresql.org
Backpatch-through: 13
2021-11-12 21:49:21 +09:00
Noah Misch 3354746910 Report any XLogReadRecord() error in XlogReadTwoPhaseData().
Buildfarm members kittiwake and tadarida have witnessed errors at this
site.  The site discarded key facts.  Back-patch to v10 (all supported
versions).

Reviewed by Michael Paquier and Tom Lane.

Discussion: https://postgr.es/m/20211107013157.GB790288@rfd.leadboat.com
2021-11-11 17:10:18 -08:00
Peter Geoghegan 42f9427aa9 Update heap_page_prune() free space map comments.
It is up to the heap_page_prune() caller to decide what to do about
updating the FSM for a page following pruning.  Update old comments that
address what we might want to do as if it was the responsibility of
heap_page_prune() itself.  heap_page_prune() doesn't have enough
high-level context to make a sensible choice.
2021-11-11 13:42:17 -08:00
Peter Geoghegan eb9baef8e9 Update another obsolete reference in vacuumlazy.c.
Addresses an oversight in commit 7ab96cf6.
2021-11-11 13:13:08 -08:00
Robert Haas beb4e9ba16 Improve performance of pgarch_readyXlog() with many status files.
Presently, the archive_status directory was scanned for each file to
archive.  When there are many status files, say because archive_command
has been failing for a long time, these directory scans can get very
slow.  With this change, the archiver remembers several files to archive
during each directory scan, speeding things up.

To ensure timeline history files are archived as quickly as possible,
XLogArchiveNotify() forces the archiver to do a new directory scan as
soon as the .ready file for one is created.

Nathan Bossart, per a long discussion involving many people. It is
not clear to me exactly who out of all those people reviewed this
particular patch.

Discussion: http://postgr.es/m/CA+TgmobhAbs2yabTuTRkJTq_kkC80-+jw=pfpypdOJ7+gAbQbw@mail.gmail.com
Discussion: http://postgr.es/m/620F3CE1-0255-4D66-9D87-0EADE866985A@amazon.com
2021-11-11 15:20:26 -05:00
Alvaro Herrera 0726c764bc
Restore lock level to set vacuum flags
Commit 27838981be mistakenly reduced the lock level from exclusive to
shared that is acquired to set PGPROC->statusFlags; this was reverted
by dcfff74fb1, but failed to do so in one spot.  Fix it.

Backpatch to 14.

Noted by Andres Freund.

Discussion: https://postgr.es/m/20211111020724.ggsfhcq3krq5r4hb@alap3.anarazel.de
2021-11-11 11:03:29 -03:00
Tom Lane c3b33698cf Doc: improve protocol spec for logical replication Type messages.
protocol.sgml documented the layout for Type messages, but completely
dropped the ball otherwise, failing to explain what they are, when
they are sent, or what they're good for.  While at it, do a little
copy-editing on the description of Relation messages.

In passing, adjust the comment for apply_handle_type() to make it
clearer that we choose not to do anything when receiving a Type
message, not that we think it has no use whatsoever.

Per question from Stefen Hillman.

Discussion: https://postgr.es/m/CAPgW8pMknK5pup6=T4a_UG=Cz80Rgp=KONqJmTdHfaZb0RvnFg@mail.gmail.com
2021-11-10 13:13:04 -05:00
Robert Haas 10eae82b27 Fix thinko in assertion in basebackup.c.
Commit 5a1007a508 tried to introduce
an assertion that the block size was at least twice the size of a
tar block, but I got the math wrong. My error was reported to me
off-list.
2021-11-10 10:12:20 -05:00
Robert Haas a27048cbcb More cleanup of 'ThisTimeLineID'.
In XLogCtlData, rename the structure member ThisTimeLineID to
InsertTimeLineID and update the comments to make clear that it's only
expected to be set after recovery is complete.

In StartupXLOG, replace the local variables ThisTimeLineID and
PrevTimeLineID with new local variables replayTLI and newTLI.  In the
old scheme, ThisTimeLineID was the replay TLI until we created a new
timeline, and after that the replay TLI was in PrevTimeLineID. Now,
replayTLI is the TLI from which we last replayed WAL throughout the
entire function, and newTLI is either that, or the new timeline created
upon promotion.

Remove some misleading comments from the comment block just above where
recoveryTargetTimeLineGoal and friends are declared. It's become
incorrect, not only because ThisTimeLineID as a variable is now gone,
but also because the rmgr code does not care about ThisTimeLineID and
has not since what used to be the TLI field in the page header was
repurposed to store the page checksum.

Add a comment GetFlushRecPtr that it's only supposed to be used in
normal running, and an assertion to verify that this is so.

Per some ideas from Michael Paquier and some of my own. Review by
Michael Paquier also.

Discussion: http://postgr.es/m/CA+TgmoY1a2d1AnVR3tJcKmGGkhj7GGrwiNwjtKr21dxOuLBzCQ@mail.gmail.com
2021-11-10 09:45:24 -05:00
Michael Paquier c9c401a5e1 Improve error messages for some callers of XLogReadRecord()
A couple of code paths related to logical decoding (WAL sender, slot
advancing, etc.) use XLogReadRecord(), feeding on error messages
generated by walreader.c on a failure.  All those messages have no
context, making it harder to spot from where an error could come even if
these should not happen.  All the other callers of XLogReadRecord() do
that already.

Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/YYnTH6OyOwQcAdkw@paquier.xyz
2021-11-10 12:00:33 +09:00
Jeff Davis 4168a47454 Add pg_checkpointer predefined role for CHECKPOINT command.
Any user with the privileges of pg_checkpointer can issue a CHECKPOINT
command.

Reviewed-by: Stephen Frost
Discussion: https://postgr.es/m/67a1d667e8ec228b5e07f232184c80348c5d93f4.camel%40j-davis.com
2021-11-09 16:59:14 -08:00
Robert Haas 5a1007a508 Have the server properly terminate tar archives.
Earlier versions of PostgreSQL featured a version of pg_basebackup
that wanted to edit tar archives but was too dumb to parse them
properly. The server made things easier for the client by failing
to add the two blocks of zero bytes that ought to end a tar file,
leaving it up to the client to do that.

But since commit 23a1c6578c, we
don't need this hack any more, because pg_basebackup is now smarter
and can parse tar files even if they are properly terminated! So
change the server to always properly terminate the tar files. Older
versions of pg_basebackup can't talk to new servers anyway, so
there's no compatibility break.

On the pg_basebackup side, we see still need to add the terminating
zero bytes if we're talking to an older server, but not when the
server is v15+. Hopefully at some point we'll be able to remove
some of this compatibility cruft, but it seems best to hang on to
it for now.

In passing, add a file header comment to bbstreamer_tar.c, to make
it clearer what's going on here.

Discussion: http://postgr.es/m/CA+TgmoZbNzsWwM4BE5Jb_qHncY817DYZwGf+2-7hkMQ27ZwsMQ@mail.gmail.com
2021-11-09 14:29:15 -05:00
Peter Eisentraut ee3a1a5b63 Remove check for accept() argument types
This check was used to accommodate a staggering variety in particular
in the type of the third argument of accept().  This is no longer of
concern on currently supported systems.  We can just use socklen_t in
the code and put in a simple check that substitutes int for socklen_t
if it's missing, to cover the few stragglers.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/3538f4c4-1886-64f2-dcff-aaad8267fb82@enterprisedb.com
2021-11-09 15:35:26 +01:00
Michael Paquier 4cd046c203 Make some comments use the term "ProcSignal" for consistency
The surroundings in procsignal.c prefer using "ProcSignal" rather than
"procsignal".

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACX99ghPmm1M_O4r4g+YsXFjCn=qF7PeDXntLwMpht_Gdg@mail.gmail.com
2021-11-09 12:56:34 +09:00
Amit Kapila b3812d0b9b Rename some enums to use TABLE instead of REL.
Commit 5a2832465f introduced some enums to represent all tables in schema
publications and used REL in their names. Use TABLE instead of REL in
those enums to avoid confusion with other objects like SEQUENCES that can
be part of a publication in the future.

In the passing, (a) Change one of the newly introduced error messages to
make it consistent for Create and Alter commands, (b) add missing alias in
one of the SQL Statements that is used to print publications associated
with the table.

Reported-by: Tomas Vondra, Peter Smith
Author: Vignesh C
Reviewed-by: Hou Zhijie, Peter Smith
Discussion: https://www.postgresql.org/message-id/CALDaNm0OANxuJ6RXqwZsM1MSY4s19nuH3734j4a72etDwvBETQ%40mail.gmail.com
2021-11-09 08:39:33 +05:30
Tom Lane 28e2412554 Reject extraneous data after SSL or GSS encryption handshake.
The server collects up to a bufferload of data whenever it reads data
from the client socket.  When SSL or GSS encryption is requested
during startup, any additional data received with the initial
request message remained in the buffer, and would be treated as
already-decrypted data once the encryption handshake completed.
Thus, a man-in-the-middle with the ability to inject data into the
TCP connection could stuff some cleartext data into the start of
a supposedly encryption-protected database session.

This could be abused to send faked SQL commands to the server,
although that would only work if the server did not demand any
authentication data.  (However, a server relying on SSL certificate
authentication might well not do so.)

To fix, throw a protocol-violation error if the internal buffer
is not empty after the encryption handshake.

Our thanks to Jacob Champion for reporting this problem.

Security: CVE-2021-23214
2021-11-08 11:01:43 -05:00
David Rowley 39a3105678 Fix incorrect hash equality operator bug in Memoize
In v14, because we don't have a field in RestrictInfo to cache both the
left and right type's hash equality operator, we just restrict the scope
of Memoize to only when the left and right types of a RestrictInfo are the
same.

In master we add another field to RestrictInfo and cache both hash
equality operators.

Reported-by: Jaime Casanova
Author: David Rowley
Discussion: https://postgr.es/m/20210929185544.GB24346%40ahch-to
Backpatch-through: 14
2021-11-08 14:40:33 +13:00
Tom Lane c3ec4f8fe8 Silence uninitialized-variable warning.
Quite a few buildfarm animals are warning about this, and lapwing
is actually failing (because -Werror).  It's a false positive AFAICS,
so no need to do more than zero the variable to start with.

Discussion: https://postgr.es/m/YYXJnUxgw9dZKxlX@paquier.xyz
2021-11-07 12:18:18 -05:00
Tom Lane 27ef132a80 Doc: add some notes about performance of the List functions.
Per suggestion from Andres Freund.

Discussion: https://postgr.es/m/20211104221248.pgo4h6wvnjl6uvkb@alap3.anarazel.de
2021-11-06 19:12:51 -04:00
Andres Freund 87bb606b20 windows: Remove use of WIN32_LEAN_AND_MEAN from crashdump.c.
Since 8162464a25 we do so in win32_port.h. But it likely didn't do much
before that either, because at that point windows.h was already included via
win32_port.h.

Reported-By: Tom Lane
Discussion: https://postgr.es/m/612842.1636237461@sss.pgh.pa.us
2021-11-06 15:43:22 -07:00
Tom Lane cbe25dcff7 Disallow making an empty lexeme via array_to_tsvector().
The tsvector data type has always forbidden lexemes to be empty.
However, array_to_tsvector() didn't get that memo, and would
allow an empty-string array element to become an empty lexeme.
This could result in dump/restore failures later, not to mention
whatever semantic issues might be behind the original prohibition.

However, other functions that take a plain text input directly as
a lexeme value do not need a similar restriction, because they only
match the string against existing tsvector entries.  In particular
it'd be a bad idea to make ts_delete() reject empty strings, since
that is the most convenient way to clean up any bad data that might
have gotten into a tsvector column via this bug.

Reflecting on that, let's also remove the prohibition against NULL
array elements in tsvector_delete_arr and tsvector_setweight_by_filter.
It seems more consistent to ignore them, as an empty-string element
would be ignored.

There's a case for back-patching this, since it's clearly a bug fix.
On balance though, it doesn't seem like something to change in a
minor release.

Jean-Christophe Arnu

Discussion: https://postgr.es/m/CAHZmTm1YVndPgUVRoag2WL0w900XcoiivDDj-gTTYBsG25c65A@mail.gmail.com
2021-11-06 13:28:53 -04:00
Tom Lane 1241fcbd7e Second attempt to silence SSL compile failures on hamerkop.
After further investigation, it seems the cause of the problem
is our recent decision to start defining WIN32_LEAN_AND_MEAN.
That causes <windows.h> to no longer include <wincrypt.h>, which
means that the OpenSSL headers are unable to prevent conflicts
with that header by #undef'ing the conflicting macros.  Apparently,
some other system header that be-secure-openssl.c #includes after
the OpenSSL headers is pulling in <wincrypt.h>.  It's obscure just
where that happens and why we're not seeing it on other Windows
buildfarm animals.  However, it should work to move the OpenSSL
#includes to the end of the list.  For the sake of future-proofing,
do likewise in fe-secure-openssl.c.  In passing, remove useless
double inclusions of <openssl/ssl.h>.

Thanks to Thomas Munro for running down the relevant information.

Discussion: https://postgr.es/m/1051867.1635720347@sss.pgh.pa.us
2021-11-06 12:43:18 -04:00
Alexander Korotkov 05e6e78c18 Reset lastOverflowedXid on standby when needed
Currently, lastOverflowedXid is never reset.  It's just adjusted on new
transactions known to be overflowed.  But if there are no overflowed
transactions for a long time, snapshots could be mistakenly marked as
suboverflowed due to wraparound.

This commit fixes this issue by resetting lastOverflowedXid when needed
altogether with KnownAssignedXids.

Backpatch to all supported versions.

Reported-by: Stan Hu
Discussion: https://postgr.es/m/CAMBWrQ%3DFp5UAsU_nATY7EMY7NHczG4-DTDU%3DmCvBQZAQ6wa2xQ%40mail.gmail.com
Author: Kyotaro Horiguchi, Alexander Korotkov
Reviewed-by: Stan Hu, Simon Riggs, Nikolay Samokhvalov, Andrey Borodin, Dmitry Dolgov
2021-11-06 19:13:58 +03:00
Peter Geoghegan 02f9fd1294 Update obsolete reference in vacuumlazy.c.
Oversight in commit 7ab96cf6.
2021-11-05 23:38:07 -07:00
Tomas Vondra d91353f4b2 Fix handling of NaN values in BRIN minmax multi
When calculating distance between float4/float8 values, we need to be a
bit more careful about NaN values in order not to trigger assert. We
consider NaN values to be equal (distace 0.0) and in infinite distance
from all other values.

On builds without asserts, this issue is mostly harmless - the ranges
may be merged in less efficient order, but the index is still correct.

Per report from Andreas Seltenreich. Backpatch to 14, where this new
BRIN opclass was introduced.

Reported-by: Andreas Seltenreich
Discussion: https://postgr.es/m/87r1bw9ukm.fsf@credativ.de
2021-11-06 01:50:44 +01:00
Peter Geoghegan f214960add Update obsolete heap pruning comments.
Add new comments that spell out what VACUUM expects from heap pruning:
pruning must never leave behind DEAD tuples that still have tuple
storage.  This has at least been the case since commit 8523492d, which
established the principle that vacuumlazy.c doesn't have to deal with
DEAD tuples that still have tuple storage directly, except perhaps by
simply retrying pruning (to handle a rare corner case involving
concurrent transaction abort).

In passing, update some references to old symbol names that were missed
by the snapshot scalability work (specifically commit dc7420c2c9).
2021-11-05 14:08:47 -07:00
Robert Haas 4a92a1c3d1 Change ThisTimeLineID from a global variable to a local variable.
StartupXLOG() still has ThisTimeLineID as a local variable, but the
remaining code in xlog.c now needs to the relevant TimeLineID by some
other means. Mostly, this means that we now pass it as a function
parameter to a bunch of functions where we didn't previously.
However, a few cases require special handling:

- In functions that might be called by outside callers who
  wouldn't necessarily know what timeline to specify, we get
  the timeline ID from shared memory. XLogCtl->ThisTimeLineID
  can be used in most cases since recovery is known to have
  completed by the time those functions are called.  In
  xlog_redo(), we can use XLogCtl->replayEndTLI.

- XLogFileClose() needs to know the TLI of the open logfile.
  Do that with a new global variable openLogTLI. While
  someone could argue that this is just trading one global
  variable for another, the new one has a far more narrow
  purposes and is referenced in just a few places.

- read_backup_label() now returns the TLI that it obtains
  by parsing the backup_label file. Previously, ReadRecord()
  could be called to parse the checkpoint record without
  ThisTimeLineID having been initialized. Now, the timeline
  is passed down, and I didn't want to pass an uninitialized
  variable; this change lets us avoid that. The old coding
  didn't seem to have any practical consequences that we need
  to worry about, but this is cleaner.

- In BootstrapXLOG(), it's just a constant.

Patch by me, reviewed and tested by Michael Paquier, Amul Sul, and
Álvaro Herrera.

Discussion: https://postgr.es/m/CA+TgmobfAAqhfWa1kaFBBFvX+5CjM=7TE=n4r4Q1o2bjbGYBpA@mail.gmail.com
2021-11-05 12:53:15 -04:00
Robert Haas e997a0c642 Remove all use of ThisTimeLineID global variable outside of xlog.c
All such code deals with this global variable in one of three ways.
Sometimes the same functions use it in more than one of these ways
at the same time.

First, sometimes it's an implicit argument to one or more functions
being called in xlog.c or elsewhere, and must be set to the
appropriate value before calling those functions lest they
misbehave. In those cases, it is now passed as an explicit argument
instead.

Second, sometimes it's used to obtain the current timeline after
the end of recovery, i.e. the timeline to which WAL is being
written and flushed. Such code now calls GetWALInsertionTimeLine()
or relies on the new out parameter added to GetFlushRecPtr().

Third, sometimes it's used during recovery to store the current
replay timeline. That can change, so such code must generally
update the value before each use. It can still do that, but must
now use a local variable instead.

The net effect of these changes is to reduce by a fair amount the
amount of code that is directly accessing this global variable.
That's good, because history has shown that we don't always think
clearly about which timeline ID it's supposed to contain at any
given point in time, or indeed, whether it has been or needs to
be initialized at any given point in the code.

Patch by me, reviewed and tested by Michael Paquier, Amul Sul, and
Álvaro Herrera.

Discussion: https://postgr.es/m/CA+TgmobfAAqhfWa1kaFBBFvX+5CjM=7TE=n4r4Q1o2bjbGYBpA@mail.gmail.com
2021-11-05 12:50:01 -04:00
Robert Haas caf1f675b8 Don't set ThisTimeLineID when there's no reason to do so.
In slotfuncs.c, pg_replication_slot_advance() needs to determine
the LSN up to which the slot should be advanced, but that doesn't
require us to update ThisTimeLineID, because none of the code called
from here depends on it. If the replication slot is logical,
pg_logical_replication_slot_advance will call read_local_xlog_page,
which does use ThisTimeLineID, but also takes care of making sure
it's up to date. If the replication slot is physical, the timeline
isn't used for anything at all.

In logicalfuncs.c, pg_logical_slot_get_changes_guts() has the same
issue: the only code we're going to run that cares about timelines
is in or downstream of read_local_xlog_page, which already makes
sure that the correct value gets set. Hence, don't do it here.

Patch by me, reviewed and tested by Michael Paquier, Amul Sul, and
Álvaro Herrera.

Discussion: https://postgr.es/m/CA+TgmobfAAqhfWa1kaFBBFvX+5CjM=7TE=n4r4Q1o2bjbGYBpA@mail.gmail.com
2021-11-05 12:43:04 -04:00
Alvaro Herrera d74b54b3dd
Avoid crash in rare case of concurrent DROP
When a role being dropped contains is referenced by catalog objects that
are concurrently also being dropped, a crash can result while trying to
construct the string that describes the objects.  Suppress that by
ignoring objects whose descriptions are returned as NULL.

The majority of relevant codesites were already cautious about this
already; we had just missed a couple.

This is an old bug, so backpatch all the way back.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/17126-21887f04508cb5c8@postgresql.org
2021-11-05 12:29:35 -03:00
Robert Haas bef47ff85d Introduce 'bbsink' abstraction to modularize base backup code.
The base backup code has accumulated a healthy number of new
features over the years, but it's becoming increasingly difficult
to maintain and further enhance that code because there's no
real separation of concerns. For example, the code that
understands knows the details of how we send data to the client
using the libpq protocol is scattered throughout basebackup.c,
rather than being centralized in one place.

To try to improve this situation, introduce a new 'bbsink' object
which acts as a recipient for archives generated during the base
backup progress and also for the backup manifest. This commit
introduces three types of bbsink: a 'copytblspc' bbsink forwards the
backup to the client using one COPY OUT operation per tablespace and
another for the manifest, a 'progress' bbsink performs command
progress reporting, and a 'throttle' bbsink performs rate-limiting.
The 'progress' and 'throttle' bbsink types also forward the data to a
successor bbsink; at present, the last bbsink in the chain will
always be of type 'copytblspc'. There are plans to add more types
of 'bbsink' in future commits.

This abstraction is a bit leaky in the case of progress reporting,
but this still seems cleaner than what we had before.

Patch by me, reviewed and tested by Andres Freund, Sumanta Mukherjee,
Dilip Kumar, Suraj Kharage, Dipesh Pandit, Tushar Ahuja, Mark Dilger,
and Jeevan Ladhe.

Discussion: https://postgr.es/m/CA+TgmoZGwR=ZVWFeecncubEyPdwghnvfkkdBe9BLccLSiqdf9Q@mail.gmail.com
Discussion: https://postgr.es/m/CA+TgmoZvqk7UuzxsX1xjJRmMGkqoUGYTZLDCH8SmU1xTPr1Xig@mail.gmail.com
2021-11-05 10:08:30 -04:00
Peter Geoghegan e7428a99a1 Add hardening to catch invalid TIDs in indexes.
Add hardening to the heapam index tuple deletion path to catch TIDs in
index pages that point to a heap item that index tuples should never
point to.  The corruption we're trying to catch here is particularly
tricky to detect, since it typically involves "extra" (corrupt) index
tuples, as opposed to the absence of required index tuples in the index.

For example, a heap TID from an index page that turns out to point to an
LP_UNUSED item in the heap page has a good chance of being caught by one
of the new checks.  There is a decent chance that the recently fixed
parallel VACUUM bug (see commit 9bacec15) would have been caught had
that particular check been in place for Postgres 14.  No backpatch of
this extra hardening for now, though.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAH2-Wzk-4_raTzawWGaiqNvkpwDXxv3y1AQhQyUeHfkU=tFCeA@mail.gmail.com
2021-11-04 19:54:05 -07:00
Peter Geoghegan 5cd7eb1f1c Add various assertions to heap pruning code.
These assertions document (and verify) our high level assumptions about
how pruning can and cannot affect existing items from target heap pages.
For example, one of the new assertions verifies that pruning does not
set a heap-only tuple to LP_DEAD.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAH2-Wz=vhvBx1GjF+oueHh8YQcHoQYrMi0F0zFMHEr8yc4sCoA@mail.gmail.com
2021-11-04 19:07:54 -07:00
Heikki Linnakangas 6b1b405ebf Fix snapshot reference leak if lo_export fails.
If lo_export() fails to open the target file or to write to it, it leaks
the created LargeObjectDesc and its snapshot in the top-transaction
context and resource owner. That's pretty harmless, it's a small leak
after all, but it gives the user a "Snapshot reference leak" warning.

Fix by using a short-lived memory context and no resource owner for
transient LargeObjectDescs that are opened and closed within one function
call. The leak is easiest to reproduce with lo_export() on a directory
that doesn't exist, but in principle the other lo_* functions could also
fail.

Backpatch to all supported versions.

Reported-by: Andrew B
Reviewed-by: Alvaro Herrera
Discussion: https://www.postgresql.org/message-id/32bf767a-2d65-71c4-f170-122f416bab7e@iki.fi
2021-11-03 10:52:38 +02:00
Peter Geoghegan c59278a1aa Fix parallel amvacuumcleanup safety bug.
Commit b4af70cb inverted the return value of the function
parallel_processing_is_safe(), but missed the amvacuumcleanup test.
Index AMs that don't support parallel cleanup at all were affected.

The practical consequences of this bug were not very serious.  Hash
indexes are affected, but since they just return the number of blocks
during hashvacuumcleanup anyway, it can't have had much impact.

Author: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAD21AoA-Em+aeVPmBbL_s1V-ghsJQSxYL-i3JP8nTfPiD1wjKw@mail.gmail.com
Backpatch: 14-, where commit b4af70cb appears.
2021-11-02 19:52:11 -07:00
Tom Lane 24f9e49e43 Blind attempt to silence SSL compile failures on hamerkop.
Buildfarm member hamerkop has been failing for the last few days
with errors that look like OpenSSL's X509-related symbols have
not been imported into be-secure-openssl.c.  It's unclear why
this should be, but let's try adding an explicit #include of
<openssl/x509v3.h>, as there has long been in fe-secure-openssl.c.

Discussion: https://postgr.es/m/1051867.1635720347@sss.pgh.pa.us
2021-11-02 15:18:07 -04:00
Peter Geoghegan 9bacec15b6 Don't overlook indexes during parallel VACUUM.
Commit b4af70cb, which simplified state managed by VACUUM, performed
refactoring of parallel VACUUM in passing.  Confusion about the exact
details of the tasks that the leader process is responsible for led to
code that made it possible for parallel VACUUM to miss a subset of the
table's indexes entirely.  Specifically, indexes that fell under the
min_parallel_index_scan_size size cutoff were missed.  These indexes are
supposed to be vacuumed by the leader (alongside any parallel unsafe
indexes), but weren't vacuumed at all.  Affected indexes could easily
end up with duplicate heap TIDs, once heap TIDs were recycled for new
heap tuples.  This had generic symptoms that might be seen with almost
any index corruption involving structural inconsistencies between an
index and its table.

To fix, make sure that the parallel VACUUM leader process performs any
required index vacuuming for indexes that happen to be below the size
cutoff.  Also document the design of parallel VACUUM with these
below-size-cutoff indexes.

It's unclear how many users might be affected by this bug.  There had to
be at least three indexes on the table to hit the bug: a smaller index,
plus at least two additional indexes that themselves exceed the size
cutoff.  Cases with just one additional index would not run into
trouble, since the parallel VACUUM cost model requires two
larger-than-cutoff indexes on the table to apply any parallel
processing.  Note also that autovacuum was not affected, since it never
uses parallel processing.

Test case based on tests from a larger patch to test parallel VACUUM by
Masahiko Sawada.

Many thanks to Kamigishi Rei for her invaluable help with tracking this
problem down.

Author: Peter Geoghegan <pg@bowt.ie>
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-By: Kamigishi Rei <iijima.yun@koumakan.jp>
Reported-By: Andrew Gierth <andrew@tao11.riddles.org.uk>
Diagnosed-By: Andres Freund <andres@anarazel.de>
Bug: #17245
Discussion: https://postgr.es/m/17245-ddf06aaf85735f36@postgresql.org
Discussion: https://postgr.es/m/20211030023740.qbnsl2xaoh2grq3d@alap3.anarazel.de
Backpatch: 14-, where the refactoring commit appears.
2021-11-02 12:06:17 -07:00
Tom Lane f3d4019da5 Ensure consistent logical replication of datetime and float8 values.
In walreceiver, set the publisher's relevant GUCs (datestyle,
intervalstyle, extra_float_digits) to the same values that pg_dump uses,
and for the same reason: we need the output to be read the same way
regardless of the receiver's settings.  Without this, it's possible
for subscribers to misinterpret transmitted values.

Although this is clearly a bug fix, it's not without downsides:
subscribers that are storing values into some other datatype, such as
text, could get different results than before, and perhaps be unhappy
about that.  Given the lack of previous complaints, it seems best
to change this only in HEAD, and to call it out as an incompatible
change in v15.

Japin Li, per report from Sadhuprasad Patro

Discussion: https://postgr.es/m/CAFF0-CF=D7pc6st-3A9f1JnOt0qmc+BcBPVzD6fLYisKyAjkGA@mail.gmail.com
2021-11-02 14:28:50 -04:00
Tom Lane 01fc652703 Fix variable lifespan in ExecInitCoerceToDomain().
This undoes a mistake in 1ec7679f1: domainval and domainnull were
meant to live across loop iterations, but they were incorrectly
moved inside the loop.  The effect was only to emit useless extra
EEOP_MAKE_READONLY steps, so it's not a big deal; nonetheless,
back-patch to v13 where the mistake was introduced.

Ranier Vilela

Discussion: https://postgr.es/m/CAEudQAqXuhbkaAp-sGH6dR6Nsq7v28_0TPexHOm6FiDYqwQD-w@mail.gmail.com
2021-11-02 13:36:47 -04:00