Commit Graph

41498 Commits

Author SHA1 Message Date
Michael Paquier 806fad7573 Fix buffer refcount leak with FDW bulk inserts
The leak would show up when using batch inserts with foreign tables
included in a partition tree, as the slots used in the batch were not
reset once processed.  In order to fix this problem, some
ExecClearTuple() are added to clean up the slots used once a batch is
filled and processed, mapping with the number of slots currently in use
as tracked by the counter ri_NumSlots.

This buffer refcount leak has been introduced in b676ac4 with the
addition of the executor facility to improve bulk inserts for FDWs, so
backpatch down to 14.

Alexander has provided the patch (slightly modified by me).  The test
for postgres_fdw comes from me, based on the test case that the author
has sent in the report.

Author: Alexander Pyhalov
Discussion: https://postgr.es/m/b035780a740efd38dc30790c76927255@postgrespro.ru
Backpatch-through: 14
2023-04-25 09:42:19 +09:00
Andres Freund 1118cd37eb Remove vacuum_defer_cleanup_age
vacuum_defer_cleanup_age was introduced before hot_standby_feedback and
replication slots existed. It is hard to use reasonably - commonly it will
either be set too low (not preventing recovery conflicts, while still causing
some bloat), or too high (causing a lot of bloat). The alternatives do not
have that issue.

That on its own might not be sufficient reason to remove
vacuum_defer_cleanup_age, but it also complicates computation of xid
horizons. See e.g. the bug fixed in be504a3e97. It also is untested.

This commit removes TransactionIdRetreatSafely(), as there are no users
anymore. There might be potential future users, hence noting that here.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20230317230930.nhsgk3qfk7f4axls@awork3.anarazel.de
2023-04-24 12:21:02 -07:00
Tom Lane 441ee1677e Fix memory leakage in plpgsql DO blocks that use cast expressions.
Commit 04fe805a1 modified plpgsql so that datatype casts make use of
expressions cached by plancache.c, in place of older code where these
expression trees were managed by plpgsql itself.  However, I (tgl)
forgot that we use a separate, shorter-lived cast info hashtable in
DO blocks.  The new mechanism thus resulted in session-lifespan
leakage of the plancache data once a DO block containing one or more
casts terminated.  To fix, split the cast hash table into two parts,
one that tracks only the plancache's CachedExpressions and one that
tracks the expression state trees generated from them.  DO blocks need
their own expression state trees and hence their own version of the
second hash table, but there's no reason they can't share the
CachedExpressions with regular plpgsql functions.

Per report from Ajit Awekar.  Back-patch to v12 where the issue
was introduced.

Ajit Awekar and Tom Lane

Discussion: https://postgr.es/m/CAHv6PyrNaqdvyWUspzd3txYQguFTBSnhx+m6tS06TnM+KWc_LQ@mail.gmail.com
2023-04-24 14:19:46 -04:00
Tom Lane fce3b26e97 Rename ExecAggTransReparent, and improve its documentation.
The name of this function suggests that it ought to reparent R/W
expanded objects to be children of the persistent aggcontext, instead
of copying them.  In fact it does no such thing, and if you try to
make it do so you will see multiple regression failures.  Rename it
to the less-misleading ExecAggCopyTransValue, and add commentary
about why that attractive-sounding optimization won't work.  Also
adjust comments at call sites, some of which were describing logic
that has since been moved into ExecAggCopyTransValue.

Discussion: https://postgr.es/m/3004282.1681930251@sss.pgh.pa.us
2023-04-24 13:01:33 -04:00
Peter Eisentraut 5a8366ad75 doc: Update SQL features names
Some feature names have been adjusted over time upstream.  This pulls
in those changes.
2023-04-24 15:39:54 +02:00
Daniel Gustafsson 69537f5d17 Remove duplicate lines of code
Commit 6df7a9698b accidentally included two identical prototypes for
default_multirange_selectivi() and commit 086cf1458c added a break;
statement where one was already present, thus duplicating it.  While
there is no bug caused by this, fix by removing the duplicated lines
as they provide no value.

Backpatch the fix for duplicate prototypes to v14 and the duplicate
break statement fix to all supported branches to avoid backpatching
hazards due to the removal.

Reported-by: Anton Voloshin <a.voloshin@postgrespro.ru>
Discussion: https://postgr.es/m/0e69cb60-0176-f6d0-7e15-6478b7d85724@postgrespro.ru
2023-04-24 11:16:17 +02:00
Masahiko Sawada 781ac42d43 Use elog to report unexpected action in handle_streamed_transaction().
An oversight in commit 216a784829.

Author: Masahiko Sawada
Reviewed-by: Kyotaro Horiguchi, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDDbM8_HJt-nMCvcjTK8K9hPzXWqJj7pyaUvR4mm_NrSg@mail.gmail.com
2023-04-24 15:37:14 +09:00
Amit Kapila 19e65dff38 Display 'password_required' option for \dRs+ command.
The commit c3afe8cf5a added a new subscription option 'password_required'
which should be shown with \dRs+ command.

Author: Vignesh C
Reviewed-by: Amit Kapila, Robert Haas
Discussion: https://postgr.es/m/CAA4eK1LRz5sCZxwCW6OtpjLtWPvRwBihQOM4jzQm6ppfpexqGA@mail.gmail.com
Discussion: https://postgr.es/m/9DFC88D3-1300-4DE8-ACBC-4CEF84399A53@enterprisedb.com
2023-04-24 08:37:58 +05:30
Alexander Korotkov cd115c3530 Fix custom validators call in build_local_reloptions()
We need to call them only when validate == true.

Backpatch to 13, where opclass options were introduced.

Reported-by: Tom Lane
Discussion: https://postgr.es/m/2656633.1681831542%40sss.pgh.pa.us
Reviewed-by: Tom Lane, Pavel Borisov
Backpatch-through: 13
2023-04-23 13:58:41 +03:00
Tom Lane 0ec0e20955 Doc: update pgindent/README some more.
In commit 2e6ba1315, I missed that there was a redundant
second reference to pg_bsd_indent's old repo.
2023-04-22 11:52:46 -04:00
Jeff Davis 5cd1a5af4d Fix initdb --no-locale.
Discussion: https://postgr.es/m/878relf7cb.fsf@news-spur.riddles.org.uk
Reported-by: Andrew Gierth
2023-04-21 13:11:18 -07:00
Jeff Davis c04c6c5d6f Avoid character classification in regex escape parsing.
For regex escape sequences, just test directly for the relevant ASCII
characters rather than using locale-sensitive character
classification.

This fixes an assertion failure when a locale considers a non-ASCII
character, such as "൧", to be a digit.

Reported-by: Richard Guo
Discussion: https://postgr.es/m/CAMbWs49Q6UoKGeT8pBkMtJGJd+16CBFZaaWUk9Du+2ERE5g_YA@mail.gmail.com
Backpatch-through: 11
2023-04-21 08:19:41 -07:00
Daniel Gustafsson a23ab2eebf Reorder connection markers in loadbalance tests
Commit 7f5b198 introduced TAP tests that use string literals to mark
the presence of a query in server logs.  Reorder the markers to make
sure they are used in alphabetical order for easier debugging.

Author: Gurjeet Singh <gurjeet@singh.im>
Reviewed-by: Jelte Fennema <postgres@jeltef.nl>
Discussion: https://postgr.es/m/CABwTF4WcYAENqyUQS2crAYfDuJ497v82ty2-MirjaC+zz9e8nQ@mail.gmail.com
2023-04-21 12:29:38 +02:00
Daniel Gustafsson 60ce452729 Make libpq error messages consistent for translation
The errormessage for an incorrect require_auth method wasn't using the
common "invalid %s value" errormessage which lessens the burden on our
translators.  Fix by changing to that format to make use of existing
translations and to make error messages consistent in wording.

Reported and fixed by Gurjeet Singh with some tweaking by myself.

Author: Gurjeet Singh <gurjeet@singh.im>
Discussion: https://postgr.es/m/CABwTF4Xu3g9zohJ9obu8m7MKbf8g63NgpRDjwqPHQgAtB+Gb8Q@mail.gmail.com
2023-04-21 10:23:38 +02:00
David Rowley 84e05beb11 Remove unused global variable
Author: Alexander Lakhin
Discussion: https://postgr.es/m/d5f68d19-c0fc-91a9-118d-7c6a5a3f5fad@gmail.com
2023-04-21 11:41:58 +12:00
David Rowley d91d163529 Fix incorrect function name reference
This function was renamed in 0c9d84427 but this comment wasn't updated.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/699beab4-a6ca-92c9-f152-f559caf6dc25@gmail.com
2023-04-21 10:46:08 +12:00
Michael Paquier 0ecb87e1fa Remove io prefix from pg_stat_io columns
a9c70b46 added the statistics view pg_stat_io which contained columns
"io_context" and "io_object".  Given that the columns are in the
pg_stat_io view, the "io" prefix is somewhat redundant, so remove it.

The code variables referring to these fields are kept unchanged so as
they can keep their context about I/O.

Bump catalog version.

Author: Melanie Plageman
Reviewed-by: Kyotaro Horiguchi, Fabrízio de Royes Mello
Discussion: https://postgr.es/m/CAAKRu_aAQoJWrvT2BYYQvJChFKra_O-5ra3jhzKJZqWsTR1CPQ@mail.gmail.com
2023-04-21 07:21:50 +09:00
Tom Lane eab2d3147e Use --strip-unneeded when stripping static libraries with GNU strip.
We've long used "--strip-unneeded" for shared libraries but plain
"-x" for static libraries when stripping symbols with GNU strip.
There doesn't seem to be any really good reason for that though,
since --strip-unneeded produces smaller output (as "-x" alone
does not remove debug symbols).  Moreover it seems that
llvm-strip, although it identifies as GNU strip, misbehaves when
given "-x" for this purpose.  It's unclear whether that's
intentional or a bug in llvm-strip, but in any case it seems like
changing to use --strip-unneeded in all cases should be a win.

Note that this doesn't change our behavior when dealing with
non-GNU strip.

Per gripes from Ed Maste and Palle Girgensohn.  Back-patch,
in case anyone wants to use llvm-strip with stable branches.

Discussion: https://postgr.es/m/17898-5308d09543463266@postgresql.org
Discussion: https://postgr.es/m/20230420153338.bbj2g5jiyy3afhjz@awork3.anarazel.de
2023-04-20 18:12:32 -04:00
Daniel Gustafsson a9781ae11b Fix autovacuum cost debug logging
Commit 7d71d3dd0 introduced finer grained updates of autovacuum option
changes by increasing the frequency of reading the configuration file.
The debug logging of cost parameter was however changed such that some
initial values weren't logged.  Fix by changing logging to use the old
frequency of logging regardless of them changing.

Also avoid taking a log for rendering the log message unless the set
loglevel is such that the log entry will be emitted.

Author: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAD21AoBS7o6Ljt_vfqPQPf67AhzKu3fR0iqk8B=vVYczMugKMQ@mail.gmail.com
2023-04-20 15:45:44 +02:00
Amit Kapila c1cc4e688b Restart the apply worker if the 'password_required' option is changed.
The apply worker is restarted if any subscription option that affects the
remote connection was changed. In commit c3afe8cf5a, we added the option
'password_required' which can affect the remote connection, so we should
restart the worker if it was changed.

Author: Amit Kapila
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/CAA4eK1+z9UDFEynXLsWeMMuUZc1iQkRwj2HNDtxUHTPo-u1F4A@mail.gmail.com
Discussion: https://postgr.es/m/9DFC88D3-1300-4DE8-ACBC-4CEF84399A53@enterprisedb.com
2023-04-20 08:56:18 +05:30
Thomas Munro 7d3d72b55e Remove obsolete defense against strxfrm() bugs.
Old versions of Solaris and illumos had buffer overrun bugs in their
strxfrm() implementations.  The bugs were fixed more than a decade ago
and the relevant releases are long out of vendor support.  It's time to
remove the defense added by commit be8b06c3.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA+hUKGJ-ZPJwKHVLbqye92-ZXeLoCHu5wJL6L6HhNP7FkJ=meA@mail.gmail.com
2023-04-20 13:20:14 +12:00
David Rowley e35ded2956 Fix list_copy_head() with empty Lists
list_copy_head() given an empty List would crash from trying to
dereference the List to obtain its length.  Since NIL is how we represent
an empty List, we should just be returning another empty List in this
case.

list_copy_head() is new to v16, so let's fix it now before too many people
start coding around the buggy NIL behavior.

Reported-by: Miroslav Bendik
Discussion: https://postgr.es/m/CAPoEpV02WhawuWnmnKet6BqU63bEu7oec0pJc=nKMtPsHMzTXQ@mail.gmail.com
2023-04-20 10:34:46 +12:00
Peter Geoghegan 2584639653 Use nbtdesc "level" field name consistently.
The "lev" name that appeared in NEWROOT nbtree record desc output was
inconsistent with the symbol name from the underlying C struct.  It was
also inconsistent with nbtdesc output for other nearby record types with
similar level fields.

Standardize on "level" to make everything consistent.

Follow-up to commit 1c453cfd.
2023-04-19 12:15:15 -07:00
Peter Geoghegan 50547a3fae Fix wal_consistency_checking enhanced desc output.
Recent enhancements to rmgr desc routines that made the output summarize
certain block data (added by commits 7d8219a4 and 1c453cfd) dealt with
records that lack relevant block data (and so have nothing to give a
more detailed summary of) by testing !DecodedBkpBlock.has_image.  As a
result, more detailed descriptions of block data were not output when
wal_consistency_checking was enabled.

This bug affected records with summarizable block data that also
happened to have an FPI that the REDO routine isn't supposed to apply
(FPIs used for consistency checking purposes only).  The presence of
such an FPI was incorrectly taken to indicate the absence of block data.

To fix, test DecodedBkpBlock.has_data, not !DecodedBkpBlock.has_image.
This is the exact condition that we care about, not an inexact proxy.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wzm5Sc9cBg1qWV_cEBfLNJCrW9FjS-SoHVt8FLA7Ldn8yg@mail.gmail.com
2023-04-19 10:42:39 -07:00
Tom Lane 9e1e9d6560 Add missed case for tab completion of GRANT/REVOKE MAINTAIN.
We failed to offer "ON" after "GRANT MAINTAIN".
Oversight in commit 60684dd83.

Ken Kato

Discussion: https://postgr.es/m/6afe7712991882a864d6d10003264e7c@oss.nttdata.com
2023-04-19 10:49:53 -04:00
Daniel Gustafsson 0b5d1fb36a Fix errormessage for missing system CA in OpenSSL 3.1
The error message for a missing or invalid system CA when using
sslrootcert=system differs based on the OpenSSL version used.

In OpenSSL 1.0.1-3.0 it is reported as SSL Error, with varying
degrees of helpfulness in the error message. With OpenSSL 3.1 it
is reported as an SSL SYSCALL error with "Undefined error" as
the error message. This fix pulls out the particular error in
OpenSSL 3.1 as a certificate verify error in order to help the
user better figure out what happened, and to keep the ssl test
working. While there is no evidence that extracing the errors
will clobber errno, this adds a guard against that regardless
to also make the consistent with how we handle OpenSSL errors
elsewhere. It also memorizes the output from OpenSSL 3.0 in
the test in cases where the system CA isn't responding.

Reported-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/c39be3c5-c1a5-1e33-1024-16f527e251a4@enterprisedb.com
2023-04-19 12:54:58 +02:00
Peter Eisentraut 77dedeb2c4 Remove some tabs in SQL code in C string literals
This is not handled uniformly throughout the code, but at least nearby
code can be consistent.
2023-04-19 09:29:43 +02:00
David Rowley 3f58a4e296 Fix various typos and incorrect/outdated name references
Author: Alexander Lakhin
Discussion: https://postgr.es/m/699beab4-a6ca-92c9-f152-f559caf6dc25@gmail.com
2023-04-19 13:50:33 +12:00
Tom Lane 4ddee4d9de Update time zone data files to tzdata release 2023c.
DST law changes in Egypt, Greenland, Morocco, and Palestine.

When observing Moscow time, Europe/Kirov and Europe/Volgograd now
use the abbreviations MSK/MSD instead of numeric abbreviations,
for consistency with other timezones observing Moscow time.

Also, America/Yellowknife is no longer distinct from America/Edmonton;
this affects some pre-1948 timestamps in that area.
2023-04-18 14:46:39 -04:00
Peter Geoghegan 06e0652750 Remove useless argument from nbtree dedup function.
_bt_dedup_pass()'s heapRel argument hasn't been needed or used since
commit cf2acaf4dc made deleting any existing LP_DEAD index tuples the
caller's responsibility.
2023-04-18 10:33:15 -07:00
Tom Lane b124104e73 Fix Utils.pm's locale-munging so that Perl itself is also affected.
Utils.pm has a BEGIN block that editorializes on the locale-related
environment variables, primarily in order to stabilize the behavior
of child programs.  It turns out that if the calling test script
has already done "use locale", this fails to affect the behavior
of Perl itself, causing locale behavior to be different between
Perl and child programs.  That breaks commit cd82e5c79's attempt
to deal with locale-specific behavior in psql.

To fix, we just need to call setlocale() to redo the calculation
of locale.

Per report from Aleksander Alekseev.  No back-patch for now, since
there are no locale-dependent TAP tests in prior branches, and
I'm not yet convinced that this won't have side-effects of its own.

Discussion: https://postgr.es/m/CAJ7c6TO9KpYYxoVVseWEQB5KtjWDkt8NfyAeKPcHoe2Jq+ykpw@mail.gmail.com
2023-04-18 13:31:46 -04:00
Robert Haas 363e8f9115 Fix pg_basebackup with in-place tablespaces some more.
Commit c6f2f01611 purported to make
this work, but problems remained. In a plain-format backup, the
files from an in-place tablespace got included in the tar file for
the main tablespace, which is wrong but it's not clear that it
has any user-visible consequences. In a tar-format backup, the
TABLESPACE_MAP option is used, and so we never iterated over
pg_tblspc and thus never backed up the in-place tablespaces
anywhere at all.

To fix this, reverse the changes in that commit, so that when we scan
pg_tblspc during a backup, we create tablespaceinfo objects even for
in-place tablespaces. We set the field that would normally contain the
absolute pathname to the relative path pg_tblspc/${TSOID}, and that's
good enough to make basebackup.c happy without any further changes.

However, pg_basebackup needs a couple of adjustments to make it work.
First, it needs to understand that a relative path for a tablespace
means it's an in-place tablespace.  Second, it needs to tolerate the
situation where restoring the main tablespace tries to create
pg_tblspc or a subdirectory and finds that it already exists, because
we restore user-defined tablespaces before the main tablespace.

Since in-place tablespaces are only intended for use in development
and testing, no back-patch.

Patch by me, reviewed by Thomas Munro and Michael Paquier.

Discussion: http://postgr.es/m/CA+TgmobwvbEp+fLq2PykMYzizcvuNv0a7gPMJtxOTMOuuRLMHg@mail.gmail.com
2023-04-18 11:23:34 -04:00
Michael Paquier f180290847 ecpg: Fix handling of strings in ORACLE compat code with SQLDA
When compiled with -C ORACLE, ecpg_get_data() had a one-off issue where
it would incorrectly store the null terminator byte to str[-1] when
varcharsize is 0, which is something that can happen when using SQLDA.
This would eat 1 byte from the previous field stored, corrupting the
results generated.

All the callers of ecpg_get_data() estimate and allocate enough storage
for the data received, and the fix of this commit relies on this
assumption.  Note that this maps to the case where no padding or
truncation is required.

This issue has been introduced by 3b7ab43 with the Oracle compatibility
option, so backpatch down to v11.

Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20230410.173500.440060475837236886.horikyota.ntt@gmail.com
Backpatch-through: 11
2023-04-18 11:20:41 +09:00
David Rowley eef231e816 Fix some typos and some incorrectly duplicated words
Author: Justin Pryzby
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/ZD3D1QxoccnN8A1V@telsasoft.com
2023-04-18 14:03:49 +12:00
David Rowley b4dbf3e924 Fix various typos
This fixes many spelling mistakes in comments, but a few references to
invalid parameter names, function names and option names too in comments
and also some in string constants

Also, fix an #undef that was undefining the incorrect definition

Author: Alexander Lakhin
Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/d5f68d19-c0fc-91a9-118d-7c6a5a3f5fad@gmail.com
2023-04-18 13:23:23 +12:00
Jeff Davis e39d512f3e Comment fix for 60684dd834.
Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/766f3799-0269-162f-ba63-4cae34a5534f@enterprisedb.com
2023-04-17 13:45:50 -07:00
Tom Lane 38358aab9a Update Solution.pm for HAVE_GSSAPI[_GSSAPI]_EXT_H, too.
Commit d48ac0070 was still a brick shy of a load.
Per buildfarm (via Andrew Dunstan).

Discussion: https://postgr.es/m/08dd6053-deee-52e9-954b-eb69232905c9@dunslane.net
2023-04-17 16:00:39 -04:00
Tom Lane 3e383f9b68 Avoid trying to write an empty WAL record in log_newpage_range().
If the last few pages in the specified range are empty (all zero),
then log_newpage_range() could try to emit an empty WAL record
containing no FPIs.  This at least upsets an Assert in
ReserveXLogInsertLocation, and might perhaps have bad real-world
consequences in non-assert builds.

This has been broken since log_newpage_range() was introduced,
but the case was hard if not impossible to hit before commit 3d6a98457
decided it was okay to leave VM and FSM pages intentionally zero.
Nonetheless, it seems prudent to back-patch.  log_newpage_range()
was added in v12 but later back-patched, so this affects all
supported branches.

Matthias van de Meent, per report from Justin Pryzby

Discussion: https://postgr.es/m/ZD1daibg4RF50IOj@telsasoft.com
2023-04-17 14:22:26 -04:00
Peter Geoghegan cd7cdc550c Fix incorrect comment about nbtree WAL record.
The nbtree VACUUM WAL record stores its page offset number payload in
blk 0 (just like the closely related nbtree DELETE WAL record).  Commit
ebd551f5 fixed a similar issue with the DELETE WAL record, but missed
this one.
2023-04-17 09:58:18 -07:00
Tom Lane d48ac0070c Further cleanup of autoconf output files for GSSAPI changes.
Running autoheader was missed in f7431bca8.  This is cosmetic since
we aren't using these HAVE_ symbols, but let's get everything in
sync while we're looking at this.

Discussion: https://postgr.es/m/2422362.1681741814@sss.pgh.pa.us
2023-04-17 11:21:50 -04:00
Peter Eisentraut 1c77de9801 doc: Add additional SQL features codes from SQL:2023
These were mysteriously omitted in c9f57541d9.
2023-04-17 16:06:41 +02:00
Peter Eisentraut 2434d60a2a Put new command-line option into sensible order in help output
We have two existing conventions for long options: either alphabetical
among short options, or all long options after all the short options.
But the convention apparently used here, next to a functionally
related option, is not one of them.
2023-04-17 11:09:17 +02:00
Tom Lane 78d5952dd0 Ensure result of an aggregate's finalfunc is made read-only.
The finalfunc might return a read-write expanded object.  If we
de-duplicate multiple call sites for the aggregate, any function(s)
receiving the aggregate result earlier could alter or destroy the
value that reaches the ones called later.  This is a brown-paper-bag
bug in commit 42b746d4c, because we actually considered the need
for read-only-ness but failed to realize that it applied to the case
with a finalfunc as well as the case without.

Per report from Justin Pryzby.  New error in HEAD,
no need for back-patch.

Discussion: https://postgr.es/m/ZDm5TuKsh3tzoEjz@telsasoft.com
2023-04-16 14:16:40 -04:00
David Rowley c0235013c1 Improve VACUUM/ANALYZE BUFFER_USAGE_LIMIT docs
This addresses various deficiencies in the documentation for VACUUM and
ANALYZE's BUFFER_USEAGE_LIMIT docs.

Here we declare "size" in the syntax synopsis for VACUUM and ANALYZE's
BUFFER_USAGE_LIMIT option and then define exactly what values can be
specified for it in the section for that below.

Also, fix the incorrect ordering of vacuumdb options both in the documents
and in vacuumdb's --help output.  These should be in alphabetical order.

In passing also add the minimum/maximum range for the BUFFER_USAGE_LIMIT
option.  These will also serve as example values that can be modified and
used.

Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/16845cb1-b228-e157-f293-5892bced9253@enterprisedb.com
2023-04-16 12:05:34 +12:00
Tom Lane 064eb89e83 Fix assignment to array of domain over composite, redux.
Commit 3e310d837 taught isAssignmentIndirectionExpr() to look through
CoerceToDomain nodes.  That's not sufficient, because since commit
04fe805a1 it's been possible for the planner to simplify
CoerceToDomain to RelabelType when the domain has no constraints
to enforce.  So we need to look through RelabelType too.

Per bug #17897 from Alexander Lakhin.  Although 3e310d837 was
back-patched to v11, it seems sufficient to apply this change
to v12 and later, since 04fe805a1 came in in v12.

Dmitry Dolgov

Discussion: https://postgr.es/m/17897-4216c546c3874044@postgresql.org
2023-04-15 12:01:39 -04:00
David Rowley 414d66220a Adjust Valgrind macro usage to protect chunk headers
Prior to this commit we only ever protected MemoryChunk's requested_size
field with Valgrind NOACCESS.  This means that if the hdrmask field is
ever accessed accidentally then we're not going to get any warnings from
Valgrind about it.  Valgrind would have warned us about the problem fixed
in 92957ed98 had we already been doing this.

Per suggestion from Tom Lane

Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/1650235.1672694719@sss.pgh.pa.us
Discussion: https://postgr.es/m/CAApHDvr=FZNGbj252Z6M9BSFKoq6BMxgkQ2yEAGUYoo7RquqZg@mail.gmail.com
2023-04-15 11:59:52 +12:00
Andres Freund 43a33ef54e Support RBM_ZERO_AND_CLEANUP_LOCK in ExtendBufferedRelTo(), add tests
For some reason I had not implemented RBM_ZERO_AND_CLEANUP_LOCK support in
ExtendBufferedRelTo(), likely thinking it not being reachable. But it is
reachable, e.g. when replaying a WAL record for a page in a relation that
subsequently is truncated (likely only reachable when doing crash recovery or
PITR, not during ongoing streaming replication).

As now all of the RBM_* modes are supported, remove assertions checking mode.

As we had no test coverage for this scenario, add a new TAP test. There's
plenty more that ought to be tested in this area...

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/392271.1681238924%40sss.pgh.pa.us
Discussion: https://postgr.es/m/0b5eb82b-cb99-e0a4-b932-3dc60e2e3926@gmail.com
2023-04-14 11:30:33 -07:00
Tom Lane e4d905f772 NULL is not an ideal way to spell bool "false".
Thinko in commit 6633cfb21, detected by buildfarm member hamerkop.
2023-04-14 13:31:51 -04:00
Peter Eisentraut 12829058c4 Fix whitespace 2023-04-14 10:04:57 +02:00
Peter Eisentraut 99322d6eee Add missing source files to nls.mk 2023-04-14 09:56:04 +02:00
David Rowley e0693faf79 Fix incorrect partition pruning logic for boolean partitioned tables
The partition pruning logic assumed that "b IS NOT true" was exactly the
same as "b IS FALSE".  This is not the case when considering NULL values.
Fix this so we correctly include any partition which could hold NULL
values for the NOT case.

Additionally, this fixes a bug in the partition pruning code which handles
partitioned tables partitioned like ((NOT boolcol)).  This is a seemingly
unlikely schema design, and it was untested and also broken.

Here we add tests for the ((NOT boolcol)) case and insert some actual data
into those tables and verify we do get the correct rows back when running
queries.  I've also adjusted the existing boolpart tests to include some
data and verify we get the correct results too.

Both the bugs being fixed here could lead to incorrect query results with
fewer rows being returned than expected.  No additional rows could have
been returned accidentally.

In passing, remove needless ternary expression.  It's more simple just to
pass !is_not_clause to makeBoolConst().  It makes sense to do this so the
code is consistent with the bug fix in the "else if" condition just below.

David Kimura did submit a patch to fix the first of the issues here, but
that's not what's being committed here.

Reported-by: David Kimura
Reviewed-by: Richard Guo, David Kimura
Discussion: https://postgr.es/m/CAHnPFjQ5qxs6J_p+g8=ww7GQvfn71_JE+Tygj0S7RdRci1uwPw@mail.gmail.com
Backpatch-through: 11, all supported versions
2023-04-14 16:20:27 +12:00
Thomas Munro 558c9d75fe Fix PHJ match bit initialization.
Hash join tuples reuse the HOT status bit to indicate match status
during hash join execution. Correct reuse requires clearing the bit in
all tuples. Serial hash join and parallel multi-batch hash join do so
upon inserting the tuple into the hashtable. Single batch parallel hash
join and batch 0 of unexpected multi-batch hash joins forgot to do this.

It hadn't come up before because hashtable tuple match bits are only
used for right and full outer joins and parallel ROJ and FOJ were
unsupported. 11c2d6fdf5 introduced support for parallel ROJ/FOJ but
neglected to ensure the match bits were reset.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reported-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/flat/CAMbWs48Nde1Mv%3DBJv6_vXmRKHMuHZm2Q_g4F6Z3_pn%2B3EV6BGQ%40mail.gmail.com
2023-04-14 11:02:38 +12:00
Michael Paquier a282697088 Remove code in charge of freeing regexps generation by Lab.c
bea3d7e has redesigned the regexp engine so as all the allocations go
through palloc() with a dedicated memory context.  hba.c had to cope
with the past memory management logic by going through all the HBA and
ident lines generated, then directly free all the regexps found in
AuthTokens to ensure that no leaks would happen.  Such leaks could
happen for example in the postmaster after a SIGHUP, in the event of
an HBA and/or ident reload failure where all the new content parsed must
be discarded, including all the regexps that may have been compiled.

Now that regexps are palloc()'d in their own memory context,
MemoryContextDelete() is enough to ensure that all the compiled regexps
are properly gone.  Simplifying this logic in hba.c has the effect to
only remove code.  Most of it is new in v16, except the part for regexps
compiled in ident entries for the system username, so doing this cleanup
now rather than when v17 opens for business will reduce future diffs
with the upcoming REL_16_STABLE.

Some comments were incorrect since bea3d7e, now fixed to reflect the
reality.

Reviewed-by: Bertrand Drouvot, Álvaro Herrera
Discussion: https://postgr.es/m/ZDdJ289Ky2qEj4h+@paquier.xyz
2023-04-14 07:27:44 +09:00
David Rowley 0981846b9c Remove old GUC name mapping for "force_parallel_mode"
This GUC was renamed to debug_parallel_query in 5352ca22e.  That commit
added an entry into map_old_guc_names[] to allow the old name still to
work.  That was done to allow a transition time where the buildfarm
configs could be swapped over to use debug_parallel_query instead.  That
work is now complete.

Here we remove the old name with the intention of breaking any user code
which is using force_parallel_query.  As mentioned in the commit message
for 5352ca22e, it appeared many users were misled into thinking that
setting this GUC was doing something useful for them to make queries run
more quickly.

Discussion: https://postgr.es/m/CAApHDvoR7EOz7Tvyzrd18FO-Dw2Cp4Uyq25TEWguK+XyCJtzOw@mail.gmail.com
2023-04-14 10:19:45 +12:00
Peter Eisentraut 83c4706658 Update Unicode data to CLDR 43
No actual changes result.
2023-04-13 22:10:08 +02:00
Peter Geoghegan d6f0f95a6b Harmonize some more function parameter names.
Make sure that function declarations use names that exactly match the
corresponding names from function definitions in a few places.  These
inconsistencies were all introduced relatively recently, after the code
base had parameter name mismatches fixed in bulk (see commits starting
with commits 4274dc22 and 035ce1fe).

pg_bsd_indent still has a couple of similar inconsistencies, which I
(pgeoghegan) have left untouched for now.

Like all earlier commits that cleaned up function parameter names, this
commit was written with help from clang-tidy.
2023-04-13 10:15:20 -07:00
Stephen Frost f7431bca8b Explicitly require MIT Kerberos for GSSAPI
WHen building with GSSAPI support, explicitly require MIT Kerberos and
check for gssapi_ext.h in configure.ac and meson.build.  Also add
documentation explicitly stating that we now require MIT Kerberos when
building with GSSAPI support.

Reveiwed by: Johnathan Katz
Discussion: https://postgr.es/m/abcc73d0-acf7-6896-e0dc-f5bc12a61bb1@postgresql.org
2023-04-13 08:55:13 -04:00
Stephen Frost 6633cfb216 De-Revert "Add support for Kerberos credential delegation"
This reverts commit 3d03b24c3 (Revert Add support for Kerberos
credential delegation) which was committed on the grounds of concern
about portability, but on further review and discussion, it's clear that
we are better off explicitly requiring MIT Kerberos as that appears to
be the only GSSAPI library currently that's under proper maintenance
and ongoing development.  The API used for storing credentials was added
to MIT Kerberos over a decade ago while for the other libraries which
appear to be mainly based on Heimdal, which exists explicitly to be a
re-implementation of MIT Kerberos, the API never made it to a released
version (even though it was added to the Heimdal git repo over 5 years
ago..).

This post-feature-freeze change was approved by the RMT.

Discussion: https://postgr.es/m/ZDDO6jaESKaBgej0%40tamriel.snowman.net
2023-04-13 08:55:07 -04:00
Thomas Munro 6ca8df2d61 Skip the 004_io_direct.pl test if a pre-flight check fails.
The test previously had a list of OSes that direct I/O was expected to
work on.  That worked well enough for the systems in our build farm, but
didn't survive contact with the Debian build bots running on tmpfs via
overlayfs.  tmpfs does not support O_DIRECT, but we don't want to
exclude Linux generally.

The new approach is to try to create an empty file with O_DIRECT from
Perl first.  If that fails, we'll skip the test and report what the
error was.

Reported-by: Christoph Berg <myon@debian.org>
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/ZDYd4A78cT2ULxZZ%40msg.df7cb.de
2023-04-13 13:57:46 +12:00
Thomas Munro b37d051b0e Remove overzealous assertion from PHJ.
We can't assert that we're the only process attached to a barrier after
BarrierArriveAndDetachExceptLast().  Although that'll be true almost
always, a late-starting parallel worker can attach very briefly (that
is, immediately detach after checking the phase) right at that moment.
BarrierArriveAndDetachExceptLast() already contains an assertion like
that, but it holds a spinlock preventing the race.  This thinko caused a
one-off failure on build farm animal chimaera.

Diagnosed-by: Melanie Plageman <melanieplageman@gmail.com>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3590249.1680971629@sss.pgh.pa.us
2023-04-13 09:37:54 +12:00
Andres Freund 5ec69b71f1 Improve error messages introduced in be87200efd and 0fdab27ad6
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20230411.120301.93333867350615278.horikyota.ntt@gmail.com
Discussion: https://postgr.es/m/20230412174244.6njadz4uoiez3l74@awork3.anarazel.de
2023-04-12 11:00:37 -07:00
Alvaro Herrera 9ce04b50e1
Revert "Catalog NOT NULL constraints" and fallout
This reverts commit e056c557ae and minor later fixes thereof.

There's a few problems in this new feature -- most notably regarding
pg_upgrade behavior, but others as well.  This new feature is not in any
way critical on its own, so instead of scrambling to fix it we revert it
and try again in early 17 with these issues in mind.

Discussion: https://postgr.es/m/3801207.1681057430@sss.pgh.pa.us
2023-04-12 19:29:21 +02:00
Tom Lane 88ceac5d77 Fix parallel-safety marking when moving initplans to another node.
Our policy since commit ab77a5a45 has been that a plan node having
any initplans is automatically not parallel-safe.  (This could be
relaxed, but not today.)  clean_up_removed_plan_level neglected
this, and could attach initplans to a parallel-safe child plan
node without clearing the plan's parallel-safe flag.  That could
lead to "subplan was not initialized" errors at runtime, in case
an initplan referenced another one and only the referencing one
got transmitted to parallel workers.

The fix in clean_up_removed_plan_level is trivial enough.
materialize_finished_plan also moves initplans from one node
to another, but it's okay because it already copies the source
node's parallel_safe flag.  The other place that does this kind
of thing is standard_planner's hack to inject a top-level Gather
when debug_parallel_query is active.  But that's actually dead
code given that we're correctly enforcing the "initplans aren't
parallel safe" rule, so just replace it with an Assert that
there are no initplans.

Also improve some related comments.

Normally we'd add a regression test case for this sort of bug.
The mistake itself is already reached by existing tests, but there
is accidentally no visible problem.  The only known test case that
creates an actual failure seems too indirect and fragile to justify
keeping it as a regression test (not least because it fails to fail
in v11, though the bug is clearly present there too).

Per report from Justin Pryzby.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/ZDVt6MaNWkRDO1LQ@telsasoft.com
2023-04-12 10:46:38 -04:00
Peter Eisentraut 5f38a2034e Fix incorrect format placeholders 2023-04-12 10:05:50 +02:00
Michael Paquier a923e21631 Fix detection of unseekable files for fseek() and ftello() with MSVC
Calling fseek() or ftello() on a handle to a non-seeking device such as
a pipe or a communications device is not supported.  Unfortunately,
MSVC's flavor of these routines, _fseeki64() and _ftelli64(), do not
return an error when given a pipe as handle.  Some of the logic of
pg_dump and restore relies on these routines to check if a handle is
seekable, causing failures when passing the contents of pg_dump to
pg_restore through a pipe, for example.

This commit introduces wrappers for fseeko() and ftello() on MSVC so as
any callers are able to properly detect the cases of non-seekable
handles.  This relies mainly on GetFileType(), sharing a bit of code
with the MSVC port for fstat().  The code in charge of getting a file
type is refactored into a new file called win32common.c, shared by
win32stat.c and the new win32fseek.c.  It includes the MSVC ports for
fseeko() and ftello().

Like 765f5df, this is backpatched down to 14, where the fstat()
implementation for MSVC is able to understand about files larger than
4GB in size.  Using a TAP test for that is proving to be tricky as
IPC::Run handles the pipes by itself, still I have been able to check
the fix manually.

Reported-by: Daniel Watzinger
Author: Juan José Santamaría Flecha, Michael Paquier
Discussion: https://postgr.es/m/CAC+AXB26a4EmxM2suXxPpJaGrqAdxracd7hskLg-zxtPB50h7A@mail.gmail.com
Backpatch-through: 14
2023-04-12 09:09:38 +09:00
Peter Geoghegan c03c2eae0a Refine the guidelines for rmgrdesc authors.
Clarify the goals of the recently added guidelines for rmgrdesc authors:
to avoid gratuitous inconsistencies across resource managers, and to
make it reasonably easy to write a reusable custom parser.

Beyond that, the guidelines leave rmgrdesc authors with a significant
amount of leeway.  This even includes the leeway to invent custom
conventions (in cases where it's warranted).

Follow-up to commit 7d8219a4.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzkbYuvwYKm-Y-72QEh6SPMQcAo9uONv+mR3bMGcu9E_Cg@mail.gmail.com
2023-04-11 15:26:24 -07:00
Peter Geoghegan 96149a180d Fix Heap rmgr's desc output for infobits arrays.
Make heap desc routines that output status bit as arrays of constants
avoid outputting array literals that contain superfluous punctuation
characters that complicate parsing the output.  Also make sure that no
heap desc routine repeats the same key name (at the same nesting level),
for the same reason.  Arguably, these were both oversights in commit
7d8219a4.

In passing, make the desc output code (which covers Heap's DELETE,
UPDATE, HOT_UPDATE, LOCK, and LOCK_UPDATED record types) consistent in
terms of the output order of each field.  This order also matches WAL
record struct order.  Heap's DELETE desc output now shows the record's
xmax field for the first time (just like UPDATE/HOT_UPDATE records).

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=pNYtxiJ2Jx5Lj=fKo1OEZ4GE0p_kct+ugAUTqBwU46g@mail.gmail.com
2023-04-11 15:25:02 -07:00
Peter Geoghegan e944063294 Fix xl_heap_lock WAL record field's data type.
Make xl_heap_lock's infobits_set field of type uint8, not int8.  Using
int8 isn't appropriate given that the field just holds status bits.
This fixes an oversight in commit 0ac5ad5134.

In passing rename the nearby TransactionId field to "xmax" to make
things consistency with related records, such as xl_heap_lock_updated.

Deliberately avoid a bump in XLOG_PAGE_MAGIC.  No backpatch, either.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkCd3kOS8b7Rfxw7Mh1_6jvX=Nzo-CWR1VBTiOtVZkWHA@mail.gmail.com
2023-04-11 14:07:54 -07:00
Andres Freund 57411c82ce 035_standby_logical_decoding: Add missing waits for replication
At least one slow buildfarm system (hoverfly) showed that the database
creation was not replicated before we try to create logical replication slots
on the standby, in that database.

Reported-by: Noah Misch <noah@leadboat.com>
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/20230411053657.GA1177147@rfd.leadboat.com
2023-04-11 11:17:36 -07:00
David Rowley 4c8a1b4769 Fix uninitialized variable in transformTableLikeClause()
process_notnull_constraints should be set to false until we discover a NOT
NULL column.

Discovered while running Valgrind.

Discussion: https://postgr.es/m/CAApHDvoMyiZVi1KW5WVdqMRzWsWkD3F7n6QD+BbAO6WTeAWsUQ@mail.gmail.com
2023-04-11 23:01:12 +12:00
David Rowley 68a2a437f4 Improve ereports for VACUUM's BUFFER_USAGE_LIMIT option
There's no need to check if opt->arg is NULL since defGetString() already
does that and raises an ERROR if it is.  Let's just remove that check.

Also, combine the two remaining ERRORs into a single check.  It seems
better to give an indication about what sort of values we're looking for
rather than just to state that the value given isn't valid.  Make
BUFFER_USAGE_LIMIT uppercase in this ERROR message too.  It's already
upper case in one other error message, so make that consistent.

Reported-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20230411.102335.1643720544536884844.horikyota.ntt@gmail.com
2023-04-11 19:36:34 +12:00
Peter Geoghegan 26e65ebdb2 Clarify nbtree posting list update desc issue.
Per complaint from Melanie Plageman.

Follow-up to commit 5d6728e5.

Reported-By: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/20230411002315.oyaicmcqrq2hb3ek@liskov
2023-04-10 17:55:23 -07:00
Peter Geoghegan 5d6728e588 Fix nbtree posting list update desc output.
We cannot use the generic array_desc approach with per-tuple nbtree
posting list update metadata because array_desc can only deal with fixed
width elements (e.g., page offset numbers).  Using array_desc led to
incorrect rmgr descriptions for updates from nbtree DELETE/VACUUM WAL
records.

To fix, add specialized code to describe the update metadata as array
elements in desc output.  We now iterate over the update metadata using
an approach that matches related REDO routines.

Also stop showing the updates offset number array separately in nbtree
DELETE/VACUUM desc output.  It's redundant information, since the same
page offset numbers appear in the description of each individual update
element.  Also make some small tweaks to the way that we format arrays
in all desc routines (not just nbtree desc routines) to make arrays a
little less verbose.

Oversight in commit 1c453cfd, which enhanced the nbtree rmgr desc
routines.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkbYuvwYKm-Y-72QEh6SPMQcAo9uONv+mR3bMGcu9E_Cg@mail.gmail.com
2023-04-10 11:15:41 -07:00
Daniel Gustafsson 6ff2e8cdd4 Simplify version check for SKIP clause
Checking for the required versions of IO::Pty as well as IPC::Run
can be achieved with a single eval call, and by using the VERSION
function the comparison is guaranteed to follow the same rules as
calling 'use' on the module with a version.

Reported-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/6d880ea2-f8ca-f458-4dcd-a7a3e6d6cd7c@dunslane.net
2023-04-08 23:32:11 +02:00
Thomas Munro 980e8879f5 Use higher wal_level for 004_io_direct.pl.
The new direct I/O test deliberately uses a very small shared_buffers to
force some disk transfers without making the data set large and slow,
but ran into a problem with wal_level = minimal: log_newpage_range()
pins many buffers, leading to a few intermittent "no unpinned buffers
available" errors.

We could presumably fix that by adjusting shared_buffers, but crake
seems to be trying to tell us something interesting with these settings,
so let's just avoid wal_level = minimal in this test for now.

Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20230408060408.n7xdwk3mxj5oykt6%40awork3.anarazel.de
2023-04-09 08:27:36 +12:00
Tom Lane 064750af4f Improve indentation of multiline initialization expressions.
If a variable has an initialization expression that wraps onto the
next line(s), pg_bsd_indent will now indent the continuation lines
one stop, instead of aligning them flush with the variable declaration.

We've been holding off applying this until the last v16 CF finished,
but now it's time.

Thomas Munro and Tom Lane

Discussion: https://postgr.es/m/20230120013137.7ky7nl4e4zjorrfa@awork3.anarazel.de
2023-04-08 11:48:45 -04:00
Andrew Dunstan bbec50de16 Try to unbreak MSVC builds for pg_waldump
remedy an omission in commit 7d8219a444
2023-04-08 11:21:53 -04:00
Tom Lane 07690aab46 Suppress bogus printout during new 035_standby_logical_decoding.pl test.
Our convention for some time has been that successful tests shouldn't
print anything on stderr.  A stray "diag" call violated that, and
for that matter messed up the normal TAP progress display.
2023-04-08 10:50:46 -04:00
Daniel Gustafsson 2e57ffe12f Skip \password TAP test on old IPC::Run versions
IPC::Run versions prior to 0.98 cause the interactive session to time out,
so SKIP the test in case these versions are detected (they are within the
base requirement for our TAP tests in general).  Error reported by the BF
and investigation by Tom Lane.

Discussion: https://postgr.es/m/414A86BD-986B-48A7-A1E4-EEBCE5AF08CB@yesql.se
2023-04-08 15:51:45 +02:00
Andrew Dunstan 0e9b271890 Try to unbreak MSVC builds for fuzzystrmatch
Commit a290378a37 neglrected to add a recipe for MSVC to build the
daitch_motokoff.h file.

Per buildfarm animal bowerbird.
2023-04-08 08:28:15 -04:00
Stephen Frost 3d03b24c35 Revert "Add support for Kerberos credential delegation"
This reverts commit 3d4fa227bc.

Per discussion and buildfarm, this depends on APIs that seem to not
be available on at least one platform (NetBSD).  Should be certainly
possible to rework to be optional on that platform if necessary but bit
late for that at this point.

Discussion: https://postgr.es/m/3286097.1680922218@sss.pgh.pa.us
2023-04-08 07:21:35 -04:00
Thomas Munro db4f21e4a3 Redesign interrupt/cancel API for regex engine.
Previously, a PostgreSQL-specific callback checked by the regex engine
had a way to trigger a special error code REG_CANCEL if it detected that
the next call to CHECK_FOR_INTERRUPTS() would certainly throw via
ereport().

A later proposed bugfix aims to move some complex logic out of signal
handlers, so that it won't run until the next CHECK_FOR_INTERRUPTS(),
which makes the above design impossible unless we split
CHECK_FOR_INTERRUPTS() into two phases, one to run logic and another to
ereport().  We may develop such a system in the future, but for the
regex code it is no longer necessary.

An earlier commit moved regex memory management over to our
MemoryContext system.  Given that the purpose of the two-phase interrupt
checking was to free memory before throwing, something we don't need to
worry about anymore, it seems simpler to inject CHECK_FOR_INTERRUPTS()
directly into cancelation points, and just let it throw.

Since the plan is to keep PostgreSQL-specific concerns separate from the
main regex engine code (with a view to bein able to stay in sync with
other projects), do this with a new macro INTERRUPT(), customizable in
regcustom.h and defaulting to nothing.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com
2023-04-08 22:10:39 +12:00
Thomas Munro 4f51429dd7 Update tsearch regex memory management.
Now that our regex engine uses palloc(), it's not necessary to set up a
special memory context callback to free compiled regexes.  The regex has
no resources other than the memory that is already going to be freed in
bulk.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com
2023-04-08 22:09:17 +12:00
Thomas Munro bea3d7e383 Use MemoryContext API for regex memory management.
Previously, regex_t objects' memory was managed with malloc() and free()
directly.  Switch to palloc()-based memory management instead.
Advantages:

 * memory used by cached regexes is now visible with MemoryContext
   observability tools

 * cleanup can be done automatically in certain failure modes
   (something that later commits will take advantage of)

 * cleanup can be done in bulk

On the downside, there may be more fragmentation (wasted memory) due to
per-regex MemoryContext objects.  This is a problem shared with other
cached objects in PostgreSQL and can probably be improved with later
tuning.

Thanks to Noah Misch for suggesting this general approach, which
unblocks later work on interrupts.

Suggested-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com
2023-04-08 22:08:41 +12:00
Andres Freund fcd77d5321 TAP test for logical decoding on standby
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Amit Khandekar <amitdkhan.pg@gmail.com>
Author: Craig Ringer <craig@2ndquadrant.com> (in an older version)
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
2023-04-08 02:24:50 -07:00
Andres Freund 0fdab27ad6 Allow logical decoding on standbys
Unsurprisingly, this requires wal_level = logical to be set on the primary and
standby. The infrastructure added in 26669757b6 ensures that slots are
invalidated if the primary's wal_level is lowered.

Creating a slot on a standby waits for a xl_running_xact record to be
processed. If the primary is idle (and thus not emitting xl_running_xact
records), that can take a while.  To make that faster, this commit also
introduces the pg_log_standby_snapshot() function. By executing it on the
primary, completion of slot creation on the standby can be accelerated.

Note that logical decoding on a standby does not itself enforce that required
catalog rows are not removed. The user has to use physical replication slots +
hot_standby_feedback or other measures to prevent that. If catalog rows
required for a slot are removed, the slot is invalidated.

See 6af1793954 for an overall design of logical decoding on a standby.

Bumps catversion, for the addition of the pg_log_standby_snapshot() function.

Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Andres Freund <andres@anarazel.de> (in an older version)
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: FabrÌzio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
2023-04-08 02:20:05 -07:00
Andres Freund e101dfac3a For cascading replication, wake physical and logical walsenders separately
Physical walsenders can't send data until it's been flushed; logical
walsenders can't decode and send data until it's been applied. On the
standby, the WAL is flushed first, which will only wake up physical
walsenders; and then applied, which will only wake up logical
walsenders.

Previously, all walsenders were awakened when the WAL was flushed. That
was fine for logical walsenders on the primary; but on the standby the
flushed WAL would have been not applied yet, so logical walsenders were
awakened too early.

Per idea from Jeff Davis and Amit Kapila.

Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-By: Jeff Davis <pgsql@j-davis.com>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAA4eK1+zO5LUeisabX10c81LU-fWMKO4M9Wyg1cdkbW7Hqh6vQ@mail.gmail.com
2023-04-08 01:06:00 -07:00
Andres Freund 26669757b6 Handle logical slot conflicts on standby
During WAL replay on the standby, when a conflict with a logical slot is
identified, invalidate such slots. There are two sources of conflicts:
1) Using the information added in 6af1793954, logical slots are invalidated if
   required rows are removed
2) wal_level on the primary server is reduced to below logical

Uses the infrastructure introduced in the prior commit. FIXME: add commit
reference.

Change InvalidatePossiblyObsoleteSlot() to use a recovery conflict to
interrupt use of a slot, if called in the startup process. The new recovery
conflict is added to pg_stat_database_conflicts, as confl_active_logicalslot.

See 6af1793954 for an overall design of logical decoding on a standby.

Bumps catversion for the addition of the pg_stat_database_conflicts column.
Bumps PGSTAT_FILE_FORMAT_ID for the same reason.

Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20230407075009.igg7be27ha2htkbt@awork3.anarazel.de
2023-04-08 00:05:44 -07:00
Andres Freund be87200efd Support invalidating replication slots due to horizon and wal_level
Needed for logical decoding on a standby. Slots need to be invalidated because
of the horizon if rows required for logical decoding are removed. If the
primary's wal_level is lowered from 'logical', logical slots on the standby
need to be invalidated.

The new invalidation methods will be used in a subsequent commit.

Logical slots that have been invalidated can be identified via the new
pg_replication_slots.conflicting column.

See 6af1793954 for an overall design of logical decoding on a standby.

Bumps catversion for the addition of the new pg_replication_slots column.

Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20230407075009.igg7be27ha2htkbt@awork3.anarazel.de
2023-04-07 22:40:27 -07:00
Andres Freund 2ed16aacf1 Fix underspecified sort order in inherit.sql
Introduced in e056c557ae.

Per buildfarm member prion.
2023-04-07 22:25:46 -07:00
Andres Freund 4397abd0a2 Prevent use of invalidated logical slot in CreateDecodingContext()
Previously we had checks for this in multiple places. Support for logical
decoding on standbys will add other forms of invalidation, making it worth
while to centralize the checks.

This slightly changes the error message for both the walsender and SQL
interface. Particularly the SQL interface error was inaccurate, as the "This
slot has never previously reserved WAL" portion was unreachable.

Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/20230407075009.igg7be27ha2htkbt@awork3.anarazel.de
2023-04-07 22:19:05 -07:00
Andres Freund 15f8203a59 Replace replication slot's invalidated_at LSN with an enum
This is mainly useful because the upcoming logical-decoding-on-standby feature
adds further reasons for invalidating slots, and we don't want to end up with
multiple invalidated_* fields, or check different attributes.

Eventually we should consider not resetting restart_lsn when invalidating a
slot due to max_slot_wal_keep_size. But that's a user visible change, so left
for later.

Increases SLOT_VERSION, due to the changed field (with a different alignment,
no less).

Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/20230407075009.igg7be27ha2htkbt@awork3.anarazel.de
2023-04-07 21:47:25 -07:00
Thomas Munro d4e71df6d7 Add io_direct setting (developer-only).
Provide a way to ask the kernel to use O_DIRECT (or local equivalent)
where available for data and WAL files, to avoid or minimize kernel
caching.  This hurts performance currently and is not intended for end
users yet.  Later proposed work would introduce our own I/O clustering,
read-ahead, etc to replace the facilities the kernel disables with this
option.

The only user-visible change, if the developer-only GUC is not used, is
that this commit also removes the obscure logic that would activate
O_DIRECT for the WAL when wal_sync_method=open_[data]sync and
wal_level=minimal (which also requires max_wal_senders=0).  Those are
non-default and unlikely settings, and this behavior wasn't (correctly)
documented.  The same effect can be achieved with io_direct=wal.

Author: Thomas Munro <thomas.munro@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGK1X532hYqJ_MzFWt0n1zt8trz980D79WbjwnT-yYLZpg%40mail.gmail.com
2023-04-08 16:35:07 +12:00
Thomas Munro faeedbcefd Introduce PG_IO_ALIGN_SIZE and align all I/O buffers.
In order to have the option to use O_DIRECT/FILE_FLAG_NO_BUFFERING in a
later commit, we need the addresses of user space buffers to be well
aligned.  The exact requirements vary by OS and file system (typically
sectors and/or memory pages).  The address alignment size is set to
4096, which is enough for currently known systems: it matches modern
sectors and common memory page size.  There is no standard governing
O_DIRECT's requirements so we might eventually have to reconsider this
with more information from the field or future systems.

Aligning I/O buffers on memory pages is also known to improve regular
buffered I/O performance.

Three classes of I/O buffers for regular data pages are adjusted:
(1) Heap buffers are now allocated with the new palloc_aligned() or
MemoryContextAllocAligned() functions introduced by commit 439f6175.
(2) Stack buffers now use a new struct PGIOAlignedBlock to respect
PG_IO_ALIGN_SIZE, if possible with this compiler.  (3) The buffer
pool is also aligned in shared memory.

WAL buffers were already aligned on XLOG_BLCKSZ.  It's possible for
XLOG_BLCKSZ to be configured smaller than PG_IO_ALIGNED_SIZE and thus
for O_DIRECT WAL writes to fail to be well aligned, but that's a
pre-existing condition and will be addressed by a later commit.

BufFiles are not yet addressed (there's no current plan to use O_DIRECT
for those, but they could potentially get some incidental speedup even
in plain buffered I/O operations through better alignment).

If we can't align stack objects suitably using the compiler extensions
we know about, we disable the use of O_DIRECT by setting PG_O_DIRECT to
0.  This avoids the need to consider systems that have O_DIRECT but
can't align stack objects the way we want; such systems could in theory
be supported with more work but we don't currently know of any such
machines, so it's easier to pretend there is no O_DIRECT support
instead.  That's an existing and tested class of system.

Add assertions that all buffers passed into smgrread(), smgrwrite() and
smgrextend() are correctly aligned, unless PG_O_DIRECT is 0 (= stack
alignment tricks may be unavailable) or the block size has been set too
small to allow arrays of buffers to be all aligned.

Author: Thomas Munro <thomas.munro@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/CA+hUKGK1X532hYqJ_MzFWt0n1zt8trz980D79WbjwnT-yYLZpg@mail.gmail.com
2023-04-08 16:34:50 +12:00
Tom Lane db6957bae8 Add missing .gitignore entry.
Seems an oversight in 7d8219a44.  Fix before somebody commits
a generated file.
2023-04-07 23:32:49 -04:00
Stephen Frost 3d4fa227bc Add support for Kerberos credential delegation
Support GSSAPI/Kerberos credentials being delegated to the server by a
client.  With this, a user authenticating to PostgreSQL using Kerberos
(GSSAPI) credentials can choose to delegate their credentials to the
PostgreSQL server (which can choose to accept them, or not), allowing
the server to then use those delegated credentials to connect to
another service, such as with postgres_fdw or dblink or theoretically
any other service which is able to be authenticated using Kerberos.

Both postgres_fdw and dblink are changed to allow non-superuser
password-less connections but only when GSSAPI credentials have been
delegated to the server by the client and GSSAPI is used to
authenticate to the remote system.

Authors: Stephen Frost, Peifeng Qiu
Reviewed-By: David Christensen
Discussion: https://postgr.es/m/CO1PR05MB8023CC2CB575E0FAAD7DF4F8A8E29@CO1PR05MB8023.namprd05.prod.outlook.com
2023-04-07 21:58:04 -04:00
Andres Freund ac8d53dae5 Track IO times in pg_stat_io
a9c70b46db and 8aaa04b32S added counting of IO operations to a new view,
pg_stat_io. Now, add IO timing for reads, writes, extends, and fsyncs to
pg_stat_io as well.

This combines the tracking for pgBufferUsage with the tracking for pg_stat_io
into a new function pgstat_count_io_op_time(). This should make it a bit
easier to avoid the somewhat costly instr_time conversion done for
pgBufferUsage.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/flat/CAAKRu_ay5iKmnbXZ3DsauViF3eMxu4m1oNnJXqV_HyqYeg55Ww%40mail.gmail.com
2023-04-07 17:04:56 -07:00
Peter Geoghegan 1c453cfd89 Show more detail in nbtree rmgr descriptions.
Show a detailed description of the page offset number arrays that appear
in certain nbtree WAL records.

Also brings nbtree desc routines in line with the guidelines established
by recent commit 7d8219a4.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/flat/20230109215842.fktuhesvayno6o4g%40awork3.anarazel.de
2023-04-07 16:46:23 -07:00
Stephen Frost ce5e234085 For Kerberos testing, disable DNS lookups
Similar to 8dff2f224, this disables DNS lookups by the Kerberos library
to look up the KDC and the realm while the Kerberos tests are running.
In some environments, these lookups can take a long time and end up
timing out and causing tests to fail.  Further, since this isn't really
our domain, we shouldn't be sending out these DNS requests during our
tests.
2023-04-07 19:36:46 -04:00
Peter Geoghegan 7d8219a444 Show more detail in heapam rmgr descriptions.
Add helper functions that output arrays in a standard format, and use
the functions inside heapdesc routines.  This allows tools like
pg_walinspect to show a detailed description of the page offset number
arrays for records like PRUNE and VACUUM (unless there was an FPI).

Also document the conventions that desc routines should follow.  Only
the heapdesc routines follow the conventions for now, so they're just
guidelines for the time being.

Based on a suggestion from Andres Freund.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/flat/20230109215842.fktuhesvayno6o4g%40awork3.anarazel.de
2023-04-07 16:08:52 -07:00
Andres Freund 728015a470 Fix table name clash in recently introduced test
A few buildfarm animals recently started complaining about the "child"
relation already existing. e056c557ae added a new child table to inherit.sql,
but triggers.sql, running in the same parallel group, also uses a child table.

Rename the new table to inh_child. It maybe worth renaming child, parent in
other tests as well, but that's work for another day.

Discussion: https://postgr.es/m/20230407204530.52q3v5cu5x6dj676@awork3.anarazel.de
2023-04-07 14:02:46 -07:00
Andres Freund 704261ecc6 Improve IO accounting for temp relation writes
Both pgstat_database and pgBufferUsage count IO timing for reads of temporary
relation blocks into local buffers. However, both failed to count write IO
timing for flushes of dirty local buffers. Fix.

Additionally, FlushRelationBuffers() seems to have omitted counting write
IO (both count and timing) stats for both pgstat_database and
pgBufferUsage. Fix.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20230321023451.7rzy4kjj2iktrg2r%40awork3.anarazel.de
2023-04-07 13:24:26 -07:00
Daniel Gustafsson bf5a894c55 Test SCRAM iteration changes with psql \password
A version of this test was included in the original patch for altering
SCRAM iteration count, but was omitted due to how interactive psql TAP
sessions worked before being refactored.

Discussion: https://postgr.es/m/20230130194350.zj5v467x4jgqt3d6@awork3.anarazel.de
Discussion: https://postgr.es/m/F72E7BC7-189F-4B17-BF47-9735EB72C364@yesql.se
2023-04-07 22:14:23 +02:00
Daniel Gustafsson 664d757531 Refactor background psql TAP functions
This breaks out the background and interactive psql functionality into a
new class, PostgreSQL::Test::BackgroundPsql.  Sessions are still initiated
via PostgreSQL::Test::Cluster, but once started they can be manipulated by
the new helper functions which intend to make querying easier.  A sample
session for a command which can be expected to finish at a later time can
be seen below.

  my $session = $node->background_psql('postgres');
  $bsession->query_until(qr/start/, q(
    \echo start
	CREATE INDEX CONCURRENTLY idx ON t(a);
  ));
  $bsession->quit;

Patch by Andres Freund with some additional hacking by me.

Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/20230130194350.zj5v467x4jgqt3d6@awork3.anarazel.de
2023-04-07 22:14:20 +02:00
Alvaro Herrera 32bc0d022d
Fix underspecified sort order in test query
Fail in e056c557ae.
2023-04-07 20:30:04 +02:00
Alvaro Herrera e056c557ae
Catalog NOT NULL constraints
We now create pg_constaint rows for NOT NULL constraints with
contype='n'.

We propagate these constraints during operations such as adding
inheritance relationships, creating and attaching partitions, creating
tables LIKE other tables.  We mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations; for example, as opposed to CHECK constraints, we don't
match NOT NULL ones by name when descending a hierarchy to alter it;
instead we match by column number.  This means we don't require the
constraint names to be identical across a hierarchy.

For now, we omit them from system catalogs.  Maybe this is worth
reconsidering.  We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)

This has been very long in the making.  The first patch was written by
Bernd Helmle in 2010 to add a new pg_constraint.contype value ('n'),
which I (Álvaro) then hijacked in 2011 and 2012, until that one was
killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints.  However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again.

In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring additional pg_attribute columns to
track the OID of the NOT NULL constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>

Discussion: https://postgr.es/m/CACA0E642A0267EDA387AF2B%40%5B172.26.14.62%5D
Discussion: https://postgr.es/m/AANLkTinLXMOEMz+0J29tf1POokKi4XDkWJ6-DDR9BKgU@mail.gmail.com
Discussion: https://postgr.es/m/20110707213401.GA27098@alvh.no-ip.org
Discussion: https://postgr.es/m/1343682669-sup-2532@alvh.no-ip.org
Discussion: https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Discussion: https://postgr.es/m/20220817181249.q7qvj3okywctra3c@alvherre.pgsql
2023-04-07 19:59:57 +02:00
Tom Lane ff245a3788 Doc: improve descriptions of max_[pred_]locks_per_transaction GUCs.
The old wording described these as being multiplied by max_connections
plus max_prepared_transactions, which hasn't been exactly right for
some time thanks to the addition of various auxiliary processes.
Moreover, exactness here is a bit pointless given that the lock tables
can expand into the initially-unallocated "slop" space in shared
memory.  Rather than trying to track exactly what the code is doing,
let's just use the term "server processes".

Likewise adjust these GUCs' description strings in guc_tables.c.

Wang Wei, reviewed by Nathan Bossart and myself

Discussion: https://postgr.es/m/OS3PR01MB6275BDD09C9B875C65FCC5AB9EA39@OS3PR01MB6275.jpnprd01.prod.outlook.com
2023-04-07 13:29:29 -04:00
Tom Lane 888f2ea0a8 Add array_sample() and array_shuffle() functions.
These are useful in Monte Carlo applications.

Martin Kalcher, reviewed/adjusted by Daniel Gustafsson and myself

Discussion: https://postgr.es/m/9d160a44-7675-51e8-60cf-6d64b76db831@aboutsource.net
2023-04-07 11:47:07 -04:00
Tom Lane cd82e5c79d Fix locale-dependent test case.
psql parses the interval argument of \watch with locale-dependent
strtod().  In commit 00beecfe8 I added a test case that exercises
a fractional interval, but I hard-coded 0.01 which doesn't work
in locales where the radix point isn't ".".  We don't want to
change this longstanding parsing behavior, so fix the test case
to generate a suitably locale-aware spelling.

Report and patch by Alexander Korotkov.

Discussion: https://postgr.es/m/CAPpHfdv+10Uk6FWjsh3+ju7kHYr76LaRXbYayXmrM7FBU-=Hgg@mail.gmail.com
2023-04-07 10:35:11 -04:00
Andres Freund 21d7c05a5c Fix copy-paste bug in 12f3867f55 triggering an assert after a write error
The same condition accidentally was copied to both branches. Manual testing
confirms that otherwise the error recovery path works fine.

Found while reviewing the logical-decoding-on-standby patch.
2023-04-07 01:02:46 -07:00
Amit Kapila 96c498d2f8 Add tab-completion for newly added SUBSCRIPTION options.
Commits c3afe8cf5a and 482675987b added new subscription options
"password_required" and "run_as_owner". This patch adds tab-completion
for these newly added options.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pu=pnJf=SS1583pknSQ3CbOqLCkWcJCQYt6zxTagHEdmw@mail.gmail.com
2023-04-07 10:32:36 +05:30
Michael Paquier 8fcb32db98 Add more protections in WAL record APIs against overflows
This commit adds a limit to the size of an XLogRecord at 1020MB, based
on a suggestion by Heikki Linnakangas.  This counts for the overhead
needed by the XLogReader when allocating the memory it needs to read a
record in DecodeXLogRecordRequiredSpace(), based on the record size.  An
assertion based on that is added to detect that any additions in the
XLogReader facilities would not cause any overflows.  If that's ever the
case, the upper bound allowed would need to be adjusted.

Before this, it was possible for an external module to create WAL
records large enough to be assembled but not replayable, causing
failures when replaying such WAL records on standbys.  One case
mentioned where this is possible is the in-core function
pg_logical_emit_message() (wrapper for LogLogicalMessage), that allows
to emit WAL records with an arbitrary amount of data potentially higher
than the replay limit of approximately 1GB (limit of a palloc, minus the
overhead needed by a XLogReader).

This commit is a follow-up of ffd1b6b that has added similar protections
for the block-level data.  Here, the checks are extended to the whole
record length, mainrdata_len being extended from uint32 to uint64 with
the routines registering buffer and record data still limited to uint32
to minimize the checks when assembling a record.  All the error messages
related to overflow checks are improved to provide more context about
the error happening.

Author: Matthias van de Meent
Reviewed-by: Andres Freund, Heikki Linnakangas, Michael Paquier
Discussion: https://postgr.es/m/CAEze2WgGiw+LZt+vHf8tWqB_6VxeLsMeoAuod0N=ij1q17n5pw@mail.gmail.com
2023-04-07 10:10:17 +09:00
Andres Freund 26158b852d Use ExtendBufferedRelTo() in XLogReadBufferExtended()
Instead of extending the relation block-by-block, use ExtendBufferedRelTo(),
introduced in 31966b151e. This is faster and simpler.

This also somewhat reduces the danger that disconnected segments pose (which
can be "discovered" once the previous segment reaches SEGSIZE), as
ExtendBufferedRelTo() won't extend past the block it has been asked. However,
the risk of the content of such a disconnected segment being invalid
remains.

Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
Discussion: https://postgr.es/m/20230223010147.32oir7sb66slqnjk@awork3.anarazel.de
2023-04-06 17:56:17 -07:00
David Rowley ae78cae3be Add --buffer-usage-limit option to vacuumdb
1cbbee033 added BUFFER_USAGE_LIMIT to the VACUUM and ANALYZE commands, so
here we permit that option to be specified in vacuumdb.

In passing, adjust the documents for vacuum_buffer_usage_limit and the
BUFFER_USAGE_LIMIT VACUUM option to mention "kB" rather than "KB".  Do the
same for the ERROR message in ExecVacuum() and
check_vacuum_buffer_usage_limit().  Without that we might tell a user that
the valid minimum value is 128 KB only to reject that because we accept
only "kB" and not "KB".

Also, add a small reminder comment in vacuum.h to try to trigger the
memory of anyone adding new fields to VacuumParams that they might want to
consider if vacuumdb needs to grow a new option too.

Author: Melanie Plageman
Reviewed-by: Justin Pryzby
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/ZAzTg3iEnubscvbf@telsasoft.com
2023-04-07 12:47:10 +12:00
Andres Freund 00d1e02be2 hio: Use ExtendBufferedRelBy() to extend tables more efficiently
While we already had some form of bulk extension for relations, it was fairly
limited. It only amortized the cost of acquiring the extension lock, the
relation itself was still extended one-by-one. Bulk extension was also solely
triggered by contention, not by the amount of data inserted.

To address this, use ExtendBufferedRelBy(), introduced in 31966b151e, to
extend the relation. We try to extend the relation by multiple blocks in two
situations:

1) The caller tells RelationGetBufferForTuple() that it will need multiple
   pages. For now that's only used by heap_multi_insert(), see commit FIXME.

2) If there is contention on the extension lock, use the number of waiters for
   the lock as a multiplier for the number of blocks to extend by. This is
   similar to what we already did. Previously we additionally multiplied the
   numbers of waiters by 20, but with the new relation extension
   infrastructure I could not see a benefit in doing so.

Using the freespacemap to provide empty pages can cause significant
contention, and adds measurable overhead, even if there is no contention. To
reduce that, remember the blocks the relation was extended by in the
BulkInsertState, in the extending backend. In case 1) from above, the blocks
the extending backend needs are not entered into the FSM, as we know that we
will need those blocks.

One complication with using the FSM to record empty pages, is that we need to
insert blocks into the FSM, when we already hold a buffer content lock. To
avoid doing IO while holding a content lock, release the content lock before
recording free space. Currently that opens a small window in which another
backend could fill the block, if a concurrent VACUUM records the free
space. If that happens, we retry, similar to the already existing case when
otherBuffer is provided. In the future it might be worth closing the race by
preventing VACUUM from recording the space in newly extended pages.

This change provides very significant wins (3x at 16 clients, on my
workstation) for concurrent COPY into a single relation. Even single threaded
COPY is measurably faster, primarily due to not dirtying pages while
extending, if supported by the operating system (see commit 4d330a61bb). Even
single-row INSERTs benefit, although to a much smaller degree, as the relation
extension lock rarely is the primary bottleneck.

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
2023-04-06 16:53:17 -07:00
David Rowley 1cbbee0338 Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT option
Add new options to the VACUUM and ANALYZE commands called
BUFFER_USAGE_LIMIT to allow users more control over how large to make the
buffer access strategy that is used to limit the usage of buffers in
shared buffers.  Larger rings can allow VACUUM to run more quickly but
have the drawback of VACUUM possibly evicting more buffers from shared
buffers that might be useful for other queries running on the database.

Here we also add a new GUC named vacuum_buffer_usage_limit which controls
how large to make the access strategy when it's not specified in the
VACUUM/ANALYZE command.  This defaults to 256KB, which is the same size as
the access strategy was prior to this change.  This setting also
controls how large to make the buffer access strategy for autovacuum.

Per idea by Andres Freund.

Author: Melanie Plageman
Reviewed-by: David Rowley
Reviewed-by: Andres Freund
Reviewed-by: Justin Pryzby
Reviewed-by: Bharath Rupireddy
Discussion: https://postgr.es/m/20230111182720.ejifsclfwymw2reb@awork3.anarazel.de
2023-04-07 11:40:31 +12:00
Andres Freund 5279e9db8e heapam: Pass number of required pages to RelationGetBufferForTuple()
A future commit will use this information to determine how aggressively to
extend the relation by. In heap_multi_insert() we know accurately how many
pages we need once we need to extend the relation, providing an accurate lower
bound for how much to extend.

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
2023-04-06 16:17:16 -07:00
Daniel Gustafsson 7d71d3dd08 Refresh cost-based delay params more frequently in autovacuum
Allow autovacuum to reload the config file more often so that cost-based
delay parameters can take effect while VACUUMing a relation. Previously,
autovacuum workers only reloaded the config file once per relation
vacuumed, so config changes could not take effect until beginning to
vacuum the next table.

Now, check if a reload is pending roughly once per block, when checking
if we need to delay.

In order for autovacuum workers to safely update their own cost delay
and cost limit parameters without impacting performance, we had to
rethink when and how these values were accessed.

Previously, an autovacuum worker's wi_cost_limit was set only at the
beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() was called, workers
vacuuming tables with no cost-related storage parameters could still
have different values for their wi_cost_limit_base and wi_cost_delay.

Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of
cost-related storage parameters). This removes the rationale for keeping
cost limit and cost delay in shared memory. Balancing the cost limit
requires only the number of active autovacuum workers vacuuming a table
with no cost-based storage parameters.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
2023-04-07 01:00:21 +02:00
Daniel Gustafsson a85c60a945 Separate vacuum cost variables from GUCs
Vacuum code run both by autovacuum workers and a backend doing
VACUUM/ANALYZE previously inspected VacuumCostLimit and VacuumCostDelay,
which are the global variables backing the GUCs vacuum_cost_limit and
vacuum_cost_delay.

Autovacuum workers needed to override these variables with their
own values, derived from autovacuum_vacuum_cost_limit and
autovacuum_vacuum_cost_delay and worker cost limit balancing logic.
This led to confusing code which, in some cases, both derived and
set a new value of VacuumCostLimit from VacuumCostLimit.

In preparation for refreshing these GUC values more often, introduce
new, independent global variables and add a function to update them
using the GUCs and existing logic.

Per suggestion by Kyotaro Horiguchi

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
2023-04-07 00:54:53 +02:00
Daniel Gustafsson 71a825194f Make vacuum failsafe_active globally visible
While vacuuming a table in failsafe mode, VacuumCostActive should
not be re-enabled.  This currently isn't a problem because vacuum
cost parameters are only refreshed in between vacuuming tables and
failsafe status is reset for every table.

In preparation for allowing vacuum cost parameters to be updated
more frequently, elevate LVRelState->failsafe_active to a global,
VacuumFailsafeActive, which will be checked when determining whether
or not to re-enable vacuum cost-related delays.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
2023-04-07 00:54:08 +02:00
Tom Lane 5499706bdf Stabilize just-added regression test cases.
The tests added by commits 029dea882 et al turn out to produce
different output under -DRANDOMIZE_ALLOCATED_MEMORY.  This is
not a bug exactly: that flag causes coerce_type() to invoke
the input function twice when coercing an unknown-type literal
to a specific type.  So you get tsqueryin's bleat about an empty
tsquery twice.  Revise the test query to avoid that.

Discussion: https://postgr.es/m/20230406213813.uep7plg6lvcywujo@awork3.anarazel.de
2023-04-06 18:13:49 -04:00
Tom Lane 31ae2aa9d2 psql: set SHELL_ERROR and SHELL_EXIT_CODE in more places.
Make the \g, \o, \w, and \copy commands set these variables
when closing a pipe.  We missed doing this in commit b0d8f2d98,
but it seems like a good idea.

There are some remaining places in psql that intentionally don't
update these variables after running a child program:
	* pager invocations
	* backtick evaluation within a prompt
	* \e (edit query buffer)

Corey Huinker and Tom Lane

Discussion: https://postgr.es/m/CADkLM=eSKwRGF-rnRqhtBORRtL49QsjcVUCa-kLxKTqxypsakw@mail.gmail.com
2023-04-06 17:33:38 -04:00
Tom Lane 029dea882a Fix ts_headline() edge cases for empty query and empty search text.
tsquery's GETQUERY() macro is only safe to apply to a tsquery
that is known non-empty; otherwise it gives a pointer to garbage.
Before commit 5a617d75d, ts_headline() avoided this pitfall, but
only in a very indirect, nonobvious way.  (hlCover could not reach
its TS_execute call, because if the query contains no lexemes
then hlFirstIndex would surely return -1.)  After that commit,
it fell into the trap, resulting in weird errors such as
"unrecognized operator" and/or valgrind complaints.  In HEAD,
fix this by not calling TS_execute_locations() at all for an
empty query.  In the back branches, add a defensive check to
hlCover() --- that's not fixing any live bug, but I judge the
code a bit too fragile as-is.

Also, both mark_hl_fragments() and mark_hl_words() were careless
about the possibility of empty search text: in the cases where
no match has been found, they'd end up telling mark_fragment() to
mark from word indexes 0 to 0 inclusive, even when there is no
word 0.  This is harmless since we over-allocated the prs->words
array, but it does annoy valgrind.  Fix so that the end index is -1
and thus mark_fragment() will do nothing in such cases.

Bottom line is that this fixes a live bug in HEAD, but in the
back branches it's only getting rid of a valgrind nitpick.
Back-patch anyway.

Per report from Alexander Lakhin.

Discussion: https://postgr.es/m/c27f642d-020b-01ff-ae61-086af287c4fd@gmail.com
2023-04-06 15:52:44 -04:00
Andres Freund 18103b7c5f hio: Don't pin the VM while holding buffer lock while extending
Starting with commit 7db0cd2145, RelationGetBufferForTuple() did a
visibilitymap_pin() while holding an exclusive buffer content lock on a newly
extended page, when using COPY FREEZE. We elsewhere try hard to avoid to doing
IO while holding a content lock. And until 14f98e0af9, that happened while
holding the relation extension lock.

Practically, this isn't a huge issue, because COPY FREEZE is restricted to
relations created or truncated in the current session, so it's unlikely
there's a lot of contention.

We can't avoid doing IO while holding the content lock by pinning the VM
earlier, because we don't know which page it will be on.

While we could just ignore the issue in this case, a future commit will add
bulk relation extension, which needs to enter pages into the FSM while also
trying to hold onto a buffer lock.

To address this issue, use visibilitymap_pin_ok() to see if the relevant
buffer is already pinned. If not, release the buffer, pin the VM buffer, and
acquire the lock again. This opens up a small window for other backends to
insert data onto the page - as the page is not entered into the freespacemap,
other backends won't see it normally, but a concurrent vacuum could enter the
page, if it started just after the relation is extended. In case the page is
used by another backend, retry. This is very similar to how locking
"otherBuffer" is already dealt with.

Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: http://postgr.es/m/20230325025740.wzvchp2kromw4zqz@awork3.anarazel.de
2023-04-06 11:11:13 -07:00
Andres Freund bba9003b62 hio: Relax rules for calling GetVisibilityMapPins()
GetVisibilityMapPins() insisted on the buffer1/buffer2 being in a specific
order. This required checks at the callsite. As a subsequent patch will add
another callsite, move related logic into GetVisibilityMapPins().

Discussion: https://postgr.es/m/20230403190030.fk2frxv6faklrseb@awork3.anarazel.de
2023-04-06 10:36:30 -07:00
Tom Lane 00beecfe83 psql: add an optional execution-count limit to \watch.
\watch can now be told to stop after N executions of the query.

With the idea that we might want to add more options to \watch
in future, this patch generalizes the command's syntax to a list
of name=value options, with the interval allowed to omit the name
for backwards compatibility.

Andrey Borodin, reviewed by Kyotaro Horiguchi, Nathan Bossart,
Michael Paquier, Yugo Nagata, and myself

Discussion: https://postgr.es/m/CAAhFRxiZ2-n_L1ErMm9AZjgmUK=qS6VHb+0SaMn8sqqbhF7How@mail.gmail.com
2023-04-06 13:18:14 -04:00
Tomas Vondra 2820adf775 Support long distance matching for zstd compression
zstd compression supports a special mode for finding matched in distant
past, which may result in better compression ratio, at the expense of
using more memory (the window size is 128MB).

To enable this optional mode, use the "long" keyword when specifying the
compression method (--compress=zstd:long).

Author: Justin Pryzby
Reviewed-by: Tomas Vondra, Jacob Champion
Discussion: https://postgr.es/m/20230224191840.GD1653@telsasoft.com
Discussion: https://postgr.es/m/20220327205020.GM28503@telsasoft.com
2023-04-06 17:18:42 +02:00
David Rowley b9b125b9c1 Move various prechecks from vacuum() into ExecVacuum()
vacuum() is used for both the VACUUM command and for autovacuum. There
were many prechecks being done inside vacuum() that were just not relevant
to autovacuum.  Let's move the bulk of these into ExecVacuum() so that
they're only executed when running the VACUUM command.  This removes a
small amount of overhead when autovacuum vacuums a table.

While we are at it, allocate VACUUM's BufferAccessStrategy in ExecVacuum()
and pass it into vacuum() instead of expecting vacuum() to make it if it's
not already made by the calling function.  To make this work, we need to
create the vacuum memory context slightly earlier, so we now need to pass
that down to vacuum() so that it's available for use in other memory
allocations.

Author: Melanie Plageman
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/20230405211534.4skgskbilnxqrmxg@awork3.anarazel.de
2023-04-06 15:44:52 +12:00
Andres Freund acab1b0914 Convert many uses of ReadBuffer[Extended](P_NEW) to ExtendBufferedRel()
A few places are not converted. Some because they are tackled in later
commits (e.g. hio.c, xlogutils.c), some because they are more
complicated (e.g. brin_pageops.c).  Having a few users of ReadBuffer(P_NEW) is
good anyway, to ensure the backward compat path stays working.

Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
2023-04-05 18:57:29 -07:00
Andres Freund fcdda1e4b5 Use ExtendBufferedRelTo() in {vm,fsm}_extend()
This uses ExtendBufferedRelTo(), introduced in 31966b151e, to extend the
visibilitymap and freespacemap to the size needed.

It also happens to fix a warning introduced in 3d6a98457d, reported by Tom
Lane.

Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
Discussion: https://postgr.es/m/2194723.1680736788@sss.pgh.pa.us
2023-04-05 17:50:09 -07:00
David Rowley bccd6908ca Always make a BufferAccessStrategy for ANALYZE
32fbe0239 changed things so we didn't bother allocating the
BufferAccessStrategy during VACUUM (ONLY_DATABASE_STATS); and VACUUM
(FULL), however, it forgot to consider that VACUUM (FULL, ANALYZE) is a
possible combination.  That change would have resulted in such a command
allowing ANALYZE to make full use of shared buffers, which wasn't
intended, so fix that.

Reported-by: Melanie Plageman
Discussion: https://postgr.es/m/CAAKRu_bJRKe+v_=OqwC+5sA3j5qv8rqdAwy3+yHaO3wmtfrCRg@mail.gmail.com
2023-04-06 12:37:03 +12:00
Michael Paquier 1d477a907e Fix row tracking in pg_stat_statements with extended query protocol
pg_stat_statements relies on EState->es_processed to count the number of
rows processed by ExecutorRun().  This proves to be a problem under the
extended query protocol when the result of a query is fetched through
more than one call of ExecutorRun(), as es_processed is reset each time
ExecutorRun() is called.  This causes pg_stat_statements to report the
number of rows calculated in the last execute fetch, rather than the
global sum of all the rows processed.

As pquery.c tells, this is a problem when a portal does not use
holdStore.  For example, DMLs with RETURNING would report a correct
tuple count as these do one execution cycle when the query is first
executed to fill in the portal's store with one ExecutorRun(), feeding
on the portal's store for each follow-up execute fetch depending on the
fetch size requested by the client.

The fix proposed for this issue is simple with the addition of an extra
counter in EState that's preserved across multiple ExecutorRun() calls,
incremented with the value calculated in es_processed.  This approach is
not back-patchable, unfortunately.

Note that libpq does not currently give any way to control the fetch
size when using the extended v3 protocol, meaning that in-core testing
is not possible yet.  This issue can be easily verified with the JDBC
driver, though, with *autocommit disabled*.  Hence, having in-core tests
requires more features, left for future discussion:
- At least two new libpq routines splitting PQsendQueryGuts(), one for
the bind/describe and a second for a series of execute fetches with a
custom fetch size, likely in a fashion similar to what JDBC does.
- A psql meta-command for the execute phase.  This part is not strictly
mandatory, still it could be handy.

Reported-by: Andrew Dunstan (original discovery by Simon Siggs)
Author: Sami Imseih
Reviewed-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/EBE6C507-9EB6-4142-9E4D-38B1673363A7@amazon.com
Discussion: https://postgr.es/m/c90890e7-9c89-c34f-d3c5-d5c763a34bd8@dunslane.net
2023-04-06 09:29:03 +09:00
Andres Freund 31966b151e bufmgr: Introduce infrastructure for faster relation extension
The primary bottlenecks for relation extension are:

1) The extension lock is held while acquiring a victim buffer for the new
   page. Acquiring a victim buffer can require writing out the old page
   contents including possibly needing to flush WAL.

2) When extending via ReadBuffer() et al, we write a zero page during the
   extension, and then later write out the actual page contents. This can
   nearly double the write rate.

3) The existing bulk relation extension infrastructure in hio.c just amortized
   the cost of acquiring the relation extension lock, but none of the other
   costs.

Unfortunately 1) cannot currently be addressed in a central manner as the
callers to ReadBuffer() need to acquire the extension lock. To address that,
this this commit moves the responsibility for acquiring the extension lock
into bufmgr.c functions. That allows to acquire the relation extension lock
for just the required time. This will also allow us to improve relation
extension further, without changing callers.

The reason we write all-zeroes pages during relation extension is that we hope
to get ENOSPC errors earlier that way (largely works, except for CoW
filesystems). It is easier to handle out-of-space errors gracefully if the
page doesn't yet contain actual tuples. This commit addresses 2), by using the
recently introduced smgrzeroextend(), which extends the relation, without
dirtying the kernel page cache for all the extended pages.

To address 3), this commit introduces a function to extend a relation by
multiple blocks at a time.

There are three new exposed functions: ExtendBufferedRel() for extending the
relation by a single block, ExtendBufferedRelBy() to extend a relation by
multiple blocks at once, and ExtendBufferedRelTo() for extending a relation up
to a certain size.

To avoid duplicating code between ReadBuffer(P_NEW) and the new functions,
ReadBuffer(P_NEW) now implements relation extension with
ExtendBufferedRel(), using a flag to tell ExtendBufferedRel() that the
relation lock is already held.

Note that this commit does not yet lead to a meaningful performance or
scalability improvement - for that uses of ReadBuffer(P_NEW) will need to be
converted to ExtendBuffered*(), which will be done in subsequent commits.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
2023-04-05 16:21:09 -07:00
Daniel Gustafsson 8eda731465 Allow to use system CA pool for certificate verification
This adds a new option to libpq's sslrootcert, "system", which will load
the system trusted CA roots for certificate verification. This is a more
convenient way to achieve this than pointing to the system CA roots
manually since the location can differ by installation and be locally
adjusted by env vars in OpenSSL.

When sslrootcert is set to system, sslmode is forced to be verify-full
as weaker modes aren't providing much security for public CAs.

Changing the location of the system roots by setting environment vars is
not supported by LibreSSL so the tests will use a heuristic to determine
if the system being tested is LibreSSL or OpenSSL.

The workaround in .cirrus.yml is required to handle a strange interaction
between homebrew and the openssl@3 formula; hopefully this can be removed
in the near future.

The original patch was written by Thomas Habets, which was later revived
by Jacob Champion.

Author: Jacob Champion <jchampion@timescale.com>
Author: Thomas Habets <thomas@habets.se>
Reviewed-by: Jelte Fennema <postgres@jeltef.nl>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Magnus Hagander <magnus@hagander.net>
Discussion: https://www.postgresql.org/message-id/flat/CA%2BkHd%2BcJwCUxVb-Gj_0ptr3_KZPwi3%2B67vK6HnLFBK9MzuYrLA%40mail.gmail.com
2023-04-05 23:22:17 +02:00
Andres Freund 12f3867f55 bufmgr: Support multiple in-progress IOs by using resowner
A future patch will add support for extending relations by multiple blocks at
once. To be concurrency safe, the buffers for those blocks need to be marked
as BM_IO_IN_PROGRESS. Until now we only had infrastructure for recovering from
an IO error for a single buffer. This commit extends that infrastructure to
multiple buffers by using the resource owner infrastructure.

This commit increases the size of the ResourceOwnerData struct, which appears
to have a just about measurable overhead in very extreme workloads. Medium
term we are planning to substantially shrink the size of
ResourceOwnerData. Short term the increase is small enough to not worry about
it for now.

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
Discussion: https://postgr.es/m/20221029200025.w7bvlgvamjfo6z44@awork3.anarazel.de
2023-04-05 14:17:55 -07:00
Tom Lane 16dc2703c5 Support "Right Anti Join" plan shapes.
Merge and hash joins can support antijoin with the non-nullable input
on the right, using very simple combinations of their existing logic
for right join and anti join.  This gives the planner more freedom
about how to order the join.  It's particularly useful for hash join,
since we may now have the option to hash the smaller table instead
of the larger.

Richard Guo, reviewed by Ronan Dunklau and myself

Discussion: https://postgr.es/m/CAMbWs48xh9hMzXzSy3VaPzGAz+fkxXXTUbCLohX1_L8THFRm2Q@mail.gmail.com
2023-04-05 16:59:09 -04:00
Andres Freund dad50f677c bufmgr: Acquire and clean victim buffer separately
Previously we held buffer locks for two buffer mapping partitions at the same
time to change the identity of buffers.  Particularly for extending relations
needing to hold the extension lock while acquiring a victim buffer is
painful.But it also creates a bottleneck for workloads that just involve
reads.

Now we instead first acquire a victim buffer and write it out, if
necessary. Then we remove that buffer from the old partition with just the old
partition's partition lock held and insert it into the new partition with just
that partition's lock held.

By separating out the victim buffer acquisition, future commits will be able
to change relation extensions to scale better.

On my workstation, a micro-benchmark exercising buffered reads strenuously and
under a lot of concurrency, sees a >2x improvement.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
2023-04-05 13:47:46 -07:00
Tom Lane 65eb2d00c6 Acquire locks on views in AcquirePlannerLocks, too.
Commit 47bb9db75 taught AcquireExecutorLocks to re-acquire locks
on views using data from their RTE_SUBQUERY replacements, but
it now seems like we should make AcquirePlannerLocks do the same.
In this way, if a view has been redefined, we will notice that
a bit earlier while checking validity of a cached plan and thereby
avoid some wasted work.

Report and patch by Amit Langote.

Discussion: https://postgr.es/m/CA+HiwqH0xZOQ+GQAdKeckY1R4NOeHdzhtfxkAMJLSchpapNk5w@mail.gmail.com
2023-04-05 15:56:35 -04:00
Tomas Vondra 84adc8e20f pg_dump: Add support for zstd compression
Allow pg_dump to use the zstd compression, in addition to gzip/lz4. Bulk
of the new compression method is implemented in compress_zstd.{c,h},
covering the pg_dump compression APIs. The rest of the patch adds test
and makes various places aware of the new compression method.

The zstd library (which this patch relies on) supports multithreaded
compression since version 1.5. We however disallow that feature for now,
as it might interfere with parallel backups on platforms that rely on
threads (e.g. Windows). This can be improved / relaxed in the future.

This also fixes a minor issue in InitDiscoverCompressFileHandle(), which
was not updated to check if the file already has the .lz4 extension.

Adding zstd compression was originally proposed in 2020 (see the second
thread), but then was reworked to use the new compression API introduced
in e9960732a9. I've considered both threads when compiling the list of
reviewers.

Author: Justin Pryzby
Reviewed-by: Tomas Vondra, Jacob Champion, Andreas Karlsson
Discussion: https://postgr.es/m/20230224191840.GD1653@telsasoft.com
Discussion: https://postgr.es/m/20201221194924.GI30237@telsasoft.com
2023-04-05 21:39:33 +02:00
Andres Freund 794f259447 bufmgr: Add Pin/UnpinLocalBuffer()
So far these were open-coded in quite a few places, without a good reason.

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
2023-04-05 10:42:17 -07:00
Andres Freund 819b69a81d bufmgr: Add some more error checking [infrastructure] around pinning
This adds a few more assertions against a buffer being local in places we
don't expect, and extracts the check for a buffer being pinned exactly once
from LockBufferForCleanup() into its own function. Later commits will use this
function.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: http://postgr.es/m/419312fd-9255-078c-c3e3-f0525f911d7f@iki.fi
2023-04-05 10:42:17 -07:00
Andres Freund 4d330a61bb Add smgrzeroextend(), FileZero(), FileFallocate()
smgrzeroextend() uses FileFallocate() to efficiently extend files by multiple
blocks. When extending by a small number of blocks, use FileZero() instead, as
using posix_fallocate() for small numbers of blocks is inefficient for some
file systems / operating systems. FileZero() is also used as the fallback for
FileFallocate() on platforms / filesystems that don't support fallocate.

A big advantage of using posix_fallocate() is that it typically won't cause
dirty buffers in the kernel pagecache. So far the most common pattern in our
code is that we smgrextend() a page full of zeroes and put the corresponding
page into shared buffers, from where we later write out the actual contents of
the page. If the kernel, e.g. due to memory pressure or elapsed time, already
wrote back the all-zeroes page, this can lead to doubling the amount of writes
reaching storage.

There are no users of smgrzeroextend() as of this commit. That will follow in
future commits.

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: John Naylor <john.naylor@enterprisedb.com>
Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
2023-04-05 10:06:39 -07:00
Tom Lane 4766eef317 Fix another issue with ENABLE/DISABLE TRIGGER on partitioned tables.
In v13 and v14, the ENABLE/DISABLE TRIGGER USER variant malfunctioned
on cloned triggers, failing to find the clones because it thought they
were system triggers.  Other variants of ENABLE/DISABLE TRIGGER would
improperly apply a superuserness check.  Fix by adjusting the is-it-
a-system-trigger check to match reality in those branches.  (As far
as I can find, this is the only place that got it wrong.)

There's no such bug in v15/HEAD, because we revised the catalog
representation of system triggers to be what this code was expecting.
However, add the test case to these branches anyway, because this area
is visibly pretty fragile.  Also remove an obsoleted comment.

The recent v15/HEAD commit 6949b921d fixed a nearby bug.  I now see
that my commit message for that was inaccurate: the behavior of
recursing to clone triggers is older than v15, but it didn't apply
to the case in v13/v14 because in those branches parent partitioned
tables have no pg_trigger entries for foreign-key triggers.  But add
the test case from that commit to v13/v14, just to show what is
happening there.

Per bug #17886 from DzmitryH.

Discussion: https://postgr.es/m/17886-5406d5d828aa4aa3@postgresql.org
2023-04-05 12:56:32 -04:00
Andres Freund 3d6a98457d Don't initialize page in {vm,fsm}_extend(), not needed
The read path needs to be able to initialize pages anyway, as relation
extensions are not durable. By avoiding initializing pages, we can, in a
future patch, extend the relation by multiple blocks at once.

Using smgrextend() for {vm,fsm}_extend() is not a good idea in general, as at
least one page of the VM/FSM will be read immediately after, always causing a
cache miss, requiring us to read content we just wrote.

Discussion: https://postgr.es/m/20230301223515.pucbj7nb54n4i4nv@awork3.anarazel.de
2023-04-05 08:19:39 -07:00
Robert Haas 86a3fc7ec8 Fix wrong word in comment.
Reported by Peter Smith.

Discussion: http://postgr.es/m/CAHut+Pt52ueOEAO-G5qeZiiPv1p9pBT_W5Vj3BTYfC8sD8LFxw@mail.gmail.com
2023-04-05 09:50:08 -04:00
Peter Eisentraut f275af8cb8 Update information_schema for SQL:2023
This is mainly a light renumbering to match the sections in the
standard.

The comments for SQL_IMPLEMENTATION_INFO and SQL_SIZING are no longer
applicable because the required information has been moved into part
9.
2023-04-05 09:57:44 +02:00
Peter Eisentraut c9f57541d9 doc: Update SQL features/conformance information to SQL:2023
Optional subfeatures have been changed to top-level features, so there
is a bit of a churn in the list for that.

Some existing functions have been added to the standard, so they are
moved from the "other" to the "standard" lists in their sections.

Discussion: https://www.postgresql.org/message-id/flat/63f285d9-4ec8-0c9e-4bf5-e76334ddc0af@enterprisedb.com
2023-04-05 09:20:25 +02:00
Peter Eisentraut c209d317e9 Fix minor signed/unsigned mixup
The chunk header is unsigned, and the output format takes unsigned, so
casting it to signed in between is incorrect.
2023-04-05 07:34:52 +02:00
Amit Kapila 8df0d3d530 Add Copyright notice in 001_basic.pl and 002_pg_upgrade.pl.
Author: Kuroda Hayato
Discussion: https://postgr.es/m/TYCPR01MB587073D91E372B8EF719931EF5929@TYCPR01MB5870.jpnprd01.prod.outlook.com
2023-04-05 09:20:14 +05:30
Andres Freund 3f695b3117 sequences: Lock buffer before initializing page
fill_seq_fork_with_data(), used to initialize a new sequence relation, only
locked the buffer after calling PageInit(), even though PageInit() modifies
page contents.

This is unlikely to cause real-world issues, as the relation is exclusively
locked at that point, and the buffer not yet marked dirty, so other processes
should not access the buffer.

This issue looks to have been present since the introduction of sequences in
e8647c45d6.

Given the low risk, it does not seem worth backpatching the fix.

Discussion: https://postgr.es/m/20230404185501.wdkmo3k7bedlx6qk@awork3.anarazel.de
2023-04-04 16:42:52 -07:00
Jeff Davis 36320cbc16 Fix MSVC warning introduced in ea1db8ae70.
Discussion: https://postgr.es/m/CA+hUKGJR1BhCORa5WdvwxztD3arhENcwaN1zEQ1Upg20BwjKWA@mail.gmail.com
Reported-by: Thomas Munro
2023-04-04 15:43:18 -07:00
Thomas Munro f303ec6210 Remove comment obsoleted by 11c2d6fd.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1604497.1680637072%40sss.pgh.pa.us
2023-04-05 09:40:34 +12:00
Jeff Davis ea1db8ae70 Canonicalize ICU locale names to language tags.
Convert to BCP47 language tags before storing in the catalog, except
during binary upgrade or when the locale comes from an existing
collation or template database.

The resulting language tags can vary slightly between ICU
versions. For instance, "@colBackwards=yes" is converted to
"und-u-kb-true" in older versions of ICU, and to the simpler (but
equivalent) "und-u-kb" in newer versions.

The process of canonicalizing to a language tag also understands more
input locale string formats than ucol_open(). For instance,
"fr_CA.UTF-8" is misinterpreted by ucol_open() and the region is
ignored; effectively treating it the same as the locale "fr" and
opening the wrong collator. Canonicalization properly interprets the
language and region, resulting in the language tag "fr-CA", which can
then be understood by ucol_open().

This commit fixes a problem in prior versions due to ucol_open()
misinterpreting locale strings as described above. For instance,
creating an ICU collation with locale "fr_CA.UTF-8" would store that
string directly in the catalog, which would later be passed to (and
misinterpreted by) ucol_open(). After this commit, the locale string
will be canonicalized to language tag "fr-CA" in the catalog, which
will be properly understood by ucol_open(). Because this fix affects
the resulting collator, we cannot change the locale string stored in
the catalog for existing databases or collations; otherwise we'd risk
corrupting indexes. Therefore, only canonicalize locales for
newly-created (not upgraded) collations/databases. For similar
reasons, do not backport.

Discussion: https://postgr.es/m/8c7af6820aed94dc7bc259d2aa7f9663518e6137.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-04-04 10:38:58 -07:00
Tom Lane d3d53f955c Add a way to get the current function's OID in pl/pgsql.
Invent "GET DIAGNOSTICS oid_variable = PG_ROUTINE_OID".
This is useful for avoiding the maintenance nuisances that come
with embedding a function's name in its body, as one might do
for logging purposes for example.  Typically users would cast the
result to regproc or regprocedure to get something human-readable,
but we won't pre-judge whether that's appropriate.

Pavel Stehule, reviewed by Kirk Wolak and myself

Discussion: https://postgr.es/m/CAFj8pRA4zMd5pY-B89Gm64bDLRt-L+akOd34aD1j4PEstHHSVQ@mail.gmail.com
2023-04-04 13:33:18 -04:00
Robert Haas 482675987b Add a run_as_owner option to subscriptions.
This option is normally false, but can be set to true to obtain
the legacy behavior where the subscription runs with the permissions
of the subscription owner rather than the permissions of the
table owner. The advantages of this mode are (1) it doesn't require
that the subscription owner have permission to SET ROLE to each
table owner and (2) since no role switching occurs, the
SECURITY_RESTRICTED_OPERATION restrictions do not apply.

On the downside, it allows any table owner to easily usurp
the privileges of the subscription owner - basically, to take
over their account. Because that's generally quite undesirable,
we don't make this mode the default, but we do make it available,
just in case the new behavior causes too many problems for someone.

Discussion: http://postgr.es/m/CA+TgmoZ-WEeG6Z14AfH7KhmpX2eFh+tZ0z+vf0=eMDdbda269g@mail.gmail.com
2023-04-04 12:03:03 -04:00
Robert Haas 1e10d49b65 Perform logical replication actions as the table owner.
Up until now, logical replication actions have been performed as the
subscription owner, who will generally be a superuser.  Commit
cec57b1a0f documented hazards
associated with that situation, namely, that any user who owns a
table on the subscriber side could assume the privileges of the
subscription owner by attaching a trigger, expression index, or
some other kind of executable code to it. As a remedy, it suggested
not creating configurations where users who are not fully trusted
own tables on the subscriber.

Although that will work, it basically precludes using logical
replication in the way that people typically want to use it,
namely, to replicate a database from one node to another
without necessarily having any restrictions on which database
users can own tables. So, instead, change logical replication to
execute INSERT, UPDATE, DELETE, and TRUNCATE operations as the
table owner when they are replicated.

Since this involves switching the active user frequently within
a session that is authenticated as the subscription user, also
impose SECURITY_RESTRICTED_OPERATION restrictions on logical
replication code. As an exception, if the table owner can SET
ROLE to the subscription owner, these restrictions have no
security value, so don't impose them in that case.

Subscription owners are now required to have the ability to
SET ROLE to every role that owns a table that the subscription
is replicating. If they don't, replication will fail. Superusers,
who normally own subscriptions, satisfy this property by default.
Non-superusers users who own subscriptions will need to be
granted the roles that own relevant tables.

Patch by me, reviewed (but not necessarily in its entirety) by
Jelte Fennema, Jeff Davis, and Noah Misch.

Discussion: http://postgr.es/m/CA+TgmoaSCkg9ww9oppPqqs+9RVqCexYCE6Aq=UsYPfnOoDeFkw@mail.gmail.com
2023-04-04 11:25:23 -04:00
Alvaro Herrera 71bfd1543f
Code review for recent SQL/JSON commits
- At the last minute and for no particularly good reason, I changed the
  WITHOUT token to be marked especially for lookahead, from the one in
  WITHOUT TIME to the one in WITHOUT UNIQUE.  Study of upcoming patches
  (where a new WITHOUT ARRAY WRAPPER clause is added) showed me that the
  former was better, so put it back the way the original patch had it.

- update exprTypmod() for JsonConstructorExpr to return the typmod of
  the RETURNING clause, as a comment there suggested.  Perhaps it's
  possible for this to make a difference with datetime types, but I
  didn't try to build a test case.

- The nodeFuncs.c support code for new nodes was calling walker()
  directly instead of the WALK() macro as introduced by commit 1c27d16e6e.
  Modernize that.  Also add exprLocation() support for a couple of nodes
  that missed it.  Lastly, reorder the code more sensibly.

The WITHOUT_LA -> WITHOUT change means that stored rules containing
either WITHOUT TIME ZONE or WITHOUT UNIQUE KEYS would change
representation.  Therefore, bump catversion.

Discussion: https://postgr.es/m/20230329181708.e64g2tpy7jyufqkr@alvherre.pgsql
2023-04-04 14:04:30 +02:00
Andres Freund 8a2b1b1477 bufmgr: Remove buffer-write-dirty tracepoints
The trace point was using the relfileno / fork / block for the to-be-read-in
buffer. Some upcoming work would make that more expensive to provide. We still
have buffer-flush-start/done, which can serve most tracing needs that
buffer-write-dirty could serve.

Discussion: https://postgr.es/m/f5164e7a-eef6-8972-75a3-8ac622ed0c6e@iki.fi
2023-04-03 18:02:41 -07:00
Peter Geoghegan 05a304a855 Make SP-GiST redirect cleanup more aggressive.
Commit 61b313e4 made VACUUM pass down a heaprel to index AM bulkdelete
and vacuumcleanup routines.  Although this was primarily intended as
preparation for logical decoding on standbys, it also made it easy to
correct an old deficiency in how we determine how to cleanup SP-GiST
redirect and placeholder tuples.

Pass the heaprel to GlobalVisTestFor() during cleanup of redirect and
placeholder tuples, rather than pessimistically passing NULL.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/02392033-f030-a3c8-c7d0-5c27eb529fec@gmail.com
2023-04-03 11:47:48 -07:00
Peter Geoghegan e48c817395 Recycle deleted nbtree pages more aggressively.
Commit 61b313e4 made nbtree consistently pass down a heaprel to low
level routines like _bt_getbuf().  Although this was primarily intended
as preparation for logical decoding on standbys, it also made it easy to
correct an old deficiency in how nbtree VACUUM determines whether or not
it's now safe to recycle deleted pages.

Pass the heaprel to GlobalVisTestFor() in nbtree routines that deal with
recycle safety.  nbtree now makes less pessimistic assumptions about
recycle safety within non-catalog relations.  This enhancement
complements the recycling enhancement added by commit 9dd963ae25.

nbtree remains just as pessimistic as ever when it comes to recycle
safety within indexes on catalog relations.  There is no fundamental
reason why we need to treat catalog relations differently, though.  The
behavioral inconsistency is a consequence of the way that nbtree uses
nextXID values to implement what Lanin and Shasha call "the drain
technique".  Note in particular that it has nothing to do with whether
or not index tuples might still be required for an older MVCC snapshot.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzkaiDxCje0yPuH=3Uh2p1V_2pFGY==xfbZoZu7Ax_NB8g@mail.gmail.com
2023-04-03 11:31:43 -07:00
Peter Geoghegan a349b86603 Move heaprel struct field next to index rel field.
Commit 61b313e4 added a heaprel struct member to IndexVacuumInfo, but
placed it last.  Move the heaprel struct member next to the index struct
member to improve the code's readability.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WznG=TV6S9d3VA=y0vBHbXwnLs9_LLdiML=aNJuHeriwxg@mail.gmail.com
2023-04-03 11:01:11 -07:00
Robert Haas e7e7da2f8d Fix possible logical replication crash.
Commit c3afe8cf5a added a new
password_required option but forgot that you need database access
to check whether an arbitrary role ID is a superuser.

Report and patch by Hou Zhijie. I added a comment. Thanks to
Alexander Lakhin for devising a way to reproduce the crash.

Discussion: http://postgr.es/m/OS0PR01MB5716BFD7EC44284C89F40808948F9@OS0PR01MB5716.jpnprd01.prod.outlook.com
2023-04-03 13:54:21 -04:00
Tom Lane a8a00124f1 When using valgrind, log the current query after an error is detected.
In USE_VALGRIND builds, add code to print the text of the current query
to the valgrind log whenever the valgrind error count has increased
during the query.  Valgrind will already have printed its report,
if the error is distinct from ones already seen, so that this works
out fairly well as a way of annotating the log.

Onur Tirtir and Tom Lane

Discussion: https://postgr.es/m/AM9PR83MB0498531E804DC8DF8CFF0E8FE9D09@AM9PR83MB0498.EURPRD83.prod.outlook.com
2023-04-03 10:18:38 -04:00
Alexander Korotkov b0b91ced16 Revert 764da7710b
Discussion: https://postgr.es/m/20230323003003.plgaxjqahjgkuxrk%40awork3.anarazel.de
2023-04-03 16:55:09 +03:00
Alexander Korotkov 2b65bf046d Revert 11470f544e
Discussion: https://postgr.es/m/20230323003003.plgaxjqahjgkuxrk%40awork3.anarazel.de
2023-04-03 16:54:31 +03:00
David Rowley 8d928e3a9f Rename BufferAccessStrategyData.ring_size to nbuffers
The new name better reflects what the field is - the size of the buffers[]
array.  ring_size sounded more like it is in units of bytes.

An upcoming commit allows a BufferAccessStrategy of custom sizes, so it
seems relevant to improve this beforehand.

Author: Melanie Plageman
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/CAAKRu_YefVbhg4rAxU2frYFdTap61UftH-kUNQZBvAs%2BYK81Zg%40mail.gmail.com
2023-04-03 23:31:16 +12:00
David Rowley 4830f10243 Disable vacuum's use of a buffer access strategy during failsafe
Traditionally, vacuum always makes use of a buffer access strategy 32
buffers in size.  This means that running vacuums tend not to cause too
many shared buffers to become dirty, however, this can cause vacuums to
run much more slowly than they otherwise could as WAL flushes will occur
more frequently due to having to flush WAL out to the LSN of the dirty
page before that page can be written to disk.

When we are performing failsafe VACUUMs (as added in 1e55e7d17), we really
want to make the vacuum work go as quickly as possible, so here we disable
the buffer access strategy when entering failsafe mode while vacuuming a
relation.

Per idea and analyis from Andres Freund.

In passing, also include some changes I had intended for 32fbe0239.

Author: Melanie Plageman
Reviewed-by: Justin Pryzby, David Rowley
Discussion: https://postgr.es/m/20230111182720.ejifsclfwymw2reb%40awork3.anarazel.de
2023-04-03 23:05:58 +12:00
Daniel Gustafsson 525fb0a171 Fix typo in CI README
s/fron/from/
2023-04-03 10:50:17 +02:00
David Rowley 32fbe0239b Only make buffer strategy for vacuum when it's likely needed
VACUUM FULL and VACUUM ONLY_DATABASE_STATS will not use the vacuum
strategy ring created in vacuum(), so don't waste effort making it in
those cases.

There are other conceivable cases where the buffer strategy also won't be
used, but those are probably less common and not worth troubling over too
much.  For example VACUUM (PROCESS_MAIN false, PROCESS_TOAST false).
There are other cases too, but many of these are only discovered once
inside vacuum_rel().

Author: Melanie Plageman
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/CAAKRu_ZLRuzkM3zKogiZAz2hUony37yLY4aaLb8fPf8fgqs5Og@mail.gmail.com
2023-04-03 19:19:45 +12:00
Peter Eisentraut 1980d3585e pg_basebackup: Correct type of WalSegSz
The pg_basebackup code had WalSegSz as uint32, whereas the rest of the
code has it as int.  This seems confusing, and using the extra range
wouldn't actually work.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/1bf15c7a-0acd-1864-081e-7a28814310fe%40enterprisedb.com
2023-04-03 07:21:06 +02:00
David Rowley 3f476c9534 Remove some global variables from vacuum.c
Using global variables because we don't want to pass these values around
as parameters does not really seem like a great idea, so let's remove
these two global variables and adjust a few functions to accept these
values as parameters instead.

This is part of a wider patch which intends to allow the size of the
buffer access strategy that vacuum uses to be adjusted.

Author: Melanie Plageman
Reviewed-by: Bharath Rupireddy
Discussion: https://postgr.es/m/CAAKRu_b1q_07uquUtAvLqTM%3DW9nzee7QbtzHwA4XdUo7KX_Cnw%40mail.gmail.com
2023-04-03 15:07:25 +12:00
Tom Lane 2e6ba13152 Doc: update pgindent/README.
I missed updating this when we pulled pg_bsd_indent into the tree.
2023-04-02 20:01:34 -04:00
Andres Freund 6af1793954 Add info in WAL records in preparation for logical slot conflict handling
This commit only implements one prerequisite part for allowing logical
decoding. The commit message contains an explanation of the overall design,
which later commits will refer back to.

Overall design:

1. We want to enable logical decoding on standbys, but replay of WAL
from the primary might remove data that is needed by logical decoding,
causing error(s) on the standby. To prevent those errors, a new replication
conflict scenario needs to be addressed (as much as hot standby does).

2. Our chosen strategy for dealing with this type of replication slot
is to invalidate logical slots for which needed data has been removed.

3. To do this we need the latestRemovedXid for each change, just as we
do for physical replication conflicts, but we also need to know
whether any particular change was to data that logical replication
might access. That way, during WAL replay, we know when there is a risk of
conflict and, if so, if there is a conflict.

4. We can't rely on the standby's relcache entries for this purpose in
any way, because the startup process can't access catalog contents.

5. Therefore every WAL record that potentially removes data from the
index or heap must carry a flag indicating whether or not it is one
that might be accessed during logical decoding.

Why do we need this for logical decoding on standby?

First, let's forget about logical decoding on standby and recall that
on a primary database, any catalog rows that may be needed by a logical
decoding replication slot are not removed.

This is done thanks to the catalog_xmin associated with the logical
replication slot.

But, with logical decoding on standby, in the following cases:

- hot_standby_feedback is off
- hot_standby_feedback is on but there is no a physical slot between
  the primary and the standby. Then, hot_standby_feedback will work,
  but only while the connection is alive (for example a node restart
  would break it)

Then, the primary may delete system catalog rows that could be needed
by the logical decoding on the standby (as it does not know about the
catalog_xmin on the standby).

So, it’s mandatory to identify those rows and invalidate the slots
that may need them if any. Identifying those rows is the purpose of
this commit.

Implementation:

When a WAL replay on standby indicates that a catalog table tuple is
to be deleted by an xid that is greater than a logical slot's
catalog_xmin, then that means the slot's catalog_xmin conflicts with
the xid, and we need to handle the conflict. While subsequent commits
will do the actual conflict handling, this commit adds a new field
isCatalogRel in such WAL records (and a new bit set in the
xl_heap_visible flags field), that is true for catalog tables, so as to
arrange for conflict handling.

The affected WAL records are the ones that already contain the
snapshotConflictHorizon field, namely:

- gistxlogDelete
- gistxlogPageReuse
- xl_hash_vacuum_one_page
- xl_heap_prune
- xl_heap_freeze_page
- xl_heap_visible
- xl_btree_reuse_page
- xl_btree_delete
- spgxlogVacuumRedirect

Due to this new field being added, xl_hash_vacuum_one_page and
gistxlogDelete do now contain the offsets to be deleted as a
FLEXIBLE_ARRAY_MEMBER. This is needed to ensure correct alignment.
It's not needed on the others struct where isCatalogRel has
been added.

This commit just introduces the WAL format changes mentioned above. Handling
the actual conflicts will follow in future commits.

Bumps XLOG_PAGE_MAGIC as the several WAL records are changed.

Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Andres Freund <andres@anarazel.de> (in an older version)
Author: Amit Khandekar <amitdkhan.pg@gmail.com>  (in an older version)
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
2023-04-02 12:32:19 -07:00
Noah Misch eaa1dd131c Use PG_TEST_TIMEOUT_DEFAULT in 019_replslot_limit.pl.
Oversight in f2698ea02c, which introduced
the variable.  This lowers some 1000s timeouts to the configurable
default of 180s, due to a lack of evidence for needing the longer
timeout.  The alternative was 6*PG_TEST_TIMEOUT_DEFAULT, which we can
adopt if the need arises.  Given the lack of observed trouble with these
timeouts, no back-patch.
2023-04-02 09:31:09 -07:00
Andres Freund 61b313e47e Pass down table relation into more index relation functions
This is done in preparation for logical decoding on standby, which needs to
include whether visibility affecting WAL records are about a (user) catalog
table. Which is only known for the table, not the indexes.

It's also nice to be able to pass the heap relation to GlobalVisTestFor() in
vacuumRedirectAndPlaceholder().

Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/21b700c3-eecf-2e05-a699-f8c78dd31ec7@gmail.com
2023-04-01 20:18:29 -07:00
Andres Freund a88a18b125 Assert only valid flag bits are passed to visibilitymap_set()
If visibilitymap_set() is called with flags containing a higher bit than
VISIBILITYMAP_ALL_FROZEN, the state of neighboring pages is affected. While
there was an assertion that *some* valid bits were set, it did not check
that *only* valid bits were. Change that.

Discussion: https://postgr.es/m/20230331043300.gux3s5wzrapqi4oe@awork3.anarazel.de
2023-04-01 18:00:19 -07:00
Andres Freund 14f98e0af9 hio: Release extension lock before initializing page / pinning VM
PageInit() while holding the extension lock is unnecessary after 0d1fe9f74e
started to use RBM_ZERO_AND_LOCK - nobody can look at the new page before we
release the page lock. PageInit() zeroes the page, which isn't that cheap, so
deferring it until after the extension lock is released seems like a good idea.

Doing visibilitymap_pin() while holding the extension lock, introduced in
7db0cd2145, looks like an accident. Due to the restrictions on
HEAP_INSERT_FROZEN it's unlikely to be a performance issue, but it still seems
better to move it out.  We also are doing the visibilitymap_pin() while
holding the buffer lock, which will be fixed in a separate commit.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: http://postgr.es/m/419312fd-9255-078c-c3e3-f0525f911d7f@iki.fi
2023-04-01 17:50:18 -07:00
Tomas Vondra 0070b66fef pg_dump: Use only LZ4 frame format for compression
After 0da243fed0 got committed, it was reported that in some cases the
compression ratio is rather poor - particularly for custom format with
narrow tables - due to writing the LZ4 header/footer for each row.

This commit switches to LZ4F (LZ4 frame format), eliminating most of the
overhead and greatly improving the compression ratio. This makes the
compressed size about the same for plain and custom formats (just like
for gzip, for example).

LZ4F is now used by both compression APIs, which allowed refactoring and
reusing more of the code. For consistency this also renames the LZ4File
struct to LZ4State, and a number of functions are now prefixed with
LZ4Stream_ (instead of LZ4File_).

Patch by Georgios Kokolatos, based on report and initial patch by Justin
Pryzby. Review and minor cleanups by me.

Author: Georgios Kokolatos, Justin Pryzby
Reported-by: Justin Pryzby
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/20230227044910.GO1653%40telsasoft.com
2023-04-01 00:54:50 +02:00
Alvaro Herrera 6ee30209a6
SQL/JSON: support the IS JSON predicate
This patch introduces the SQL standard IS JSON predicate. It operates
on text and bytea values representing JSON, as well as on the json and
jsonb types. Each test has IS and IS NOT variants and supports a WITH
UNIQUE KEYS flag. The tests are:

IS JSON [VALUE]
IS JSON ARRAY
IS JSON OBJECT
IS JSON SCALAR

These should be self-explanatory.

The WITH UNIQUE KEYS flag makes these return false when duplicate keys
exist in any object within the value, not necessarily directly contained
in the outermost object.

Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Oleg Bartunov <obartunov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>
Author: Andrew Dunstan <andrew@dunslane.net>

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.

Discussion: https://postgr.es/m/CAF4Au4w2x-5LTnN_bxky-mq4=WOqsGsxSpENCzHRAzSnEd8+WQ@mail.gmail.com
Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
2023-03-31 22:34:04 +02:00
Tom Lane a2a0c7c29e Further tweaking of width_bucket() edge cases.
I realized that the third overflow case I posited in commit b0e9e4d76
actually should be handled in a different way: rather than tolerating
the idea that the quotient could round to 1, we should clamp so that
the output cannot be more than "count" when we know that the operand is
less than bound2.  That being the case, we don't need an overflow-aware
increment in that code path, which leads me to revert the movement of
the pg_add_s32_overflow() call.  (The diff in width_bucket_float8
might be easier to read by comparing against b0e9e4d76^.)

What's more, width_bucket_numeric also has this problem of the quotient
potentially rounding to 1, so add a clamp there too.

As before, I'm not quite convinced that a back-patch is warranted.

Discussion: https://postgr.es/m/391415.1680268470@sss.pgh.pa.us
2023-03-31 16:29:55 -04:00
Tom Lane f0d65c0eaf Reject system columns as elements of foreign keys.
Up through v11 it was sensible to use the "oid" system column as
a foreign key column, but since that was removed there's no visible
usefulness in making any of the remaining system columns a foreign
key.  Moreover, since the TupleTableSlot rewrites in v12, such cases
actively fail because of implicit assumptions that only user columns
appear in foreign keys.  The lack of complaints about that seems
like good evidence that no one is trying to do it.  Hence, rather
than trying to repair those assumptions (of which there are at least
two, maybe more), let's just forbid the case up front.

Per this patch, a system column in either the referenced or
referencing side of a foreign key will draw this error; however,
putting one in the referenced side would have failed later anyway,
since we don't allow unique indexes to be made on system columns.

Per bug #17877 from Alexander Lakhin.  Back-patch to v12; the
case still appears to work in v11, so we shouldn't break it there.

Discussion: https://postgr.es/m/17877-4bcc658e33df6de1@postgresql.org
2023-03-31 11:18:49 -04:00
Tom Lane c2d7d679c1 Ensure acquire_inherited_sample_rows sets its output parameters.
The totalrows/totaldeadrows outputs were left uninitialized in cases
where we find no analyzable child tables of a partitioned table.  This
could lead to setting the partitioned table's pg_class.reltuples value
to garbage.  It's not clear that that would have any very bad effects
in practice, but fix it anyway because it's making valgrind unhappy.

Reported and diagnosed by Alexander Lakhin (bug #17880).

Discussion: https://postgr.es/m/17880-9282037c923d856e@postgresql.org
2023-03-31 10:08:40 -04:00
Daniel Gustafsson 558fff0adf pg_regress: Emit TAP compliant output
This converts pg_regress output format to emit TAP compliant output
while keeping it as human readable as possible for use without TAP
test harnesses. As verbose harness related information isn't really
supported by TAP this also reduces the verbosity of pg_regress runs
which makes scrolling through log output in buildfarm/CI runs a bit
easier as well.

As the meson TAP parser conumes whitespace, the leading indentation
for differentiating parallel tests from sequential tests has been
changed to a single character prefix.

This patch has been around for an extended period of time, reviewers
listed below may have been involved in reviewing a version quite
different from the version in this commit.  The original idea for
this patch was a hacking session with Jinbao Chen.

TAP format testing is also enabled in meson as of this.

Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Nikolay Shaplov <dhyan@nataraj.su>
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/BD4B107D-7E53-4794-ACBA-275BEB4327C9@yesql.se
Discussion: https://postgr.es/m/20220221164736.rq3ornzjdkmwk2wo@alap3.anarazel.de
2023-03-31 13:00:02 +02:00
Alvaro Herrera 9b058f6b0d
Move ExecEvalJsonConstructor new function to a more natural place
Commit 7081ac46ac put it at the end of the file, but that doesn't look
very nice.
2023-03-31 12:55:25 +02:00
Alvaro Herrera 47a9709846
No need to add FORMAT to the keyword precedence list
Commit 7081ac46ac put it there.  Remove it.
2023-03-31 11:16:43 +02:00
Andres Freund f95c1cd6b2 Bump PGSTAT_FILE_FORMAT_ID, omitted in 8aaa04b32d
I forgot to do so in the referenced commit. While the consequences of omitting
the version change are likely to be harmless (besides discarding stats, as a
PGSTAT_FILE_FORMAT_ID bump also does), it still seems worth doing.
2023-03-30 19:48:01 -07:00
Andres Freund 8aaa04b32d Track shared buffer hits in pg_stat_io
Among other things, this should make it easier to calculate a useful cache hit
ratio by excluding buffer reads via buffer access strategies. As buffer access
strategies reuse buffers (and thus evict the prior buffer contents), it is
normal to see reads on repeated scans of the same data.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAAKRu_beMa9Hzih40%3DXPYqhDVz6tsgUGTrhZXRo%3Dunp%2Bszb%3DUA%40mail.gmail.com
2023-03-30 19:24:21 -07:00
David Rowley 6c3b697b19 Fix List memory issue in transformColumnDefinition
When calling generateSerialExtraStmts(), we would pass in the
constraint->options.  In some cases, generateSerialExtraStmts() would
modify the referenced List to remove elements from it, but doing so is
invalid without assigning the list back to all variables that point to it.
In the particular reported problem case, the List became empty, in which
cases it became NIL, but the passed in constraint->options didn't get to
find out about that and was left pointing to free'd memory.

To fix this, just perform a list_copy() inside generateSerialExtraStmts().
We could just do a list_copy() just before we perform the delete from the
list, however, that seems less robust.  Let's make sure the generated
CreateSeqStmt gets a completely different copy of the list to be safe.

Bug: #17879
Reported-by: Fei Changhong
Diagnosed-by: Fei Changhong
Discussion: https://postgr.es/m/17879-b7dfb5debee58ff5@postgresql.org
Backpatch-through: 11, all supported versions
2023-03-31 12:13:05 +13:00
Thomas Munro 11c2d6fdf5 Parallel Hash Full Join.
Full and right outer joins were not supported in the initial
implementation of Parallel Hash Join because of deadlock hazards (see
discussion).  Therefore FULL JOIN inhibited parallelism, as the other
join strategies can't do that in parallel either.

Add a new PHJ phase PHJ_BATCH_SCAN that scans for unmatched tuples on
the inner side of one batch's hash table.  For now, sidestep the
deadlock problem by terminating parallelism there.  The last process to
arrive at that phase emits the unmatched tuples, while others detach and
are free to go and work on other batches, if there are any, but
otherwise they finish the join early.

That unfairness is considered acceptable for now, because it's better
than no parallelism at all.  The build and probe phases are run in
parallel, and the new scan-for-unmatched phase, while serial, is usually
applied to the smaller of the two relations and is either limited by
some multiple of work_mem, or it's too big and is partitioned into
batches and then the situation is improved by batch-level parallelism.

Author: Melanie Plageman <melanieplageman@gmail.com>
Author: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2BA6ftXPz4oe92%2Bx8Er%2BxpGZqto70-Q_ERwRaSyA%3DafNg%40mail.gmail.com
2023-03-31 11:34:03 +13:00
Andres Freund ca7b3c4c00 pg_stat_wal: Accumulate time as instr_time instead of microseconds
In instr_time.h it is stated that:

* When summing multiple measurements, it's recommended to leave the
* running sum in instr_time form (ie, use INSTR_TIME_ADD or
* INSTR_TIME_ACCUM_DIFF) and convert to a result format only at the end.

The reason for that is that converting to microseconds is not cheap, and can
loose precision.  Therefore this commit changes 'PendingWalStats' to use
'instr_time' instead of 'PgStat_Counter' while accumulating 'wal_write_time'
and 'wal_sync_time'.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/1feedb83-7aa9-cb4b-5086-598349d3f555@gmail.com
2023-03-30 14:23:14 -07:00
Alvaro Herrera 63cc20205c
Simplify transformJsonAggConstructor() API
There's no need for callers to pass aggregate names so that the function
can resolve them to OIDs, when callers can just pass aggregate OIDs
directly to begin with.
2023-03-30 21:07:24 +02:00
Alvaro Herrera 60966f56c3
Fix inconsistencies and style issues in new SQL/JSON code
Reported by Alexander Lakhin.

Discussion: https://postgr.es/m/60483139-5c34-851d-baee-6c0d014e1710@gmail.com
2023-03-30 21:06:31 +02:00
Alvaro Herrera 589bb81649
Fix setrefs.c code for adjusting partPruneInfos
We were transferring partPruneInfos from PlannerInfo into PlannerGlobal
wrong, essentially relying on all of them being transferred, and
adjusting their list indexes based on that.  But apparently it's
possible that some of them are skipped, so that strategy leads to a
corrupted execution tree.  Instead, adjust each Append/MergeAppend's
partpruneinfo index as we copy from one list to the other, which seems
safer anyway.  This requires adjusting the RT offset of the RTE
referenced in each partPruneInfo ahead of actually adjusting the RTE
itself, which seems a bit too ad-hoc.

This problem was introduced by commit ec38694894.  However, it may be
that we no longer require the change introduced there, so perhaps we
should revert both the present commit and that one.

Problem noticed by sqlsmith.

Author: Amit Langote <amitlangote09@gmail.com
Discussion: https://postgr.es/m/CA+HiwqG6tbc2oadsbyyy24b2AL295XHQgyLRWghmA7u_SL1K8A@mail.gmail.com
2023-03-30 21:06:31 +02:00
Andres Freund f9054b7a7c Fix format code in fd.c debugging infrastructure
These were not sufficiently adjusted in 2d4f1ba6cf.
2023-03-30 10:26:10 -07:00
Andres Freund 558cf80387 bufmgr: Fix undefined behaviour with, unrealistically, large temp_buffers
Quoting Melanie:
> Since if buffer is INT_MAX, then the -(buffer + 1) version invokes
> undefined behavior while the -buffer - 1 version doesn't.

All other places were already using the correct version. I (Andres), copied
the code into more places in a patch. Melanie caught it in review, but to
prevent more people from copying the bad code, fix it. Even if it is a
theoretical issue.

We really ought to wrap these accesses in a helper function...

As this is a theoretical issue, don't backpatch.

Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_aW2SX_LWtwHgfnqYpBrunMLfE9PD6-ioPpkh92XH0qpg@mail.gmail.com
2023-03-30 10:26:10 -07:00
Tom Lane e9d202a149 Clean up role created in new subscription test.
This oversight broke repeated runs of "make installcheck".
2023-03-30 13:07:04 -04:00
Robert Haas c3afe8cf5a Add new predefined role pg_create_subscription.
This role can be granted to non-superusers to allow them to issue
CREATE SUBSCRIPTION. The non-superuser must additionally have CREATE
permissions on the database in which the subscription is to be
created.

Most forms of ALTER SUBSCRIPTION, including ALTER SUBSCRIPTION .. SKIP,
now require only that the role performing the operation own the
subscription, or inherit the privileges of the owner. However, to
use ALTER SUBSCRIPTION ... RENAME or ALTER SUBSCRIPTION ... OWNER TO,
you also need CREATE permission on the database. This is similar to
what we do for schemas. To change the owner of a schema, you must also
have permission to SET ROLE to the new owner, similar to what we do
for other object types.

Non-superusers are required to specify a password for authentication
and the remote side must use the password, similar to what is required
for postgres_fdw and dblink.  A superuser who wants a non-superuser to
own a subscription that does not rely on password authentication may
set the new password_required=false property on that subscription. A
non-superuser may not set password_required=false and may not modify a
subscription that already has password_required=false.

This new password_required subscription property works much like the
eponymous postgres_fdw property.  In both cases, the actual semantics
are that a password is not required if either (1) the property is set
to false or (2) the relevant user is the superuser.

Patch by me, reviewed by Andres Freund, Jeff Davis, Mark Dilger,
and Stephen Frost (but some of those people did not fully endorse
all of the decisions that the patch makes).

Discussion: http://postgr.es/m/CA+TgmoaDH=0Xj7OBiQnsHTKcF2c4L+=gzPBUKSJLh8zed2_+Dg@mail.gmail.com
2023-03-30 11:37:19 -04:00
Tom Lane b0e9e4d76c Avoid overflow in width_bucket_float8().
The original coding of this function paid little attention to the
possibility of overflow.  There were actually three different hazards:

1. The range from bound1 to bound2 could exceed DBL_MAX, which on
IEEE-compliant machines produces +Infinity in the subtraction.
At best we'd lose all precision in the result, and at worst
produce NaN due to dividing Inf/Inf.  The range can't exceed
twice DBL_MAX though, so we can fix this case by scaling all the
inputs by 0.5.

2. We computed count * (operand - bound1), which is also at risk of
float overflow, before dividing.  Safer is to do the division first,
producing a quotient that should be in [0,1), and even after allowing
for roundoff error can't be outside [0,1]; then multiplying by count
can't produce a result overflowing an int.  (width_bucket_numeric does
the multiplication first on the grounds that that improves accuracy of
its result, but I don't think that a similar argument can be made in
float arithmetic.)

3. If the division result does round to 1, and count is INT_MAX,
the final addition of 1 would overflow an int.  We took care
of that in the operand >= bound2 case but did not consider that
it could be possible in the main path.  Fix that by moving the
overflow-aware addition of 1 so it is done that way in all cases.

The fix for point 2 creates a possibility that values very close to
a bucket boundary will be rounded differently than they were before.
I'm not troubled by that for HEAD, but it is an argument against
putting this into the stable branches.  Given that the cases being
fixed here are fairly extreme and unlikely to be hit in normal use,
it seems best not to back-patch.

Mats Kindahl and Tom Lane

Discussion: https://postgr.es/m/17876-61f280d1601f978d@postgresql.org
2023-03-30 11:27:36 -04:00
Daniel Gustafsson 2fe7a6df94 Fix pointer cast for seed calculation on 32-bit systems
The fallback seed for when pg_strong_random cannot generate a high
quality seed mixes in the address of the conn object, but the cast
failed to take the word size into consideration. Fix by casting to
a uintptr_t instead. The seed calculation was added in 7f5b19817e.

The code as it stood generated the following warning on mamba and
lapwing in the buildfarm:

fe-connect.c: In function 'libpq_prng_init':
fe-connect.c:1048:11: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
1048 |  rseed = ((uint64) conn) ^
     |           ^

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/TYAPR01MB58665250EDCD551CCA9AD117F58E9@TYAPR01MB5866.jpnprd01.prod.outlook.com
2023-03-30 10:53:15 +02:00
Peter Eisentraut 261cf8962b Fix incorrect format placeholders 2023-03-30 08:33:43 +02:00
Amit Kapila da324d6cd4 Refactor pgoutput_change().
Instead of mostly-duplicate code for different operation
(insert/update/delete) types, write a common code to compute old/new
tuples, and check the row filter.

Author: Hou Zhijie
Reviewed-by: Peter Smith, Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB5716194A47FFA8D91133687D94BF9@OS0PR01MB5716.jpnprd01.prod.outlook.com
2023-03-30 11:10:38 +05:30
David Rowley 902ecd3bd4 Fix outdated comments regarding TupleTableSlots
The tts_flag is named TTS_FLAG_SHOULDFREE, so use that instead of
TTS_SHOULDFREE, which is the name of the macro that checks for that flag.

Additionally, 4da597edf got rid of the TupleTableSlot.tts_tuple field but
forgot to update a comment which referenced that field.  Fix that.

Reported-by: Zhen Mingyang <zhenmingyang@yeah.net>
Reported-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/1a96696c.9d3.187193989c3.Coremail.zhenmingyang@yeah.net
2023-03-30 16:37:03 +13:00
Daniel Gustafsson 7f5b19817e Support connection load balancing in libpq
This adds support for load balancing connections with libpq using a
connection parameter: load_balance_hosts=<string>. When setting the
param to random, hosts and addresses will be connected to in random
order. This then results in load balancing across these addresses and
hosts when multiple clients or frequent connection setups are used.

The randomization employed performs two levels of shuffling:

  1. The given hosts are randomly shuffled, before resolving them
     one-by-one.
  2. Once a host its addresses get resolved, the returned addresses
     are shuffled, before trying to connect to them one-by-one.

Author: Jelte Fennema <postgres@jeltef.nl>
Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-by: Michael Banck <mbanck@gmx.net>
Reviewed-by: Andrey Borodin <amborodin86@gmail.com>
Discussion: https://postgr.es/m/PR3PR83MB04768E2FF04818EEB2179949F7A69@PR3PR83MB0476.EURPRD83.prod.outlook.
2023-03-29 21:53:38 +02:00
Daniel Gustafsson 44d85ba5a3 Copy and store addrinfo in libpq-owned private memory
This refactors libpq to copy addrinfos returned by getaddrinfo to
memory owned by libpq such that future improvements can alter for
example the order of entries.

As a nice side effect of this refactor the mechanism for iteration
over addresses in PQconnectPoll is now identical to its iteration
over hosts.

Author: Jelte Fennema <postgres@jeltef.nl>
Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-by: Michael Banck <mbanck@gmx.net>
Reviewed-by: Andrey Borodin <amborodin86@gmail.com>
Discussion: https://postgr.es/m/PR3PR83MB04768E2FF04818EEB2179949F7A69@PR3PR83MB0476.EURPRD83.prod.outlook.com
2023-03-29 21:41:27 +02:00
Tom Lane 8e5eef50c5 Fix dereference of dangling pointer in GiST index buffering build.
gistBuildCallback tried to fetch the size of an index tuple that
might have already been freed by gistProcessEmptyingQueue.
While this seems to usually be harmless in production builds,
in principle it could result in a SIGSEGV, or more likely a bogus
value for indtuplesSize leading to poor page-split decisions later
in the build.

The memory management here is confusing and could stand to be
refactored, but for the moment it seems to be enough to fetch
the tuple size sooner.  AFAICT the indtuples[Size] totals aren't
used in between these places; even if they were, the updated
values shouldn't be any worse to use.  So just move the
incrementing of the totals up.

It's not very clear why our valgrind-using buildfarm animals
haven't noticed this problem, because the relevant code path
does seem to be exercised according to the code coverage report.
I think the reason that we didn't fix this bug after the first
report is that I'd wanted to try to understand that better.
However, now that it's been re-discovered let's just be pragmatic
and fix it already.

Original report by Alexander Lakhin (bug #16329),
later rediscovered by Egor Chindyaskin (bug #17874).

Patch by Alexander Lakhin (commentary by Pavel Borisov and me).
Back-patch to all supported branches.

Discussion: https://postgr.es/m/16329-7a6aa9b6fa1118a1@postgresql.org
Discussion: https://postgr.es/m/17874-63ca6c7ce42d2103@postgresql.org
2023-03-29 11:31:30 -04:00
Tom Lane 3aa961378b Add missing .gitignore entries.
Oversight in commit 7081ac46ac.
2023-03-29 09:16:53 -04:00
Tom Lane 58c9600a9f Remove empty function BufmgrCommit().
This function has been a no-op for over a decade.  Even if bufmgr
regains a need to be called during commit, it seems unlikely that
the most appropriate call points would be precisely here, so it's not
doing us much good as a placeholder either.  Now, removing it probably
doesn't save any noticeable number of cycles --- but the main call is
inside the commit critical section, and the less work done there the
better.

Matthias van de Meent

Discussion: https://postgr.es/m/CAEze2Wi1=tLKbxZnXzcD+8fYKyKqBtivVakLQC_mYBsP4Y8qVA@mail.gmail.com
2023-03-29 09:13:57 -04:00
Alvaro Herrera 7081ac46ac
SQL/JSON: add standard JSON constructor functions
This commit introduces the SQL/JSON standard-conforming constructors for
JSON types:

JSON_ARRAY()
JSON_ARRAYAGG()
JSON_OBJECT()
JSON_OBJECTAGG()

Most of the functionality was already present in PostgreSQL-specific
functions, but these include some new functionality such as the ability
to skip or include NULL values, and to allow duplicate keys or throw
error when they are found, as well as the standard specified syntax to
specify output type and format.

Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Oleg Bartunov <obartunov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>

Reviewers have included (in no particular order) Andres Freund, Alexander
Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu,
Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby.

Discussion: https://postgr.es/m/CAF4Au4w2x-5LTnN_bxky-mq4=WOqsGsxSpENCzHRAzSnEd8+WQ@mail.gmail.com
Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
2023-03-29 12:11:36 +02:00
Peter Eisentraut 38b7437b90 Fix some section numbers in information_schema.sql
Some of the section numbers that appeared multiple times were not
updated completely by previous changes d61d9aa750 and eb3a1376c9.
2023-03-29 11:34:37 +02:00
Peter Eisentraut 563f21cda8 Move definition of standard collations from initdb to pg_collation.dat
The standard collations "ucs_basic" and "unicode" were defined in
initdb, even though pg_collation.dat seems like the correct place for
them.  It seems this was just forgotten during various reorganizations
of initdb and pg_collation.dat/.h over time.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/08b58ecd-0d50-9395-ed51-dc8294e3fd2b%40enterprisedb.com
2023-03-29 09:45:21 +02:00
Peter Eisentraut 0d15afc875 Simplify useless 0L constants
In ancient times, these belonged to arguments or fields that were
actually of type long, but now they are not anymore, so this "L"
decoration is just confusing.  (Some other 0L and other "L" constants
remain, where they are actually associated with a long type.)
2023-03-29 08:25:12 +02:00
Amit Kapila 062a844424 Avoid syncing data twice for the 'publish_via_partition_root' option.
When there are multiple publications for a subscription and one of those
publishes via the parent table by using publish_via_partition_root and the
other one directly publishes the child table, we end up copying the same
data twice during initial synchronization. The reason for this was that we
get both the parent and child tables from the publisher and try to copy
the data for both of them.

This patch extends the function pg_get_publication_tables() to take a
publication list as its input parameter. This allows us to exclude a
partition table whose ancestor is published by the same publication list.

This problem does exist in back-branches but we decide to fix it there in
a separate commit if required. The fix for back-branches requires quite
complicated changes to fetch the required table information from the
publisher as we can't update the function pg_get_publication_tables() in
back-branches. We are not sure whether we want to deviate and complicate
the code in back-branches for this problem as there are no field reports
yet.

Author: Wang wei
Reviewed-by: Peter Smith, Jacob Champion, Kuroda Hayato, Vignesh C, Osumi Takamichi, Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB57167F45D481F78CDC5986F794B99@OS0PR01MB5716.jpnprd01.prod.outlook.com
2023-03-29 10:46:58 +05:30
Tomas Vondra 00d9dcf5be pg_dump: Fix gzip compression of empty data
The pg_dump Compressor API has three basic callbacks - Allocate, Write
and End.  The gzip implementation (since e9960732a) wrongly assumed the
Write function would always be called, and deferred the initialization
of the internal compression system until the first such call.  But when
there's no data to compress (e.g. for empty LO), this would result in
not finalizing the compression state (because it was not actually
initialized), producing invalid dump.

Fixed by initializing the internal compression system in the Allocate
call, whenever the caller provides the Write.  For decompression the
state is not needed, so we leave the private_data member unpopulated.

Introduces a pg_dump TAP test compressing an empty large object.

This also rearranges the functions to their original order, to make
diffs against older code simpler to understand.  Finally, replace an
unreachable pg_fatal() with a simple assert check.

Reported-by: Justin Pryzby
Author: Justin Pryzby, Georgios Kokolatos
Reviewed-by: Georgios Kokolatos, Tomas Vondra

https://postgr.es/m/20230228235834.GC30529%40telsasoft.com
2023-03-29 02:34:48 +02:00
Jeff Davis 1671f990dd Validate ICU locales.
For ICU collations, ensure that the locale's language exists in ICU,
and that the locale can be opened.

Basic validation helps avoid minor mistakes and misspellings, which
often fall back to the root locale instead of the intended
locale. It's even more important to avoid such mistakes in ICU
versions 54 and earlier, where the same (misspelled) locale string
could fall back to different locales depending on the environment.

Discussion: https://postgr.es/m/11b1eeb7e7667fdd4178497aeb796c48d26e69b9.camel@j-davis.com
Discussion: https://postgr.es/m/df2efad0cae7c65180df8e5ebb709e5eb4f2a82b.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-03-28 16:34:29 -07:00
Tom Lane 326a33a289 Fix corner-case planner failure for MERGE.
MERGE planning could fail with "variable not found in subplan target
list" if the target table is partitioned and all its partitions are
excluded at plan time, or in the case where it has no partitions but
used to have some.  This happened because distribute_row_identity_vars
thought it didn't need to make the target table's reltarget list
fully valid; but if we generate a join plan then that is required
because the dummy Result node's tlist will be made from the reltarget.

The same logic appears in distribute_row_identity_vars in v14,
but AFAICS the problem is unreachable in that branch for lack of
MERGE.  In other updating statements, the target table is always
inner-joined to any other tables, so if the target is known dummy
then the whole plan reduces to dummy, so no join nodes are created.
So I'll refrain from back-patching this code change to v14 for now.

Per report from Alvaro Herrera.

Discussion: https://postgr.es/m/20230328112248.6as34mlx5sr4kltg@alvherre.pgsql
2023-03-28 11:39:24 -04:00
Jeff Davis c1f1c1f87f initdb: emit message when using default ICU locale.
Helpful to determine from test logs whether the locale came from the
environment or a command-line option.

Discussion: https://postgr.es/m/04182066-7655-344a-b8b7-040b1b2490fb%40enterprisedb.com
Reviewed-by: Peter Eisentraut
2023-03-28 08:24:43 -07:00
Jeff Davis f8ca22295e initdb: replace check_icu_locale() with default_icu_locale().
The extra checks done in check_icu_locale() are not necessary. An
existing comment already pointed out that the checks would be done
during post-bootstrap initialization, when the locale is opened by the
backend. This was a mistake in commit 27b62377b4.

This commit creates a simpler function default_icu_locale() to just
return the locale of the default collator.

Discussion: https://postgr.es/m/04182066-7655-344a-b8b7-040b1b2490fb%40enterprisedb.com
Reviewed-by: Peter Eisentraut
2023-03-28 08:24:21 -07:00
Jeff Davis 8b3eb0c584 Fix error inconsistency in older ICU versions.
To support older ICU versions, we rely on
icu_set_collation_attributes() to do error checking that is handled
directly by ucol_open() in newer ICU versions. Commit 3b50275b12
introduced a slight inconsistency, where the error report includes the
fixed-up locale string, rather than the locale string passed to
pg_ucol_open().

Refactor slightly so that pg_ucol_open() handles the errors from both
ucol_open() and icu_set_collation_attributes(), making it easier to
see any differences between the error reports. It also makes
pg_ucol_open() responsible for closing the UCollator on error, which
seems like the right place.

Discussion: https://postgr.es/m/04182066-7655-344a-b8b7-040b1b2490fb%40enterprisedb.com
Reviewed-by: Peter Eisentraut
2023-03-28 08:24:18 -07:00
Peter Eisentraut 90189eefc1 Save a few bytes in pg_attribute
Change the columns attndims, attstattarget, and attinhcount from int32
to int16, and reorder a bit.  This saves some space (currently 4
bytes) in pg_attribute and tuple descriptors, which translates into
small performance benefits and/or room for new columns in pg_attribute
needed by future features.

attndims and attinhcount are never realistically used with values
larger than int16.  Just to be sure, add some overflow checks.
attstattarget is currently limited explicitly to 10000.

For consistency, pg_constraint.coninhcount is also changed like
attinhcount.

Discussion: https://www.postgresql.org/message-id/flat/d07ffc2b-e0e8-77f7-38fb-be921dff71af%40enterprisedb.com
2023-03-28 10:05:56 +02:00
Michael Paquier 4efd0bf7ea Generate a few more functions of pgstatfuncs.c with macros
Two new macros are added with their respective functions switched to
use them.  These are for functions with millisecond stats, with and
without "xact" in their names (for the stats that can be tracked within
a transaction).

While on it, prefix the macro for float8 on database entries with "_MS",
as it does a us->ms conversion, based on a suggestion from Andres
Freund.

Author: Bertrand Drouvot
Discussion: https://postgr.es/m/6e2efb4f-6fd0-807e-f6bf-94207db8183a@gmail.com
2023-03-28 07:35:33 +09:00
Tom Lane a3c9d35ae1 Reject attempts to alter composite types used in indexes.
find_composite_type_dependencies() ignored indexes, which is a poor
decision because an expression index could have a stored column of
a composite (or other container) type even when the underlying table
does not.  Teach it to detect such cases and error out.  We have to
work a bit harder than for other relations because the pg_depend entry
won't identify the specific index column of concern, but it's not much
new code.

This does not address bug #17872's original complaint that dropping
a column in such a type might lead to violations of the uniqueness
property that a unique index is supposed to ensure.  That seems of
much less concern to me because it won't lead to crashes.

Per bug #17872 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17872-d0fbb799dc3fd85d@postgresql.org
2023-03-27 15:04:15 -04:00
Robert Haas c87aff065c amcheck: Generalize one of the recently-added update chain checks.
Commit bbc1376b39 checked that if
a redirected line pointer pointed to a tuple, the tuple should be
marked both HEAP_ONLY_TUPLE and HEAP_UPDATED. But Andres Freund
pointed out that *any* tuple that is marked HEAP_ONLY_TUPLE should
be marked HEAP_UPDATED, not just one that is the target of a
redirected line pointer. Do that instead.

To see why this is better, consider a redirect line pointer A
which points to a heap-only tuple B which points (via CTID)
to another heap-only tuple C. With the old code, we'd complain
if B was not marked HEAP_UPDATED, but with this change, we'll
complain if either B or C is not marked HEAP_UPDATED.

(Note that, with or without this commit, if either B or C were
not marked HEAP_ONLY_TUPLE, we would also complain about that.)

Discussion: http://postgr.es/m/CA%2BTgmobLypZx%3DcOH%2ByY1GZmCruaoucHm77A6y_-Bo%3Dh-_3H28g%40mail.gmail.com
2023-03-27 13:37:16 -04:00
Daniel Gustafsson b577743000 Make SCRAM iteration count configurable
Replace the hardcoded value with a GUC such that the iteration
count can be raised in order to increase protection against
brute-force attacks.  The hardcoded value for SCRAM iteration
count was defined to be 4096, which is taken from RFC 7677, so
set the default for the GUC to 4096 to match.  In RFC 7677 the
recommendation is at least 15000 iterations but 4096 is listed
as a SHOULD requirement given that it's estimated to yield a
0.5s processing time on a mobile handset of the time of RFC
writing (late 2015).

Raising the iteration count of SCRAM will make stored passwords
more resilient to brute-force attacks at a higher computational
cost during connection establishment.  Lowering the count will
reduce computational overhead during connections at the tradeoff
of reducing strength against brute-force attacks.

There are however platforms where even a modest iteration count
yields a too high computational overhead, with weaker password
encryption schemes chosen as a result.  In these situations,
SCRAM with a very low iteration count still gives benefits over
weaker schemes like md5, so we allow the iteration count to be
set to one at the low end.

The new GUC is intentionally generically named such that it can
be made to support future SCRAM standards should they emerge.
At that point the value can be made into key:value pairs with
an undefined key as a default which will be backwards compatible
with this.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Jonathan S. Katz <jkatz@postgresql.org>
Discussion: https://postgr.es/m/F72E7BC7-189F-4B17-BF47-9735EB72C364@yesql.se
2023-03-27 09:46:29 +02:00
Michael Paquier 850f4b4c8c Generate pg_stat_get_xact*() functions for relations using macros
This change replaces seven functions definitions by macros.

This is the same idea as 8018ffb or 83a1a1b, taking advantage of the
variable rename done in 8089517 for relation entries.

Author: Bertrand Drouvot
Discussion: https://postgr.es/m/631e3084-c5d9-8463-7540-fcff4674caa5@gmail.com
2023-03-27 09:57:41 +09:00
Tom Lane 554841699f Fix oversights in array manipulation.
The nested-arrays code path in ExecEvalArrayExpr() used palloc to
allocate the result array, whereas every other array-creating function
has used palloc0 since 18c0b4ecc.  This mostly works, but unused bits
past the end of the nulls bitmap may end up undefined.  That causes
valgrind complaints with -DWRITE_READ_PARSE_PLAN_TREES, and could
cause planner misbehavior as cited in 18c0b4ecc.  There seems no very
good reason why we should strive to avoid palloc0 in just this one case,
so fix it the easy way with s/palloc/palloc0/.

While looking at that I noted that we also failed to check for overflow
of "nbytes" and "nitems" while summing the sizes of the sub-arrays,
potentially allowing a crash due to undersized output allocation.
For "nbytes", follow the policy used by other array-munging code of
checking for overflow after each addition.  (As elsewhere, the last
addition of the array's overhead space doesn't need an extra check,
since palloc itself will catch a value between 1Gb and 2Gb.)
For "nitems", there's no very good reason to sum the inputs at all,
since we can perfectly well use ArrayGetNItems' result instead of
ignoring it.

Per discussion of this bug, also remove redundant zeroing of the
nulls bitmap in array_set_element and array_set_slice.

Patch by Alexander Lakhin and myself, per bug #17858 from Alexander
Lakhin; thanks also to Richard Guo.  These bugs are a dozen years old,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/17858-8fd287fd3663d051@postgresql.org
2023-03-26 13:41:06 -04:00
Daniel Gustafsson d435f15fff Add SysCacheGetAttrNotNull for guaranteed not-null attrs
When extracting an attr from a cached tuple in the syscache with
SysCacheGetAttr the isnull parameter must be checked in case the
attr cannot be NULL.  For cases when this is known beforehand, a
wrapper is introduced which perform the errorhandling internally
on behalf of the caller, invoking an elog in case of a NULL attr.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/AD76405E-DB45-46B6-941F-17B1EB3A9076@yesql.se
2023-03-25 22:49:33 +01:00
Noah Misch e33967b13b Comment on expectations for AutoVacuumWorkItem handlers.
This might prevent a repeat of the brin_summarize_range() vulnerability
that commit a117cebd63 fixed.
2023-03-25 13:00:27 -07:00
Tom Lane 27f5c712b2 Fix CREATE INDEX progress reporting for multi-level partitioning.
The "partitions_total" and "partitions_done" fields were updated
as though the current level of partitioning was the only one.
In multi-level cases, not only could partitions_total change
over the course of the command, but partitions_done could go
backwards or exceed the currently-reported partitions_total.

Fix by setting partitions_total to the total number of direct
and indirect children once at command start, and then just
incrementing partitions_done at appropriate points.  Invent
a new progress monitoring function "pgstat_progress_incr_param"
to simplify doing the latter.  We can avoid adding cost for the
former when doing CREATE INDEX, because ProcessUtility already
enumerates the children and it's pretty easy to pass the count
down to DefineIndex.  In principle the same could be done in
ALTER TABLE, but that's structurally difficult; for now, just
eat the cost of an extra find_all_inheritors scan in that case.

Ilya Gladyshev and Justin Pryzby

Discussion: https://postgr.es/m/a15f904a70924ffa4ca25c3c744cff31e0e6e143.camel@gmail.com
2023-03-25 15:34:03 -04:00
Jeff Davis 81a6d57e33 Fix abbreviated keys bug introduced in d87d548cd0.
Discussion: http://postgr.es/m/CAMkU=1z17XJatF-rMCY3Cjqcxer-Kyn57x6h3OSCpJ0LpAp0ig@mail.gmail.com
Reported-by: Jeff Janes
2023-03-25 11:08:32 -07:00
Tom Lane 3c05284d83 Invent GENERIC_PLAN option for EXPLAIN.
This provides a very simple way to see the generic plan for a
parameterized query.  Without this, it's necessary to define
a prepared statement and temporarily change plan_cache_mode,
which is a bit tedious.

One thing that's a bit of a hack perhaps is that we disable
execution-time partition pruning when the GENERIC_PLAN option
is given.  That's because the pruning code may attempt to
fetch the value of one of the parameters, which would fail.

Laurenz Albe, reviewed by Julien Rouhaud, Christoph Berg,
Michel Pelletier, Jim Jones, and myself

Discussion: https://postgr.es/m/0a29b954b10b57f0d135fe12aa0909bd41883eb0.camel@cybertec.at
2023-03-24 17:07:22 -04:00
Jeff Davis a03b3b6b4a Avoid potential UCollator leak for older ICU versions.
ICU versions 53 and earlier rely on icu_set_collation_attributes() to
process the attributes in the locale string. Avoid leaking the
already-opened UCollator object if an error is encountered.

Discussion: https://postgr.es/m/04182066-7655-344a-b8b7-040b1b2490fb%40enterprisedb.com
Reviewed-by: Peter Eisentraut
2023-03-24 08:48:03 -07:00
Jeff Davis 9a24289915 pg_locale.c: change ereport() to elog().
Discussion: https://postgr.es/m/73553013-3926-0f34-0fb8-f37909fe4902@enterprisedb.com
Reported-by: Peter Eisentraut
2023-03-24 08:47:42 -07:00
Daniel Gustafsson a04761ac77 Fix typo in header comment
Commit 4c04be9b0 accidentally left off the _id portion of the function
name in the header comment.

Author: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/CAEG8a3LP+ytnAXSzR=yiEaQrde+iCybMHsuPn9n=UN3puV_1tw@mail.gmail.com
2023-03-24 09:03:31 +01:00
Peter Eisentraut a9bc04b211 Fix incorrect format placeholders
The fields of NLSVERSIONINFOEX are of type DWORD, which is unsigned
long, so the results of the computations being printed are also of
type unsigned long.
2023-03-24 07:21:40 +01:00
Michael Paquier 36f40ce2dc libpq: Add sslcertmode option to control client certificates
The sslcertmode option controls whether the server is allowed and/or
required to request a certificate from the client.  There are three
modes:
- "allow" is the default and follows the current behavior, where a
configured client certificate is sent if the server requests one
(via one of its default locations or sslcert).  With the current
implementation, will happen whenever TLS is negotiated.
- "disable" causes the client to refuse to send a client certificate
even if sslcert is configured or if a client certificate is available in
one of its default locations.
- "require" causes the client to fail if a client certificate is never
sent and the server opens a connection anyway.  This doesn't add any
additional security, since there is no guarantee that the server is
validating the certificate correctly, but it may helpful to troubleshoot
more complicated TLS setups.

sslcertmode=require requires SSL_CTX_set_cert_cb(), available since
OpenSSL 1.0.2.  Note that LibreSSL does not include it.

Using a connection parameter different than require_auth has come up as
the simplest design because certificate authentication does not rely
directly on any of the AUTH_REQ_* codes, and one may want to require a
certificate to be sent in combination of a given authentication method,
like SCRAM-SHA-256.

TAP tests are added in src/test/ssl/, some of them relying on sslinfo to
check if a certificate has been set.  These are compatible across all
the versions of OpenSSL supported on HEAD (currently down to 1.0.1).

Author: Jacob Champion
Reviewed-by: Aleksander Alekseev, Peter Eisentraut, David G. Johnston,
Michael Paquier
Discussion: https://postgr.es/m/9e5a8ccddb8355ea9fa4b75a1e3a9edc88a70cd3.camel@vmware.com
2023-03-24 13:34:26 +09:00
Andres Freund e522049f23 meson: add install-{quiet, world} targets
To define our own install target, we need dependencies on the i18n targets,
which we did not collect so far.

Discussion: https://postgr.es/m/3fc3bb9b-f7f8-d442-35c1-ec82280c564a@enterprisedb.com
2023-03-23 21:20:18 -07:00
Andres Freund 614c5f5f52 meson: make install_test_files more generic, rename to install_files
Now it supports installing directories and directory contents as well. This
will be used in a subsequent patch to install documentation.

Discussion: https://postgr.es/m/3fc3bb9b-f7f8-d442-35c1-ec82280c564a@enterprisedb.com
2023-03-23 21:20:18 -07:00
Michael Paquier bcaa1fafc8 Rewrite error message related to sslmode in libpq
The same error message will be used for a different option, to be
introduced in a separate patch.  Reshaping the error message as done
here saves in translation.

Extracted from a larger patch by the same author.

Author: Jacob Champion
Discussion: https://postgr.es/m/9e5a8ccddb8355ea9fa4b75a1e3a9edc88a70cd3.camel@vmware.com
2023-03-24 10:14:33 +09:00
Michael Paquier 8089517ab8 Rename fields in pgstat structures for functions and relations
This commit renames the members of a few pgstat structures related to
functions and relations, by respectively removing their prefix "f_" and
"t_".  The statistics for functions and relations and handled in their
own file, and pgstatfuncs.c associates each field in a structure
variable named based on the object type handled, so no information is
lost with this rename.

This will help with some of the refactoring aimed for pgstatfuncs.c, as
this makes more consistent the field names with the SQL functions
retrieving them.

Author: Bertrand Drouvot
Reviewed-by: Michael Paquier, Melanie Plageman
Discussion: https://postgr.es/m/9142f62a-a422-145c-bde0-b5bc498a4ada@gmail.com
2023-03-24 08:46:29 +09:00
Tom Lane 11a0a8b529 Implement find_my_exec()'s path normalization using realpath(3).
Replace the symlink-chasing logic in find_my_exec with realpath(3),
which has been required by POSIX since SUSv2.  (Windows lacks
realpath(), but there we can use _fullpath() which is functionally
equivalent.)  The main benefit of this is that -- on all modern
platforms at least -- realpath() avoids the chdir() shenanigans
we used to perform while interpreting symlinks.  That had various
corner-case failure modes so it's good to get rid of it.

There is still ongoing discussion about whether we could skip the
replacement of symlinks in some cases, but that's really matter
for a separate patch.  Meanwhile I want to push this before we get
too close to feature freeze, so that we can find out if there are
showstopper portability issues.

Discussion: https://postgr.es/m/797232.1662075573@sss.pgh.pa.us
2023-03-23 18:17:49 -04:00
Peter Geoghegan ae4fdde135 Count updates that move row to a new page.
Add pgstat counter to track row updates that result in the successor
version going to a new heap page, leaving behind an original version
whose t_ctid points to the new version.  The current count is shown by
the n_tup_newpage_upd column of each of the pg_stat_*_tables views.

The new n_tup_newpage_upd column complements the existing n_tup_hot_upd
and n_tup_upd columns.  Tables that have high n_tup_newpage_upd values
(relative to n_tup_upd) are good candidates for tuning heap fillfactor.

Corey Huinker, with small tweaks by me.

Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CADkLM=ded21M9iZ36hHm-vj2rE2d=zcKpUQMds__Xm2pxLfHKA@mail.gmail.com
2023-03-23 11:16:17 -07:00
Jeff Davis 3b50275b12 Handle the "und" locale in ICU versions 54 and older.
The "und" locale is an alternative spelling of the root locale, but it
was not recognized until ICU 55. To maintain common behavior across
all supported ICU versions, check for "und" and replace with "root"
before opening.

Previously, the lack of support for "und" was dangerous, because
versions 54 and older fall back to the environment when a locale is
not found. If the user specified "und" for the language (which is
expected and documented), it could not only resolve to the wrong
collator, but it could unexpectedly change (which could lead to
corrupt indexes).

This effectively reverts commit d72900bded, which worked around the
problem for the built-in "unicode" collation, and is no longer
necessary.

Discussion: https://postgr.es/m/60da0cecfb512a78b8666b31631a636215d8ce73.camel@j-davis.com
Discussion: https://postgr.es/m/0c6fa66f2753217d2a40480a96bd2ccf023536a1.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-03-23 10:08:27 -07:00
Tom Lane dccef0f2f8 Add missing "-I." flag when building pg_bsd_indent.
This is evidently not required by most compilers, but buildfarm
member fairywren is unhappy without it.  It looks like the meson
infrastructure has this right already.
2023-03-23 13:01:44 -04:00
Tomas Vondra d0160ca11e Minor comment improvements for compress_lz4
Author: Tomas Vondra
Reviewed-by: Georgios Kokolatos, Justin Pryzby
Discussion: https://postgr.es/m/33496f7c-3449-1426-d568-63f6bca2ac1f@gmail.com
2023-03-23 17:55:52 +01:00
Tomas Vondra f081a48f9a Unify buffer sizes in pg_dump compression API
Prior to the introduction of the compression API in e9960732a9, pg_dump
would use the ZLIB_IN_SIZE/ZLIB_OUT_SIZE to size input/output buffers.
Commit 0da243fed0 introduced similar constants for LZ4, but while gzip
defined both buffers to be 4kB, LZ4 used 4kB and 16kB without any clear
reasoning why that's desirable.

Furthermore, parts of the code unaware of which compression is used
(e.g. pg_backup_directory.c) continued to use ZLIB_OUT_SIZE directly.

Simplify by replacing the various constants with DEFAULT_IO_BUFFER_SIZE,
set to 4kB. The compression implementations still have an option to use
a custom value, but considering 4kB was fine for 20+ years, I find that
unlikely (and we'd probably just increase the default buffer size).

Author: Georgios Kokolatos
Reviewed-by: Tomas Vondra, Justin Pryzby
Discussion: https://postgr.es/m/33496f7c-3449-1426-d568-63f6bca2ac1f@gmail.com
2023-03-23 17:55:52 +01:00
Tomas Vondra d3b57755e6 Improve type handling in pg_dump's compress file API
After 0da243fed0 got committed, we've received a report about a compiler
warning, related to the new LZ4File_gets() function:

  compress_lz4.c: In function 'LZ4File_gets':
  compress_lz4.c:492:19: warning: comparison of unsigned expression in
                                  '< 0' is always false [-Wtype-limits]
    492 |         if (dsize < 0)

The reason is very simple - dsize is declared as size_t, which is an
unsigned integer, and thus the check is pointless and we might fail to
notice an error in some cases (or fail in a strange way a bit later).

The warning could have been silenced by simply changing the type, but we
realized the API mostly assumes all the libraries use the same types and
report errors the same way (e.g. by returning 0 and/or negative value).

But we can't make this assumption - the gzip/lz4 libraries already
disagree on some of this, and even if they did a library added in the
future might not.

The right solution is to define what the API does, and translate the
library-specific behavior in consistent way (so that the internal errors
are not exposed to users of our compression API). So this adjusts the
data types in a couple places, so that we don't miss library errors, and
simplifies and unifies the error reporting to simply return true/false
(instead of e.g. size_t).

While at it, make sure LZ4File_open_write() does not clobber errno in
case open_func() fails.

Author: Georgios Kokolatos
Reported-by: Alexander Lakhin
Reviewed-by: Tomas Vondra, Justin Pryzby
Discussion: https://postgr.es/m/33496f7c-3449-1426-d568-63f6bca2ac1f@gmail.com
2023-03-23 17:55:17 +01:00
Jeff Davis a326aac8f1 Wrap ICU ucol_open().
Hide details of supporting older ICU versions in a wrapper
function. The current code only needs to handle
icu_set_collation_attributes(), but a subsequent commit will add
additional version-specific code.

Discussion: https://postgr.es/m/7ee414ad-deb5-1144-8a0e-b34ae3b71cd5@enterprisedb.com
Reviewed-by: Peter Eisentraut
2023-03-23 09:15:25 -07:00
Amit Kapila adedf54e65 Ignore generated columns during apply of update/delete.
We fail to apply updates and deletes when the REPLICA IDENTITY FULL is
used for the table having generated columns. We didn't use to ignore
generated columns while doing tuple comparison among the tuples from
the publisher and subscriber during apply of updates and deletes.

Author: Onder Kalaci
Reviewed-by: Shi yu, Amit Kapila
Backpatch-through: 12
Discussion: https://postgr.es/m/CACawEhVQC9WoofunvXg12aXtbqKnEgWxoRx3+v8q32AWYsdpGg@mail.gmail.com
2023-03-23 11:58:36 +05:30
Amit Kapila ecb696527c Allow logical replication to copy tables in binary format.
This patch allows copying tables in the binary format during table
synchronization when the binary option for a subscription is enabled.
Previously, tables are copied in text format even if the subscription is
created with the binary option enabled. Copying tables in binary format
may reduce the time spent depending on column types.

A binary copy for initial table synchronization is supported only when
both publisher and subscriber are v16 or later.

Author: Melih Mutlu
Reviewed-by: Peter Smith, Shi yu, Euler Taveira, Vignesh C,  Kuroda Hayato, Osumi Takamichi, Bharath Rupireddy, Hou Zhijie
Discussion: https://postgr.es/m/CAGPVpCQvAziCLknEnygY0v1-KBtg%2BOm-9JHJYZOnNPKFJPompw%40mail.gmail.com
2023-03-23 08:45:51 +05:30
Thomas Munro 8fba928fd7 Improve the naming of Parallel Hash Join phases.
* Commit 3048898e dropped -ING from PHJ wait event names.  Update the
  corresponding barrier phases names to match.

* Rename the "DONE" phases to "FREE".  That's symmetrical with
  "ALLOCATE", and names the activity that actually happens in that phase
  (as we do for the other phases) rather than a state.  The bug fixed by
  commit 8d578b9b might have been more obvious with this name.

* Rename the batch/bucket growth barriers' "ALLOCATE" phases to
  "REALLOCATE", a better description of what they do.

* Update the high level comments about phases to highlight phases
  are executed by a single process with an asterisk (mostly memory
  management phases).

No behavior change, as this is just improving internal identifiers.  The
only user-visible sign of this is that a couple of wait events' display
names change from "...Allocate" to "...Reallocate" in pg_stat_activity,
to stay in sync with the internal names.

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2BMDpwF2Eo2LAvzd%3DpOh81wUTsrwU1uAwR-v6OGBB6%2B7g%40mail.gmail.com
2023-03-23 13:14:25 +13:00
Alexander Korotkov 11470f544e Allow locking updated tuples in tuple_update() and tuple_delete()
Currently, in read committed transaction isolation mode (default), we have the
following sequence of actions when tuple_update()/tuple_delete() finds
the tuple updated by concurrent transaction.

1. Attempt to update/delete tuple with tuple_update()/tuple_delete(), which
   returns TM_Updated.
2. Lock tuple with tuple_lock().
3. Re-evaluate plan qual (recheck if we still need to update/delete and
   calculate the new tuple for update).
4. Second attempt to update/delete tuple with tuple_update()/tuple_delete().
   This attempt should be successful, since the tuple was previously locked.

This patch eliminates step 2 by taking the lock during first
tuple_update()/tuple_delete() call.  Heap table access method saves some
efforts by checking the updated tuple once instead of twice.  Future
undo-based table access methods, which will start from the latest row version,
can immediately place a lock there.

The code in nodeModifyTable.c is simplified by removing the nested switch/case.

Discussion: https://postgr.es/m/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg%40mail.gmail.com
Reviewed-by: Aleksander Alekseev, Pavel Borisov, Vignesh C, Mason Sharp
Reviewed-by: Andres Freund, Chris Travers
2023-03-23 00:26:59 +03:00
Alexander Korotkov 764da7710b Evade extra table_tuple_fetch_row_version() in ExecUpdate()/ExecDelete()
When we lock tuple using table_tuple_lock() then we at the same time fetch
the locked tuple to the slot.  In this case we can skip extra
table_tuple_fetch_row_version() thank to we've already fetched the 'old' tuple
and nobody can change it concurrently since it's locked.

Discussion: https://postgr.es/m/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg%40mail.gmail.com
Reviewed-by: Aleksander Alekseev, Pavel Borisov, Vignesh C, Mason Sharp
Reviewed-by: Andres Freund, Chris Travers
2023-03-23 00:26:59 +03:00
Tom Lane c75a623304 Fix new test case to work on (some?) big-endian architectures.
Use of pack("L") gets around the basic endian problem, but it doesn't
deal with the fact that the order of the bitfields within the struct
may differ.  This patch fixes it to work with gcc on NetBSD/macppc,
but I wonder whether that will be enough --- in principle, there
could be four different combinations of bitpatterns needed here.

Discussion: https://postgr.es/m/1650745.1679513221@sss.pgh.pa.us
2023-03-22 17:14:21 -04:00
Tom Lane b48af6d174 Fix initdb's handling of min_wal_size and max_wal_size.
In commit 3e51b278d, I misinterpreted the coding in setup_config()
as setting min_wal_size and max_wal_size to compile-time-constant
values.  But it's not: there's a hidden dependency on --wal-segsize.
Therefore leaving these variables commented out is the wrong thing.
Per report from Andres Freund.

Discussion: https://postgr.es/m/20230322200751.jvfvsuuhd3hgm6vv@awork3.anarazel.de
2023-03-22 16:37:41 -04:00
Tom Lane 4fe2aa7656 Reduce memory leakage in initdb.
While testing commit 3e51b278d, I noted that initdb leaks about a
megabyte worth of data due to the sloppy bookkeeping in its
string-manipulating code.  That's not a huge amount on modern machines,
but it's still kind of annoying, and it's easy to fix by recognizing
that we might as well treat these arrays of strings as
modifiable-in-place.  There's no caller that cares about preserving
the old state of the array after replace_token or replace_guc_value.

With this fix, valgrind sees only a few hundred bytes leaked during
an initdb run.

Discussion: https://postgr.es/m/2844176.1674681919@sss.pgh.pa.us
2023-03-22 14:28:45 -04:00
Tom Lane 3e51b278db Add "-c name=value" switch to initdb.
This option, or its long form --set, sets the GUC "name" to "value".
The setting applies in the bootstrap and standalone servers run by
initdb, and is also written into the generated postgresql.conf.

This can save an extra editing step when creating a new cluster,
but the real use-case is for coping with situations where the
bootstrap server fails to start due to environmental issues;
for example, if it's necessary to force huge_pages to off.

Discussion: https://postgr.es/m/2844176.1674681919@sss.pgh.pa.us
2023-03-22 13:49:05 -04:00
Andres Freund 5df319f3d5 Fix memory leak and inefficiency in CREATE DATABASE ... STRATEGY WAL_LOG
RelationCopyStorageUsingBuffer() did not free the strategies used to access
the source / target relation. They memory was released at the end of the
transaction, but when using a template database with a lot of relations, the
temporary leak can become big prohibitively big.

RelationCopyStorageUsingBuffer() acquired the buffer for the target relation
with RBM_NORMAL, therefore requiring a read of a block guaranteed to be
zero. Use RBM_ZERO_AND_LOCK instead.

Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/20230321070113.o2vqqxogjykwgfrr@awork3.anarazel.de
Backpatch: 15-, where STRATEGY WAL_LOG was introduced
2023-03-22 09:20:34 -07:00
Robert Haas bbc1376b39 Teach verify_heapam() to validate update chains within a page.
Prior to this commit, we only consider each tuple or line pointer
on the page in isolation, but now we can do some validation of a line
pointer against its successor. For example, a redirect line pointer
shouldn't point to another redirect line pointer, and if a tuple
is HOT-updated, the result should be a heap-only tuple.

Himanshu Upadhyaya and Robert Haas, reviewed by Aleksander Alekseev,
Andres Freund, and Peter Geoghegan.
2023-03-22 08:48:54 -04:00
Michael Paquier 88199b9d5f Fix a couple of typos
PL/pgSQL was misspelled in a few places, so fix these.

Author: Zhang Mingli
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/1bd41572-9cd9-465e-9f59-ee45385e51b4@Spark
2023-03-22 08:44:59 +09:00
Jeff Davis 869650fa86 Support language tags in older ICU versions (53 and earlier).
By calling uloc_canonicalize() before parsing the attributes, the
existing locale attribute parsing logic works on language tags as
well.

Fix a small memory leak, too.

Discussion: http://postgr.es/m/60da0cecfb512a78b8666b31631a636215d8ce73.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-03-21 16:12:37 -07:00
Michael Paquier e8e1f96c49 Fix make maintainer-clean with queryjumblefuncs.*.c files in src/backend/nodes/
The files generated by gen_node_support.pl for query jumbling
(queryjumblefuncs.funcs.c and queryjumblefuncs.switch.c) were not being
removed on make maintainer-clean (they need to remain around after a
simple "clean").  This commit makes the operation consistent with the
copy, equal, out and read files.

While on it, update a comment in the nodes'README where a reference to
queryjumblefuncs.funcs.c was missing.

Reported-by: Nathan Bossart
Reviewed-by: Richard Guo, Daniel Gustafsson
Discussion: https://postgr.es/m/ZBgAfTHcL6W7zGdW@paquier.xyz
2023-03-22 07:51:16 +09:00
David Rowley b94c671648 Fix incorrect comment in preptlist.c
Author: Etsuro Fujita
Reviewed-by: Richard Guo, Tom Lane
Discussion: https://postgr.es/m/CAPmGK15V8dcVxL9vcgVWPHV6pw1qzM42LzoUkQDB7-e+1onnJw@mail.gmail.com
2023-03-22 08:58:13 +13:00
David Rowley f48b4f892f Correct Memoize's estimated cache hit ratio calculation
As demonstrated by David Johnston, the Memoize cache hit ratio calculation
wasn't quite correct.

This change only affects the estimated hit ratio when the estimated number
of entries to cache is estimated not to fit inside the cache.  For
example, if we expect 2000 distinct cache key values and only expect to be
able to cache 1000 of those at once due to memory constraints, with an
estimate of 10000 calls, if we could store all entries then the hit ratio
should be 80% to account for the first 2000 of the 10000 calls to be a
cache miss due to the value not being cached yet.  If we can only store
1000 entries for each of the 2000 distinct possible values at once then
the 80% should be reduced by half to make the final estimate of 40%.
Previously, the calculation would have produced an estimated hit ratio of
30%, which wasn't correct.

Apply to master only so as not to destabilize plans in the back branches.

Reported-by: David G. Johnston
Discussion: https://postgr.es/m/CAKFQuwZEmcNk3YQo2Xj4EDUOdY6qakad31rOD1Vc4q1_s68-Ew@mail.gmail.com
Discussion: https://postgr.es/m/CAApHDvrV44LwiF4W_qf_RpbGYWSgp1kF=cZr+kTRRaALUfmXqw@mail.gmail.com
2023-03-22 08:44:54 +13:00
Tom Lane b0d8f2d983 Add SHELL_ERROR and SHELL_EXIT_CODE magic variables to psql.
These are set after a \! command or a backtick substitution.
SHELL_ERROR is just "true" for error (nonzero exit status) or "false"
for success, while SHELL_EXIT_CODE records the actual exit status
following standard shell/system(3) conventions.

Corey Huinker, reviewed by Maxim Orlov and myself

Discussion: https://postgr.es/m/CADkLM=cWao2x2f+UDw15W1JkVFr_bsxfstw=NGea7r9m4j-7rQ@mail.gmail.com
2023-03-21 13:03:56 -04:00
Daniel Gustafsson 106f26a849 Avoid using atooid for numerical comparisons which arent Oids
The check for the number of roles in the target cluster for an upgrade
selects the existing roles and performs a COUNT(*) over the result.  A
value of one is the expected query result value indicating that only
the install user is present in the new cluster. The result was converted
with the function for converting a string containing an Oid into a numeric,
which avoids potential overflow but makes the code less readable since
it's not actually an Oid at all.

Discussion: https://postgr.es/m/41AB5F1F-4389-4B25-9668-5C430375836C@yesql.se
2023-03-21 12:57:21 +01:00
Peter Eisentraut 4c8044c044 pg_waldump: Allow hexadecimal values for -t/--timeline option
This makes it easier to specify values taken directly from WAL file
names.

The option parsing is arranged in the style of option_parse_int() (but
we need to parse unsigned int), to allow future refactoring in the
same manner.

Reviewed-by: Sébastien Lardière <sebastien@lardiere.net>
Discussion: https://www.postgresql.org/message-id/flat/8fef346e-2541-76c3-d768-6536ae052993@lardiere.net
2023-03-21 08:05:23 +01:00
Amit Kapila b797def595 Ignore dropped columns during apply of update/delete.
We fail to apply updates and deletes when the REPLICA IDENTITY FULL is
used for the table having dropped columns. We didn't use to ignore dropped
columns while doing tuple comparison among the tuples from the publisher
and subscriber during apply of updates and deletes.

Author: Onder Kalaci, Shi yu
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CACawEhVQC9WoofunvXg12aXtbqKnEgWxoRx3+v8q32AWYsdpGg@mail.gmail.com
2023-03-21 09:47:21 +05:30
Thomas Munro 8d578b9b2e Fix race in parallel hash join batch cleanup, take II.
With unlucky timing and parallel_leader_participation=off (not the
default), PHJ could attempt to access per-batch shared state just as it
was being freed.  There was code intended to prevent that by checking
for a cleared pointer, but it was racy.  Fix, by introducing an extra
barrier phase.  The new phase PHJ_BUILD_RUNNING means that it's safe to
access the per-batch state to find a batch to help with, and
PHJ_BUILD_DONE means that it is too late.  The last to detach will free
the array of per-batch state as before, but now it will also atomically
advance the phase, so that late attachers can avoid the hazard.  This
mirrors the way per-batch hash tables are freed (see phases
PHJ_BATCH_PROBING and PHJ_BATCH_DONE).

An earlier attempt to fix this (commit 3b8981b6, later reverted) missed
one special case.  When the inner side is empty (the "empty inner
optimization), the build barrier would only make it to
PHJ_BUILD_HASHING_INNER phase before workers attempted to detach from
the hashtable.  In that case, fast-forward the build barrier to
PHJ_BUILD_RUNNING before proceeding, so that our later assertions hold
and we can still negotiate who is cleaning up.

Revealed by build farm failures, where BarrierAttach() failed a sanity
check assertion, because the memory had been clobbered by dsa_free().
In non-assert builds, the result could be a segmentation fault.

Back-patch to all supported releases.

Author: Thomas Munro <thomas.munro@gmail.com>
Author: Melanie Plageman <melanieplageman@gmail.com>
Reported-by: Michael Paquier <michael@paquier.xyz>
Reported-by: David Geier <geidav.pg@gmail.com>
Tested-by: David Geier <geidav.pg@gmail.com>
Discussion: https://postgr.es/m/20200929061142.GA29096%40paquier.xyz
2023-03-21 14:29:34 +13:00
Andres Freund ef719e7b32 Stabilize pg_stat_io writes test
Counting writes only for io_context = 'normal' is unreliable, as backends
using a buffer access strategy could flush all of the dirty buffers out from
under the other backends and checkpointer. Change the test to count writes in
any context. This achieves roughly the same coverage anyway.

Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Author: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://www.postgresql.org/message-id/ZAnWU8WbXEDjrfUE%40telsasoft.com
2023-03-20 18:16:06 -07:00
Tom Lane 72a5b1fc88 Add @extschema:name@ and no_relocate options to extensions.
@extschema:name@ extends the existing @extschema@ feature so that
we can also insert the schema name of some required extension,
thus making cross-extension references robust even if they are in
different schemas.

However, this has the same hazard as @extschema@: if the schema
name is embedded literally in an installed object, rather than being
looked up once during extension script execution, then it's no longer
safe to relocate the other extension to another schema.  To deal with
that without restricting things unnecessarily, add a "no_relocate"
option to extension control files.  This allows an extension to
specify that it cannot handle relocation of some of its required
extensions, even if in themselves those extensions are relocatable.
We detect "no_relocate" requests of dependent extensions during
ALTER EXTENSION SET SCHEMA.

Regina Obe, reviewed by Sandro Santilli and myself

Discussion: https://postgr.es/m/003001d8f4ae$402282c0$c0678840$@pcorp.us
2023-03-20 18:37:11 -04:00
Tomas Vondra 19d8e2308b Ignore BRIN indexes when checking for HOT updates
When determining whether an index update may be skipped by using HOT, we
can ignore attributes indexed by block summarizing indexes without
references to individual tuples that need to be cleaned up.

A new type TU_UpdateIndexes provides a signal to the executor to
determine which indexes to update - no indexes, all indexes, or only the
summarizing indexes.

This also removes rd_indexattr list, and replaces it with rd_attrsvalid
flag. The list was not used anywhere, and a simple flag is sufficient.

This was originally committed as 5753d4ee32, but then got reverted by
e3fcca0d0d because of correctness issues.

Original patch by Josef Simanek, various fixes and improvements by Tomas
Vondra and me.

Authors: Matthias van de Meent, Josef Simanek, Tomas Vondra
Reviewed-by: Tomas Vondra, Alvaro Herrera
Discussion: https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
Discussion: https://postgr.es/m/CAFp7QwpMRGcDAQumN7onN9HjrJ3u4X3ZRXdGFT0K5G2JWvnbWg%40mail.gmail.com
2023-03-20 11:02:42 +01:00
Tomas Vondra e858312683 Fix netmask handling in inet_minmax_multi_ops
When calculating distance in brin_minmax_multi_distance_inet(), the
netmask was applied incorrectly. This results in (seemingly) incorrect
ordering of values, triggering an assert.

For builds without asserts this is mostly harmless - we may merge other
ranges, possibly resulting in slightly less efficient index. But it's
still correct and the greedy algorithm doesn't guarantee optimality
anyway.

Backpatch to 14, where minmax-multi indexes were introduced.

Reported by Dmitry Dolgov, investigation and fix by me.

Reported-by: Dmitry Dolgov
Backpatch-through: 14
Discussion: https://postgr.es/m/17774-c6f3e36dd4471e67@postgresql.org
2023-03-20 10:24:14 +01:00
David Rowley 785f709576 Have the planner account for the Memoize cache key memory
The Memoize executor node stores the cache key values along with the
tuple(s) which were found in the outer node which match each key value,
however, when the planner tried to estimate how many entries could be
stored in the cache, it didn't take into account that the cache key must
also be stored.  In many cases, this won't make a large difference as the
key is likely small in comparison to the tuple(s) being stored, however,
it's not impossible to craft cases where the key could take more memory
than the tuple(s) stored for it.

Here we adjust the planner so it takes into account the estimated amount
of memory to store the cache key.  Effectively, this change will reduce
the estimated cache hit ratio when it's thought that not all items will
fit in the cache, thus Memoize will become more expensive in such cases.

The executor already takes into account the memory consumed by the cache
key, so here we only need to adjust the planner.

Discussion: https://postgr.es/m/CAApHDvqGErGuyBfQvBQrTCHDbzLTqoiW=_G9sOzeFxWEc_7auA@mail.gmail.com
2023-03-20 16:26:04 +13:00
David Rowley 579ee5df14 Fix memory leak in Memoize cache key evaluation
When probing the Memoize cache to check if the current cache key values
exist in the cache, we perform an evaluation of the expressions making up
the cache key before probing the hash table for those values.  This
operation could leak memory as it is possible that the cache key is an
expression which requires allocation of memory, as was the case in bug
17844.

Here we fix this by correctly switching to the per tuple context before
evaluating the cache expressions so that the memory is freed next time the
per tuple context is reset.

Bug: 17844
Reported-by: Alexey Ermakov
Discussion: https://postgr.es/m/17844-d2f6f9e75a622bed@postgresql.org
Backpatch-through: 14, where Memoize was introduced
2023-03-20 13:28:47 +13:00
Tom Lane e060cd59fa Avoid copying undefined data in _readA_Const().
nodeRead() will have created a Node struct that's only allocated big
enough for the specific node type, so copying sizeof(union ValUnion)
can be copying too much.  This provokes valgrind complaints, and with
very bad luck could perhaps result in SIGSEGV.

While at it, tidy up _equalA_Const to avoid duplicate checks of isnull.

Per report from Alexander Lakhin.  This code is new as of a6bc33019,
so no need to back-patch.

Discussion: https://postgr.es/m/4995256b-cc65-170e-0b22-60ad2cd535f1@gmail.com
2023-03-19 15:36:16 -04:00
Tom Lane 75bd846b68 Add functions to do timestamptz arithmetic in a non-default timezone.
Add versions of timestamptz + interval, timestamptz - interval, and
generate_series(timestamptz, ...) in which a timezone can be specified
explicitly instead of defaulting to the TimeZone GUC setting.

The new functions for the first two are named date_add and
date_subtract.  This might seem too generic, but we could use
overloading to add additional variants if that seems useful.

Along the way, improve the docs' pretty inadequate explanation
of how timestamptz +- interval works.

Przemysław Sztoch and Gurjeet Singh; cosmetic changes and most of
the docs work by me

Discussion: https://postgr.es/m/01a84551-48dd-1359-bf7e-f6b0203a6bd0@sztoch.pl
2023-03-18 14:12:16 -04:00
Michael Paquier 0e681cf039 Add files related to query jumbling in src/include/nodes/ for meson
This caused ninja clean to not remove the two files generated by
gen_node_support.pl for the query jumbling, for example:
queryjumblefuncs.funcs.c and queryjumblefuncs.switch.c.

Reported-by: Pavel Stehule
Discussion: https://postgr.es/m/CAFj8pRBFiWVRyGYSPziyFuXJbHirNmfWwzbfTyCf8YOdiwK74w@mail.gmail.com
2023-03-18 18:04:04 +09:00
Tom Lane 3e59e5048d Refactor datetime functions' timezone lookup code to reduce duplication.
We already had five copies of essentially the same logic, and an
upcoming patch introduces yet another use-case.  That's past my
threshold of pain, so introduce a common subroutine.  There's not
that much net code savings, but the chance of typos should go down.

Inspired by a patch from Przemysław Sztoch, but different in detail.

Discussion: https://postgr.es/m/01a84551-48dd-1359-bf7e-f6b0203a6bd0@sztoch.pl
2023-03-17 17:47:19 -04:00
Peter Eisentraut cc1392d4aa Fix typo
Introduced in de4d456b40.

Reported-by: Erik Rijkers <er@xs4all.nl>
2023-03-17 21:40:25 +01:00
Jeff Davis f413941f41 Fix t_isspace(), etc., when datlocprovider=i and datctype=C.
Check whether the datctype is C to determine whether t_isspace() and
related functions use isspace() or iswspace().

Previously, t_isspace() checked whether the database default collation
was C; which is incorrect when the default collation uses the ICU
provider.

Discussion: https://postgr.es/m/79e4354d9eccfdb00483146a6b9f6295202e7890.camel@j-davis.com
Reviewed-by: Peter Eisentraut
Backpatch-through: 15
2023-03-17 12:08:46 -07:00
Tom Lane 064709f803 Simplify and speed up pg_dump's creation of parent-table links.
Instead of trying to optimize this by skipping creation of the
links for tables we don't plan to dump, just create them all in
bulk with a single scan over the pg_inherits data.  The previous
approach was more or less O(N^2) in the number of pg_inherits
entries, not to mention being way too complicated.

Also, don't create useless TableAttachInfo objects.
It's silly to create a TableAttachInfo object that we're not
going to dump, when we know perfectly well at creation time
that it won't be dumped.

Patch by me; thanks to Julien Rouhaud for review.

Discussion: https://postgr.es/m/1376149.1675268279@sss.pgh.pa.us
2023-03-17 13:43:10 -04:00
Tom Lane bc8cd50fef Fix pg_dump for hash partitioning on enum columns.
Hash partitioning on an enum is problematic because the hash codes are
derived from the OIDs assigned to the enum values, which will almost
certainly be different after a dump-and-reload than they were before.
This means that some rows probably end up in different partitions than
before, causing restore to fail because of partition constraint
violations.  (pg_upgrade dodges this problem by using hacks to force
the enum values to keep the same OIDs, but that's not possible nor
desirable for pg_dump.)

Users can work around that by specifying --load-via-partition-root,
but since that's a dump-time not restore-time decision, one might
find out the need for it far too late.  Instead, teach pg_dump to
apply that option automatically when dealing with a partitioned
table that has hash-on-enum partitioning.

Also deal with a pre-existing issue for --load-via-partition-root
mode: in a parallel restore, we try to TRUNCATE target tables just
before loading them, in order to enable some backend optimizations.
This is bad when using --load-via-partition-root because (a) we're
likely to suffer deadlocks from restore jobs trying to restore rows
into other partitions than they came from, and (b) if we miss getting
a deadlock we might still lose data due to a TRUNCATE removing rows
from some already-completed restore job.

The fix for this is conceptually simple: just don't TRUNCATE if we're
dealing with a --load-via-partition-root case.  The tricky bit is for
pg_restore to identify those cases.  In dumps using COPY commands we
can inspect each COPY command to see if it targets the nominal target
table or some ancestor.  However, in dumps using INSERT commands it's
pretty impractical to examine the INSERTs in advance.  To provide a
solution for that going forward, modify pg_dump to mark TABLE DATA
items that are using --load-via-partition-root with a comment.
(This change also responds to a complaint from Robert Haas that
the dump output for --load-via-partition-root is pretty confusing.)
pg_restore checks for the special comment as well as checking the
COPY command if present.  This will fail to identify the combination
of --load-via-partition-root and --inserts in pre-existing dump files,
but that should be a pretty rare case in the field.  If it does
happen you will probably get a deadlock failure that you can work
around by not using parallel restore, which is the same as before
this bug fix.

Having done this, there seems no remaining reason for the alarmism
in the pg_dump man page about combining --load-via-partition-root
with parallel restore, so remove that warning.

Patch by me; thanks to Julien Rouhaud for review.  Back-patch to
v11 where hash partitioning was introduced.

Discussion: https://postgr.es/m/1376149.1675268279@sss.pgh.pa.us
2023-03-17 13:31:40 -04:00
Peter Eisentraut de4d456b40 Improve several permission-related error messages.
Mainly move some detail from errmsg to errdetail, remove explicit
mention of superuser where appropriate, since that is implied in most
permission checks, and make messages more uniform.

Author: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/20230316234701.GA903298@nathanxps13
2023-03-17 10:33:09 +01:00
Thomas Munro bfc9497ece libpq: Use modern socket flags, if available.
Since commit 7627b91cd5, libpq has used FD_CLOEXEC so that sockets
wouldn't be leaked to subprograms.  With enough bad luck, a
multi-threaded program might fork in between the socket() and fcntl()
calls.  We can close that tiny gap by using SOCK_CLOEXEC instead of a
separate call.  While here, we might as well do the same for
SOCK_NONBLOCK, to save another syscall.

These flags are expected to appear in the next revision of the POSIX
standard, specifically to address this problem.  Our Unixoid targets
except macOS and AIX have had them for a long time, and macOS would
hopefully use guarded availability to roll them out, so it seems enough
to use a simple ifdef test for availability until we hear otherwise.
Windows doesn't have them, but has non-inheritable sockets by default.

Discussion: https://postgr.es/m/CA%2BhUKGKb6FsAdQWcRL35KJsftv%2B9zXqQbzwkfRf1i0J2e57%2BhQ%40mail.gmail.com
2023-03-17 20:40:34 +13:00
Peter Eisentraut 95a828378e Fix incorrect format placeholders
Small fixup for 9637badd9f.
2023-03-17 07:48:24 +01:00
Andres Freund 64470973b1 tests: Prevent syslog activity by slapd, take 2
Unfortunately it turns out that the logfile-only option added in b9f8d1cbad
is only available in openldap starting in 2.6.

Luckily the option to control the log level (loglevel/-s) have been around for
much longer. As it turns out loglevel/-s only control what goes into syslog,
not what ends up in the file specified with 'logfile' and stderr.

While we currently are specifying 'logfile', nothing ends up in it, as the
option only controls debug messages, and we didn't set a debug level. The
debug level can only be configured on the commandline and also prevents
forking. That'd require larger changes, so this commit doesn't tackle that
issue.

Specify the syslog level when starting slapd using -s, as that allows to
prevent all syslog messages if one uses '0' instead of 'none', while loglevel
doesn't prevent the first message.

Discussion: https://postgr.es/m/20230311233708.3yjdbjkly2q4gq2j@awork3.anarazel.de
Backpatch: 11-
2023-03-16 23:07:29 -07:00
Amit Kapila e709596b25 Add macros for ReorderBufferTXN toptxn.
Currently, there are quite a few places in reorderbuffer.c that tries to
access top-transaction for a subtransaction. This makes the code to access
top-transaction consistent and easier to follow.

Author: Peter Smith
Reviewed-by: Vignesh C, Sawada Masahiko
Discussion: https://postgr.es/m/CAHut+PuCznOyTqBQwjRUu-ibG-=KHyCv-0FTcWQtZUdR88umfg@mail.gmail.com
2023-03-17 08:29:41 +05:30
David Rowley eb7d043c9b Fix incorrect logic for determining safe WindowAgg run conditions
The logic added in 9d9c02ccd to determine when a qual can be used as a
WindowClause run condition failed to correctly check for subqueries in the
qual.  This was being done correctly for normal subquery qual pushdowns,
it's just that 9d9c02ccd failed to follow the lead on that.

This also fixes various other cases where transforming the qual into a
WindowClause run condition in the subquery should have been disallowed.

Bug: #17826
Reported-by: Anban Company
Discussion: https://postgr.es/m/17826-7d8750952f19a5f5@postgresql.org
Backpatch-through: 15, where 9d9c02ccd was introduced.
2023-03-17 15:49:53 +13:00
Andres Freund b9f8d1cbad tests: Minimize syslog activity by slapd
Until now the tests using slapd spammed syslog for every connection /
query. Use logfile-only to prevent syslog activity. Unfortunately that only
takes effect after logging the first message, but that's still much better
than the prior situation.

Discussion: https://postgr.es/m/20230311233708.3yjdbjkly2q4gq2j@awork3.anarazel.de
Backpatch: 11-
2023-03-16 19:38:02 -07:00
Michael Paquier 98ae2c84a4 libpq: Remove code for SCM credential authentication
Support for SCM credential authentication has been removed in the
backend in 9.1, and libpq has kept some code to handle it for
compatibility.

Commit be4585b, that did the cleanup of the backend code, has done
so because the code was not really portable originally.  And, as there
are likely little chances that this is used these days, this removes the
remaining code from libpq.  An error will now be raised by libpq if
attempting to connect to a server that returns AUTH_REQ_SCM_CREDS,
instead.

References to SCM credential authentication are removed from the
protocol documentation.  This removes some meson and configure checks.

Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/ZBLH8a4otfqgd6Kn@paquier.xyz
2023-03-17 10:52:26 +09:00
Thomas Munro 10b6745d31 Small tidyup for commit d41a178b, part II.
Further to commit 6a9229da, checking for NULL is now redundant.  An "out
of memory" error would have been thrown already by palloc() and treated
as FATAL, so we can delete a few more lines.

Back-patch to all releases, like those other commits.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/4040668.1679013388%40sss.pgh.pa.us
2023-03-17 14:44:12 +13:00
Tom Lane 9bfd2822b3 Enable use of Memoize atop an Append that came from UNION ALL.
create_append_path() would only apply get_baserel_parampathinfo
when the path is for a partitioned table, but it's also potentially
useful for paths for UNION ALL appendrels.  Specifically, that
supports building a Memoize path atop this one.

While we're in the vicinity, delete some dead code in
create_merge_append_plan(): there's no need for it to support
parameterized MergeAppend paths, and it doesn't look like that
is going to change anytime soon.  It'll be easy enough to undo
this when/if it becomes useful.

Richard Guo

Discussion: https://postgr.es/m/CAMbWs4_ABSu4PWG2rE1q10tJugEXHWgru3U8dAgkoFvgrb6aEA@mail.gmail.com
2023-03-16 18:13:45 -04:00
Andres Freund 0dc40196f2 Work around spurious compiler warning in inet operators
gcc 12+ has complaints like the following:

../../../../../pgsql/src/backend/utils/adt/network.c: In function 'inetnot':
../../../../../pgsql/src/backend/utils/adt/network.c:1893:34: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 1893 |                         pdst[nb] = ~pip[nb];
      |                         ~~~~~~~~~^~~~~~~~~~
../../../../../pgsql/src/include/utils/inet.h:27:23: note: at offset -1 into destination object 'ipaddr' of size 16
   27 |         unsigned char ipaddr[16];       /* up to 128 bits of address */
      |                       ^~~~~~
../../../../../pgsql/src/include/utils/inet.h:27:23: note: at offset -1 into destination object 'ipaddr' of size 16

This is due to a compiler bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104986

It has been a year since the bug has been reported without getting fixed. As
the warnings are verbose and use of gcc 12 is becoming more common, it seems
worth working around the bug. Particularly because a simple reformulation of
the loop condition fixes the issue and isn't any less readable.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/144536.1648326206@sss.pgh.pa.us
Backpatch: 11-
2023-03-16 14:48:45 -07:00
Thomas Munro 6a9229da65 Small tidyup for commit d41a178b.
A comment was left behind claiming that we needed to use malloc() rather
than palloc() because the corresponding free would run in another
thread, but that's not true anymore.  Remove that comment.  And, with
the reason being gone, we might as well actually use palloc().

Back-patch to supported releases, like d41a178b.

Discussion: https://postgr.es/m/CA%2BhUKG%2BpdM9v3Jv4tc2BFx2jh_daY3uzUyAGBhtDkotEQDNPYw%40mail.gmail.com
2023-03-17 10:44:46 +13:00
Tom Lane 5b3c595355 Tighten error checks in datetime input, and remove bogus "ISO" format.
DecodeDateTime and DecodeTimeOnly had support for date input in the
style "Y2023M03D16", which the comments claimed to be an "ISO" format.
However, so far as I can find there is no such format in ISO 8601;
they write units before numbers in intervals, but not in datetimes.
Furthermore, the lesser-known ISO 8601-2 spec actually defines an
incompatible format "2023Y03M16D".  None of our documentation mentions
such a format either.  So let's just drop it.

That leaves us with only two cases for a prefix unit specifier in
datetimes: Julian dates written as Jnnnn, and the "T" separator
defined by ISO 8601.  Add checks to catch misuse of these specifiers,
that is consecutive specifiers or a dangling specifier at the end of
the string.  We do not however disallow a specifier that is separated
from the field that it disambiguates (by noise words or unrelated
fields).  That being the case, remove some overly-aggressive error
checks from the ISOTIME cases.

Joseph Koshakow, editorialized a bit by me; thanks also to
Peter Eisentraut for some standards-reading.

Discussion: https://postgr.es/m/CAAvxfHf2Q1gKLiHGnuPOiyf0ASvKUM4BnMfsXuwgtYEb_Gx0Zw@mail.gmail.com
2023-03-16 14:18:33 -04:00
Andres Freund 2b7259f855 Silence pedantic compiler warning introduced in ce340e530d
.../src/common/file_utils.c: In function ‘pg_pwrite_zeros’:
.../src/common/file_utils.c:543:9: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration]
  543 |         const static PGAlignedBlock zbuffer = {{0}};    /* worth BLCKSZ */
2023-03-16 09:41:13 -07:00
Tom Lane 2333803d84 Use "data directory" not "current directory" in error messages.
The user receiving the message might not understand where the
server's "current directory" is.  "Data directory" seems clearer.
(This would not be good for frontend code, but both of these
messages are only issued in the backend.)

Kyotaro Horiguchi

Discussion: https://postgr.es/m/20230316.111646.1564684434328830712.horikyota.ntt@gmail.com
2023-03-16 12:04:08 -04:00
Peter Eisentraut 442f870065 Integrate superuser check into has_rolreplication()
This makes it consistent with similar functions like
has_createrole_privilege() and allows removing some explicit superuser
checks.

Author: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/20230310000313.GA3992372%40nathanxps13
2023-03-16 15:43:33 +01:00
Peter Eisentraut 3b7cd8c690 Small code simplification
Author: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/20230310000313.GA3992372%40nathanxps13
2023-03-16 15:33:43 +01:00
Peter Eisentraut 4ef1be5a0b pkg-config Requires.private entries should be comma-separated
In the .pc (pkg-config) files generated by the make and meson builds,
the Requires.private entries use different delimiters.  The make build
uses spaces, the meson build uses commas. The pkg-config documentation
says that it should be comma-separated, but apparently about half the
.pc in the wild use just spaces.  The pkg-config source code
acknowledges that both commas and spaces work.

This changes the make build to use commas, for consistency.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/1fb52d61-0964-2d8e-87d9-e8be830e2b24%40enterprisedb.com
2023-03-16 07:37:38 +01:00
Michael Paquier e731aeac89 Remove PgStat_BackendFunctionEntry
This structure included only PgStat_FunctionCounts, and removing it
facilitates some upcoming refactoring for pgstatfuncs.c to use more
macros rather that mostly-duplicated functions.

Author: Bertrand Drouvot
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/11d531fe-52fc-c6ea-7e8e-62f1b6ec626e@gmail.com
2023-03-16 14:22:34 +09:00
Michael Paquier c9a272daaa Add .gitignore to ldap_password_func
This bit has been forgotten in 419a8dd.
2023-03-16 09:36:01 +09:00
Michael Paquier 6f9ee74d45 Improve handling of psql \watch's interval argument
A failure in parsing the interval value defined in the \watch command
was silently switched to 1s of interval between two queries, which can
be confusing.  This commit improves the error handling, and a couple of
tests are added to check after:
- An incorrect value.
- An out-of-range value.
- A negative value.

A value of zero is able to work now, meaning that there is no interval
of time between two queries in a \watch loop.  No backpatch is done, as
it could break existing applications.

Author: Andrey Borodin
Reviewed-by: Kyotaro Horiguchi, Nathan Bossart, Michael Paquier
Discussion: https://postgr.es/m/CAAhFRxiZ2-n_L1ErMm9AZjgmUK=qS6VHb+0SaMn8sqqbhF7How@mail.gmail.com
2023-03-16 09:32:36 +09:00
Andrew Dunstan dccb4d1c03 MSVC: Don't build ldap_password_func if not building with ldap
Blind attempt to fix issue with 419a8dd814 found on drongo.
2023-03-15 18:30:22 -04:00
Andrew Dunstan b85e91023b Don't try to read default for a non-existent attribute
Oversight in commit 9f8377f7a2 for COPY .. DEFAULT

per report from Alexander Lakhin
2023-03-15 17:20:42 -04:00
Tom Lane 483bdb2afe Support [NO] INDENT option in XMLSERIALIZE().
This adds the ability to pretty-print XML documents ... according to
libxml's somewhat idiosyncratic notions of what's pretty, anyway.
One notable divergence from a strict reading of the spec is that
libxml is willing to collapse empty nodes "<node></node>" to just
"<node/>", whereas SQL and the underlying XML spec say that this
option should only result in whitespace tweaks.  Nonetheless,
it seems close enough to justify using the SQL-standard syntax.

Jim Jones, reviewed by Peter Smith and myself

Discussion: https://postgr.es/m/2f5df461-dad8-6d7d-4568-08e10608a69b@uni-muenster.de
2023-03-15 16:59:09 -04:00
Andrew Dunstan 419a8dd814 Add a hook for modifying the ldapbind password
The hook can be installed by a shared_preload library.

A similar mechanism could be used for radius paswords, for example, and
the type name auth_password_hook_typ has been shosen with that in mind.

John Naylor and Andrew Dunstan

Discussion: https://postgr.es/m/469b06ed-69de-ba59-c13a-91d2372e52a9@dunslane.net
2023-03-15 16:37:28 -04:00
Tom Lane e3ac85014e Support PlaceHolderVars in MERGE actions.
preprocess_targetlist thought PHVs couldn't appear here.
It was mistaken, as per report from Önder Kalacı.

Surveying other pull_var_clause calls, I noted no similar errors,
but I did notice that qual_is_pushdown_safe's assertion about
!contain_window_function was pointless, because the following
pull_var_clause call would complain about them anyway.  In HEAD
only, remove the redundant Assert and improve the commentary.

Discussion: https://postgr.es/m/CACawEhUuum-gC_2S3sXLTcsk7bUSPSHOD+g1ZpfKaDK-KKPPWA@mail.gmail.com
2023-03-15 11:59:18 -04:00
Thomas Munro a948e49e2e Use nanosleep() to implement pg_usleep().
The previous coding based on select() had commentary about historical
portability concerns.  Use POSIX nanosleep() instead.

This has independently been suggested a couple of times before, but
never managed to stick.  Since recent and proposed work removes other
uses of select(), and associated code and comments relating to its
non-portable interaction with signals, it seems like a good time to tidy
up this case, too.

Also modernize the explanation of why WaitLatch() is a better way to
wait.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Suggested-by: Paul Guo <paulguo@gmail.com>
Suggested-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAAKRu_b-q0hXCBUCAATh0Z4Zi6UkiC0k2DFgoD3nC-r3SkR3tg%40mail.gmail.com
Discussion: https://postgr.es/m/CABQrizfxpBLZT5mZeE0js5oCh1tqEWvcGF3vMRCv5P-RwUY5dQ@mail.gmail.com
Discussion: https://postgr.es/m/4902.1552349020@sss.pgh.pa.us
2023-03-15 17:57:12 +13:00
Thomas Munro e4da2a44c1 Update obsolete comment about pg_usleep() accuracy.
There are still some systems that use traditional tick-based sleep
timing, but many including Linux, FreeBSD and macOS started using high
resolution timer hardware more directly a decade or two ago.  Update our
comment about that.  Also highlight that Windows is like the older
Unixen in that respect.

Author: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2BogAon8_V223Ldv6taPR2uKH3X_UJ_A7LJAf3-VRARPA%40mail.gmail.com
2023-03-15 17:23:02 +13:00
Amit Kapila 805b821e77 Add the testcases for 89e46da5e5.
Forgot to add new testcases in commit 89e46da5e5.

Author: Onder Kalaci, Amit Kapila
Reviewed-by: Peter Smith, Shi yu, Hou Zhijie, Vignesh C, Kuroda Hayato, Amit Kapila
Discussion: https://postgr.es/m/CACawEhVLqmAAyPXdHEPv1ssU2c=dqOniiGz7G73HfyS7+nGV4w@mail.gmail.com
2023-03-15 09:31:44 +05:30
Michael Paquier 765f5df726 Improve WIN32 port of fstat() to detect more file types
The current implementation of _pgfstat64() is ineffective in detecting a
terminal handle or an anonymous named pipe.  This commit improves our
port of fstat() to detect more efficiently such cases by relying on
GetFileType(), and returning more correct data when the type found is
either a FILE_TYPE_PIPE (_S_IFIFO) or a FILE_TYPE_CHAR (_S_IFCHR).

This is part of a more global fix to address failures when feeding the
output generated by pg_dump to pg_restore through a pipe, for example,
but not all of it.   We are also going to need to do something about
fseek() and ftello() which are not reliable on WIN32 for the same cases
where fstat() was incorrect.  Fixing fstat() is independent of the rest,
though, which is why both fixes are handled separately, and this is the
first part of it.

Reported-by: Daniel Watzinger
Author: Daniel Watzinger, Juan José Santamaría Flecha
Discussion: https://postgr.es/m/b1448cd7-871e-20e3-8398-895e2d1d3bf9@gmail.com
Backpatch-through: 14
2023-03-15 12:55:51 +09:00
Amit Kapila 89e46da5e5 Allow the use of indexes other than PK and REPLICA IDENTITY on the subscriber.
Using REPLICA IDENTITY FULL on the publisher can lead to a full table scan
per tuple change on the subscription when REPLICA IDENTITY or PK index is
not available. This makes REPLICA IDENTITY FULL impractical to use apart
from some small number of use cases.

This patch allows using indexes other than PRIMARY KEY or REPLICA
IDENTITY on the subscriber during apply of update/delete. The index that
can be used must be a btree index, not a partial index, and it must have
at least one column reference (i.e. cannot consist of only expressions).
We can uplift these restrictions in the future. There is no smart
mechanism to pick the index. If there is more than one index that
satisfies these requirements, we just pick the first one. We discussed
using some of the optimizer's low-level APIs for this but ruled it out
as that can be a maintenance burden in the long run.

This patch improves the performance in the vast majority of cases and the
improvement is proportional to the amount of data in the table. However,
there could be some regression in a small number of cases where the indexes
have a lot of duplicate and dead rows. It was discussed that those are
mostly impractical cases but we can provide a table or subscription level
option to disable this feature if required.

Author: Onder Kalaci, Amit Kapila
Reviewed-by: Peter Smith, Shi yu, Hou Zhijie, Vignesh C, Kuroda Hayato, Amit Kapila
Discussion: https://postgr.es/m/CACawEhVLqmAAyPXdHEPv1ssU2c=dqOniiGz7G73HfyS7+nGV4w@mail.gmail.com
2023-03-15 08:49:04 +05:30
Thomas Munro 720de00af4 Fix fractional vacuum_cost_delay.
Commit 4753ef37 changed vacuum_delay_point() to use the WaitLatch() API,
to fix the problem that vacuum could keep running for a very long time
after the postmaster died.

Unfortunately, that broke commit caf626b2's support for fractional
vacuum_cost_delay, which shipped in PostgreSQL 12.  WaitLatch() works in
whole milliseconds.

For now, revert the change from commit 4753ef37, but add an explicit
check for postmaster death.  That's an extra system call on systems
other than Linux and FreeBSD, but that overhead doesn't matter much
considering that we willingly went to sleep and woke up again.  (In
later work, we might add higher resolution timeouts to the latch API so
that we could do this with our standard programming pattern, but that
wouldn't be back-patched.)

Back-patch to 14, where commit 4753ef37 arrived.

Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_b-q0hXCBUCAATh0Z4Zi6UkiC0k2DFgoD3nC-r3SkR3tg%40mail.gmail.com
2023-03-15 13:58:18 +13:00
Thomas Munro d41a178b3a Fix waitpid() emulation on Windows.
Our waitpid() emulation didn't prevent a PID from being recycled by the
OS before the call to waitpid().  The postmaster could finish up
tracking more than one child process with the same PID, and confuse
them.

Fix, by moving the guts of pgwin32_deadchild_callback() into waitpid(),
so that resources are released synchronously.  The process and PID
continue to exist until we close the process handle, which only happens
once we're ready to adjust our book-keeping of running children.

This seems to explain a couple of failures on CI.  It had never been
reported before, despite the code being as old as the Windows port.
Perhaps Windows started recycling PIDs more rapidly, or perhaps timing
changes due to commit 7389aad6 made it more likely to break.

Thanks to Alexander Lakhin for analysis and Andres Freund for tracking
down the root cause.

Back-patch to all supported branches.

Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20230208012852.bvkn2am4h4iqjogq%40awork3.anarazel.de
2023-03-15 13:24:47 +13:00
Tom Lane b081fe4199 Fix corner case bug in numeric to_char() some more.
The band-aid applied in commit f0bedf3e4 turns out to still need
some work: it made sure we didn't set Np->last_relevant too small
(to the left of the decimal point), but it didn't prevent setting
it too large (off the end of the partially-converted string).
This could result in fetching data beyond the end of the allocated
space, which with very bad luck could cause a SIGSEGV, though
I don't see any hazard of interesting memory disclosure.

Per bug #17839 from Thiago Nunes.  The bug's pretty ancient,
so back-patch to all supported versions.

Discussion: https://postgr.es/m/17839-aada50db24d7b0da@postgresql.org
2023-03-14 19:17:31 -04:00
Tom Lane a563c24c95 Allow pg_dump to include/exclude child tables automatically.
This patch adds new pg_dump switches
    --table-and-children=pattern
    --exclude-table-and-children=pattern
    --exclude-table-data-and-children=pattern
which act the same as the existing --table, --exclude-table, and
--exclude-table-data switches, except that any partitions or
inheritance child tables of the table(s) matching the pattern
are also included or excluded.

Gilles Darold, reviewed by Stéphane Tachoires

Discussion: https://postgr.es/m/5aa393b5-5f67-8447-b83e-544516990ee2@migops.com
2023-03-14 16:09:03 -04:00
Tom Lane 684ffac8c7 Remove unnecessary code in dependency_is_compatible_expression().
Scanning the expression for compatible Vars isn't really necessary,
because the subsequent match against StatisticExtInfo entries will
eliminate expressions containing other Vars just fine.  Moreover,
this code hadn't stopped to think about what to do with
PlaceHolderVars or Aggrefs in the clause; and at least for the PHV
case, that demonstrably leads to failures.  Rather than work out
whether it's reasonable to ignore those, let's just remove the
whole stanza.

Per report from Richard Guo.  Back-patch to v14 where this code
was added.

Discussion: https://postgr.es/m/CAMbWs48Mmvm-acGevXuwpB=g5JMqVSL6i9z5UaJyLGJqa-XPAA@mail.gmail.com
2023-03-14 11:10:45 -04:00
Dean Rasheed d5d574146d Add support for the error functions erf() and erfc().
Expose the standard error functions as SQL-callable functions. These
are expected to be useful to people working with normal distributions,
and we use them here to test the distribution from random_normal().

Since these functions are defined in the POSIX and C99 standards, they
should in theory be available on all supported platforms. If that
turns out not to be the case, more work will be needed.

On all platforms tested so far, using extra_float_digits = -1 in the
regression tests is sufficient to allow for variations between
implementations. However, past experience has shown that there are
almost certainly going to be additional unexpected portability issues,
so these tests may well need further adjustments, based on the
buildfarm results.

Dean Rasheed, reviewed by Nathan Bossart and Thomas Munro.

Discussion: https://postgr.es/m/CAEZATCXv5fi7+Vu-POiyai+ucF95+YMcCMafxV+eZuN1B-=MkQ@mail.gmail.com
2023-03-14 09:17:36 +00:00
Michael Paquier 3a465cc678 libpq: Add support for require_auth to control authorized auth methods
The new connection parameter require_auth allows a libpq client to
define a list of comma-separated acceptable authentication types for use
with the server.  There is no negotiation: if the server does not
present one of the allowed authentication requests, the connection
attempt done by the client fails.

The following keywords can be defined in the list:
- password, for AUTH_REQ_PASSWORD.
- md5, for AUTH_REQ_MD5.
- gss, for AUTH_REQ_GSS[_CONT].
- sspi, for AUTH_REQ_SSPI and AUTH_REQ_GSS_CONT.
- scram-sha-256, for AUTH_REQ_SASL[_CONT|_FIN].
- creds, for AUTH_REQ_SCM_CREDS (perhaps this should be removed entirely
now).
- none, to control unauthenticated connections.

All the methods that can be defined in the list can be negated, like
"!password", in which case the server must NOT use the listed
authentication type.  The special method "none" allows/disallows the use
of unauthenticated connections (but it does not govern transport-level
authentication via TLS or GSSAPI).

Internally, the patch logic is tied to check_expected_areq(), that was
used for channel_binding, ensuring that an incoming request is
compatible with conn->require_auth.  It also introduces a new flag,
conn->client_finished_auth, which is set by various authentication
routines when the client side of the handshake is finished.  This
signals to check_expected_areq() that an AUTH_REQ_OK from the server is
expected, and allows the client to complain if the server bypasses
authentication entirely, with for example the reception of a too-early
AUTH_REQ_OK message.

Regression tests are added in authentication TAP tests for all the
keywords supported (except "creds", because it is around only for
compatibility reasons).  A new TAP script has been added for SSPI, as
there was no script dedicated to it yet.  It relies on SSPI being the
default authentication method on Windows, as set by pg_regress.

Author: Jacob Champion
Reviewed-by: Peter Eisentraut, David G. Johnston, Michael Paquier
Discussion: https://postgr.es/m/9e5a8ccddb8355ea9fa4b75a1e3a9edc88a70cd3.camel@vmware.com
2023-03-14 14:00:05 +09:00
Tom Lane 25a7812cd0 Fix JSON error reporting for many cases of erroneous string values.
The majority of error exit cases in json_lex_string() failed to
set lex->token_terminator, causing problems for the error context
reporting code: it would see token_terminator less than token_start
and do something more or less nuts.  In v14 and up the end result
could be as bad as a crash in report_json_context().  Older
versions accidentally avoided that fate; but all versions produce
error context lines that are far less useful than intended,
because they'd stop at the end of the prior token instead of
continuing to where the actually-bad input is.

To fix, invent some macros that make it less notationally painful
to do the right thing.  Also add documentation about what the
function is actually required to do; and in >= v14, add an assertion
in report_json_context about token_terminator being sufficiently
far advanced.

Per report from Nikolay Shaplov.  Back-patch to all supported
versions.

Discussion: https://postgr.es/m/7332649.x5DLKWyVIX@thinkpad-pgpro
2023-03-13 15:19:00 -04:00
Tom Lane 30dbdbe753 Fix failure to detect some cases of improperly-nested aggregates.
check_agg_arguments_walker() supposed that it needn't descend into
the arguments of a lower-level aggregate function, but this is
just wrong in the presence of multiple levels of sub-select.  The
oversight would lead to executor failures on queries that should
be rejected.  (Prior to v11, they actually were rejected, thanks
to a "redundant" execution-time check.)

Per bug #17835 from Anban Company.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17835-4f29f3098b2d0ba4@postgresql.org
2023-03-13 12:40:28 -04:00
Andrew Dunstan 9f8377f7a2 Add a DEFAULT option to COPY FROM
This allows for a string which if an input field matches causes the
column's default value to be inserted. The advantage of this is that
the default can be inserted in some rows and not others, for which
non-default data is available.

The file_fdw extension is also modified to take allow use of this
option.

Israel Barth Rubio

Discussion: https://postgr.es/m/CAO_rXXAcqesk6DsvioOZ5zmeEmpUN5ktZf-9=9yu+DTr0Xr8Uw@mail.gmail.com
2023-03-13 10:01:56 -04:00
Dean Rasheed 7b14e20b12 Fix MERGE command tag for actions blocked by BEFORE ROW triggers.
This ensures that the row count in the command tag for a MERGE is
correctly computed in the case where UPDATEs or DELETEs are skipped
due to a BEFORE ROW trigger returning NULL (the INSERT case was
already handled correctly by ExecMergeNotMatched() calling
ExecInsert()).

Back-patch to v15, where MERGE was introduced.

Discussion: https://postgr.es/m/CAEZATCU8XEmR0JWKDtyb7iZ%3DqCffxS9uyJt0iOZ4TV4RT%2Bow1w%40mail.gmail.com
2023-03-13 11:12:20 +00:00
Dean Rasheed 9321c79c86 Fix concurrent update issues with MERGE.
If MERGE attempts an UPDATE or DELETE on a table with BEFORE ROW
triggers, or a cross-partition UPDATE (with or without triggers), and
a concurrent UPDATE or DELETE happens, the merge code would fail.

In some cases this would lead to a crash, while in others it would
cause the wrong merge action to be executed, or no action at all. The
immediate cause of the crash was the trigger code calling
ExecGetUpdateNewTuple() as part of the EPQ mechanism, which fails
because during a merge ri_projectNew is NULL, since merge has its own
per-action projection information, which ExecGetUpdateNewTuple() knows
nothing about.

Fix by arranging for the trigger code to exit early, returning the
TM_Result and TM_FailureData information, if a concurrent modification
is detected, allowing the merge code to do the necessary EPQ handling
in its own way. Similarly, prevent the cross-partition update code
from doing any EPQ processing for a merge, allowing the merge code to
work out what it needs to do.

This leads to a number of simplifications in nodeModifyTable.c. Most
notably, the ModifyTableContext->GetUpdateNewTuple() callback is no
longer needed, and mergeGetUpdateNewTuple() can be deleted, since
there is no longer any requirement for get-update-new-tuple during a
merge. Similarly, ModifyTableContext->cpUpdateRetrySlot is no longer
needed. Thus ExecGetUpdateNewTuple() and the retry_slot handling of
ExecCrossPartitionUpdate() can be restored to how they were in v14,
before the merge code was added, and ExecMergeMatched() no longer
needs any special-case handling for cross-partition updates.

While at it, tidy up ExecUpdateEpilogue() a bit, making it handle
recheckIndexes locally, rather than passing it in as a parameter,
ensuring that it is freed properly. This dates back to when it was
split off from ExecUpdate() to support merge.

Per bug #17809 from Alexander Lakhin, and follow-up investigation of
bug #17792, also from Alexander Lakhin.

Back-patch to v15, where MERGE was introduced, taking care to preserve
backwards-compatibility of the trigger API in v15 for any extensions
that might use it.

Discussion:
  https://postgr.es/m/17809-9e6650bef133f0fe%40postgresql.org
  https://postgr.es/m/17792-0f89452029662c36%40postgresql.org
2023-03-13 10:22:22 +00:00
Peter Eisentraut b2bd9a6796 Fix expected test output
For builds without lz4, for 208bf364a9.
2023-03-13 11:15:21 +01:00
Peter Eisentraut 208bf364a9 Remove incidental md5() function uses from main regression tests
Most of these calls were to generate some random data.  These can be
replaced by appropriately adapted sha256() calls.  To keep the diff
smaller, we wrap this into a helper function that produces the same
output format and length as the md5() call.

This will eventually allow these tests to pass in OpenSSL FIPS mode
(which does not allow MD5 use).

Similar work for other test suites will follow later.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/dbbd927f-ef1f-c9a1-4ec6-c759778ac852@enterprisedb.com
2023-03-13 10:53:28 +01:00
Peter Eisentraut d72900bded Improve support for UNICODE collation on older ICU
The recently added standard collation UNICODE (0d21d4b9bc) doesn't
give consistent results on some build farm members with old ICU
versions.  Apparently, the ICU locale specification 'und' (language
tag style) misbehaves on some older ICU versions.  Replacing it with
'' (ICU locale ID style) fixes it at least on some OS versions.  Let's
see what the build farm says.
2023-03-13 09:08:58 +01:00
Michael Paquier e0a09d4e35 Fix inconsistent error handling for GSS encryption in PQconnectPoll()
The error cases for TLS and GSS encryption were inconsistent.  After TLS
fails, the connection is marked as dead and follow-up calls of
PQconnectPoll() would return immediately, but GSS encryption was not
doing that, so the connection would still have been allowed to enter the
GSS handling code.  This was handled incorrectly when gssencmode was set
to "require".  "prefer" was working correctly, and this could not happen
under "disable" as GSS encryption would not be attempted.

This commit makes the error handling of GSS encryption on par with TLS
portion, fixing the case of gssencmode=require.

Reported-by: Jacob Champion
Author: Michael Paquier
Reviewed-by: Jacob Champion, Stephen Frost
Discussion: https://postgr.es/m/23787477-5fe1-a161-6d2a-e459f74c4713@timescale.com
Backpatch-through: 12
2023-03-13 16:36:20 +09:00
Peter Eisentraut 6a3002715e meson: Make auto the default of the ssl option
The 'ssl' option is of type 'combo', but we add a choice 'auto' that
simulates the behavior of a feature option.  This way, openssl is used
automatically by default if present, but we retain the ability to
potentially select another ssl library.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/ad65ffd1-a9a7-fda1-59c6-f7dc763c3051%40enterprisedb.com
2023-03-13 07:04:11 +01:00
Tom Lane 767c598954 Work around implementation restriction in adjust_appendrel_attrs.
adjust_appendrel_attrs can't transfer nullingrel labeling to a non-Var
translation expression (mainly because it's too late to wrap such an
expression in a PlaceHolderVar).  I'd supposed in commit 2489d76c4
that that restriction was unreachable because we'd not attempt to push
problematic clauses down to an appendrel child relation.  I forgot that
set_append_rel_size blindly converts all the parent rel's joininfo
clauses to child clauses, and that list could well contain clauses
from above a nulling outer join.

We might eventually have to devise a direct fix for this implementation
restriction, but for now it seems enough to filter out troublesome
clauses while constructing the child's joininfo list.  Such clauses
are certainly not useful while constructing paths for the child rel;
they'll have to be applied later when we join the completed appendrel
to something else.  So we don't need them here, and omitting them from
the list should save a few cycles while processing the child rel.

Per bug #17832 from Marko Tiikkaja.

Discussion: https://postgr.es/m/17832-d0a8106cdf1b722e@postgresql.org
2023-03-12 14:20:34 -04:00
Andrew Dunstan 872e3d150e Mark unsafe_tests module as not runnable with installcheck
This was an omission in the original creation of the module.

Also slightly adjust some wording to avoid a double "is".

Backpatch the non-meson piece of this to release 12, where the module
was introduced.

Discussion: https://postgr.es/m/be869e1c-8e3f-4cde-8609-212c899cccf9@dunslane.net
2023-03-12 09:00:32 -04:00
Andres Freund a4f23f9b3c pg_amcheck: Minor test speedups
Freezing the relation N times and fetching the tuples one-by-one isn't that
cheap. On my machine this reduces test times by a bit less than one second, on
windows CI it's a few seconds.

Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://postgr.es/m/20230309001558.b7shzvio645ebdta@awork3.anarazel.de
2023-03-11 15:41:47 -08:00
Andres Freund 4f5d461e04 amcheck: Fix FullTransactionIdFromXidAndCtx() for xids before epoch 0
64bit xids can't represent xids before epoch 0 (see also be504a3e97). When
FullTransactionIdFromXidAndCtx() was passed such an xid, it'd create a 64bit
xid far into the future. Noticed while adding assertions in the course of
investigating be504a3e97, as amcheck's test create such xids.

To fix the issue, just return FirstNormalFullTransactionId in this case. A
freshly initdb'd cluster already has a newer horizon. The most minimal version
of this would make the messages for some detected corruptions differently
inaccurate. To make those cases accurate, switch
FullTransactionIdFromXidAndCtx() to use the 32bit modulo difference between
xid and nextxid to compute the 64bit xid, yielding sensible "in the future" /
"in the past" answers.

Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://postgr.es/m/20230108002923.cyoser3ttmt63bfn@awork3.anarazel.de
Backpatch: 14-, where heapam verification was introduced
2023-03-11 14:12:52 -08:00
Tom Lane d66bb048c3 Ensure COPY TO on an RLS-enabled table copies no more than it should.
The COPY documentation is quite clear that "COPY relation TO" copies
rows from only the named table, not any inheritance children it may
have.  However, if you enabled row-level security on the table then
this stopped being true, because the code forgot to apply the ONLY
modifier in the "SELECT ... FROM relation" query that it constructs
in order to allow RLS predicates to be attached.  Fix that.

Report and patch by Antonin Houska (comment adjustments and test case
by me).  Back-patch to all supported branches.

Discussion: https://postgr.es/m/3472.1675251957@antos
2023-03-10 13:52:44 -05:00
Jeff Davis c45dc7ffbb initdb: derive encoding from locale for ICU; similar to libc.
Previously, the default encoding was derived from the locale when
using libc; while the default was always UTF-8 when using ICU. That
would throw an error when the locale was not compatible with UTF-8.

This commit causes initdb to derive the default encoding from the
locale for both providers. If --no-locale is specified (or if the
locale is C or POSIX), the default encoding will be UTF-8 for ICU
(because ICU does not support SQL_ASCII) and SQL_ASCII for libc.

Per buildfarm failure on system "hoverfly" related to commit
27b62377b4.

Discussion: https://postgr.es/m/d191d5841347301a8f1238f609471ddd957fc47e.camel%40j-davis.com
2023-03-10 10:51:24 -08:00
Peter Eisentraut 3e623ebc7a Fix tests for non-ICU build
missed in 0d21d4b9bc
2023-03-10 14:27:55 +01:00
Peter Eisentraut 0d21d4b9bc Add standard collation UNICODE
This adds a new predefined collation named UNICODE, which sorts by the
default Unicode collation algorithm specifications, per SQL standard.

This only works if ICU support is built.

Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://www.postgresql.org/message-id/flat/1293e382-2093-a2bf-a397-c04e8f83d3c2@enterprisedb.com
2023-03-10 13:35:43 +01:00
Michael Paquier 6ad5793a49 Include headers of archive/ in installation
These new headers have been recently added in 35739b8, but they were not
installed.  Sravan has provided the patch for configure/make, while I
have fixed the meson part.

Author: Sravan Kumar, Michael Paquier
Discussion: https://postgr.es/m/CA+=NbjguiQy-MbVqfQ-jQ=2Fcmx3Zs36OkKb-vjt28jMTG0OOg@mail.gmail.com
2023-03-10 20:08:10 +09:00
Peter Eisentraut 012ee84259 Add a test for UCS_BASIC collation 2023-03-10 11:18:08 +01:00
Peter Eisentraut 470103697a Fix incorrect format placeholders 2023-03-10 07:10:43 +01:00
Jeff Davis 8da2ec31ee Fix test failure caused in 27b62377b4.
Per buildfarm system "prion".
2023-03-09 15:34:41 -08:00
Tom Lane bcc704b524 Reject combining "epoch" and "infinity" with other datetime fields.
Datetime input formerly accepted combinations such as
'1995-08-06 infinity', but this seems like a clear error.
Reject any combination of regular y/m/d/h/m/s fields with
these special tokens.

Joseph Koshakow, reviewed by Keisuke Kuroda and myself

Discussion: https://postgr.es/m/CAAvxfHdm8wwXwG_FFRaJ1nTHiMWb7YXS2YKCzCt8Q0a2ZoMcHg@mail.gmail.com
2023-03-09 16:49:03 -05:00
Jeff Davis 27b62377b4 Use ICU by default at initdb time.
If the ICU locale is not specified, initialize the default collator
and retrieve the locale name from that.

Discussion: https://postgr.es/m/510d284759f6e943ce15096167760b2edcb2e700.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-03-09 10:52:41 -08:00
Jeff Davis 206b44bb24 Fix 9637badd9f.
Discussion: https://postgr.es/m/0a364430-266e-1e1a-d5d8-1a5273c9ddb6@dunslane.net
Reported-by: Andrew Dunstan
2023-03-09 10:26:47 -08:00
Jeff Davis 9637badd9f pg_upgrade: copy locale and encoding information to new cluster.
Previously, pg_upgrade checked that the old and new clusters were
compatible, including the locale and encoding. But the new cluster was
just created, and only template0 from the new cluster will be
preserved (template1 and postgres are both recreated during the
upgrade process).

Because template0 is not sensitive to locale or encoding, just update
the pg_database entry to be the same as template0 from the original
cluster.

This commit makes it easier to change the default initdb locale or
encoding settings without causing needless incompatibilities.

Discussion: https://postgr.es/m/d62b2874-729b-d26a-2d0a-0d64f509eca4@enterprisedb.com
Reviewed-by: Peter Eisentraut
2023-03-09 08:28:05 -08:00
Stephen Frost 8dff2f224f For Kerberos testing, disable reverse DNS lookup
In our Kerberos test suite, there isn't much need to worry about the
normal canonicalization that Kerberos provides by looking up the reverse
DNS for the IP address connected to, and in some cases it can actively
cause problems (eg: a captive portal wifi where the normally not
resolvable localhost address used ends up being resolved anyway, and
not to the domain we are using for testing, causing the entire
regression test to fail with errors about not being able to get a TGT
for the remote realm for cross-realm trust).

Therefore, disable it by adding rdns = false into the krb5.conf that's
generated for the test.

Reviewed-By: Heikki Linnakangas
Discussion: https://postgr.es/m/Y/QD2zDkDYQA1GQt@tamriel.snowman.net
2023-03-09 10:32:49 -05:00
Alvaro Herrera 590a075789
Avoid criticizable perl code
Using `require` / `->import` instead of `use` avoids the use of a
"stringy eval", making for cleaner code that we don't need to silence
perlcritic about.

Per Andrew Dunstan

Discussion: https://postgr.es/m/7cd3bbbd-0216-4436-d571-8f80c9259a07@dunslane.net
2023-03-09 12:02:18 +01:00
Peter Eisentraut 36ea345f8f Improve/correct comments
Change comments for pg_cryptohash_init(), pg_cryptohash_update(),
pg_cryptohash_final() in cryptohash.c to match cryptohash_openssl.c.
In particular, the claim that these functions were "designed" to never
fail was incorrect, since by design callers need to be prepared to
handle failures, for compatibility with the cryptohash_openssl.c
versions.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/301F4EDD-27B9-460F-B462-B9DB2BDE4ACF@yesql.se
2023-03-09 09:59:46 +01:00
Peter Eisentraut 544b452a5a Disallow specifying ICU rules unless locale provider is ICU
Follow-up for 30a53b7929; this was not checked in all cases.

Reported-by: Jeff Davis <pgsql@j-davis.com>
2023-03-09 08:09:40 +01:00
Michael Paquier b6dfee28f2 Run pgindent on libpq's fe-auth.c, fe-auth-scram.c and fe-connect.c
A patch sent by Jacob Champion has been touching this area of the code,
and the set of changes done in a9e9a9f has made a run of pgindent on
these files a bit annoying to handle.  So let's clean up a bit the area,
first, to ease the work on follow-up patches.

Discussion: https://postgr.es/m/9e5a8ccddb8355ea9fa4b75a1e3a9edc88a70cd3.camel@vmware.com
2023-03-09 15:09:45 +09:00
Thomas Munro 65e388d418 Fix race in SERIALIZABLE READ ONLY.
Commit bdaabb9b started skipping doomed transactions when building the
list of possible conflicts for SERIALIZABLE READ ONLY.  That makes
sense, because doomed transactions won't commit, but a couple of subtle
things broke:

1.  If all uncommitted r/w transactions are doomed, a READ ONLY
transaction would arbitrarily not benefit from the safe snapshot
optimization.  It would not be taken immediately, and yet no other
transaction would set SXACT_FLAG_RO_SAFE later.

2.  In the same circumstances but with DEFERRABLE, GetSafeSnapshot()
would correctly exit its wait loop without sleeping and then take the
optimization in non-assert builds, but assert builds would fail a sanity
check that SXACT_FLAG_RO_SAFE had been set by another transaction.

This is similar to the case for PredXact->WritableSxactCount == 0.  We
should opt out immediately if our possibleUnsafeConflicts list is empty
after filtering.

The code to maintain the serializable global xmin is moved down below
the new opt out site, because otherwise we'd have to reverse its effects
before returning.

Back-patch to all supported releases.  Bug #17368.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/17116-d6ca217acc180e30%40postgresql.org
Discussion: https://postgr.es/m/20110707212159.GF76634%40csail.mit.edu
2023-03-09 16:33:24 +13:00
Andres Freund 8bf826528a meson: tests: Adjust with_icu/ZSTD env vars for pg_dump, pg_basebackup
396d348b0 omitted adding with_icu to the pg_dump tests under
meson. Conversely, e6927270c exported ZSTD for pg_basebackup's tests, despite
pg_basebackup's ZSTD support not having any tests.

Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20230226225239.GL1653@telsasoft.com
2023-03-08 17:04:15 -08:00
Andres Freund 0d237aeeba meson: Add target for installing test files & improve install_test_files
The changes in b6a0d469ca prevented installation of the test files during a
normal install. However, the buildfarm intentionally tries to trun the tests
against a "real" installation. The new install-test-files target provides that
ability.

Because we want to install into a normal directory, I removed the necessary
munging of the target paths from meson.build and moved it into
install-test-files. I also added DESTDIR support, so that installing can
redirect the directory if desired. That's used for the tmp_install/
installation now.

I didn't like the number of arguments necessary for install_test_files, so I
changed it to use
  --install target list of files
which makes it easier to use for further directories, if/when we need them.

Discussion: https://postgr.es/m/20230308012940.edexipb3vqylcu6r@awork3.anarazel.de
2023-03-08 11:12:10 -08:00
Alvaro Herrera 87e4f24d82
001_libpq_pipeline.pl: use Test::Differences if available
When one of these tests fails to match the trace, this better shows what
the problem is.

Discussion: https://postgr.es/m/20220617183150.ilgokxp22mzywnhh@alvherre.pgsql
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
2023-03-08 18:31:55 +01:00
Peter Eisentraut 30a53b7929 Allow tailoring of ICU locales with custom rules
This exposes the ICU facility to add custom collation rules to a
standard collation.

New options are added to CREATE COLLATION, CREATE DATABASE, createdb,
and initdb to set the rules.

Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
Discussion: https://www.postgresql.org/message-id/flat/821c71a4-6ef0-d366-9acf-bb8e367f739f@enterprisedb.com
2023-03-08 16:56:37 +01:00
Peter Eisentraut b1534ed99d Clean up comments
Reformat some of the comments in MergeAttributes().  A lot of code has
been added here over time, and the comments could use a bit of editing
to make the code flow read better.
2023-03-08 15:56:32 +01:00
Peter Eisentraut 2a71ad64cb Break up long GETTEXT_FILES lists
One file per line seems best.  We already did this in some cases.
This adopts the same format everywhere (except in some cases where the
list reasonably fits on one line).
2023-03-08 15:05:43 +01:00
Peter Eisentraut 822e8e2951 Update comment
There was apparently an attempt here to list all the object types that
ACL_USAGE applies to, but it wasn't complete.  So instead of trying to
keep up, put in a more timeless comment.
2023-03-08 14:22:06 +01:00
Andres Freund be504a3e97 Fix corruption due to vacuum_defer_cleanup_age underflowing 64bit xids
When vacuum_defer_cleanup_age is bigger than the current xid, including the
epoch, the subtraction of vacuum_defer_cleanup_age would lead to a wrapped
around xid. While that normally is not a problem, the subsequent conversion to
a 64bit xid results in a 64bit-xid very far into the future. As that xid is
used as a horizon to detect whether rows versions are old enough to be
removed, that allows removal of rows that are still visible (i.e. corruption).

If vacuum_defer_cleanup_age was never changed from the default, there is no
chance of this bug occurring.

This bug was introduced in dc7420c2c9.  A lesser version of it exists in
12-13, introduced by fb5344c969, affecting only GiST.

The 12-13 version of the issue can, in rare cases, lead to pages in a gist
index getting recycled too early, potentially causing index entries to be
found multiple times.

The fix is fairly simple - don't allow vacuum_defer_cleanup_age to retreat
further than FirstNormalTransactionId.

Patches to make similar bugs easier to find, by adding asserts to the 64bit
xid infrastructure, have been proposed, but are not suitable for backpatching.

Currently there are no tests for vacuum_defer_cleanup_age. A patch introducing
infrastructure to make writing a test easier has been posted to the list.

Reported-by: Michail Nikolaev <michail.nikolaev@gmail.com>
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20230108002923.cyoser3ttmt63bfn@awork3.anarazel.de
Backpatch: 12-, but impact/fix is smaller for 12-13
2023-03-07 21:52:32 -08:00
Michael Paquier a4e003338d Refine query jumbling handling for CallStmt
Previously, all the nodes of CallStmt were included in the jumbling,
causing a duplicate in the computation as the transformed state of the
CALL query was included as well as the parsed state (transformed
FuncCall with all the input arguments and potential output arguments).

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Y+MRdEq9W9XVa2AB@paquier.xyz
2023-03-08 14:38:35 +09:00
Andres Freund 401874ab02 meson: don't require 'touch' binary, make use of 'cp' optional
We already didn't use touch (some earlier version of the meson build did ),
and cp is only used for updating unicode files. The latter already depends on
the optional availability of 'wget', so doing the same for 'cp' makes sense.

Eventually we probably want a portable command for updating source code as
part of a target, but for now...

Reported-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/70e96c34-64ee-e549-8c4a-f91a7a668804@dunslane.net
2023-03-07 18:44:42 -08:00
Michael Paquier d69cd3a2e2 Ignore IntoClause.viewQuery in query jumbling
IntoClause.viewQuery is a copy of the parsed-but-not-rewritten SELECT
clause copied to IntoClause when transforming CreateTableAsStmt for a
materialized view.  Including a second copy of the SELECT Query into the
query jumbling was leading to an incorrect numbering of the Const node
locations, as these would be counted twice instead of once.

This becomes visible once the query normalization is applied to CREATE
MATERIALIZED VIEW in pg_stat_statements in the shape of a query string
using only odd numbers for the normalized constants, (regression tests
added in pg_stat_statements as of de2aca2 would show the difference).
Including the original Query from CreateTableAsStmt is enough for the
query jumbling.

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Y+MRdEq9W9XVa2AB@paquier.xyz
2023-03-08 11:41:52 +09:00
Michael Paquier ee56048b0e Improve readability of code PROCESS_MAIN in vacuum_rel()
4211fbd has been handling PROCESS_MAIN in vacuum_rel() with an "if/else
if" structure to avoid an extra level of indentation, but this has been
found as being rather parse to read.  This commit updates the code so as
we check for PROCESS_MAIN in a single place and then handle its
subpaths, FULL or non-FULL vacuums.  Some comments are added to make
that clearer for the reader.

Reported-by: Melanie Plageman
Author: Nathan Bossart
Reviewed-by: Michael Paquier, Melanie Plageman
Discussion: https://postgr.es/m/20230306194009.5cn6sp3wjotd36nu@liskov
2023-03-08 09:16:44 +09:00
Tom Lane 99be6feec9 Fix more bugs caused by adding columns to the end of a view.
If a view is defined atop another view, and then CREATE OR REPLACE
VIEW is used to add columns to the lower view, then when the upper
view's referencing RTE is expanded by ApplyRetrieveRule we will have
a subquery RTE with fewer eref->colnames than output columns.  This
confuses various code that assumes those lists are always in sync,
as they are in plain parser output.

We have seen such problems before (cf commit d5b760ecb), and now
I think the time has come to do what was speculated about in that
commit: let's make ApplyRetrieveRule synthesize some column names to
preserve the invariant that holds in parser output.  Otherwise we'll
be chasing this class of bugs indefinitely.  Moreover, it appears from
testing that this actually gives us better results in the test case
d5b760ecb added, and likely in other corner cases that we lack
coverage for.

In HEAD, I replaced d5b760ecb's hack to make expandRTE exit early with
an elog(ERROR) call, since the case is now presumably unreachable.
But it seems like changing that in back branches would bring more risk
than benefit, so there I just updated the comment.

Per bug #17811 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17811-d31686b78f0dffc9@postgresql.org
2023-03-07 18:21:53 -05:00
Peter Eisentraut ce1215d9b0 Add support for unit "B" to pg_size_bytes()
This makes it consistent with the units support in GUC.

Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/0106914a-9eb5-22be-40d8-652cc88c827d%40enterprisedb.com
2023-03-07 20:31:16 +01:00
Andres Freund 1be0fdb9de Fix flakey pg_stat_io test
Wrap test of pg_stat_io's tracking of shared buffer reads in a transaction to
prevent concurrent accesses (e.g. by autovacuum) causing spurious test
failures.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://www.postgresql.org/message-id/20230306190919.ai6mxdq3sygyyths%40awork3.anarazel.de
2023-03-07 10:08:38 -08:00
Michael Paquier e20b1ea157 Make get_extension_schema() available
This routine is able to retrieve the OID of the schema used with an
extension (pg_extension.extnamespace), or InvalidOid if this information
is not available.  plpgsql_check embeds a copy of this code when
performing checks on functions, as one out-of-core example.

Author: Pavel Stehule
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/CAFj8pRD+9x55hjDoi285jCcjPc8uuY_D+FLn5RpXggdz+4O2sQ@mail.gmail.com
2023-03-07 14:18:20 +09:00
David Rowley cf96907aad Fix incorrect comment in pg_get_partkeydef()
The comment claimed the output of the function was prefixed by "PARTITION
BY".  This is incorrect.

Author: Japin Li
Reviewed-by: Ashutosh Bapat
Discussion: https://postgr.es/m/MEYP282MB166923B446FF5FE55B9DACB7B6B69@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
2023-03-07 14:33:28 +13:00
Tom Lane 7fee7871b4 Fix some more cases of missed GENERATED-column updates.
If UPDATE is forced to retry after an EvalPlanQual check, it neglected
to repeat GENERATED-column computations, even though those might well
have changed since we're dealing with a different tuple than before.
Fixing this is mostly a matter of looping back a bit further when
we retry.  In v15 and HEAD that's most easily done by altering the API
of ExecUpdateAct so that it includes computing GENERATED expressions.

Also, if an UPDATE in a partitioned table turns into a cross-partition
INSERT operation, we failed to recompute GENERATED columns.  That's a
bug since 8bf6ec3ba allowed partitions to have different generation
expressions; although it seems to have no ill effects before that.
Fixing this is messier because we can now have situations where the same
query needs both the UPDATE-aligned set of GENERATED columns and the
INSERT-aligned set, and it's unclear which set will be generated first
(else we could hack things by forcing the INSERT-aligned set to be
generated, which is indeed how fe9e658f4 made it work for MERGE).
The best fix seems to be to build and store separate sets of expressions
for the INSERT and UPDATE cases.  That would create ABI issues in the
back branches, but so far it seems we can leave this alone in the back
branches.

Per bug #17823 from Hisahiro Kauchi.  The first part of this affects all
branches back to v12 where GENERATED columns were added.

Discussion: https://postgr.es/m/17823-b64909cf7d63de84@postgresql.org
2023-03-06 18:31:27 -05:00
Michael Paquier d937904cce Silence -Wmissing-braces complaints in file_utils.c
Per buildfarm member lapwing, coupled with an offline poke from Julien
Rouhaud.

6392f2a was a similar case.
2023-03-07 07:42:36 +09:00
Tom Lane b803b7d132 Fill EState.es_rteperminfos more systematically.
While testing a fix for bug #17823, I discovered that EvalPlanQualStart
failed to copy es_rteperminfos from the parent EState, resulting in
failure if anything in EPQ execution wanted to consult that information.

This led me to conclude that commit a61b1f748 had been too haphazard
about where to fill es_rteperminfos, and that we need to be sure that
that happens exactly where es_range_table gets filled.  So I changed the
signature of ExecInitRangeTable to help ensure that this new requirement
doesn't get missed.  (Indeed, pgoutput.c was also failing to fill it.
Maybe we don't ever need it there, but I wouldn't bet on that.)

No test case yet; one will arrive with the fix for #17823.
But that needs to be back-patched, while this fix is HEAD-only.

Discussion: https://postgr.es/m/17823-b64909cf7d63de84@postgresql.org
2023-03-06 13:10:57 -05:00
Robert Haas e76cbb6cd6 Reword overly-optimistic comment about backup checksum verification.
The comment implies that a single retry is sufficient to avoid
spurious checksum failures, but in fact no number of retries is
sufficient for that purpose. Update the comment accordingly.

Patch by me, reviewed by Michael Paquier.

Discussion: http://postgr.es/m/CA+TgmoZ_fFAoU6mrHt9QBs+dcYhN6yXenGTTMRebZNhtwPwHyg@mail.gmail.com
2023-03-06 10:35:15 -05:00
Robert Haas f3948b5c91 Remove an old comment that doesn't seem especially useful.
The functions that follow are concerned with various things, of
which the tar format is only one, so this comment doesn't really
seem helpful. The file isn't really divided into sections in the
way that this comment seems to contemplate -- or at least, not
any more.

Patch by me, reviewed by Michael Paquier.

Discussion: http://postgr.es/m/CA+TgmoZ_fFAoU6mrHt9QBs+dcYhN6yXenGTTMRebZNhtwPwHyg@mail.gmail.com
2023-03-06 10:27:06 -05:00
Robert Haas 33352b9279 In basebackup.c, perform end-of-file test after checksum validation.
We read blocks of data from files that we're backing up in chunks,
some multiple of BLCKSZ for each read. If checksum verification fails,
we then try rereading just the one block for which validation failed.
If that block happened to be the first block of the chunk, and if
the file was concurrently truncated to remove that block, then we'd
reach a call to bbsink_archive_contents() with a buffer length of 0.
That causes an assertion failure.

As far as I can see, there are no particularly bad consequences if
this happens in a non-assert build, and it's pretty unlikely to happen
in the first place because it requires a series of somewhat unlikely
things to happen in very quick succession. However, assertion failures
are bad, so rearrange the code to avoid that possibility.

Patch by me, reviewed by Michael Paquier.

Discussion: http://postgr.es/m/CA+TgmoZ_fFAoU6mrHt9QBs+dcYhN6yXenGTTMRebZNhtwPwHyg@mail.gmail.com
2023-03-06 10:15:32 -05:00
Daniel Gustafsson d3406d8036 Fix handling of default option values in createuser
Add description of which one is the default between two complementary
options of --bypassrls and --replication in the help text and docs. In
correspondence let the command always include the tokens corresponding
to every options of that kind in the SQL command sent to server. Tests
are updated accordingly.

Also fix the checks of some trivalue vars which were using literal zero
for checking default value instead of the enum label TRI_DEFAULT. While
not a bug, since TRI_DEFAULT is defined as zero, fixing improves read-
ability improved readability (and avoid bugs if the enum is changed).

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220810.151243.1073197628358749087.horikyota.ntt@gmail.com
2023-03-06 14:16:32 +01:00
Michael Paquier 4211fbd841 Add PROCESS_MAIN to VACUUM
Disabling this option is useful to run VACUUM (with or without FULL) on
only the toast table of a relation, bypassing the main relation.  This
option is enabled by default.

Running directly VACUUM on a toast table was already possible without
this feature, by using the non-deterministic name of a toast relation
(as of pg_toast.pg_toast_N, where N would be the OID of the parent
relation) in the VACUUM command, and it required a scan of pg_class to
know the name of the toast table.  So this feature is basically a
shortcut to be able to run VACUUM or VACUUM FULL on a toast relation,
using only the name of the parent relation.

A new switch called --no-process-main is added to vacuumdb, to work as
an equivalent of PROCESS_MAIN.

Regression tests are added to cover VACUUM and VACUUM FULL, looking at
pg_stat_all_tables.vacuum_count to see how many vacuums have run on
each table, main or toast.

Author: Nathan Bossart
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/20221230000028.GA435655@nathanxps13
2023-03-06 16:41:05 +09:00
Michael Paquier 46d490ac19 Improve the regression tests of VACUUM (PROCESS_TOAST)
All the regression tests of VACUUM (PROCESS_TOAST) were only checking if
the commands were able to run, without checking if VACUUM was really
running on what it should.  This expands this set of tests so as we now
look at pg_stat_all_tables.vacuum_count to see how many vacuums have
been run on a given table and its toast relation.

Extracted from a larger patch by the same author, as this is useful on
its own.

Special thanks to Álvaro Herrera for the idea of using
pg_stat_all_tables to check the state of the toast relation.

Author: Nathan Bossart
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/20221230000028.GA435655@nathanxps13
2023-03-06 15:40:56 +09:00
Amit Kapila 9effa55236 Deduplicate handling of binary and text modes in logicalrep_read_tuple().
Author: Bharath Rupireddy
Reviewed-by: Peter Smith
Discussion: https://postgr.es/m/CALj2ACXdbq7kW_+bRrSGMsR6nefCvwbHBJ5J51mr3gFf7QysTA@mail.gmail.com
2023-03-06 09:54:57 +05:30
Michael Paquier ce340e530d Revise pg_pwrite_zeros()
The following changes are made to pg_write_zeros(), the API able to
write series of zeros using vectored I/O:
- Add of an "offset" parameter, to write the size from this position
(the 'p' of "pwrite" seems to mean position, though POSIX does not
outline ythat directly), hence the name of the routine is incorrect if
it is not able to handle offsets.
- Avoid memset() of "zbuffer" on every call.
- Avoid initialization of the whole IOV array if not needed.
- Group the trailing write() call with the main write() call,
simplifying the function logic.

Author: Andres Freund
Reviewed-by: Michael Paquier, Bharath Rupireddy
Discussion: https://postgr.es/m/20230215005525.mrrlmqrxzjzhaipl@awork3.anarazel.de
2023-03-06 13:21:33 +09:00
Thomas Munro 47c0accbe0 Fix assert failures in parallel SERIALIZABLE READ ONLY.
1.  Make sure that we don't decrement SxactGlobalXminCount twice when
the SXACT_FLAG_RO_SAFE optimization is reached in a parallel query.
This could trigger a sanity check failure in assert builds.  Non-assert
builds recompute the count in SetNewSxactGlobalXmin(), so the problem
was hidden, explaining the lack of field reports.  Add a new isolation
test to exercise that case.

2.  Remove an assertion that the DOOMED flag can't be set on a partially
released SERIALIZABLEXACT.  Instead, ignore the flag (our transaction
was already determined to be read-only safe, and DOOMED is in fact set
during partial release, and there was already an assertion that it
wasn't set sooner).  Improve an existing isolation test so that it
reaches that case (previously it wasn't quite testing what it was
supposed to be testing; see discussion).

Back-patch to 12.  Bug #17116.  Defects in commit 47a338cf.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/17116-d6ca217acc180e30%40postgresql.org
2023-03-06 15:07:15 +13:00
Peter Eisentraut 102a5c164a SQL JSON path enhanced numeric literals
Add support for non-decimal integer literals and underscores in
numeric literals to SQL JSON path language.  This follows the rules of
ECMAScript, as referred to by the SQL standard.

Internally, all the numeric literal parsing of jsonpath goes through
numeric_in, which already supports all this, so this patch is just a
bit of lexer work and some tests and documentation.

Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/b11b25bb-6ec1-d42f-cedd-311eae59e1fb@enterprisedb.com
2023-03-05 15:19:58 +01:00
Tom Lane 6949b921d5 Avoid failure when altering state of partitioned foreign-key triggers.
Beginning in v15, if you apply ALTER TABLE ENABLE/DISABLE TRIGGER to
a partitioned table, it also affects the partitions' cloned versions
of the affected trigger(s).  The initial implementation of this
located the clones by name, but that fails on foreign-key triggers
which have names incorporating their own OIDs.  We can fix that, and
also make the behavior more bulletproof in the face of user-initiated
trigger renames, by identifying the cloned triggers by tgparentid.

Following the lead of earlier commits in this area, I took care not
to break ABI in the v15 branch, even though I rather doubt there
are any external callers of EnableDisableTrigger.

While here, update the documentation, which was not touched when
the semantics were changed.

Per bug #17817 from Alan Hodgson.  Back-patch to v15; older versions
do not have this behavior.

Discussion: https://postgr.es/m/17817-31dfb7c2100d9f3d@postgresql.org
2023-03-04 13:32:35 -05:00
Tom Lane f62975b2af Tighten header pre-inclusions in headerscheck and cpluspluscheck.
We allow our header files to depend on the appropriate one of
postgres.h, postgres_fe.h, or c.h having already been included.
However, there are a few headers such as libpq-fe.h that are
meant to be used by client applications and therefore must
compile without any assumptions about previous inclusions.
These test scripts failed to consider that, which seems quite
hazardous since we might not immediately notice such a problem
otherwise.  Hence, adjust these scripts to test relevant libpq
and ecpg headers with no prior inclusion.

While at it, we can also make an effort to actually use the
relevant one of postgres.h, postgres_fe.h, or c.h.  I added
some rules that guess which one to use based on the first-level
src subdirectory, e.g. use postgres_fe.h under src/bin/.
These rules are hardly water-tight but they seem to work today,
and we can always refine them in the future.

These changes don't reveal any live problems today, which is good,
but they should make these scripts more able to catch future bugs.

Discussion: https://postgr.es/m/2488193.1677863247@sss.pgh.pa.us
2023-03-04 12:11:50 -05:00
Robert Haas ebd551f586 Update some incorrect comments about xlog records.
The comments claim that certain pieces of data are part of the main
WAL record data when in reality they are part of the data for
block 0. Repair.

Bertrand Drouvot, reviewed by Amit Kapila. Originally reported by me.

Discussion: http://postgr.es/m/80db7836-4415-d54a-64c3-66b88b1430e7@gmail.com
2023-03-03 12:52:04 -05:00
Peter Eisentraut b6a0d469ca meson: Prevent installation of test files during main install
Previously, meson installed modules under src/test/modules/ as part of
a normal installation, even though these files are only meant for use
by tests.  This is because there is no way to set up up the build
system to install extra things only when told.

This patch fixes that with a workaround: We don't install these
modules as part of meson install, but we create a new "test" that runs
before the real tests whose action it is to install these files.  The
installation is done by manual copies using a small helper script.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/2a039e8e-f31f-31e8-afe7-bab3130ad2de%40enterprisedb.com
2023-03-03 07:45:52 +01:00
Peter Eisentraut b1307b8b60 Fix incorrect format placeholders 2023-03-03 07:01:18 +01:00
Michael Paquier d28a449854 Force testing of query jumbling in 027_stream_regress.pl
Coverage of the query jumbling code has always relied on the queries
included in the regression tests of pg_stat_statements.  This has its
limitations, as a lot of query patterns have never really stressed the
query jumbling code.  The situation got a bit worse since the query
jumbling has been added in the backend core code (5fd9dfa), hence new
nodes that should be included in the jumbling could easily be missed,
resulting in failures in pg_stat_statements or any modules that require
query ID computations.  Forcing a load of pg_stat_statements in
027_stream_regress.pl ensures that nodes are never missed in the
computations, without having to rely on a buildfarm member for this
check.

Before this commit, the line coverage of queryjumblefuncs.funcs.c was
around 48.5%, now up to 94.6% just by running 027_stream_regress.pl.
A basic check is added to show that pg_stat_statements reports are
generated after the main regression test suite is finished.

Discussion: https://postgr.es/m/Y+nD9LN70w+8eaG9@paquier.xyz
2023-03-03 10:41:51 +09:00
Tom Lane 98a88bc2bc Harden new test case against force_parallel_mode = regress.
Per buildfarm: worker processes can't see a role created in
the current transaction.
2023-03-02 17:47:20 -05:00
Tom Lane 3dfae91f7a Show "internal name" not "source code" in psql's \df+ command.
Our previous habit of showing the full function body is really
pretty unfriendly for tabular viewing of functions, and now that
we have \sf and \ef commands there seems no good reason why \df+
has to do it.  It still seems to make sense to show prosrc for
internal and C-language functions, since in those cases prosrc
is just the C function name; but then let's rename the column to
"Internal name" which is a more accurate descriptor.

Isaac Morland

Discussion: https://postgr.es/m/CAMsGm5eqKc6J1=Lwn=ZONG=6ZDYWRQ4cgZQLqMuZGB1aVt_JBg@mail.gmail.com
2023-03-02 17:15:13 -05:00
Thomas Munro 1da569ca1f Don't leak descriptors into subprograms.
Open long-lived data and WAL file descriptors with O_CLOEXEC.  This flag
was introduced by SUSv4 (POSIX.1-2008), and by now all of our target
Unix systems have it.  Our open() implementation for Windows already had
that behavior, so provide a dummy O_CLOEXEC flag on that platform.

For now, callers of open() and the "thin" wrappers in fd.c that deal in
raw descriptors need to pass in O_CLOEXEC explicitly if desired.  This
commit does that for WAL files, and automatically for everything
accessed via VFDs including SMgrRelation and BufFile.  (With more
discussion we might decide to turn it on automatically for the thin
open()-wrappers too to avoid risk of missing places that need it, but
these are typically used for short-lived descriptors where we don't
expect to fork/exec, and it's remotely possible that extensions could be
using these APIs and passing descriptors to subprograms deliberately, so
that hasn't been done here.)

Do the same for sockets and the postmaster pipe with FD_CLOEXEC.  (Later
commits might use modern interfaces to remove these extra fcntl() calls
and more where possible, but we'll need them as a fallback for a couple
of systems, so do it that way in this initial commit.)

With this change, subprograms executed for archiving, copying etc will
no longer have access to the server's descriptors, other than the ones
that we decide to pass down.

Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version)
Discussion: https://postgr.es/m/CA%2BhUKGKb6FsAdQWcRL35KJsftv%2B9zXqQbzwkfRf1i0J2e57%2BhQ%40mail.gmail.com
2023-03-03 10:43:33 +13:00
Tom Lane 6b661b01f4 Remove local optimizations of empty Bitmapsets into null pointers.
These are all dead code now that it's done centrally.

Patch by me; thanks to Nathan Bossart and Richard Guo for review.

Discussion: https://postgr.es/m/1159933.1677621588@sss.pgh.pa.us
2023-03-02 12:01:47 -05:00
Tom Lane 00b41463c2 Require empty Bitmapsets to be represented as NULL.
When I designed the Bitmapset module, I set things up so that an empty
Bitmapset could be represented either by a NULL pointer, or by an
allocated object all of whose bits are zero.  I've recently come to
the conclusion that that was a bad idea and we should instead have a
convention like the longstanding invariant for Lists, whereby an empty
list is represented by NIL and nothing else.

To do this, we need to fix bms_intersect, bms_difference, and a couple
of other functions to check for having produced an empty result; but
then we can replace bms_is_empty(a) by a simple "a == NULL" test.

This is very likely a (marginal) win performance-wise, because we
call bms_is_empty many more times than those other functions put
together.  However, the real reason to do it is that we have various
places that have hand-implemented a rule about "this Bitmapset
variable must be exactly NULL if empty", so that they can use
checks-for-null in place of bms_is_empty calls in particularly hot
code paths.  That is a really fragile, mistake-prone way to do things,
and I'm surprised that we've seldom been bitten by it.  It's not well
documented at all which variables have this property, so you can't
readily tell which code might be violating those conventions.  By
making the convention universal, we can eliminate a subtle source of
bugs.

Patch by me; thanks to Nathan Bossart and Richard Guo for review.

Discussion: https://postgr.es/m/1159933.1677621588@sss.pgh.pa.us
2023-03-02 11:47:26 -05:00
Tom Lane 141225b251 Mop up some undue familiarity with the innards of Bitmapsets.
nodeAppend.c used non-nullness of appendstate->as_valid_subplans as
a state flag to indicate whether it'd done ExecFindMatchingSubPlans
(or some sufficient approximation to that).  This was pretty
questionable even in the beginning, since it wouldn't really work
right if there are no valid subplans.  It got more questionable
after commit 27e1f1456 added logic that could reduce as_valid_subplans
to an empty set: at that point we were depending on unspecified
behavior of bms_del_members, namely that it'd not return an empty
set as NULL.  It's about to start doing that, which breaks this
logic entirely.  Hence, add a separate boolean flag to signal
whether as_valid_subplans has been computed.

Also fix a previously-cosmetic bug in nodeAgg.c, wherein it ignored
the return value of bms_del_member instead of updating its pointer.

Patch by me; thanks to Nathan Bossart and Richard Guo for review.

Discussion: https://postgr.es/m/1159933.1677621588@sss.pgh.pa.us
2023-03-02 11:37:37 -05:00
Tom Lane 462bb7f128 Remove bms_first_member().
This function has been semi-deprecated ever since we invented
bms_next_member().  Its habit of scribbling on the input bitmapset
isn't great, plus for sufficiently large bitmapsets it would take
O(N^2) time to complete a loop.  Now we have the additional problem
that reducing the input to empty while leaving it still accessible
would violate a planned invariant.  So let's just get rid of it,
after updating the few extant callers to use bms_next_member().

Patch by me; thanks to Nathan Bossart and Richard Guo for review.

Discussion: https://postgr.es/m/1159933.1677621588@sss.pgh.pa.us
2023-03-02 11:34:29 -05:00
Daniel Gustafsson 2f80c95740 Mark options as deprecated in usage output
Some deprecated options were not marked as such in usage output.  This
does so across the installed binaries in an attempt to provide consistent
markup for this.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/062C6A8A-A4E8-4F52-9E31-45F0C9E9915E@yesql.se
2023-03-02 14:36:37 +01:00
Daniel Gustafsson 7ab1bc2939 Fix outdated references to guc.c
Commit 0a20ff54f split out the GUC variables from guc.c into a new file
guc_tables.c. This updates comments referencing guc.c regarding variables
which are now in guc_tables.c.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/6B50C70C-8C1F-4F9A-A7C0-EEAFCC032406@yesql.se
2023-03-02 13:49:39 +01:00
Peter Eisentraut 4ac30ba4f2 Make some xlogreader messages more accurate
When you have some invalid WAL, you often get a message like "wanted
24, got 0".  This is a bit incorrect, since it really wanted *at
least* 24, not exactly 24.  This updates the messages to that effect,
and also adds that detail to one message where it was available but
not printed.

Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Jeevan Ladhe <jeevanladhe.os@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/726d782b-5e45-0c3e-d775-6686afe9aa83%40enterprisedb.com
2023-03-02 07:46:12 +01:00
Tom Lane d7056bc1c7 Avoid fetching one past the end of translate()'s "to" parameter.
This is usually harmless, but if you were very unlucky it could
provoke a segfault due to the "to" string being right up against
the end of memory.  Found via valgrind testing (so we might've
found it earlier, except that our regression tests lacked any
exercise of translate()'s deletion feature).

Fix by switching the order of the test-for-end-of-string and
advance-pointer steps.  While here, compute "to_ptr + tolen"
just once.  (Smarter compilers might figure that out for
themselves, but let's just make sure.)

Report and fix by Daniil Anisimov, in bug #17816.

Discussion: https://postgr.es/m/17816-70f3d2764e88a108@postgresql.org
2023-03-01 11:30:31 -05:00
Tomas Vondra 6095069b40 Improve wording in pg_dump compression docs
A couple minor corrections in pg_dump comments and docs, related to the
recently introduced compression API.

Reported-by: Justin Pryzby
Discussion: https://postgr.es/m/20230227044910.GO1653@telsasoft.com
2023-03-01 16:11:38 +01:00