Commit Graph

34034 Commits

Author SHA1 Message Date
Tom Lane 7ddc428ef0 Update time zone data files to tzdata release 2022g.
DST law changes in Greenland and Mexico.  Notably, a new timezone
America/Ciudad_Juarez has been split off from America/Ojinaga.

Historical corrections for northern Canada, Colombia, and Singapore.
2023-01-31 17:37:34 -05:00
Michael Paquier 403d82dd54 Remove recovery test 011_crash_recovery.pl
This test has been added as of 857ee8e that has introduced the SQL
function txid_status(), with the purpose of checking that a transaction
ID still in-progress during a crash is correctly marked as aborted after
recovery finishes.

This test is unstable, and some configuration scenarios may that easier
to reproduce (wal_level=minimal, wal_compression=on) because the WAL
holding the information about the in-progress transaction ID may not
have made it to disk yet, hence a post-crash recovery may cause the same
XID to be reused, triggering a test failure.

We have discussed a few approaches, like making this function force a
WAL flush to make it reliable across crashes, but we don't want to pay a
performance penalty in some scenarios, as well.  The test could have
been tweaked to enforce a checkpoint but that actually breaks the
promise of the test to rely on a stable result of txid_status() after
a crash.

This issue has been reported a few times across the past years, with an
original report from Kyotaro Horiguchi.  The buildfarm machines tanager,
hachi and gokiburi enable wal_compression, and fail on this test
periodically.

Discussion: https://postgr.es/m/3163112.1674762209@sss.pgh.pa.us
Discussion: https://postgr.es/m/20210305.115011.558061052471425531.horikyota.ntt@gmail.com
Backpatch-through: 11
2023-01-31 12:47:20 +09:00
Thomas Munro d95dcc9ab5 Fix rare sharedtuplestore.c corruption.
If the final chunk of an oversized tuple being written out to disk was
exactly 32760 bytes, it would be corrupted due to a fencepost bug.

Bug #17619.  Back-patch to 11 where the code arrived.

While testing that (see test module in archives), I (tmunro) noticed
that the per-participant page counter was not initialized to zero as it
should have been; that wasn't a live bug when it was written since DSM
memory was originally always zeroed, but since 14
min_dynamic_shared_memory might be configured and it supplies non-zeroed
memory, so that is also fixed here.

Author: Dmitry Astapov <dastapov@gmail.com>
Discussion: https://postgr.es/m/17619-0de62ceda812b8b5%40postgresql.org
2023-01-26 14:55:37 +13:00
Andres Freund 243373159f Fix error handling in libpqrcv_connect()
When libpqrcv_connect (also known as walrcv_connect()) failed, it leaked the
libpq connection. In most paths that's fairly harmless, as the calling process
will exit soon after. But e.g. CREATE SUBSCRIPTION could lead to a somewhat
longer lived leak.

Fix by releasing resources, including the libpq connection, on error.

Add a test exercising the error code path. To make it reliable and safe, the
test tries to connect to port=-1, which happens to fail during connection
establishment, rather than during connection string parsing.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20230121011237.q52apbvlarfv6jm6@awork3.anarazel.de
Backpatch: 11-
2023-01-23 18:27:58 -08:00
Tom Lane 6c122eddec Allow REPLICA IDENTITY to be set on an index that's not (yet) valid.
The motivation for this change is that when pg_dump dumps a
partitioned index that's marked REPLICA IDENTITY, it generates a
command sequence that applies REPLICA IDENTITY before the partitioned
index has been marked valid, causing restore to fail.  We could
perhaps change pg_dump to not do it like that, but that would be
difficult and would not fix existing dump files with the problem.
There seems to be very little reason for the backend to disallow
this anyway --- the code ignores indisreplident when the index
isn't valid --- so instead let's fix it by allowing the case.

Commit 9511fb37a previously expressed a concern that allowing
indisreplident to be set on invalid indexes might allow us to
wind up in a situation where a table could have indisreplident
set on multiple indexes.  I'm not sure I follow that concern
exactly, but in any case the only way that could happen is because
relation_mark_replica_identity is too trusting about the existing set
of markings being valid.  Let's just rip out its early-exit code path
(which sure looks like premature optimization anyway; what are we
doing expending code to make redundant ALTER TABLE ... REPLICA
IDENTITY commands marginally faster and not-redundant ones marginally
slower?) and fix it to positively guarantee that no more than one
index is marked indisreplident.

The pg_dump failure can be demonstrated in all supported branches,
so back-patch all the way.  I chose to back-patch 9511fb37a as well,
just to keep indisreplident handling the same in all branches.

Per bug #17756 from Sergey Belyashov.

Discussion: https://postgr.es/m/17756-dd50e8e0c8dd4a40@postgresql.org
2023-01-21 13:10:30 -05:00
Noah Misch 8f70de7e01 Reject CancelRequestPacket having unexpected length.
When the length was too short, the server read outside the allocation.
That yielded the same log noise as sending the correct length with
(backendPID,cancelAuthCode) matching nothing.  Change to a message about
the unexpected length.  Given the attacker's lack of control over the
memory layout and the general lack of diversity in memory layouts at the
code in question, we doubt a would-be attacker could cause a segfault.
Hence, while the report arrived via security@postgresql.org, this is not
a vulnerability.  Back-patch to v11 (all supported versions).

Andrey Borodin, reviewed by Tom Lane.  Reported by Andrey Borodin.
2023-01-21 06:08:05 -08:00
Tom Lane b69e9dfab1 Make our back branches build under -fkeep-inline-functions.
Add "#ifndef FRONTEND" where necessary to make pg_waldump build
on compilers that don't elide unused static-inline functions.

This back-patches relevant parts of commit 3e9ca5260, fixing build
breakage from dc7420c2c and back-patching of f10f0ae42.

Per recently-resurrected buildfarm member castoroides.  We aren't
expecting castoroides to build anything newer than v11, but we
might as well clean up the intermediate branches while at it.
2023-01-20 11:58:12 -05:00
Tom Lane 0a269527f6 Log the correct ending timestamp in recovery_target_xid mode.
When ending recovery based on recovery_target_xid matching with
recovery_target_inclusive = off, we printed an incorrect timestamp
(always 2000-01-01) in the "recovery stopping before ... transaction"
log message.  This is a consequence of sloppy refactoring in
c945af80c: the code to fetch recordXtime out of the commit/abort
record used to be executed unconditionally, but it was changed
to get called only in the RECOVERY_TARGET_TIME case.  We need only
flip the order of operations to restore the intended behavior.

Per report from Torsten Förtsch.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/CAKkG4_kUevPqbmyOfLajx7opAQk6Cvwkvx0HRcFjSPfRPTXanA@mail.gmail.com
2023-01-19 12:23:20 -05:00
Michael Paquier 0c2f34af7e Add missing assign hook for GUC checkpoint_completion_target
This is wrong since 88e9823, that has switched the WAL sizing
configuration from checkpoint_segments to min_wal_size and
max_wal_size.  This missed the recalculation of the internal value of
the internal "CheckPointSegments", that works as a mapping of the old
GUC checkpoint_segments, on reload, for example, and it controls the
timing of checkpoints depending on the volume of WAL generated.

Most users tend to leave checkpoint_completion_target at 0.9 to smooth
the I/O workload, which is why I guess this has gone unnoticed for so
long, still it can be useful to tweak and reload the value dynamically
in some cases to control the timing of checkpoints.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACXgPPAm28mruojSBno+F_=9cTOOxHAywu_dfZPeBdybQw@mail.gmail.com
Backpatch-through: 11
2023-01-19 13:13:34 +09:00
Michael Paquier 145bc5debf Fix failure with perlcritic in psql's create_help.pl
No buildfarm members have reported that yet, but a recently-refreshed
Debian host did.

Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/Y8ey5z4Nav62g4/K@paquier.xyz
Backpatch-through: 11
2023-01-19 10:02:15 +09:00
Tom Lane c94d684bf4 AdjustUpgrade.pm should zap test_ext_cine, too.
test_extensions' test_ext_cine extension has the same upgrade hazard
as test_ext7: the regression test leaves it in an updated state
from which no downgrade path to default is provided.  This causes
the update_extensions.sql script helpfully provided by pg_upgrade
to fail.  So drop it in cross-version-upgrade testing.

Not entirely sure how come I didn't hit this in testing yesterday;
possibly I'd built the upgrade reference databases with
testmodules-install-check disabled.

Backpatch to v10 where this module was introduced.
2023-01-17 16:01:11 -05:00
Tom Lane f02a752228 Create common infrastructure for cross-version upgrade testing.
To test pg_upgrade across major PG versions, we have to be able to
modify or drop any old objects with no-longer-supported properties,
and we have to be able to deal with cosmetic changes in pg_dump output.
Up to now, the buildfarm and pg_upgrade's own test infrastructure had
separate implementations of the former, and we had nothing but very
ad-hoc rules for the latter (including an arbitrary threshold on how
many lines of unchecked diff were okay!).  This patch creates a Perl
module that can be shared by both those use-cases, and adds logic
that deals with pg_dump output diffs in a much more tightly defined
fashion.

This largely supersedes previous efforts in commits 0df9641d3,
9814ff550, and 62be9e4cd, which developed a SQL-script-based solution
for the task of dropping old objects.  There was nothing fundamentally
wrong with that work in itself, but it had no basis for solving the
output-formatting problem.  The most plausible way to deal with
formatting is to build a Perl module that can perform editing on the
dump files; and once we commit to that, it makes more sense for the
same module to also embed the knowledge of what has to be done for
dropping old objects.

Back-patch versions of the helper module as far as 9.2, to
support buildfarm animals that still test that far back.
It's also necessary to back-patch PostgreSQL/Version.pm,
because the new code depends on that.  I fixed up pg_upgrade's
002_pg_upgrade.pl in v15, but did not look into back-patching
it further than that.

Tom Lane and Andrew Dunstan

Discussion: https://postgr.es/m/891521.1673657296@sss.pgh.pa.us
2023-01-16 20:35:53 -05:00
Thomas Munro 1b40710a8c Fix WaitEventSetWait() buffer overrun.
The WAIT_USE_EPOLL and WAIT_USE_KQUEUE implementations of
WaitEventSetWaitBlock() confused the size of their internal buffer with
the size of the caller's output buffer, and could ask the kernel for too
many events.  In fact the set of events retrieved from the kernel needs
to be able to fit in both buffers, so take the smaller of the two.

The WAIT_USE_POLL and WAIT_USE WIN32 implementations didn't have this
confusion.

This probably didn't come up before because we always used the same
number in both places, but commit 7389aad6 calculates a dynamic size at
construction time, while using MAXLISTEN for its output event buffer on
the stack.  That seems like a reasonable thing to want to do, so
consider this to be a pre-existing bug worth fixing.

As discovered by valgrind on skink.

Back-patch to all supported releases for epoll, and to release 13 for
the kqueue part, which copied the incorrect epoll code.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/901504.1673504836%40sss.pgh.pa.us
2023-01-13 10:54:20 +13:00
Dean Rasheed c54b888700 Fix tab completion of ALTER FUNCTION/PROCEDURE/ROUTINE ... SET SCHEMA.
The ALTER DATABASE|FUNCTION|PROCEDURE|ROLE|ROUTINE|USER ... SET <name>
case in psql tab completion failed to exclude <name> = "SCHEMA", which
caused ALTER FUNCTION|PROCEDURE|ROUTINE ... SET SCHEMA to complete
with "FROM CURRENT" and "TO", which won't work.

Fix that, so that those cases now complete with the list of schemas,
like other ALTER ... SET SCHEMA commands.

Noticed while testing the recent patch to improve tab completion for
ALTER FUNCTION/PROCEDURE/ROUTINE, but this is not directly related to
that patch. Rather, this is a long-standing bug, so back-patch to all
supported branches.

Discussion: https://postgr.es/m/CALDaNm0s7GQmkLP_mx5Cvk=UzYMnjhPmXBxU8DsHEunFbC5sTg@mail.gmail.com
2023-01-06 11:09:56 +00:00
Michael Paquier a80740a7c9 Fix typos in comments, code and documentation
While on it, newlines are removed from the end of two elog() strings.
The others are simple grammar mistakes.  One comment in pg_upgrade
referred incorrectly to sequences since a7e5457.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20221230231257.GI1153@telsasoft.com
Backpatch-through: 11
2023-01-03 16:26:38 +09:00
Andres Freund 99f8bc335c perl: Hide warnings inside perl.h when using gcc compatible compiler
New versions of perl trigger warnings within perl.h with our compiler
flags. At least -Wdeclaration-after-statement, -Wshadow=compatible-local are
known to be problematic.

To avoid these warnings, conditionally use #pragma GCC system_header before
including plperl.h.

Alternatively, we could add the include paths for problematic headers with
-isystem, but that is a larger hammer and is harder to search for.

A more granular alternative would be to use #pragma GCC diagnostic
push/ignored/pop, but gcc warns about unknown warnings being ignored, so every
to-be-ignored-temporarily compiler warning would require its own pg_config.h
symbol and #ifdef.

As the warnings are voluminous, it makes sense to backpatch this change. But
don't do so yet, we first want gather buildfarm coverage - it's e.g. possible
that some compiler claiming to be gcc compatible has issues with the pragma.

Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: Discussion: https://postgr.es/m/20221228182455.hfdwd22zztvkojy2@awork3.anarazel.de
2023-01-02 15:51:05 -08:00
Tom Lane 982b9b1eba Avoid reference to nonexistent array element in ExecInitAgg().
When considering an empty grouping set, we fetched
phasedata->eqfunctions[-1].  Because the eqfunctions array is
palloc'd, that would always be an aset pointer in released versions,
and thus the code accidentally failed to malfunction (since it would
do nothing unless it found a null pointer).  Nonetheless this seems
like trouble waiting to happen, so add a check for length == 0.

It's depressing that our valgrind testing did not catch this.
Maybe we should reconsider the choice to not mark that word NOACCESS?

Richard Guo

Discussion: https://postgr.es/m/CAMbWs4-vZuuPOZsKOYnSAaPYGKhmacxhki+vpOKk0O7rymccXQ@mail.gmail.com
2023-01-02 16:17:00 -05:00
Michael Paquier df6fea51f0 Fix come incorrect elog() messages in aclchk.c
Three error strings used with cache lookup failures were referring to
incorrect object types for ACL checks:
- Schemas
- Types
- Foreign Servers
There errors should never be triggered, but if they do incorrect
information would be reported.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20221222153041.GN1153@telsasoft.com
Backpatch-through: 11
2022-12-23 10:04:37 +09:00
Tom Lane 8cd700cc5a Add some recursion and looping defenses in prepjointree.c.
Andrey Lepikhov demonstrated a case where we spend an unreasonable
amount of time in pull_up_subqueries().  Not only is that recursing
with no explicit check for stack overrun, but the code seems not
interruptable by control-C.  Let's stick a CHECK_FOR_INTERRUPTS
there, along with sprinkling some stack depth checks.

An actual fix for the excessive time consumption seems a bit
risky to back-patch; but this isn't, so let's do so.

Discussion: https://postgr.es/m/703c09a2-08f3-d2ec-b33d-dbecd62428b8@postgrespro.ru
2022-12-22 10:35:03 -05:00
Tom Lane f48aa5df4e Rethink handling of [Prevent|Is]InTransactionBlock in pipeline mode.
Commits f92944137 et al. made IsInTransactionBlock() set the
XACT_FLAGS_NEEDIMMEDIATECOMMIT flag before returning "false",
on the grounds that that kept its API promises equivalent to those of
PreventInTransactionBlock().  This turns out to be a bad idea though,
because it allows an ANALYZE in a pipelined series of commands to
cause an immediate commit, which is unexpected.

Furthermore, if we return "false" then we have another issue,
which is that ANALYZE will decide it's allowed to do internal
commit-and-start-transaction sequences, thus possibly unexpectedly
committing the effects of previous commands in the pipeline.

To fix the latter situation, invent another transaction state flag
XACT_FLAGS_PIPELINING, which explicitly records the fact that we
have executed some extended-protocol command and not yet seen a
commit for it.  Then, require that flag to not be set before allowing
InTransactionBlock() to return "false".

Having done that, we can remove its setting of NEEDIMMEDIATECOMMIT
without fear of causing problems.  This means that the API guarantees
of IsInTransactionBlock now diverge from PreventInTransactionBlock,
which is mildly annoying, but it seems OK given the very limited usage
of IsInTransactionBlock.  (In any case, a caller preferring the old
behavior could always set NEEDIMMEDIATECOMMIT for itself.)

For consistency also require XACT_FLAGS_PIPELINING to not be set
in PreventInTransactionBlock.  This too is meant to prevent commands
such as CREATE DATABASE from silently committing previous commands
in a pipeline.

Per report from Peter Eisentraut.  As before, back-patch to all
supported branches (which sadly no longer includes v10).

Discussion: https://postgr.es/m/65a899dd-aebc-f667-1d0a-abb89ff3abf8@enterprisedb.com
2022-12-13 14:23:59 -05:00
Tom Lane 2df0733139 Fix generate_partitionwise_join_paths() to tolerate failure.
We might fail to generate a partitionwise join, because
reparameterize_path_by_child() does not support all path types.
This should not be a hard failure condition: we should just fall back
to a non-partitioned join.  However, generate_partitionwise_join_paths
did not consider this possibility and would emit the (misleading)
error "could not devise a query plan for the given query" if we'd
failed to make any paths for a child join.  Fix it to give up on
partitionwise joining if so.  (The accepted technique for giving up
appears to be to set rel->nparts = 0, which I find pretty bizarre,
but there you have it.)

I have not added a test case because there'd be little point:
any omissions of this sort that we identify would soon get fixed
by extending reparameterize_path_by_child(), so the test would stop
proving anything.  However, right now there is a known test case based
on failure to cover MaterialPath, and with that I've found that this
is broken in all supported versions.  Hence, patch all the way back.

Original report and patch by me; thanks to Richard Guo for
identifying a test case that works against committed versions.

Discussion: https://postgr.es/m/1854233.1669949723@sss.pgh.pa.us
2022-12-04 13:17:18 -05:00
Dean Rasheed 30f9b03a08 Fix DEFAULT handling for multi-row INSERT rules.
When updating a relation with a rule whose action performed an INSERT
from a multi-row VALUES list, the rewriter might skip processing the
VALUES list, and therefore fail to replace any DEFAULTs in it. This
would lead to an "unrecognized node type" error.

The reason was that RewriteQuery() assumed that a query doing an
INSERT from a multi-row VALUES list would necessarily only have one
item in its fromlist, pointing to the VALUES RTE to read from. That
assumption is correct for the original query, but not for product
queries produced for rule actions. In such cases, there may be
multiple items in the fromlist, possibly including multiple VALUES
RTEs.

What is required instead is for RewriteQuery() to skip any RTEs from
the product query's originating query, which might include one or more
already-processed VALUES RTEs. What's left should then include at most
one VALUES RTE (from the rule action) to be processed.

Patch by me. Thanks to Tom Lane for reviewing.

Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAEZATCV39OOW7LAR_Xq4i%2BLc1Byux%3DeK3Q%3DHD_pF1o9LBt%3DphA%40mail.gmail.com
2022-12-03 12:20:02 +00:00
Andres Freund af3517c15c Prevent pgstats from getting confused when relkind of a relation changes
When the relkind of a relache entry changes, because a table is converted into
a view, pgstats can get confused in 15+, leading to crashes or assertion
failures.

For HEAD, Tom fixed this in b23cd185fd, by removing support for converting a
table to a view, removing the source of the inconsistency. This commit just
adds an assertion that a relcache entry's relkind does not change, just in
case we end up with another case of that in the future. As there's no cases of
changing relkind anymore, we can't add a test that that's handled correctly.

For 15, fix the problem by not maintaining the association with the old pgstat
entry when the relkind changes during a relcache invalidation processing. In
that case the pgstat entry needs to be unlinked first, to avoid
PgStat_TableStatus->relation getting out of sync. Also add a test reproducing
the issues.

No known problem exists in 11-14, so just add the test there.

Reported-by: vignesh C <vignesh21@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CALDaNm2yXz+zOtv7y5zBd5WKT8O0Ld3YxikuU3dcyCvxF7gypA@mail.gmail.com
Discussion: https://postgr.es/m/CALDaNm3oZA-8Wbps2Jd1g5_Gjrr-x3YWrJPek-mF5Asrrvz2Dg@mail.gmail.com
Backpatch: 15-
2022-12-02 18:17:54 -08:00
Tom Lane 04f3acc97e Reject missing database name in pg_regress and cohorts.
Writing "pg_regress --dbname= ..." led to a crash, because
we weren't expecting there to be no database name supplied.
It doesn't seem like a great idea to run regression tests
in whatever is the user's default database; so rather than
supporting this case let's explicitly reject it.

Per report from Xing Guo.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/CACpMh+A8cRvtvtOWVAZsCM1DU81GK4DL26R83y6ugZ1osV=ifA@mail.gmail.com
2022-11-30 13:01:41 -05:00
Michael Paquier a282d5f034 Fix comment in fe-auth-scram.c
The frontend-side routine in charge of building a SCRAM verifier
mentioned that the restrictions applying to SASLprep on the password
with the encoding are described at the top of fe-auth-scram.c, but this
information is in auth-scram.c.

This is wrong since 8f8b9be, so backpatch all the way down as this is an
important documentation bit.

Spotted while reviewing a different patch.

Backpatch-through: 11
2022-11-30 08:38:40 +09:00
Tom Lane a6c9e1db2b Improve heuristics for compressing the KnownAssignedXids array.
Previously, we'd compress only when the active range of array entries
reached Max(4 * PROCARRAY_MAXPROCS, 2 * pArray->numKnownAssignedXids).
If max_connections is large, the first term could result in not
compressing for a long time, resulting in much wastage of cycles in
hot-standby backends scanning the array to take snapshots.  Get rid
of that term, and just bound it to 2 * pArray->numKnownAssignedXids.

That however creates the opposite risk, that we might spend too much
effort compressing.  Hence, consider compressing only once every 128
commit records.  (This frequency was chosen by benchmarking.  While
we only tried one benchmark scenario, the results seem stable over
a fairly wide range of frequencies.)

Also, force compression when processing RecoveryInfo WAL records
(which should be infrequent); the old code could perform compression
then, but would do so only after the same array-range check as for
the transaction-commit path.

Also, opportunistically run compression if the startup process is about
to wait for WAL, though not oftener than once a second.  This should
prevent cases where we waste lots of time by leaving the array
not-compressed for long intervals due to low WAL traffic.

Lastly, add a simple check to keep us from uselessly compressing
when the array storage is already compact.

Back-patch, as the performance problem is worse in pre-v14 branches
than in HEAD.

Simon Riggs and Michail Nikolaev, with help from Tom Lane and
Andres Freund.

Discussion: https://postgr.es/m/CALdSSPgahNUD_=pB_j=1zSnDBaiOtqVfzo8Ejt5J_k7qZiU1Tw@mail.gmail.com
2022-11-29 15:43:17 -05:00
Andrew Dunstan 724dd56490 Fix binary mismatch for MSVC plperl vs gcc built perl libs
When loading plperl built against Strawberry perl or the msys2 ucrt perl
that have been built with gcc, a binary mismatch has been encountered
which looks like this:

loadable library and perl binaries are mismatched (got handshake key 0000000012800080, needed 0000000012900080)

To cure this we bring the handshake keys into sync by adding
NO_THREAD_SAFE_LOCALE to the defines used to build plperl.

Discussion: https://postgr.es/m/20211005004334.tgjmro4kuachwiuc@alap3.anarazel.de
Discussion: https://postgr.es/m/c2da86a0-2906-744c-923d-16da6047875e@dunslane.net

Backpatch to all live branches.
2022-11-27 09:18:46 -05:00
Andrew Dunstan ae7c512130 Allow building with MSVC and Strawberry perl
Strawberry uses __builtin_expect which Visual C doesn't have. For this
case define it as a noop. Solution taken from vim sources.

Backpatch to all live branches
2022-11-25 15:37:34 -05:00
Amit Kapila 9b788aafdc Fix uninitialized access to InitialRunningXacts during decoding.
In commit 272248a0c, we introduced an InitialRunningXacts array to
remember transactions and subtransactions that were running when the
xl_running_xacts record that we decoded was written. This array was
allocated in the snapshot builder memory context after we restore
serialized snapshot but we forgot to reset the array while freeing the
builder memory context. So, the next time when we start decoding in the
same session where we don't restore any serialized snapshot, we ended up
using the uninitialized array and that can lead to unpredictable behavior.

This problem doesn't exist in HEAD as instead of using
InitialRunningXacts, we added the list of transaction IDs and
sub-transaction IDs, that have modified catalogs and are running during
snapshot serialization, to the serialized snapshot (see commit 7f13ac8123).

Reported-by: Maxim Orlov
Author: Masahiko Sawada
Reviewed-by: Amit Kapila, Maxim Orlov
Backpatch-through: 11
Discussion: https://postgr.es/m/CACG=ezZoz_KG+Ryh9MrU_g5e0HiVoHocEvqFF=NRrhrwKmEQJQ@mail.gmail.com
2022-11-25 08:56:54 +05:30
Alvaro Herrera b20e381422
Make multixact error message more explicit
There are recent reports involving a very old error message that we have
no history of hitting -- perhaps a recently introduced bug.  Improve the
error message in an attempt to improve our chances of investigating the
bug.

Per reports from Dimos Stamatakis and Bob Krier.

Backpatch to 11.

Discussion: https://postgr.es/m/CO2PR0801MB2310579F65529380A4E5EDC0E20A9@CO2PR0801MB2310.namprd08.prod.outlook.com
Discussion: https://postgr.es/m/17518-04e368df5ad7f2ee@postgresql.org
2022-11-24 10:45:10 +01:00
Andrew Dunstan 2f92b8ad31
Fix perl warning from commit 9b4eafcaf4
per gripe from Andres Freund and Tom Lane

Backpatch to all live branches.
2022-11-23 07:18:16 -05:00
Tom Lane b96a096dbc YA attempt at taming worst-case behavior of get_actual_variable_range.
We've made multiple attempts at preventing get_actual_variable_range
from taking an unreasonable amount of time (3ca930fc3, fccebe421).
But there's still an issue for the very first planning attempt after
deletion of a large number of extremal-valued tuples.  While that
planning attempt will set "killed" bits on the tuples it visits and
thereby reduce effort for next time, there's still a lot of work it
has to do to visit the heap and then set those bits.  It's (usually?)
not worth it to do that much work at plan time to have a slightly
better estimate, especially in a context like this where the table
contents are known to be mutating rapidly.

Therefore, let's bound the amount of work to be done by giving up
after we've visited 100 heap pages.  Giving up just means we'll
fall back on the extremal value recorded in pg_statistic, so it
shouldn't mean that planner estimates suddenly become worthless.

Note that this means we'll still gradually whittle down the problem
by setting a few more index "killed" bits in each planning attempt;
so eventually we'll reach a good state (barring further deletions),
even in the absence of VACUUM.

Simon Riggs, per a complaint from Jakub Wartak (with cosmetic
adjustments by me).  Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAKZiRmznOwi0oaV=4PHOCM4ygcH4MgSvt8=5cu_vNCfc8FSUug@mail.gmail.com
2022-11-22 14:40:46 -05:00
Andrew Dunstan 46def5267c Prevent port collisions between concurrent TAP tests
Currently there is a race condition where if concurrent TAP tests both
test that they can open a port they will assume that it is free and use
it, causing one of them to fail. To prevent this we record a reservation
using an exclusive lock, and any TAP test that discovers a reservation
checks to see if the reserving process is still alive, and looks for
another free port if it is.

Ports are reserved in a directory set by the environment setting
PG_TEST_PORT_DIR, or if that doesn't exist a subdirectory of the top
build directory as set by Makefile.global, or its own
tmp_check directory.

The prove_check recipe in Makefile.global.in is extended to export
top_builddir to the TAP tests. This was already exported by the
prove_installcheck recipes.

Per complaint from Andres Freund

Backpatched from 9b4eafcaf4 to all live branches

Discussion: https://postgr.es/m/20221002164931.d57hlutrcz4d2zi7@awork3.anarazel.de
2022-11-22 10:53:01 -05:00
Tom Lane c0eed88914 Add comments and a missing CHECK_FOR_INTERRUPTS in ts_headline.
I just spent an annoying amount of time reverse-engineering the
100%-undocumented API between ts_headline and the text search
parser's prsheadline function.  Add some commentary about that
while it's fresh in mind.  Also remove some unused macros in
wparser_def.c.

While at it, I noticed that when commit 78e73e875 added a
CHECK_FOR_INTERRUPTS call in TS_execute_recurse, it missed
doing so in the parallel function TS_phrase_execute, which
surely needs one just as much.

Back-patch because of the missing CHECK_FOR_INTERRUPTS.
Might as well back-patch the rest of this too.
2022-11-21 17:07:07 -05:00
Andres Freund 140c803723 Fix mislabeling of PROC_QUEUE->links as PGPROC, fixing UBSan on 32bit
ProcSleep() used a PGPROC* variable to point to PROC_QUEUE->links.next,
because that does "the right thing" with SHMQueueInsertBefore(). While that
largely works, it's certainly not correct and unnecessary - we can just use
SHM_QUEUE* to point to the insertion point.

Noticed when testing a 32bit of postgres with undefined behavior
sanitizer. UBSan noticed that sometimes the supposed PGPROC wasn't
sufficiently aligned (required since 46d6e5f567, ensured indirectly, via
ShmemAllocRaw() guaranteeing cacheline alignment).

For now fix this by using a SHM_QUEUE* for the insertion point. Subsequently
we should replace all the use of PROC_QUEUE and SHM_QUEUE with ilist.h, but
that's a larger change that we don't want to backpatch.

Backpatch to all supported versions - it's useful to be able to run postgres
under UBSan.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20221117014230.op5kmgypdv2dtqsf@awork3.anarazel.de
Backpatch: 11-
2022-11-19 12:36:52 -08:00
Tom Lane 8687cee44d Doc: sync src/tutorial/basics.source with SGML documentation.
basics.source is supposed to be pretty closely in step with
the examples in chapter 2 of the tutorial, but I forgot to
update it in commit f05a5e000.  Fix that, and adjust a couple
of other discrepancies that had crept in over time.

(I notice that advanced.source is nowhere near being in sync
with chapter 3, but I lack the ambition to do something
about that right now.)
2022-11-19 13:09:14 -05:00
Tom Lane b7333e8269 pg_dump: avoid unsafe function calls in getPolicies().
getPolicies() had the same disease I fixed in other places in
commit e3fcbbd62, i.e., it was calling pg_get_expr() for
expressions on tables that we don't necessarily have lock on.
To fix, restrict the query to only collect interesting rows,
rather than doing the filtering on the client side.

Back-patch of commit 3e6e86abc.  That's been in v15/HEAD long enough
to have some confidence about it, so now let's fix the problem in
older branches.

Discussion: https://postgr.es/m/2273648.1634764485@sss.pgh.pa.us
Discussion: https://postgr.es/m/7d7eb6128f40401d81b3b7a898b6b4de@W2012-02.nidsa.loc
Discussion: https://postgr.es/m/45c93d57-9973-248e-d2df-e02ca9af48d4@darold.net
2022-11-19 12:00:27 -05:00
Tom Lane b1f106420b Postpone calls of unsafe server-side functions in pg_dump.
Avoid calling pg_get_partkeydef(), pg_get_expr(relpartbound),
and regtypeout until we have lock on the relevant tables.
The existing coding is at serious risk of failure if there
are any concurrent DROP TABLE commands going on --- including
drops of other sessions' temp tables.

Back-patch of commit e3fcbbd62.  That's been in v15/HEAD long enough
to have some confidence about it, so now let's fix the problem in
older branches.

Original patch by me; thanks to Gilles Darold for back-patching
legwork.

Discussion: https://postgr.es/m/2273648.1634764485@sss.pgh.pa.us
Discussion: https://postgr.es/m/7d7eb6128f40401d81b3b7a898b6b4de@W2012-02.nidsa.loc
Discussion: https://postgr.es/m/45c93d57-9973-248e-d2df-e02ca9af48d4@darold.net
2022-11-19 11:40:30 -05:00
Tom Lane d4acf2eb94 Replace RelationOpenSmgr() with RelationGetSmgr().
This is a back-patch of the v15-era commit f10f0ae42 into older
supported branches.  The idea is to design out bugs in which an
ill-timed relcache flush clears rel->rd_smgr partway through
some code sequence that wasn't expecting that.  We had another
report today of a corner case that reliably crashes v14 under
debug_discard_caches (nee CLOBBER_CACHE_ALWAYS), and therefore
would crash once in a blue moon in the field.  We're unlikely
to get rid of all such code paths unless we adopt the more
rigorous coding rules instituted by f10f0ae42.  Therefore,
even though this is a bit invasive, it's time to back-patch.
Some comfort can be taken in the fact that f10f0ae42 has been
in v15 for 16 months without problems.

I left the RelationOpenSmgr macro present in the back branches,
even though no core code should use it anymore, in order to not break
third-party extensions in minor releases.  Such extensions might opt
to start using RelationGetSmgr instead, to reduce their code
differential between v15 and earlier branches.  This carries a hazard
of failing to compile against headers from existing minor releases.
However, once compiled the extension should work fine even with such
releases, because RelationGetSmgr is a "static inline" function so
it creates no link-time dependency.  So depending on distribution
practices, that might be an OK tradeoff.

Per report from Spyridon Dimitrios Agathos.  Original patch
by Amul Sul.

Discussion: https://postgr.es/m/CAFM5RaqdgyusQvmWkyPYaWMwoK5gigdtW-7HcgHgOeAw7mqJ_Q@mail.gmail.com
Discussion: https://postgr.es/m/CANiYTQsU7yMFpQYnv=BrcRVqK_3U3mtAzAsJCaqtzsDHfsUbdQ@mail.gmail.com
2022-11-17 16:54:31 -05:00
Noah Misch 791dd75791 Account for IPC::Run::result() Windows behavior change.
This restores compatibility with the not-yet-released successor of
version 20220807.0.  Back-patch to 9.4, which introduced this code.

Reviewed by Andrew Dunstan.

Discussion: https://postgr.es/m/20221117061805.GA4020280@rfd.leadboat.com
2022-11-17 07:39:02 -08:00
Amit Kapila 1703033f89 Fix cleanup lock acquisition in SPLIT_ALLOCATE_PAGE replay.
During XLOG_HASH_SPLIT_ALLOCATE_PAGE replay, we were checking for a
cleanup lock on the new bucket page after acquiring an exclusive lock on
it and raising a PANIC error on failure. However, it is quite possible
that checkpointer can acquire the pin on the same page before acquiring a
lock on it, and then the replay will lead to an error. So instead, directly
acquire the cleanup lock on the new bucket page during
XLOG_HASH_SPLIT_ALLOCATE_PAGE replay operation.

Reported-by: Andres Freund
Author: Robert Haas
Reviewed-By: Amit Kapila, Andres Freund, Vignesh C
Backpatch-through: 11
Discussion: https://postgr.es/m/20220810022617.fvjkjiauaykwrbse@awork3.anarazel.de
2022-11-14 09:52:06 +05:30
Jeff Davis 5eaf3e375d Fix theoretical torn page hazard.
The original report was concerned with a possible inconsistency
between the heap and the visibility map, which I was unable to
confirm. The concern has been retracted.

However, there did seem to be a torn page hazard when using
checksums. By not setting the heap page LSN during redo, the
protections of minRecoveryPoint were bypassed. Fixed, along with a
misleading comment.

It may have been impossible to hit this problem in practice, because
it would require a page tear between the checksum and the flags, so I
am marking this as a theoretical risk. But, as discussed, it did
violate expectations about the page LSN, so it may have other
consequences.

Backpatch to all supported versions.

Reported-by: Konstantin Knizhnik
Reviewed-by: Konstantin Knizhnik
Discussion: https://postgr.es/m/fed17dac-8cb8-4f5b-d462-1bb4908c029e@garret.ru
Backpatch-through: 11
2022-11-11 12:46:52 -08:00
Tom Lane 421f3363a8 Fix alter_table.sql test case to test what it claims to.
The stanza "SET STORAGE may need to add a TOAST table" does not
test what it's supposed to, and hasn't done so since we added
the ability to store constant column default values as metadata.
We need to use a non-constant default to get the expected table
rewrite to actually happen.

Fix that, and add the missing checks that would have exposed the
problem to begin with.

Noted while reviewing a patch that made changes in this test case.
Back-patch to v11 where the problem came in.
2022-11-10 17:24:26 -05:00
Tom Lane fec80da849 Doc: add comments about PreventInTransactionBlock/IsInTransactionBlock.
Add a little to the header comments for these functions to make it
clearer what guarantees about commit behavior are provided to callers.
(See commit f92944137 for context.)

Although this is only a comment change, it's really documentation
aimed at authors of extensions, so it seems appropriate to back-patch.

Yugo Nagata and Tom Lane, per further discussion of bug #17434.

Discussion: https://postgr.es/m/17434-d9f7a064ce2a88a3@postgresql.org
2022-11-09 11:08:52 -05:00
Tom Lane eb3ff8600b Stamp 11.18. 2022-11-07 16:49:11 -05:00
Peter Eisentraut 953a0346f4 Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: e10eebf472d24554d5f1a3a854f8c47508225b2d
2022-11-07 13:52:22 +01:00
Etsuro Fujita 6f47756d6f Correct error message for row-level triggers with transition tables on partitioned tables.
"Triggers on partitioned tables cannot have transition tables." is
incorrect as we allow statement-level triggers on partitioned tables to
have transition tables.

This has been wrong since commit 86f575948; back-patch to v11 where that
commit came in.

Reviewed by Tom Lane.

Discussion: https://postgr.es/m/CAPmGK17gk4vXLzz2iG%2BG4LWRWCoVyam70nZ3OuGm1hMJwDrhcg%40mail.gmail.com
2022-11-04 19:15:08 +09:00
Tom Lane ed019b5ef9 Avoid crash after function syntax error in a replication worker.
If a syntax error occurred in a SQL-language or PL/pgSQL-language
CREATE FUNCTION or DO command executed in a logical replication worker,
we'd suffer a null pointer dereference or assertion failure.  That
seems like a rather contrived case, but nonetheless worth fixing.

The cause is that function_parse_error_transpose assumes it must be
executing within the context of a Portal, but logical/worker.c
doesn't create a Portal since it's not running the standard executor.
We can just back off the hard Assert check and make it fail gracefully
if there's not an ActivePortal.  (I have a feeling that the aggressive
check here was my fault originally, probably because I wasn't sure if
the case would always hold and wanted to find out.  Well, now we know.)

The hazard seems to exist in all branches that have logical replication,
so back-patch to v10.

Maxim Orlov, Anton Melnikov, Masahiko Sawada, Tom Lane

Discussion: https://postgr.es/m/b570c367-ba38-95f3-f62d-5f59b9808226@inbox.ru
Discussion: https://postgr.es/m/adf0452f-8c6b-7def-d35e-ab516c80088e@inbox.ru
2022-11-03 12:01:57 -04:00
Tom Lane a0f9be1f92 Allow use of __sync_lock_test_and_set for spinlocks on any machine.
If we have no special-case code in s_lock.h for the current platform,
but the compiler has __sync_lock_test_and_set, use that instead of
failing.  It's unlikely that anybody's __sync_lock_test_and_set
would be so awful as to be worse than our semaphore-based fallback,
but if it is, they can (continue to) use --disable-spinlocks.

This allows removal of the RISC-V special case installed by commit
c32fcac56, which generated exactly the same code but only on that
platform.  Usefully, the RISC-V buildfarm animals should now test
at least the int variant of this patch.

I've manually tested both variants on ARM by dint of removing the
ARM-specific stanza.  We don't want to drop that, because it already
has some special knowledge and is likely to grow more over time.
Likewise, this is not meant to preclude installing special cases
for other arches if that proves worthwhile.

Per discussion of a request to install the same code for loongarch64.
Like the previous patch, we might as well back-patch to supported
branches.

Discussion: https://postgr.es/m/761ac43d44b84d679ba803c2bd947cc0@HSMAILSVR04.hs.handsome.com.cn
2022-11-02 17:37:26 -04:00
Etsuro Fujita 36abd1fb51 Fix copy-and-pasteo in comment. 2022-11-02 18:15:08 +09:00
Tom Lane b1cb77bdf4 Update time zone data files to tzdata release 2022f.
DST law changes in Chile, Fiji, Iran, Jordan, Mexico, Palestine,
and Syria.  Historical corrections for Chile, Crimea, Iran, and
Mexico.

Also, the Europe/Kiev zone has been renamed to Europe/Kyiv
(retaining the old name as a link).

The following zones have been merged into nearby, more-populous zones
whose clocks have agreed since 1970: Antarctica/Vostok, Asia/Brunei,
Asia/Kuala_Lumpur, Atlantic/Reykjavik, Europe/Amsterdam,
Europe/Copenhagen, Europe/Luxembourg, Europe/Monaco, Europe/Oslo,
Europe/Stockholm, Indian/Christmas, Indian/Cocos, Indian/Kerguelen,
Indian/Mahe, Indian/Reunion, Pacific/Chuuk, Pacific/Funafuti,
Pacific/Majuro, Pacific/Pohnpei, Pacific/Wake and Pacific/Wallis.
(This indirectly affects zones that were already links to one of
these: Arctic/Longyearbyen, Atlantic/Jan_Mayen, Iceland,
Pacific/Ponape, Pacific/Truk, and Pacific/Yap.)  America/Nipigon,
America/Rainy_River, America/Thunder_Bay, Europe/Uzhgorod, and
Europe/Zaporozhye were also merged into nearby zones after discovering
that their claimed post-1970 differences from those zones seem to have
been errors.

While the IANA crew have been working on merging zones that have no
post-1970 differences for some time, this batch of changes affects
some zones that are significantly more populous than those merged
in the past, notably parts of Europe.  The loss of pre-1970 timezone
history for those zones may be troublesome for applications
expecting consistency of timestamptz display.  As an example, the
stored value '1944-06-01 12:00 UTC' would previously display as
'1944-06-01 13:00:00+01' if the Europe/Stockholm zone is selected,
but now it will read out as '1944-06-01 14:00:00+02'.

There exists a "packrat" option that will build the timezone data
files with this old data preserved, but the problem is that it also
resurrects a bunch of other, far less well-attested data; so much so
that actually more zones' contents change from 2022a with that option
than without it.  I have chosen not to do that here, for that reason
and because it appears that no major OS distributions are using the
"packrat" option, so that doing so would cause Postgres' behavior
to diverge significantly depending on whether it was built with
--with-system-tzdata.  However, for anyone for whom these changes pose
significant problems, there is a solution: build a set of timezone
files with the "packrat" option and use those with Postgres.
2022-11-01 17:09:16 -04:00
Michael Paquier 341fba2a62 Fix ordering issue with WAL operations in GIN fast insert path
Contrary to what is documented in src/backend/access/transam/README,
ginHeapTupleFastInsert() had a few ordering issues with the way it does
its WAL operations when inserting items in its fast path.

First, when using a separate list, XLogBeginInsert() was being always
called before START_CRIT_SECTION(), and in this case a second thing was
wrong when merging lists, as an exclusive lock was taken on the tail
page *before* calling XLogBeginInsert().  Finally, when inserting items
into a tail page, the order of XLogBeginInsert() and
START_CRIT_SECTION() was reversed.  This commit addresses all these
issues by moving the calls of XLogBeginInsert() after all the pages
logged are locked and pinned, within a critical section.

This has been applied first only on HEAD as of 56b6625, but as per
discussion with Tom Lane and Álvaro Herrera, a backpatch is preferred to
keep all the branches consistent and to respect the transam's README
where we can.

Author:  Matthias van de Meent, Zhang Mingli
Discussion: https://postgr.es/m/CAEze2WhL8uLMqynnnCu1LAPwxD5RKEo0nHV+eXGg_N6ELU88HQ@mail.gmail.com
Backpatch-through: 10
2022-10-26 09:41:28 +09:00
Robert Haas 38214dabd4 pg_basebackup: Fix cross-platform tablespace relocation.
Specifically, when pg_basebackup is invoked with -Tx=y, don't error
out if x could plausibly be an absolute path either on Windows or on
non-Windows systems. We don't know whether the remote system is
running the same OS as the local system, so it's not appropriate to
assume that our local rule about absolute pathnames is the same as
the rule on the remote system.

Patch by me, reviewed by Tom Lane, Andrew Dunstan, and
Davinder Singh.

Discussion: http://postgr.es/m/CA+TgmoY+jC3YiskomvYKDPK3FbrmsDU7_8+wMHt02HOdJeRb0g@mail.gmail.com
2022-10-21 09:05:57 -04:00
Amit Kapila 5c51afe23d Add CHECK_FOR_INTERRUPTS while restoring changes during decoding.
Previously in commit 42681dffaf, we added CFI during decoding changes but
missed another similar case that can happen while restoring changes
spilled to disk back into memory in a loop.

Reported-by: Robert Haas
Author: Amit Kapila
Backpatch-through: 10
Discussion: https://postgr.es/m/CA+TgmoaLObg0QbstbC8ykDwOdD1bDkr4AbPpB=0DPgA2JW0mFg@mail.gmail.com
2022-10-21 12:08:14 +05:30
Amit Kapila 216af69aec Fix executing invalidation messages generated by subtransactions during decoding.
This problem has been introduced by commit 272248a0c1 where we started
assigning the subtransactions to the top-level transaction when we mark
both the top-level transaction and its subtransactions as containing
catalog changes. After we assign subtransactions to the top-level
transaction, we were not allowed to execute any invalidations associated
with it when we decide to skip the transaction.

The reason to assign the subtransactions to the top-level transaction was
to avoid the assertion failure in AssertTXNLsnOrder() as they have the
same LSN when we sometimes start accumulating transaction changes for
partial transactions after the restart. Now that with commit 64ff0fe4e8,
we skip this assertion check until we reach the LSN at which we start
decoding the contents of the transaction, so, there is no reason for such
an assignment anymore.

The assignment change was introduced in 15 and prior versions but this bug
doesn't exist in branches prior to 14 since we don't add invalidation
messages to subtransactions. We decided to backpatch through 11 for
consistency but not for 10 since its final release is near.

Reported-by: Kuroda Hayato
Author: Masahiko Sawada
Reviewed-by: Amit Kapila
Backpatch-through: 11
Discussion: https://postgr.es/m/TYAPR01MB58660803BCAA7849C8584AA4F57E9%40TYAPR01MB5866.jpnprd01.prod.outlook.com
Discussion: https://postgr.es/m/a89b46b6-0239-2fd5-71a9-b19b1f7a7145%40enterprisedb.com
2022-10-21 09:22:20 +05:30
Amit Kapila 5f7076cb60 Fix assertion failures while processing NEW_CID record in logical decoding.
When the logical decoding restarts from NEW_CID, since there is no
association between the top transaction and its subtransaction, both are
created as top transactions and have the same LSN. This caused the
assertion failure in AssertTXNLsnOrder().

This patch skips the assertion check until we reach the LSN at which we
start decoding the contents of the transaction, specifically
start_decoding_at LSN in SnapBuild. This is okay because we don't
guarantee to make the association between top transaction and
subtransaction until we try to decode the actual contents of transaction.
The ordering of the records prior to the start_decoding_at LSN should have
been checked before the restart.

The other assertion failure is due to the reason that we forgot to track
that we have considered top-level transaction id in the list of catalog
changing transactions that were committed when one of its subtransactions
is marked as containing catalog change.

Reported-by: Tomas Vondra, Osumi Takamichi
Author: Masahiko Sawada, Kuroda Hayato
Reviewed-by: Amit Kapila, Dilip Kumar, Kuroda Hayato, Kyotaro Horiguchi, Masahiko Sawada
Backpatch-through: 10
Discussion: https://postgr.es/m/a89b46b6-0239-2fd5-71a9-b19b1f7a7145%40enterprisedb.com
Discussion: https://postgr.es/m/TYCPR01MB83733C6CEAE47D0280814D5AED7A9%40TYCPR01MB8373.jpnprd01.prod.outlook.com
2022-10-20 09:07:04 +05:30
Thomas Munro da3a6825eb Track LLVM 15 changes.
Per https://llvm.org/docs/OpaquePointers.html, support for non-opaque
pointers still exists and we can request that on our context.  We have
until LLVM 16 to move to opaque pointers, a much larger change.

Back-patch to 11, where LLVM support arrived.

Author: Thomas Munro <thomas.munro@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAMHz58Sf_xncdyqsekoVsNeKcruKootLtVH6cYXVhhUR1oKPCg%40mail.gmail.com
2022-10-19 22:49:25 +13:00
Tom Lane e9377e3e53 Reject non-ON-SELECT rules that are named "_RETURN".
DefineQueryRewrite() has long required that ON SELECT rules be named
"_RETURN".  But we overlooked the converse case: we should forbid
non-ON-SELECT rules that are named "_RETURN".  In particular this
prevents using CREATE OR REPLACE RULE to overwrite a view's _RETURN
rule with some other kind of rule, thereby breaking the view.

Per bug #17646 from Kui Liu.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/17646-70c93cfa40365776@postgresql.org
2022-10-17 12:14:39 -04:00
Tom Lane 6618c276b7 Rename parser token REF to REF_P to avoid a symbol conflict.
In the latest version of Apple's macOS SDK, <sys/socket.h>
fails to compile if "REF" is #define'd as something.
Apple may or may not agree that this is a bug, and even if
they do accept the bug report I filed, they probably won't
fix it very quickly.  In the meantime, our back branches will all
fail to compile gram.y.  v15 and HEAD currently escape the problem
thanks to the refactoring done in 98e93a1fc, but that's purely
accidental.  Moreover, since that patch removed a widely-visible
inclusion of <netdb.h>, back-patching it seems too likely to break
third-party code.

Instead, change the token's code name to REF_P, following our usual
convention for naming parser tokens that are likely to have symbol
conflicts.  The effects of that should be localized to the grammar
and immediately surrounding files, so it seems like a safer answer.

Per project policy that we want to keep recently-out-of-support
branches buildable on modern systems, back-patch all the way to 9.2.

Discussion: https://postgr.es/m/1803927.1665938411@sss.pgh.pa.us
2022-10-16 15:27:04 -04:00
Tom Lane 6c1de98bad Harden pmsignal.c against clobbered shared memory.
The postmaster is not supposed to do anything that depends
fundamentally on shared memory contents, because that creates
the risk that a backend crash that trashes shared memory will
take the postmaster down with it, preventing automatic recovery.
In commit 969d7cd43 I lost sight of this principle and coded
AssignPostmasterChildSlot() in such a way that it could fail
or even crash if the shared PMSignalState structure became
corrupted.  Remarkably, we've not seen field reports of such
crashes; but I managed to induce one while testing the recent
changes around palloc chunk headers.

To fix, make a semi-duplicative state array inside the postmaster
so that we need consult only local state while choosing a "child
slot" for a new backend.  Ensure that other postmaster-executed
routines in pmsignal.c don't have critical dependencies on the
shared state, either.  Corruption of PMSignalState might now
lead ReleasePostmasterChildSlot() to conclude that backend X
failed, when actually backend Y was the one that trashed things.
But that doesn't matter, because we'll force a cluster-wide reset
regardless.

Back-patch to all supported branches, since this is an old bug.

Discussion: https://postgr.es/m/3436789.1665187055@sss.pgh.pa.us
2022-10-11 18:54:31 -04:00
Tom Lane addde9bc6c Yet further fixes for multi-row VALUES lists for updatable views.
DEFAULT markers appearing in an INSERT on an updatable view
could be mis-processed if they were in a multi-row VALUES clause.
This would lead to strange errors such as "cache lookup failed
for type NNNN", or in older branches even to crashes.

The cause is that commit 41531e42d tried to re-use rewriteValuesRTE()
to remove any SetToDefault nodes (that hadn't previously been replaced
by the view's own default values) appearing in "product" queries,
that is DO ALSO queries.  That's fundamentally wrong because the
DO ALSO queries might not even be INSERTs; and even if they are,
their targetlists don't necessarily match the view's column list,
so that almost all the logic in rewriteValuesRTE() is inapplicable.

What we want is a narrow focus on replacing any such nodes with NULL
constants.  (That is, in this context we are interpreting the defaults
as being strictly those of the view itself; and we already replaced
any that aren't NULL.)  We could add still more !force_nulls tests
to further lobotomize rewriteValuesRTE(); but it seems cleaner to
split out this case to a new function, restoring rewriteValuesRTE()
to the charter it had before.

Per bug #17633 from jiye_sw.  Patch by me, but thanks to
Richard Guo and Japin Li for initial investigation.
Back-patch to all supported branches, as the previous fix was.

Discussion: https://postgr.es/m/17633-98cc85e1fa91e905@postgresql.org
2022-10-11 18:24:15 -04:00
Alvaro Herrera dd82638734
Ensure all perl test modules are installed
PostgreSQL::Test::Cluster and ::Utils were not being installed.  This is
very hard to notice, as it only seems to affect external modules that
want to run tests from 15 back in earlier versions.  Oversight in
b235d41d96.

This applies only to branches 14 and back, because 15 had already been
made correct in commit b3b4d8e68a.

Discussion: https://postgr.es/m/20221010093415.poplkyn7pjeiv2y7@alvherre.pgsql
2022-10-11 09:56:13 +02:00
Alvaro Herrera 1f2f50bc1b
Change some errdetail() to errdetail_internal()
This prevents marking the argument string for translation for gettext,
and it also prevents the given string (which is already translated) from
being translated at runtime.

Also, mark the strings used as arguments to check_rolespec_name for
translation.

Backpatch all the way back as appropriate.  None of this is caught by
any tests (necessarily so), so I verified it manually.
2022-09-28 17:14:53 +02:00
Alvaro Herrera c9a9eae569
Add missing source files to pg_waldump/nls.mk 2022-09-25 17:48:03 +02:00
Tom Lane bb8dbc9f25 Suppress more variable-set-but-not-used warnings from clang 15.
Mop up assorted set-but-not-used warnings in the back branches.
This includes back-patching relevant fixes from commit 152c9f7b8
the rest of the way, but there are also several cases that did not
appear in HEAD.  Some of those we'd fixed in a retail way but not
back-patched, and others I think just got rewritten out of existence
during nearby refactoring.

While here, also back-patch b1980f6d0 (PL/Tcl: Fix compiler warnings
with Tcl 8.6) into 9.2, so that that branch compiles warning-free
with modern Tcl.

Per project policy, this is a candidate for back-patching into
out-of-support branches: it suppresses annoying compiler warnings
but changes no behavior.  Hence, back-patch all the way to 9.2.

Discussion: https://postgr.es/m/514615.1663615243@sss.pgh.pa.us
2022-09-21 13:52:38 -04:00
Tom Lane 6ae8aee0b6 Suppress variable-set-but-not-used warnings from clang 15.
clang 15+ will issue a set-but-not-used warning when the only
use of a variable is in autoincrements (e.g., "foo++;").
That's perfectly sensible, but it detects a few more cases that
we'd not noticed before.  Silence the warnings with our usual
methods, such as PG_USED_FOR_ASSERTS_ONLY, or in one case by
actually removing a useless variable.

One thing that we can't nicely get rid of is that with %pure-parser,
Bison emits "yynerrs" as a local variable that falls foul of this
warning.  To silence those, I inserted "(void) yynerrs;" in the
top-level productions of affected grammars.

Per recently-established project policy, this is a candidate
for back-patching into out-of-support branches: it suppresses
annoying compiler warnings but changes no behavior.  Hence,
back-patch to 9.5, which is as far as these patches go without
issues.  (A preliminary check shows that the prior branches
need some other set-but-not-used cleanups too, so I'll leave
them for another day.)

Discussion: https://postgr.es/m/514615.1663615243@sss.pgh.pa.us
2022-09-20 12:04:37 -04:00
Tom Lane cb7d636e0f Future-proof the recursion inside ExecShutdownNode().
The API contract for planstate_tree_walker() callbacks is that they
take a PlanState pointer and a context pointer.  Somebody figured
they could save a couple lines of code by ignoring that, and passing
ExecShutdownNode itself as the walker even though it has but one
argument.  Somewhat remarkably, we've gotten away with that so far.
However, it seems clear that the upcoming C2x standard means to
forbid such cases, and compilers that actively break such code
likely won't be far behind.  So spend the extra few lines of code
to do it honestly with a separate walker function.

In HEAD, we might as well go further and remove ExecShutdownNode's
useless return value.  I left that as-is in back branches though,
to forestall complaints about ABI breakage.

Back-patch, with the thought that this might become of practical
importance before our stable branches are all out of service.
It doesn't seem to be fixing any live bug on any currently known
platform, however.

Discussion: https://postgr.es/m/208054.1663534665@sss.pgh.pa.us
2022-09-19 12:16:02 -04:00
Peter Geoghegan f01fd89b15 Make check_usermap() parameter names consistent.
The function has a bool argument named "case_insensitive", but that was
spelled "case_sensitive" in the declaration.  Make them consistent now
to avoid confusion in the future.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Michael Paquiër <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com
Backpatch: 10-
2022-09-17 16:54:08 -07:00
Tom Lane 7391ab28a6 Improve plpgsql's ability to handle arguments declared as RECORD.
Treat arguments declared as RECORD as if that were a polymorphic type
(which it is, sort of), in that we substitute the actual argument type
while forming the function cache lookup key.  This allows the specific
composite type to be known in some cases where it was not before,
at the cost of making a separate function cache entry for each named
composite type that's passed to the function during a session.  The
particular symptom discussed in bug #17610 could be solved in other
more-efficient ways, but only at the cost of considerable development
work, and there are other cases where we'd still fail without this.

Per bug #17610 from Martin Jurča.  Back-patch to v11 where we first
allowed plpgsql functions to be declared as taking type RECORD.

Discussion: https://postgr.es/m/17610-fb1eef75bf6c2364@postgresql.org
2022-09-16 13:23:01 -04:00
Tom Lane 19bb5e46b9 In back branches, fix conditions for pullup of FROM-less subqueries.
In branches before commit 4be058fe9, we have to prevent flattening
of subqueries with empty jointrees if the subqueries' output
columns might need to be wrapped in PlaceHolderVars.  That's
because the empty jointree would result in empty phrels for the
PlaceHolderVars, meaning we'd be unable to figure out where to
evaluate them.  However, we've failed to keep is_simple_subquery's
check for this hazard in sync with what pull_up_simple_subquery
actually does.  The former is checking "lowest_outer_join != NULL",
whereas the conditions pull_up_simple_subquery actually uses are

	if (lowest_nulling_outer_join != NULL)
	if (containing_appendrel != NULL)
	if (parse->groupingSets)

So the outer-join test is overly conservative, while we missed out
checking for appendrels and groupingSets.  The appendrel omission
is harmless, because in that case we also check is_safe_append_member
which will also reject such subqueries.  The groupingSets omission
is a bug though, leading to assertion failures or planner errors
such as "variable not found in subplan target lists".

is_simple_subquery has access to none of the three variables used
in the correct tests, but its callers do, so I chose to have them
pass down a bool corresponding to the OR of these conditions.
(The need for duplicative conditions would be a maintenance
hazard in actively-developed code, but I'm not too concerned
about it in branches that have only ~ 1 year to live.)

Per bug #17614 from Wei Wei.  Patch v10 and v11 only, since we have a
better answer to this in v12 and later (indeed, the faulty check in
is_simple_subquery is gone entirely).

Discussion: https://postgr.es/m/17614-8ec20c85bdecaa2a@postgresql.org
2022-09-15 15:21:35 -04:00
Michael Paquier f01cc02259 Fix incorrect value for "strategy" with deflateParams() in walmethods.c
The zlib documentation mentions the values supported for the compression
strategy, but this code has been using a hardcoded value of 0 rather
than Z_DEFAULT_STRATEGY.  This commit adjusts the code to use
Z_DEFAULT_STRATEGY.

Backpatch down to where this code has been added to ease the backport of
any future patch touching this area.

Reported-by: Tom Lane
Discussion: https://postgr.es/m/1400032.1662217889@sss.pgh.pa.us
Backpatch-through: 10
2022-09-14 14:52:35 +09:00
Peter Eisentraut e962235fe1 Expand palloc/pg_malloc API for more type safety
This adds additional variants of palloc, pg_malloc, etc. that
encapsulate common usage patterns and provide more type safety.

Specifically, this adds palloc_object(), palloc_array(), and
repalloc_array(), which take the type name of the object to be
allocated as its first argument and cast the return as a pointer to
that type.  There are also palloc0_object() and palloc0_array()
variants for initializing with zero, and pg_malloc_*() variants of all
of the above.

Inspired by the talloc library.

This is backpatched from master so that future backpatchable code can
make use of these APIs.  This patch by itself does not contain any
users of these APIs.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/bb755632-2a43-d523-36f8-a1e7a389a907@enterprisedb.com
2022-09-14 06:08:56 +02:00
Tom Lane fe4e151d4a Fix possible omission of variable storage markers in ECPG.
The ECPG preprocessor converted code such as

static varchar str1[10], str2[20], str3[30];

into

static  struct varchar_1  { int len; char arr[ 10 ]; }  str1 ;
        struct varchar_2  { int len; char arr[ 20 ]; }  str2 ;
        struct varchar_3  { int len; char arr[ 30 ]; }  str3 ;

thus losing the storage attribute for the later variables.
Repeat the declaration for each such variable.

(Note that this occurred only for variables declared "varchar"
or "bytea", which may help explain how it escaped detection
for so long.)

Andrey Sokolov

Discussion: https://postgr.es/m/942241662288242@mail.yandex.ru
2022-09-09 15:34:04 -04:00
Tom Lane 9bcf6fb281 Further fixes for MULTIEXPR_SUBLINK fix.
Some more things I didn't think about in commits 3f7323cbb et al:

MULTIEXPR_SUBLINK subplans might have been converted to initplans
instead of regular subplans, in which case they won't show up in
the modified targetlist.  Fortunately, this would only happen if
they have no input parameters, which means that the problem we
originally needed to fix can't happen with them.  Therefore, there's
no need to clone their output parameters, and thus it doesn't hurt
that we'll fail to see them in the first pass over the targetlist.
Nonetheless, this complicates matters greatly, because now we have
to distinguish output Params of initplans (which shouldn't get
renumbered) from those of regular subplans (which should).
This also breaks the simplistic scheme I used of assuming that the
subplans found in the targetlist have consecutive subLinkIds.
We really can't avoid the need to know the subplans' subLinkIds in
this code.  To fix that, add subLinkId as the last field of SubPlan.
We can get away with that change in back branches because SubPlan
nodes will never be stored in the catalogs, and there's no ABI
break for external code that might be looking at the existing
fields of SubPlan.

Secondly, rewriteTargetListIU might have rolled up multiple
FieldStores or SubscriptingRefs into one targetlist entry,
breaking the assumption that there's at most one Param to fix
per targetlist entry.  (That assumption is OK I think in the
ruleutils.c code I stole the logic from in 18f51083c, because
that only deals with pre-rewrite query trees.  But it's
definitely not OK here.)  Abandon that shortcut and just do a
full tree walk on the targetlist to ensure we find all the
Params we have to change.

Per bug #17606 from Andre Lin.  As before, only v10-v13 need the
patch.

Discussion: https://postgr.es/m/17606-e5c8ad18d31db96a@postgresql.org
2022-09-06 16:38:18 -04:00
Peter Geoghegan a228cca46d Backpatch nbtree page deletion hardening.
Postgres 14 commit 5b861baa taught nbtree VACUUM to tolerate buggy
opclasses.  VACUUM's inability to locate a to-be-deleted page's downlink
in the parent page was logged instead of throwing an error.  VACUUM
could just press on with vacuuming the index, and vacuuming the table as
a whole.

There are now anecdotal reports of this error causing problems that were
much more disruptive than the underlying index corruption ever could be.
Anything that makes VACUUM unable to make forward progress against one
table/index ultimately risks making the system enter xidStopLimit mode.
There is no good reason to take any chances here, so backpatch the
hardening commit.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wzm9HR6Pow=t-iQa57zT8qmX6_M4h14F-pTtb=xFDW5FBA@mail.gmail.com
Backpatch: 10-13 (all supported versions that lacked the hardening)
2022-09-05 11:20:01 -07:00
Tom Lane 56dc442447 Fix oversight in recent MULTIEXPR_SUBLINK fix.
Commits 3f7323cbb et al missed the possibility that the Params
they are looking for could be buried under implicit coercions,
as well as other stuff that processIndirection() could add to
the original targetlist entry.  Copy the code in ruleutils.c
that deals with such cases.  (I thought about refactoring so
that there's just one copy; but seeing that we only need this
in old back branches, it seems not worth the trouble.)

Per off-list report from Andre Lin.  As before, only v10-v13
need the patch.

Discussion: https://postgr.es/m/17596-c5357f61427a81dc@postgresql.org
2022-09-02 14:54:40 -04:00
David Rowley 1b11543967 Fix some possibly latent bugs in slab.c
Primarily, this fixes an incorrect calculation in SlabCheck which was
looking in the wrong byte for the sentinel check.  The reason that we've
never noticed this before in the form of a failing sentinel check is
because the pre-check to this always fails because all current core users
of slab contexts have a chunk size which is already MAXALIGNed, therefore
there's never any space for the sentinel byte.  It is possible that an
extension needs to use a slab context and if they do with a chunk size
that's not MAXALIGNed, then they'll likely get errors about overwritten
sentinel bytes.

Additionally, this patch changes various calculations which are being done
based on the sizeof(SlabBlock).  Currently, sizeof(SlabBlock) is a
multiple of 8, therefore sizeof(SlabBlock) is the same as
MAXALIGN(sizeof(SlabBlock)), however, if we were to ever have to add any
fields to that struct as part of a bug fix, then SlabAlloc could end up
returning a non-MAXALIGNed pointer.  To be safe, let's ensure we always
MAXALIGN sizeof(SlabBlock) before using it in any calculations.

This patch has already been applied to master in d5ee4db0e.

Diagnosed-by: Tomas Vondra, Tom Lane
Author: Tomas Vondra, David Rowley
Discussion: https://postgr.es/m/CAA4eK1%2B1JyW5TiL%3DyV-3Uq1CrfnTyn0Xrk5uArt31Z%3D8rgPhXQ%40mail.gmail.com
Backpatch-through: 10
2022-09-01 19:24:19 +12:00
Tom Lane cc0e506a64 Port regress-python3-mangle.mk to Solaris "sed", redux.
Per experimentation and buildfarm failures, Solaris' "sed"
has got some kind of problem with regexes that use both '*'
and '[[:alpha:]]'.  We can work around that by replacing
'[[:alpha:]]' with '[a-zA-Z]', which is plenty good enough
for our purposes, especially since this is only needed in
long-stable branches.

I chose to flat-out remove the second pattern of this sort,
's/except \([a-zA-Z][a-zA-Z.]*\), *\([a-zA-Z][a-zA-Z]*\):/except \1 as \2:/g'
because we haven't needed it since 8.4.

Follow-on to c3556f6fa, which probably missed catching this
because the problematic pattern was already gone when that
patch was written.

Patch v10-v12 only, as the problem manifests only there.
We have a line of dead code in v13-v14, which isn't worth
changing, and the whole mess is gone as of v15.

Discussion: https://postgr.es/m/165561.1661984701@sss.pgh.pa.us
2022-08-31 21:33:45 -04:00
Tom Lane cb9232d166 Prevent long-term memory leakage in autovacuum launcher.
get_database_list() failed to restore the caller's memory context,
instead leaving current context set to TopMemoryContext which is
how CommitTransactionCommand() leaves it.  The callers both think
they are using short-lived contexts, for the express purpose of
not having to worry about cleaning up individual allocations.
The net effect therefore is that supposedly short-lived allocations
could accumulate indefinitely in the launcher's TopMemoryContext.

Although this has been broken for a long time, it seems we didn't
have any obvious memory leak here until v15's rearrangement of the
stats logic.  I (tgl) am not entirely convinced that there's no
other leak at all, though, and we're surely at risk of adding one
in future back-patched fixes.  So back-patch to all supported
branches, even though this may be only a latent bug in pre-v15.

Reid Thompson

Discussion: https://postgr.es/m/972a4e12b68b0f96db514777a150ceef7dcd2e0f.camel@crunchydata.com
2022-08-31 16:23:20 -04:00
Tom Lane f5aa855cd8 In the Snowball dictionary, don't try to stem excessively-long words.
If the input word exceeds 1000 bytes, don't pass it to the stemmer;
just return it as-is after case folding.  Such an input is surely
not a word in any human language, so whatever the stemmer might
do to it would be pretty dubious in the first place.  Adding this
restriction protects us against a known recursion-to-stack-overflow
problem in the Turkish stemmer, and it seems like good insurance
against any other safety or performance issues that may exist in
the Snowball stemmers.  (I note, for example, that they contain no
CHECK_FOR_INTERRUPTS calls, so we really don't want them running
for a long time.)  The threshold of 1000 bytes is arbitrary.

An alternative definition could have been to treat such words as
stopwords, but that seems like a bigger break from the old behavior.

Per report from Egor Chindyaskin and Alexander Lakhin.
Thanks to Olly Betts for the recommendation to fix it this way.

Discussion: https://postgr.es/m/1661334672.728714027@f473.i.mail.ru
2022-08-31 10:42:05 -04:00
Tom Lane 6fd58ca770 On NetBSD, force dynamic symbol resolution at postmaster start.
The default of lazy symbol resolution means that when the postmaster
first reaches the select() call in ServerLoop, it'll need to resolve
the link to that libc entry point.  NetBSD's dynamic loader takes
an internal lock while doing that, and if a signal interrupts the
operation then there is a risk of self-deadlock should the signal
handler do anything that requires that lock, as several of the
postmaster signal handlers do.  The window for this is pretty narrow,
and timing considerations make it unlikely that a signal would arrive
right then anyway.  But it's semi-repeatable on slow single-CPU
machines, and in principle the race could happen with any hardware.

The least messy solution to this is to force binding of dynamic
symbols at postmaster start, using the "-z now" linker option.
While we're at it, also use "-z relro" so as to provide a small
security gain.

It's not entirely clear whether any other platforms share this
issue, but for now we'll assume it's NetBSD-specific.  (We might
later try to use "-z now" on more platforms for performance
reasons, but that would not likely be something to back-patch.)

Report and patch by me; the idea to fix it this way is from
Andres Freund.

Discussion: https://postgr.es/m/3384826.1661802235@sss.pgh.pa.us
2022-08-30 17:29:17 -04:00
Robert Haas 002fba80eb Prevent WAL corruption after a standby promotion.
When a PostgreSQL instance performing archive recovery but not using
standby mode is promoted, and the last WAL segment that it attempted
to read ended in a partial record, the previous code would create
invalid WAL on the new timeline. The WAL from the previously timeline
would be copied to the new timeline up until the end of the last valid
record, but instead of beginning to write WAL at immediately
afterwards, the promoted server would write an overwrite contrecord at
the beginning of the next segment. The end of the previous segment
would be left as all-zeroes, resulting in failures if anything tried
to read WAL from that file.

The root of the issue is that ReadRecord() decides whether to set
abortedRecPtr and missingContrecPtr based on the value of StandbyMode,
but ReadRecord() switches to a new timeline based on the value of
ArchiveRecoveryRequested. We shouldn't try to write an overwrite
contrecord if we're switching to a new timeline, so change the test in
ReadRecod() to check ArchiveRecoveryRequested instead.

Code fix by Dilip Kumar. Comments by me incorporating suggested
language from Álvaro Herrera. Further review from Kyotaro Horiguchi
and Sami Imseih.

Discussion: http://postgr.es/m/CAFiTN-t7umki=PK8dT1tcPV=mOUe2vNhHML6b3T7W7qqvvajjg@mail.gmail.com
Discussion: http://postgr.es/m/FB0DEA0B-E14E-43A0-811F-C1AE93D00FF3%40amazon.com
2022-08-29 12:06:30 -04:00
Tom Lane d9ebc582fb Repair rare failure of MULTIEXPR_SUBLINK subplans in inherited updates.
Prior to v14, if we have a MULTIEXPR SubPlan (that is, use of the syntax
UPDATE ... SET (c1, ...) = (SELECT ...)) in an UPDATE with an inherited
or partitioned target table, inheritance_planner() will clone the
targetlist and therefore also the MULTIEXPR SubPlan and the Param nodes
referencing it for each child target table.  Up to now, we've allowed
all the clones to share the underlying subplan as well as the output
parameter IDs -- that is, the runtime ParamExecData slots.  That
technique is borrowed from the far older code that supports initplans,
and it works okay in that case because the cloned SubPlan nodes are
essentially identical.  So it doesn't matter which one of the clones
the shared ParamExecData.execPlan field might point to.

However, this fails to hold for MULTIEXPR SubPlans, because they can
have nonempty "args" lists (values to be passed into the subplan), and
those lists could get mutated to different states in the various clones.
In the submitted reproducer, as well as the test case added here, one
clone contains Vars with varno OUTER_VAR where another has INNER_VAR,
because the child tables are respectively on the outer or inner side of
the join.  Sharing the execPlan pointer can result in trying to evaluate
an args list that doesn't match the local execution state, with mayhem
ensuing.  The result often is to trigger consistency checks in the
executor, but I believe this could end in a crash or incorrect updates.

To fix, assign new Param IDs to each of the cloned SubPlans, so that
they don't share ParamExecData slots at runtime.  It still seems fine
for the clones to share the underlying subplan, and extra ParamExecData
slots are cheap enough that this fix shouldn't cost much.

This has been busted since we invented MULTIEXPR SubPlans in 9.5.
Probably the lack of previous reports is because query plans in which
the different clones of a MULTIEXPR mutate to effectively-different
states are pretty rare.  There's no issue in v14 and later, because
without inheritance_planner() there's never a reason to clone
MULTIEXPR SubPlans.

Per bug #17596 from Andre Lin.  Patch v10-v13 only.

Discussion: https://postgr.es/m/17596-c5357f61427a81dc@postgresql.org
2022-08-27 12:11:20 -04:00
Etsuro Fujita 1d40200a98 Fix typo in comment. 2022-08-26 16:55:07 +09:00
Tom Lane 310d734efb Defend against stack overrun in a few more places.
SplitToVariants() in the ispell code, lseg_inside_poly() in geo_ops.c,
and regex_selectivity_sub() in selectivity estimation could recurse
until stack overflow; fix by adding check_stack_depth() calls.
So could next() in the regex compiler, but that case is better fixed by
converting its tail recursion to a loop.  (We probably get better code
that way too, since next() can now be inlined into its sole caller.)

There remains a reachable stack overrun in the Turkish stemmer, but
we'll need some advice from the Snowball people about how to fix that.

Per report from Egor Chindyaskin and Alexander Lakhin.  These mistakes
are old, so back-patch to all supported branches.

Richard Guo and Tom Lane

Discussion: https://postgr.es/m/1661334672.728714027@f473.i.mail.ru
2022-08-24 13:01:40 -04:00
Tom Lane d0371f190a Doc: prefer sysctl to /proc/sys in docs and comments.
sysctl is more portable than Linux's /proc/sys file tree, and
often easier to use too.  That's why most of our docs refer to
sysctl when talking about how to adjust kernel parameters.
Bring the few stragglers into line.

Discussion: https://postgr.es/m/361175.1661187463@sss.pgh.pa.us
2022-08-23 09:42:28 -04:00
Amit Kapila 51e9469a41 Add CHECK_FOR_INTERRUPTS while decoding changes.
While decoding changes in a loop, if we skip all the changes there is no
CFI making the loop uninterruptible.

Reported-by: Whale Song and Andrey Borodin
Bug: 17580
Author: Masahiko Sawada
Reviwed-by: Amit Kapila
Backpatch-through: 10
Discussion: https://postgr.es/m/17580-849c1d5b6d7eb422@postgresql.org
Discussion: https://postgr.es/m/B319ECD6-9A28-4CDF-A8F4-3591E0BF2369@yandex-team.ru
2022-08-23 08:42:51 +05:30
Tom Lane 116f20f926 Fix subtly-incorrect matching of parent and child partitioned indexes.
When creating a partitioned index, DefineIndex tries to identify
any existing indexes on the partitions that match the partitioned
index, so that it can absorb those as child indexes instead of
building new ones.  Part of the matching is to compare IndexInfo
structs --- but that wasn't done quite right.  We're comparing
the IndexInfo built within DefineIndex itself to one made from
existing catalog contents by BuildIndexInfo.  Notably, while
BuildIndexInfo will run index expressions and predicates through
expression preprocessing, that has not happened to DefineIndex's
struct.  The result is failure to match and subsequent creation
of duplicate indexes.

The easiest and most bulletproof fix is to build a new IndexInfo
using BuildIndexInfo, thereby guaranteeing that the processing done
is identical.

While here, let's also extract the opfamily and collation data
from the new partitioned index, removing ad-hoc logic that
duplicated knowledge about how those are constructed.

Per report from Christophe Pettus.  Back-patch to v11 where
we invented partitioned indexes.

Richard Guo and Tom Lane

Discussion: https://postgr.es/m/8864BFAA-81FD-4BF9-8E06-7DEB8D4164ED@thebuild.com
2022-08-18 12:11:47 -04:00
Tom Lane ee4a17e200 Add missing bad-PGconn guards in libpq entry points.
There's a convention that externally-visible libpq functions should
check for a NULL PGconn pointer, and fail gracefully instead of
crashing.  PQflush() and PQisnonblocking() didn't get that memo
though.  Also add a similar check to PQdefaultSSLKeyPassHook_OpenSSL;
while it's not clear that ordinary usage could reach that with a
null conn pointer, it's cheap enough to check, so let's be consistent.

Daniele Varrazzo and Tom Lane

Discussion: https://postgr.es/m/CA+mi_8Zm_mVVyW1iNFgyMd9Oh0Nv8-F+7Y3-BqwMgTMHuo_h2Q@mail.gmail.com
2022-08-15 15:40:07 -04:00
Michael Paquier 6e086bb1db Fix outdated --help message for postgres -f
This option switch supports a total of 8 values, as told by
set_plan_disabling_options() and the documentation, but this was not
reflected in the output generated by --help.

Author: Junwang Zhao
Discussion: https://postgr.es/m/CAEG8a3+pT3cWzyjzKs184L1XMNm8NDnoJLiSjAYSO7XqpRh_vA@mail.gmail.com
Backpatch-through: 10
2022-08-15 13:37:44 +09:00
Tom Lane 84f9691a19 Preserve memory context of VarStringSortSupport buffers.
When enlarging the work buffers of a VarStringSortSupport object,
varstrfastcmp_locale was careful to keep them in the ssup_cxt
memory context; but varstr_abbrev_convert just used palloc().
The latter creates a hazard that the buffers could be freed out
from under the VarStringSortSupport object, resulting in stomping
on whatever gets allocated in that memory later.

In practice, because we only use this code for ICU collations
(cf. 3df9c374e), the problem is confined to use of ICU collations.
I believe it may have been unreachable before the introduction
of incremental sort, too, as traditional sorting usually just
uses one context for the duration of the sort.

We could fix this by making the broken stanzas in varstr_abbrev_convert
match the non-broken ones in varstrfastcmp_locale.  However, it seems
like a better idea to dodge the issue altogether by replacing the
pfree-and-allocate-anew coding with repalloc, which automatically
preserves the chunk's memory context.  This fix does add a few cycles
because repalloc will copy the chunk's content, which the existing
coding assumes is useless.  However, we don't expect that these buffer
enlargement operations are performance-critical.  Besides that, it's
far from obvious that copying the buffer contents isn't required, since
these stanzas make no effort to mark the buffers invalid by resetting
last_returned, cache_blob, etc.  That seems to be safe upon examination,
but it's fragile and could easily get broken in future, which wouldn't
get revealed in testing with short-to-moderate-size strings.

Per bug #17584 from James Inform.  Whether or not the issue is
reachable in the older branches, this code has been broken on its
own terms from its introduction, so patch all the way back.

Discussion: https://postgr.es/m/17584-95c79b4a7d771f44@postgresql.org
2022-08-14 12:05:27 -04:00
Tom Lane b744e13b02 Catch stack overflow when recursing in transformFromClauseItem().
Most parts of the parser can expect that the stack overflow check
in transformExprRecurse() will trigger before things get desperate.
However, transformFromClauseItem() can recurse directly to self
without having analyzed any expressions, so it's possible to drive
it to a stack-overrun crash.  Add a check to prevent that.

Per bug #17583 from Egor Chindyaskin.  Back-patch to all supported
branches.

Richard Guo

Discussion: https://postgr.es/m/17583-33be55b9f981f75c@postgresql.org
2022-08-13 15:21:28 -04:00
Peter Eisentraut db9ec28d64 Add missing fields to _outConstraint()
As of 897795240c, check constraints can
be declared invalid.  But that patch didn't update _outConstraint() to
also show the relevant struct fields (which were only applicable to
foreign keys before that).  This currently only affects debugging
output, so no impact in practice.
2022-08-13 10:37:55 +02:00
Peter Eisentraut e5c2feac0e pg_upgrade: Fix some minor code issues
96ef3b8ff1 accidentally copied a not
applicable comment from the float8_pass_by_value code to the
data_checksums code.  Remove that.

87d3b35a1c changed pg_upgrade to
checking the checksum version rather than just the Boolean presence of
checksums, but didn't change the field type in its ControlData struct
from bool.  So this would not work correctly if there ever is a
checksum version larger than 1.
2022-08-13 00:16:17 +02:00
Peter Eisentraut bc11a7e2f3 Fix _outConstraint() for "identity" constraints
The set of fields printed by _outConstraint() in the CONSTR_IDENTITY
case didn't match the set of fields actually used in that case.  (The
code was probably uncarefully copied from the CONSTR_DEFAULT case.)
Fix that by using the right set of fields.  Since there is no read
support for this node type, this is really just for debugging output
right now, so it doesn't affect anything important.
2022-08-12 08:52:28 +02:00
Amit Kapila 63903546e5 Back-Patch "Add wait_for_subscription_sync for TAP tests."
This was originally done in commit 0c20dd33db for 16 only, to eliminate
duplicate code and as an infrastructure that makes it easier to write
future tests. However, it has been suggested that it would be good to
back-patch this testing infrastructure to aid future tests in
back-branches.

Backpatch to all supported versions.

Author: Masahiko Sawada
Reviewed by: Amit Kapila, Shi yu
Discussion: https://postgr.es/m/CAD21AoC-fvAkaKHa4t1urupwL8xbAcWRePeETvshvy80f6WV1A@mail.gmail.com
Discussion: https://postgr.es/m/E1oJBIf-0006sw-SA@gemulon.postgresql.org
2022-08-12 10:30:04 +05:30
Amit Kapila e721123b7a Fix catalog lookup with the wrong snapshot during logical decoding.
Previously, we relied on HEAP2_NEW_CID records and XACT_INVALIDATION
records to know if the transaction has modified the catalog, and that
information is not serialized to snapshot. Therefore, after the restart,
if the logical decoding decodes only the commit record of the transaction
that has actually modified a catalog, we will miss adding its XID to the
snapshot. Thus, we will end up looking at catalogs with the wrong
snapshot.

To fix this problem, this changes the snapshot builder so that it
remembers the last-running-xacts list of the decoded RUNNING_XACTS record
after restoring the previously serialized snapshot. Then, we mark the
transaction as containing catalog changes if it's in the list of initial
running transactions and its commit record has XACT_XINFO_HAS_INVALS. To
avoid ABI breakage, we store the array of the initial running transactions
in the static variables InitialRunningXacts and NInitialRunningXacts,
instead of storing those in SnapBuild or ReorderBuffer.

This approach has a false positive; we could end up adding the transaction
that didn't change catalog to the snapshot since we cannot distinguish
whether the transaction has catalog changes only by checking the COMMIT
record. It doesn't have the information on which (sub) transaction has
catalog changes, and XACT_XINFO_HAS_INVALS doesn't necessarily indicate
that the transaction has catalog change. But that won't be a problem since
we use snapshot built during decoding only to read system catalogs.

On the master branch, we took a more future-proof approach by writing
catalog modifying transactions to the serialized snapshot which avoids the
above false positive. But we cannot backpatch it because of a change in
the SnapBuild.

Reported-by: Mike Oh
Author: Masahiko Sawada
Reviewed-by: Amit Kapila, Shi yu, Takamichi Osumi, Kyotaro Horiguchi, Bertrand Drouvot, Ahsan Hadi
Backpatch-through: 10
Discussion: https://postgr.es/m/81D0D8B0-E7C4-4999-B616-1E5004DBDCD2%40amazon.com
2022-08-11 08:55:31 +05:30
Tom Lane 442dbd6698 Fix handling of R/W expanded datums that are passed to SQL functions.
fmgr_sql must make expanded-datum arguments read-only, because
it's possible that the function body will pass the argument to
more than one callee function.  If one of those functions takes
the datum's R/W property as license to scribble on it, then later
callees will see an unexpected value, leading to wrong answers.

From a performance standpoint, it'd be nice to skip this in the
common case that the argument value is passed to only one callee.
However, detecting that seems fairly hard, and certainly not
something that I care to attempt in a back-patched bug fix.

Per report from Adam Mackler.  This has been broken since we
invented expanded datums, so back-patch to all supported branches.

Discussion: https://postgr.es/m/WScDU5qfoZ7PB2gXwNqwGGgDPmWzz08VdydcPFLhOwUKZcdWbblbo-0Lku-qhuEiZoXJ82jpiQU4hOjOcrevYEDeoAvz6nR0IU4IHhXnaCA=@mackler.email
Discussion: https://postgr.es/m/187436.1660143060@sss.pgh.pa.us
2022-08-10 13:37:25 -04:00
Tom Lane 866f915688 Stamp 11.17. 2022-08-08 16:49:00 -04:00
Tom Lane 4c1dd68026 Stabilize output of new regression test.
Per buildfarm, the output order of \dx+ isn't consistent across
locales.  Apply NO_LOCALE to force C locale.  There might be a
more localized way, but I'm not seeing it offhand, and anyway
there is nothing in this test module that particularly cares
about locales.

Security: CVE-2022-2625
2022-08-08 12:16:01 -04:00
Tom Lane f52d2fbd8c In extensions, don't replace objects not belonging to the extension.
Previously, if an extension script did CREATE OR REPLACE and there was
an existing object not belonging to the extension, it would overwrite
the object and adopt it into the extension.  This is problematic, first
because the overwrite is probably unintentional, and second because we
didn't change the object's ownership.  Thus a hostile user could create
an object in advance of an expected CREATE EXTENSION command, and would
then have ownership rights on an extension object, which could be
modified for trojan-horse-type attacks.

Hence, forbid CREATE OR REPLACE of an existing object unless it already
belongs to the extension.  (Note that we've always forbidden replacing
an object that belongs to some other extension; only the behavior for
previously-free-standing objects changes here.)

For the same reason, also fail CREATE IF NOT EXISTS when there is
an existing object that doesn't belong to the extension.

Our thanks to Sven Klemm for reporting this problem.

Security: CVE-2022-2625
2022-08-08 11:12:31 -04:00
Alvaro Herrera 97d6614d70
Translation updates
Source-Git-URL: ssh://git@git.postgresql.org/pgtranslation/messages.git
Source-Git-Hash: ca66f38be9dcb1cf18429e1cd62b9f6b7ae7a89e
2022-08-08 12:39:52 +02:00
Alvaro Herrera 61904503b7
Remove unportable use of timezone in recent test
Per buildfarm member snapper

Discussion: https://postgr.es/m/129951.1659812518@sss.pgh.pa.us
2022-08-07 10:19:40 +02:00
Alvaro Herrera 772e6383d1
Improve recently-added test reliability
Commit 59be1c942a already tried to make
src/test/recovery/t/033_replay_tsp_drops more reliable, but it wasn't
enough.  Try to improve on that by making this use of a replication slot
to be more like others.  Also, don't drop the slot.

Make a few other stylistic changes while at it.  It's still quite slow,
which is another thing that we need to fix in this script.

Backpatch to all supported branches.

Discussion: https://postgr.es/m/349302.1659191875@sss.pgh.pa.us
2022-08-06 15:52:10 +02:00
Alvaro Herrera 39e45d3cec
BRIN: mask BRIN_EVACUATE_PAGE for WAL consistency checking
That bit is unlogged and therefore it's wrong to consider it in WAL page
comparison.

Add a test that tickles the case, as branch testing technology allows.

This has been a problem ever since wal consistency checking was
introduced (commit a507b86900 for pg10), so backpatch to all supported
branches.

Author: 王海洋 (Haiyang Wang) <wanghaiyang.001@bytedance.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/CACciXAD2UvLMOhc4jX9VvOKt7DtYLr3OYRBhvOZ-jRxtzc_7Jg@mail.gmail.com
Discussion: https://postgr.es/m/CACciXADOfErX9Bx0nzE_SkdfXr6Bbpo5R=v_B6MUTEYW4ya+cg@mail.gmail.com
2022-08-05 18:00:17 +02:00
Noah Misch cf86fddc1c Add HINT for restartpoint race with KeepFileRestoredFromArchive().
The five commits ending at cc2c7d65fc
closed this race condition for v15+.  For v14 through v10, add a HINT to
discourage studying the cosmetic problem.

Reviewed by Kyotaro Horiguchi and David Steele.

Discussion: https://postgr.es/m/20220731061747.GA3692882@rfd.leadboat.com
2022-08-05 08:31:02 -07:00
Alvaro Herrera 91130dd316
regress: fix test instability
Having additional triggers in a test table made the ORDER BY clauses in
old queries underspecified.  Add another column there for stability.

Per sporadic buildfarm pink.
2022-08-05 11:55:52 +02:00
Alvaro Herrera ce8e066521
Fix ENABLE/DISABLE TRIGGER to handle recursion correctly
Using ATSimpleRecursion() in ATPrepCmd() to do so as bbb927b4db did is
not correct, because ATPrepCmd() can't distinguish between triggers that
may be cloned and those that may not, so would wrongly try to recurse
for the latter category of triggers.

So this commit restores the code in EnableDisableTrigger() that
86f575948c had added to do the recursion, which would do it only for
triggers that may be cloned, that is, row-level triggers.  This also
changes tablecmds.c such that ATExecCmd() is able to pass the value of
ONLY flag down to EnableDisableTrigger() using its new 'recurse'
parameter.

This also fixes what seems like an oversight of 86f575948c that the
recursion to partition triggers would only occur if EnableDisableTrigger()
had actually changed the trigger.  It is more apt to recurse to inspect
partition triggers even if the parent's trigger didn't need to be
changed: only then can we be certain that all descendants share the same
state afterwards.

Backpatch all the way back to 11, like bbb927b4db.  Care is taken not
to break ABI compatibility (and that no catversion bump is needed.)

Co-authored-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru>
Discussion: https://postgr.es/m/CA+HiwqG-cZT3XzGAnEgZQLoQbyfJApVwOTQaCaas1mhpf+4V5A@mail.gmail.com
2022-08-05 09:46:58 +02:00
Tom Lane 2e2f3435af Add CHECK_FOR_INTERRUPTS in ExecInsert's speculative insertion loop.
Ordinarily the functions called in this loop ought to have plenty
of CFIs themselves; but we've now seen a case where no such CFI is
reached, making the loop uninterruptible.  Even though that's from
a recently-introduced bug, it seems prudent to install a CFI at
the loop level in all branches.

Per discussion of bug #17558 from Andrew Kesper (an actual fix for
that bug will follow).

Discussion: https://postgr.es/m/17558-3f6599ffcf52fd4a@postgresql.org
2022-08-04 14:10:06 -04:00
Tom Lane e94c52fca6 Reduce test runtime of src/test/modules/snapshot_too_old.
The sto_using_cursor and sto_using_select tests were coded to exercise
every permutation of their test steps, but AFAICS there is no value in
exercising more than one.  This matters because each permutation costs
about six seconds, thanks to the "pg_sleep(6)".  Perhaps we could
reduce that, but the useless permutations seem worth getting rid of
in any case.  (Note that sto_using_hash_index got it right already.)

While here, clean up some other sloppiness such as an unused table.

This doesn't make too much difference in interactive testing, since the
wasted time is typically masked by parallelization with other tests.
However, the buildfarm runs this as a serial step, which means we can
expect to shave ~40 seconds from every buildfarm run.  That makes it
worth back-patching.

Discussion: https://postgr.es/m/2515192.1659454702@sss.pgh.pa.us
2022-08-03 11:14:55 -04:00
Tom Lane 51d8b52fcf Check maximum number of columns in function RTEs, too.
I thought commit fd96d14d9 had plugged all the holes of this sort,
but no, function RTEs could produce oversize tuples too, either
via long coldeflists or just from multiple functions in one RTE.
(I'm pretty sure the other variants of base RTEs aren't a problem,
because they ultimately refer to either a table or a sub-SELECT,
whose widths are enforced elsewhere.  But we explicitly allow join
RTEs to be overwidth, as long as you don't try to form their
tuple result.)

Per further discussion of bug #17561.  As before, patch all branches.

Discussion: https://postgr.es/m/17561-80350151b9ad2ad4@postgresql.org
2022-08-01 12:22:35 -04:00
Andrew Dunstan 3f9c205367 Fix new recovery test for log_error_verbosity=verbose case
The new test is from commit 9e4f914b5e.

With this setting messages have SQL error numbers included, so that
needs to be provided for in the pattern looked for.

Backpatch to all live branches like the original.
2022-07-29 18:17:49 -04:00
Tom Lane 8dea18372f In transformRowExpr(), check for too many columns in the row.
A RowExpr with more than MaxTupleAttributeNumber columns would fail at
execution anyway, since we cannot form a tuple datum with more than that
many columns.  While heap_form_tuple() has a check for too many columns,
it emerges that there are some intermediate bits of code that don't
check and can be driven to failure with sufficiently many columns.
Checking this at parse time seems like the most appropriate place to
install a defense, since we already check SELECT list length there.

While at it, make the SELECT-list-length error use the same errcode
(TOO_MANY_COLUMNS) as heap_form_tuple does, rather than the generic
PROGRAM_LIMIT_EXCEEDED.

Per bug #17561 from Egor Chindyaskin.  The given test case crashes
in all supported branches (and probably a lot further back),
so patch all.

Discussion: https://postgr.es/m/17561-80350151b9ad2ad4@postgresql.org
2022-07-29 13:30:50 -04:00
Alvaro Herrera fcd72cf295
Fix test instability
On FreeBSD, the new test fails due to a WAL file being removed before
the standby has had the chance to copy it.  Fix by adding a replication
slot to prevent the removal until after the standby has connected.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reported-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAEze2Wj5nau_qpjbwihvmXLfkAWOZ5TKdbnqOc6nKSiRJEoPyQ@mail.gmail.com
2022-07-29 12:50:47 +02:00
Alvaro Herrera 5a10c262fc
Fix replay of create database records on standby
Crash recovery on standby may encounter missing directories
when replaying database-creation WAL records.  Prior to this
patch, the standby would fail to recover in such a case;
however, the directories could be legitimately missing.
Consider the following sequence of commands:

    CREATE DATABASE
    DROP DATABASE
    DROP TABLESPACE

If, after replaying the last WAL record and removing the
tablespace directory, the standby crashes and has to replay the
create database record again, crash recovery must be able to continue.

A fix for this problem was already attempted in 49d9cfc68b, but it
was reverted because of design issues.  This new version is based
on Robert Haas' proposal: any missing tablespaces are created
during recovery before reaching consistency.  Tablespaces
are created as real directories, and should be deleted
by later replay.  CheckRecoveryConsistency ensures
they have disappeared.

The problems detected by this new code are reported as PANIC,
except when allow_in_place_tablespaces is set to ON, in which
case they are WARNING.  Apart from making tests possible, this
gives users an escape hatch in case things don't go as planned.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Asim R Praveen <apraveen@pivotal.io>
Author: Paul Guo <paulguo@gmail.com>
Reviewed-by: Anastasia Lubennikova <lubennikovaav@gmail.com> (older versions)
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> (older versions)
Reviewed-by: Michaël Paquier <michael@paquier.xyz>
Diagnosed-by: Paul Guo <paulguo@gmail.com>
Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
2022-07-28 08:26:05 +02:00
Alvaro Herrera 258c896418
Allow "in place" tablespaces.
This is a backpatch to branches 10-14 of the following commits:

7170f2159f Allow "in place" tablespaces.
c6f2f01611 Fix pg_basebackup with in-place tablespaces.
f6f0db4d62 Fix pg_tablespace_location() with in-place tablespaces
7a7cd84893 doc: Remove mention to in-place tablespaces for pg_tablespace_location()
5344723755 Remove unnecessary Windows-specific basebackup code.

In-place tablespaces were introduced as a testing helper mechanism, but
they are going to be used for a bugfix in WAL replay to be backpatched
to all stable branches.

I (Álvaro) had to adjust some code to account for lack of
get_dirent_type() in branches prior to 14.

Author: Thomas Munro <thomas.munro@gmail.com>
Author: Michaël Paquier <michael@paquier.xyz>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20220722081858.omhn2in5zt3g4nek@alvherre.pgsql
2022-07-27 07:55:12 +02:00
Tom Lane 9e3e1ac458 Force immediate commit after CREATE DATABASE etc in extended protocol.
We have a few commands that "can't run in a transaction block",
meaning that if they complete their processing but then we fail
to COMMIT, we'll be left with inconsistent on-disk state.
However, the existing defenses for this are only watertight for
simple query protocol.  In extended protocol, we didn't commit
until receiving a Sync message.  Since the client is allowed to
issue another command instead of Sync, we're in trouble if that
command fails or is an explicit ROLLBACK.  In any case, sitting
in an inconsistent state while waiting for a client message
that might not come seems pretty risky.

This case wasn't reachable via libpq before we introduced pipeline
mode, but it's always been an intended aspect of extended query
protocol, and likely there are other clients that could reach it
before.

To fix, set a flag in PreventInTransactionBlock that tells
exec_execute_message to force an immediate commit.  This seems
to be the approach that does least damage to existing working
cases while still preventing the undesirable outcomes.

While here, add some documentation to protocol.sgml that explicitly
says how to use pipelining.  That's latent in the existing docs if
you know what to look for, but it's better to spell it out; and it
provides a place to document this new behavior.

Per bug #17434 from Yugo Nagata.  It's been wrong for ages,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/17434-d9f7a064ce2a88a3@postgresql.org
2022-07-26 13:07:03 -04:00
Tom Lane 1078742af0 Fix ruleutils issues with dropped cols in functions-returning-composite.
Due to lack of concern for the case in the dependency code, it's
possible to drop a column of a composite type even though stored
queries have references to the dropped column via functions-in-FROM
that return the composite type.  There are "soft" references,
namely FROM-clause aliases for such columns, and "hard" references,
that is actual Vars referring to them.  The right fix for hard
references is to add dependencies preventing the drop; something
we've known for many years and not done (and this commit still doesn't
address it).  A "soft" reference shouldn't prevent a drop though.
We've been around on this before (cf. 9b35ddce9, 2c4debbd0), but
nobody had noticed that the current behavior can result in dump/reload
failures, because ruleutils.c can print more column aliases than the
underlying composite type now has.  So we need to rejigger the
column-alias-handling code to treat such columns as dropped and not
print aliases for them.

Rather than writing new code for this, I used expandRTE() which already
knows how to figure out which function result columns are dropped.
I'd initially thought maybe we could use expandRTE() in all cases, but
that fails for EXPLAIN's purposes, because the planner strips a lot of
RTE infrastructure that expandRTE() needs.  So this patch just uses it
for unplanned function RTEs and otherwise does things the old way.

If there is a hard reference (Var), then removing the column alias
causes us to fail to print the Var, since there's no longer a name
to print.  Failing seems less desirable than printing a made-up
name, so I made it print "?dropped?column?" instead.

Per report from Timo Stolz.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de
2022-07-21 13:56:02 -04:00
Fujii Masao 80c3ea9185 Fix assertion failure and segmentation fault in backup code.
When a non-exclusive backup is canceled, do_pg_abort_backup() is called
and resets some variables set by pg_backup_start (pg_start_backup in v14
or before). But previously it forgot to reset the session state indicating
whether a non-exclusive backup is in progress or not in this session.

This issue could cause an assertion failure when the session running
BASE_BACKUP is terminated after it executed pg_backup_start and
pg_backup_stop (pg_stop_backup in v14 or before). Also it could cause
a segmentation fault when pg_backup_stop is called after BASE_BACKUP
in the same session is canceled.

This commit fixes the issue by making do_pg_abort_backup reset
that session state.

Back-patch to all supported branches.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Michael Paquier, Robert Haas
Discussion: https://postgr.es/m/3374718f-9fbf-a950-6d66-d973e027f44c@oss.nttdata.com
2022-07-20 09:54:10 +09:00
Fujii Masao 87e5044877 Prevent BASE_BACKUP in the middle of another backup in the same session.
Multiple non-exclusive backups are able to be run conrrently in different
sessions. But, in the same session, only one non-exclusive backup can be
run at the same moment. If pg_backup_start (pg_start_backup in v14 or before)
is called in the middle of another non-exclusive backup in the same session,
an error is thrown.

However, previously, in logical replication walsender mode, even if that
walsender session had already called pg_backup_start and started
a non-exclusive backup, it could execute BASE_BACKUP command and
start another non-exclusive backup. Which caused subsequent pg_backup_stop
to throw an error because BASE_BACKUP unexpectedly reset the session state
marked by pg_backup_start.

This commit prevents BASE_BACKUP command in the middle of another
non-exclusive backup in the same session.

Back-patch to all supported branches.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Michael Paquier, Robert Haas
Discussion: https://postgr.es/m/3374718f-9fbf-a950-6d66-d973e027f44c@oss.nttdata.com
2022-07-20 09:52:36 +09:00
Peter Eisentraut 6d61aef5db Re-add SPICleanup for ABI compatibility in stable branch
This fixes an ABI break introduced by
2f6b8c287b.

Author: Markus Wanner <markus.wanner@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/defd749a-8410-841d-1126-21398686d63d@enterprisedb.com
2022-07-18 19:38:24 +02:00
Thomas Munro 3f2344d4ae Make dsm_impl_posix_resize more future-proof.
Commit 4518c798 blocks signals for a short region of code, but it
assumed that whatever called it had the signal mask set to UnBlockSig on
entry.  That may be true today (or may even not be, in extensions in the
wild), but it would be better not to make that assumption.  We should
save-and-restore the caller's signal mask.

The PG_SETMASK() portability macro couldn't be used for that, which is
why it wasn't done before.  But... considering that commit a65e0864
established back in 9.6 that supported POSIX systems have sigprocmask(),
and that this is POSIX-only code, there is no reason not to use standard
sigprocmask() directly to achieve that.

Back-patch to all supported releases, like 4518c798 and 80845b7c.

Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGKx6Biq7_UuV0kn9DW%2B8QWcpJC1qwhizdtD9tN-fn0H0g%40mail.gmail.com
2022-07-16 12:23:43 +12:00
Thomas Munro 74a9ee0348 Don't clobber postmaster sigmask in dsm_impl_resize.
Commit 4518c798 intended to block signals in regular backends that
allocate DSM segments, but dsm_impl_resize() is also reached by
dsm_postmaster_startup().  It's not OK to clobber the postmaster's
signal mask, so only manipulate the signal mask when under the
postmaster.

Back-patch to all releases, like 4518c798.

Discussion: https://postgr.es/m/CA%2BhUKGKNpK%3D2OMeea_AZwpLg7Bm4%3DgYWk7eDjZ5F6YbozfOf8w%40mail.gmail.com
2022-07-15 02:07:15 +12:00
Thomas Munro 39683c69a0 Block signals while allocating DSM memory.
On Linux, we call posix_fallocate() on shm_open()'d memory to avoid
later potential SIGBUS (see commit 899bd785).

Based on field reports of systems stuck in an EINTR retry loop there,
there, we made it possible to break out of that loop via slightly odd
coding where the CHECK_FOR_INTERRUPTS() call was somewhat removed from
the loop (see commit 422952ee).

On further reflection, that was not a great choice for at least two
reasons:

1.  If interrupts were held, the CHECK_FOR_INTERRUPTS() would do nothing
and the EINTR error would be surfaced to the user.

2.  If EINTR was reported but neither QueryCancelPending nor
ProcDiePending was set, then we'd dutifully retry, but with a bit more
understanding of how posix_fallocate() works, it's now clear that you
can get into a loop that never terminates.  posix_fallocate() is not a
function that can do some of the job and tell you about progress if it's
interrupted, it has to undo what it's done so far and report EINTR, and
if signals keep arriving faster than it can complete (cf recovery
conflict signals), you're stuck.

Therefore, for now, we'll simply block most signals to guarantee
progress.  SIGQUIT is not blocked (see InitPostmasterChild()), because
its expected handler doesn't return, and unblockable signals like
SIGCONT are not expected to arrive at a high rate.  For good measure,
we'll include the ftruncate() call in the blocked region, and add a
retry loop.

Back-patch to all supported releases.

Reported-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Nicola Contu <nicola.contu@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220701154105.jjfutmngoedgiad3%40alvherre.pgsql
2022-07-14 14:23:03 +12:00
Thomas Munro cd26139a30 Fix lock assertions in dshash.c.
dshash.c previously maintained flags to be able to assert that you
didn't hold any partition lock.  These flags could get out of sync with
reality in error scenarios.

Get rid of all that, and make assertions about the locks themselves
instead.  Since LWLockHeldByMe() loops internally, we don't want to put
that inside another loop over all partition locks.  Introduce a new
debugging-only interface LWLockAnyHeldByMe() to avoid that.

This problem was noted by Tom and Andres while reviewing changes to
support the new shared memory stats system, and later showed up in
reality while working on commit 389869af.

Back-patch to 11, where dshash.c arrived.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220311012712.botrpsikaufzteyt@alap3.anarazel.de
Discussion: https://postgr.es/m/CA%2BhUKGJ31Wce6HJ7xnVTKWjFUWQZPBngxfJVx4q0E98pDr3kAw%40mail.gmail.com
2022-07-11 15:54:24 +12:00
Thomas Munro 21ed12b14a Fix \watch's interaction with libedit on ^C.
When you hit ^C, the terminal driver in Unix-like systems echoes "^C" as
well as sending an interrupt signal (depending on stty settings).  At
least libedit (but maybe also libreadline) is then confused about the
current cursor location, and corrupts the display if you try to scroll
back.  Fix, by moving to a new line before the next prompt is displayed.

Back-patch to all supported released.

Author: Pavel Stehule <pavel.stehule@gmail.com>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3278793.1626198638%40sss.pgh.pa.us
2022-07-10 16:55:18 +12:00
Dean Rasheed e88b1f1e22 Fix alias matching in transformLockingClause().
When locking a specific named relation for a FOR [KEY] UPDATE/SHARE
clause, transformLockingClause() finds the relation to lock by
scanning the rangetable for an RTE with a matching eref->aliasname.
However, it failed to account for the visibility rules of a join RTE.

If a join RTE doesn't have a user-supplied alias, it will have a
generated eref->aliasname of "unnamed_join" that is not visible as a
relation name in the parse namespace. Such an RTE needs to be skipped,
otherwise it might be found in preference to a regular base relation
with a user-supplied alias of "unnamed_join", preventing it from being
locked.

In addition, if a join RTE doesn't have a user-supplied alias, but
does have a join_using_alias, then the RTE needs to be matched using
that alias rather than the generated eref->aliasname, otherwise a
misleading "relation not found" error will be reported rather than a
"join cannot be locked" error.

Backpatch all the way, except for the second part which only goes back
to 14, where JOIN USING aliases were added.

Dean Rasheed, reviewed by Tom Lane.

Discussion: https://postgr.es/m/CAEZATCUY_KOBnqxbTSPf=7fz9HWPnZ5Xgb9SwYzZ8rFXe7nb=w@mail.gmail.com
2022-07-07 13:07:54 +01:00
Noah Misch 1cad30e3ba Fix previous commit's ecpg_clocale for ppc Darwin.
Per buildfarm member prairiedog, this platform rejects uninitialized
global variables in shared libraries.  Back-patch to v10, like the
addition of the variable.

Reviewed by Tom Lane.

Discussion: https://postgr.es/m/20220703030619.GB2378460@rfd.leadboat.com
2022-07-02 21:03:24 -07:00
Noah Misch d68b731a15 ecpglib: call newlocale() once per process.
ecpglib has been calling it once per SQL query and once per EXEC SQL GET
DESCRIPTOR.  Instead, if newlocale() has not succeeded before, call it
while establishing a connection.  This mitigates three problems:
- If newlocale() failed in EXEC SQL GET DESCRIPTOR, the command silently
  proceeded without the intended locale change.
- On AIX, each newlocale()+freelocale() cycle leaked memory.
- newlocale() CPU usage may have been nontrivial.

Fail the connection attempt if newlocale() fails.  Rearrange
ecpg_do_prologue() to validate the connection before its uselocale().

The sort of program that may regress is one running in an environment
where newlocale() fails.  If that program establishes connections
without running SQL statements, it will stop working in response to this
change.  I'm betting against the importance of such an ECPG use case.
Most SQL execution (any using ECPGdo()) has long required newlocale()
success, so there's little a connection could do without newlocale().

Back-patch to v10 (all supported versions).

Reviewed by Tom Lane.  Reported by Guillaume Lelarge.

Discussion: https://postgr.es/m/20220101074055.GA54621@rfd.leadboat.com
2022-07-02 13:00:35 -07:00
Thomas Munro facfd04678 Harden dsm_impl.c against unexpected EEXIST.
Previously, we trusted the OS not to report EEXIST unless we'd passed in
IPC_CREAT | IPC_EXCL or O_CREAT | O_EXCL, as appropriate.  Solaris's
shm_open() can in fact do that, causing us to crash because we didn't
ereport and then we blithely assumed the mapping was successful.

Let's treat EEXIST just like any other error, unless we're actually
trying to create a new segment.  This applies to shm_open(), where this
behavior has been seen, and also to the equivalent operations for our
sysv and mmap modes just on principle.

Based on the underlying reason for the error, namely contention on a
lock file managed by Solaris librt for each distinct name, this problem
is only likely to happen on 15 and later, because the new shared memory
stats system produces shm_open() calls for the same path from
potentially large numbers of backends concurrently during
authentication.  Earlier releases only shared memory segments between a
small number of parallel workers under one Gather node.  You could
probably hit it if you tried hard enough though, and we should have been
more defensive in the first place.  Therefore, back-patch to all
supported releases.

Per build farm animal margay.  This isn't the end of the story, though,
it just changes random crashes into random "File exists" errors; more
work needed for a green build farm.

Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKqKrCV5xKWfh9rnm%3Do%3DDwZLTLtnsj_XpUi9g5%3DV%2B9oyg%40mail.gmail.com
2022-07-01 13:21:28 +12:00
Heikki Linnakangas b49889f3c0 Fix visibility check when XID is committed in CLOG but not in procarray.
TransactionIdIsInProgress had a fast path to return 'false' if the
single-item CLOG cache said that the transaction was known to be
committed. However, that was wrong, because a transaction is first
marked as committed in the CLOG but doesn't become visible to others
until it has removed its XID from the proc array. That could lead to an
error:

    ERROR:  t_xmin is uncommitted in tuple to be updated

or for an UPDATE to go ahead without blocking, before the previous
UPDATE on the same row was made visible.

The window is usually very short, but synchronous replication makes it
much wider, because the wait for synchronous replica happens in that
window.

Another thing that makes it hard to hit is that it's hard to get such
a commit-in-progress transaction into the single item CLOG cache.
Normally, if you call TransactionIdIsInProgress on such a transaction,
it determines that the XID is in progress without checking the CLOG
and without populating the cache. One way to prime the cache is to
explicitly call pg_xact_status() on the XID. Another way is to use a
lot of subtransactions, so that the subxid cache in the proc array is
overflown, making TransactionIdIsInProgress rely on pg_subtrans and
CLOG checks.

This has been broken ever since it was introduced in 2008, but the race
condition is very hard to hit, especially without synchronous
replication. There were a couple of reports of the error starting from
summer 2021, but no one was able to find the root cause then.

TransactionIdIsKnownCompleted() is now unused. In 'master', remove it,
but I left it in place in backbranches in case it's used by extensions.

Also change pg_xact_status() to check TransactionIdIsInProgress().
Previously, it only checked the CLOG, and returned "committed" before
the transaction was actually made visible to other queries. Note that
this also means that you cannot use pg_xact_status() to reproduce the
bug anymore, even if the code wasn't fixed.

Report and analysis by Konstantin Knizhnik. Patch by Simon Riggs, with
the pg_xact_status() change added by me.

Author: Simon Riggs
Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/flat/4da7913d-398c-e2ad-d777-f752cf7f0bbb%40garret.ru
2022-06-27 08:24:37 +03:00
Noah Misch bf92b73beb Fix PostgreSQL::Test aliasing for Perl v5.10.1.
This Perl segfaults if a declaration of the to-be-aliased package
precedes the aliasing itself.  Per buildfarm members lapwing and wrasse.
Like commit 20911775de, back-patch to v10
(all supported versions).

Discussion: https://postgr.es/m/20220625171533.GA2012493@rfd.leadboat.com
2022-06-25 14:16:00 -07:00
Noah Misch 6d49cc2861 CREATE INDEX: use the original userid for more ACL checks.
Commit a117cebd63 used the original userid
for ACL checks located directly in DefineIndex(), but it still adopted
the table owner userid for more ACL checks than intended.  That broke
dump/reload of indexes that refer to an operator class, collation, or
exclusion operator in a schema other than "public" or "pg_catalog".
Back-patch to v10 (all supported versions), like the earlier commit.

Nathan Bossart and Noah Misch

Discussion: https://postgr.es/m/f8a4105f076544c180a87ef0c4822352@stmuk.bayern.de
2022-06-25 09:07:46 -07:00
Noah Misch c41edb3242 For PostgreSQL::Test compatibility, alias entire package symbol tables.
Remove the need to edit back-branch-specific code sites when
back-patching the addition of a PostgreSQL::Test::Utils symbol.  Replace
per-symbol, incomplete alias lists.  Give old and new package names the
same EXPORT and EXPORT_OK semantics.  Back-patch to v10 (all supported
versions).

Reviewed by Andrew Dunstan.

Discussion: https://postgr.es/m/20220622072144.GD4167527@rfd.leadboat.com
2022-06-25 09:07:45 -07:00
Amit Kapila ed2a7a6bf6 Fix memory leak due to LogicalRepRelMapEntry.attrmap.
When rebuilding the relation mapping on subscribers, we were not releasing
the attribute mapping's memory which was no longer required.

The attribute mapping used in logical tuple conversion was refactored in
PG13 (by commit e1551f96e6) but we forgot to update the related code that
frees the attribute map.

Author: Hou Zhijie
Reviewed-by: Amit Langote, Amit Kapila, Shi yu
Backpatch-through: 10, where it was introduced
Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
2022-06-23 08:37:40 +05:30
Tom Lane 2f6b8c287b Fix SPI's handling of errors during transaction commit.
SPI_commit previously left it up to the caller to recover from any error
occurring during commit.  Since that's complicated and requires use of
low-level xact.c facilities, it's not too surprising that no caller got
it right.  Let's move the responsibility for cleanup into spi.c.  Doing
that requires redefining SPI_commit as starting a new transaction, so
that it becomes equivalent to SPI_commit_and_chain except that you get
default transaction characteristics instead of preserving the prior
transaction's characteristics.  We can make this pretty transparent
API-wise by redefining SPI_start_transaction() as a no-op.  Callers
that expect to do something in between might be surprised, but
available evidence is that no callers do so.

Having made that API redefinition, we can fix this mess by having
SPI_commit[_and_chain] trap errors and start a new, clean transaction
before re-throwing the error.  Likewise for SPI_rollback[_and_chain].
Some cleanup is also needed in AtEOXact_SPI, which was nowhere near
smart enough to deal with SPI contexts nested inside a committing
context.

While plperl and pltcl need no changes beyond removing their now-useless
SPI_start_transaction() calls, plpython needs some more work because it
hadn't gotten the memo about catching commit/rollback errors in the
first place.  Such an error resulted in longjmp'ing out of the Python
interpreter, which leaks Python stack entries at present and is reported
to crash Python 3.11 altogether.  Add the missing logic to catch such
errors and convert them into Python exceptions.

This is a back-patch of commit 2e517818f.  That's now aged long enough
to reduce the concerns about whether it will break something, and we
do need to ensure that supported branches will work with Python 3.11.

Peter Eisentraut and Tom Lane

Discussion: https://postgr.es/m/3375ffd8-d71c-2565-e348-a597d6e739e3@enterprisedb.com
Discussion: https://postgr.es/m/17416-ed8fe5d7213d6c25@postgresql.org
2022-06-22 12:12:00 -04:00
Tom Lane f7797747fc Avoid ecpglib core dump with out-of-order operations.
If an application executed operations like EXEC SQL PREPARE
without having first established a database connection, it could
get a core dump instead of the expected clean failure.  This
occurred because we did "pthread_getspecific(actual_connection_key)"
without ever having initialized the TSD key actual_connection_key.
The results of that are probably platform-specific, but at least
on Linux it often leads to a crash.

To fix, add calls to ecpg_pthreads_init() in the code paths that
might use actual_connection_key uninitialized.  It's harmless
(and hopefully inexpensive) to do that more than once.

Per bug #17514 from Okano Naoki.  The problem's ancient, so
back-patch to all supported branches.

Discussion: https://postgr.es/m/17514-edd4fad547c5692c@postgresql.org
2022-06-14 18:16:46 -04:00
Tom Lane aaa88d3828 Revert "Fix psql's single transaction mode on client-side errors with -c/-f switches".
This reverts commits a04ccf6df et al. in the back branches only.
There was some disagreement already over whether to back-patch
157f8739a, on the grounds that it is the sort of behavioral
change that we don't like to back-patch.  Furthermore, it now
looks like the logic needs some more work, which we don't have
time for before the upcoming 14.4 release.  Revert for now, and
perhaps reconsider later.

Discussion: https://postgr.es/m/17504-76b68018e130415e@postgresql.org
2022-06-10 16:34:25 -04:00
Tom Lane 199aac8b2f Un-break whole-row Vars referencing domain-over-composite types.
In commit ec62cb0aa, I foolishly replaced ExecEvalWholeRowVar's
lookup_rowtype_tupdesc_domain call with just lookup_rowtype_tupdesc,
because I didn't see how a domain could be involved there, and
there were no regression test cases to jog my memory.  But the
existing code was correct, so revert that change and add a test
case showing why it's necessary.  (Note: per comment in struct
DatumTupleFields, it is correct to produce an output tuple that's
labeled with the base composite type, not the domain; hence just
blindly looking through the domain is correct here.)

Per bug #17515 from Dan Kubb.  Back-patch to v11 where domains over
composites became a thing.

Discussion: https://postgr.es/m/17515-a24737438363aca0@postgresql.org
2022-06-10 10:35:57 -04:00
Peter Eisentraut 6450d15e2e Fix whitespace 2022-06-08 13:23:53 +02:00
Tom Lane d628ce048d Fix off-by-one loop termination condition in pg_stat_get_subscription().
pg_stat_get_subscription scanned one more LogicalRepWorker array entry
than is really allocated.  In the worst case this could lead to SIGSEGV,
if the LogicalRepCtx data structure is near the end of shared memory.
That seems quite unlikely though (thanks to the ordering of calls in
CreateSharedMemoryAndSemaphores) and we've heard no field reports of it.
A more likely misbehavior is one row of garbage data in the function's
result, but even that is not real likely because of the check that the
pid field matches some live backend.

Report and fix by Kuntal Ghosh.  This bug is old, so back-patch
to all supported branches.

Discussion: https://postgr.es/m/CAGz5QCJykEDzW6jQK6Yz7Qh_PMtD=95de_7QoocbVR2Qy8hWZA@mail.gmail.com
2022-06-07 15:34:30 -04:00
Tom Lane d82ed5b2f8 Don't fail on libpq-generated error reports in ecpg_raise_backend().
An error PGresult generated by libpq itself, such as a report of
connection loss, won't have broken-down error fields.
ecpg_raise_backend() blithely assumed that PG_DIAG_MESSAGE_PRIMARY
would always be present, and would end up passing a NULL string
pointer to snprintf when it isn't.  That would typically crash
before 3779ac62d, and it would fail to provide a useful error report
in any case.  Best practice is to substitute PQerrorMessage(conn)
in such cases, so do that.

Per bug #17421 from Masayuki Hirose.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17421-790ff887e3188874@postgresql.org
2022-06-06 11:20:46 -04:00
Michael Paquier b0bd9327dd Fix psql's single transaction mode on client-side errors with -c/-f switches
psql --single-transaction is able to handle multiple -c and -f switches
in a single transaction since d5563d7d, but this had the surprising
behavior of forcing a transaction COMMIT even if psql failed with an
error in the client (for example incorrect path given to \copy), which
would generate an error, but still commit any changes that were already
applied in the backend.  This commit makes the behavior more consistent,
by enforcing a transaction ROLLBACK if any commands fail, both
client-side and backend-side, so as no changes are applied if one error
happens in any of them.

Some tests are added on HEAD to provide some coverage about all that.
Backend-side errors are unreliable as IPC::Run can complain on SIGPIPE
if psql quits before reading a query result, but that should work
properly in the case where any errors come from psql itself, which is
what the original report is about.

Reported-by: Christoph Berg
Author: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/17504-76b68018e130415e@postgresql.org
Backpatch-through: 10
2022-06-06 11:07:35 +09:00
Tom Lane 37290f7f5a Silence compiler warnings from some older compilers.
Since a117cebd6, some older gcc versions issue "variable may be used
uninitialized in this function" complaints for brin_summarize_range.
Silence that using the same coding pattern as in bt_index_check_internal;
arguably, a117cebd6 had too narrow a view of which compilers might give
trouble.

Nathan Bossart and Tom Lane.  Back-patch as the previous commit was.

Discussion: https://postgr.es/m/20220601163537.GA2331988@nathanxps13
2022-06-01 17:21:45 -04:00
Tom Lane b5265196e5 Fix pl/perl test case so it will still work under Perl 5.36.
Perl 5.36 has reclassified the warning condition that this test
case used, so that the expected error fails to appear.  Tweak
the test so it instead exercises a case that's handled the same
way in all Perl versions of interest.

This appears to meet our standards for back-patching into
out-of-support branches: it changes no user-visible behavior
but enables testing of old branches with newer tools.
Hence, back-patch as far as 9.2.

Dagfinn Ilmari Mannsåker, per report from Jitka Plesníková.

Discussion: https://postgr.es/m/564579.1654093326@sss.pgh.pa.us
2022-06-01 16:15:47 -04:00
Tom Lane ae758e603d Ensure ParseTzFile() closes the input file after failing.
We hadn't noticed this because (a) few people feed invalid
timezone abbreviation files to the server, and (b) in typical
scenarios guc.c would throw ereport(ERROR) and then transaction
abort handling would silently clean up the leaked file reference.
However, it was possible to observe file leakage warnings if one
breaks an already-active abbreviation file, because guc.c does
not throw ERROR when loading supposedly-validated settings during
session start or SIGHUP processing.

Report and fix by Kyotaro Horiguchi (cosmetic adjustments by me)

Discussion: https://postgr.es/m/20220530.173740.748502979257582392.horikyota.ntt@gmail.com
2022-05-31 14:47:44 -04:00
Michael Paquier c3db8a2e2e Handle NULL for short descriptions of custom GUC variables
If a short description is specified as NULL in one of the various
DefineCustomXXXVariable() functions available to external modules to
define a custom parameter, SHOW ALL would crash.  This change teaches
SHOW ALL to properly handle NULL short descriptions, as well as any code
paths that manipulate it, to gain in flexibility.  Note that
help_config.c was already able to do that, when describing a set of GUCs
for postgres --describe-config.

Author: Steve Chavez
Reviewed by: Nathan Bossart, Andres Freund, Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/CAGRrpzY6hO-Kmykna_XvsTv8P2DshGiU6G3j8yGao4mk0CqjHA%40mail.gmail.com
Backpatch-through: 10
2022-05-28 12:12:58 +09:00
Tom Lane a44bc8b8f4 Remove misguided SSL key file ownership check in libpq.
Commits a59c79564 et al. tried to sync libpq's SSL key file
permissions checks with what we've used for years in the backend.
We did not intend to create any new failure cases, but it turns out
we did: restricting the key file's ownership breaks cases where the
client is allowed to read a key file despite not having the identical
UID.  In particular a client running as root used to be able to read
someone else's key file; and having seen that I suspect that there are
other, less-dubious use cases that this restriction breaks on some
platforms.

We don't really need an ownership check, since if we can read the key
file despite its having restricted permissions, it must have the right
ownership --- under normal conditions anyway, and the point of this
patch is that any additional corner cases where that works should be
deemed allowable, as they have been historically.  Hence, just drop
the ownership check, and rearrange the permissions check to get rid
of its faulty assumption that geteuid() can't be zero.  (Note that the
comparable backend-side code doesn't have to cater for geteuid() == 0,
since the server rejects that very early on.)

This does have the end result that the permissions safety check used
for a root user's private key file is weaker than that used for
anyone else's.  While odd, root really ought to know what she's doing
with file permissions, so I think this is acceptable.

Per report from Yogendra Suralkar.  Like the previous patch,
back-patch to all supported branches.

Discussion: https://postgr.es/m/MW3PR15MB3931DF96896DC36D21AFD47CA3D39@MW3PR15MB3931.namprd15.prod.outlook.com
2022-05-26 14:14:05 -04:00
Tom Lane f3b8d7244a Show 'AS "?column?"' explicitly when it's important.
ruleutils.c was coded to suppress the AS label for a SELECT output
expression if the column name is "?column?", which is the parser's
fallback if it can't think of something better.  This is fine, and
avoids ugly clutter, so long as (1) nothing further up in the parse
tree relies on that column name or (2) the same fallback would be
assigned when the rule or view definition is reloaded.  Unfortunately
(2) is far from certain, both because ruleutils.c might print the
expression in a different form from how it was originally written
and because FigureColname's rules might change in future releases.
So we shouldn't rely on that.

Detecting exactly whether there is any outer-level use of a SELECT
column name would be rather expensive.  This patch takes the simpler
approach of just passing down a flag indicating whether there *could*
be any outer use; for example, the output column names of a SubLink
are not referenceable, and we also do not care about the names exposed
by the right-hand side of a setop.  This is sufficient to suppress
unwanted clutter in all but one case in the regression tests.  That
seems like reasonable evidence that it won't be too much in users'
faces, while still fixing the cases we need to fix.

Per bug #17486 from Nicolas Lutic.  This issue is ancient, so
back-patch to all supported branches.

Discussion: https://postgr.es/m/17486-1ad6fd786728b8af@postgresql.org
2022-05-21 14:45:58 -04:00
Alvaro Herrera 6c6ea6ea81
Fix DDL deparse of CREATE OPERATOR CLASS
When an implicit operator family is created, it wasn't getting reported.
Make it do so.

This has always been missing.  Backpatch to 10.

Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-by: Leslie LEMAIRE <leslie.lemaire@developpement-durable.gouv.fr>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Michael Paquiër <michael@paquier.xyz>
Discussion: https://postgr.es/m/f74d69e151b22171e8829551b1159e77@developpement-durable.gouv.fr
2022-05-20 18:52:55 +02:00