Commit Graph

50467 Commits

Author SHA1 Message Date
Alexander Korotkov bad202c615 Fix default signature length for gist_ltree_ops
911e702077 implemented operator class parameters including the signature length
in ltree.  Previously, the signature length for gist_ltree_ops was 8.  Because
of bug 911e702077 the default signature length for gist_ltree_ops became 28 for
ltree 1.1 (where options method is NOT provided) and 8 for ltree 1.2 (where
options method is provided).  This commit changes the default signature length
for ltree 1.1 to 8.

Existing gist_ltree_ops indexes might be corrupted in various scenarios.
Thus, we have to recommend reindexing all the gist_ltree_ops indexes after
the upgrade.

Reported-by: Victor Yegorov
Reviewed-by: Tomas Vondra, Tom Lane, Andres Freund, Nikita Glukhov
Reviewed-by: Andrew Dunstan
Author: Tomas Vondra, Alexander Korotkov
Discussion: https://postgr.es/m/17406-71e02820ae79bb40%40postgresql.org
Discussion: https://postgr.es/m/d80e0a55-6c3e-5b26-53e3-3c4f973f737c%40enterprisedb.com
2022-03-16 11:41:34 +03:00
Thomas Munro 51e760e5a6 Fix race between DROP TABLESPACE and checkpointing.
Commands like ALTER TABLE SET TABLESPACE may leave files for the next
checkpoint to clean up.  If such files are not removed by the time DROP
TABLESPACE is called, we request a checkpoint so that they are deleted.
However, there is presently a window before checkpoint start where new
unlink requests won't be scheduled until the following checkpoint.  This
means that the checkpoint forced by DROP TABLESPACE might not remove the
files we expect it to remove, and the following ERROR will be emitted:

	ERROR:  tablespace "mytblspc" is not empty

To fix, add a call to AbsorbSyncRequests() just before advancing the
unlink cycle counter.  This ensures that any unlink requests forwarded
prior to checkpoint start (i.e., when ckpt_started is incremented) will
be processed by the current checkpoint.  Since AbsorbSyncRequests()
performs memory allocations, it cannot be called within a critical
section, so we also need to move SyncPreCheckpoint() to before
CreateCheckPoint()'s critical section.

This is an old bug, so back-patch to all supported versions.

Author: Nathan Bossart <nathandbossart@gmail.com>
Reported-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220215235845.GA2665318%40nathanxps13
2022-03-16 17:21:19 +13:00
Michael Paquier 028a3c6b1c pageinspect: Fix memory context allocation of page in brin_revmap_data()
This caused the function to fail, as the aligned copy of the raw page
given by the function caller was not saved in the correct memory
context, which needs to be multi_call_memory_ctx in this case.

Issue introduced by 076f4d9.

Per buildfarm members sifika, mylodon and longfin.  I have reproduced
that locally with macos.

Discussion: https://postgr.es/m/YjFPOtfCW6yLXUeM@paquier.xyz
Backpatch-through: 10
2022-03-16 12:29:55 +09:00
Thomas Munro cfdb303be7 Fix waiting in RegisterSyncRequest().
If we run out of space in the checkpointer sync request queue (which is
hopefully rare on real systems, but common with very small buffer pool),
we wait for it to drain.  While waiting, we should report that as a wait
event so that users know what is going on, and also handle postmaster
death, since otherwise the loop might never terminate if the
checkpointer has exited.

Back-patch to 12.  Although the problem exists in earlier releases too,
the code is structured differently before 12 so I haven't gone any
further for now, in the absence of field complaints.

Reported-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220226213942.nb7uvb2pamyu26dj%40alap3.anarazel.de
2022-03-16 15:37:15 +13:00
Michael Paquier d3a9b83c30 pageinspect: Fix handling of page sizes and AM types
This commit fixes a set of issues related to the use of the SQL
functions in this module when the caller is able to pass down raw page
data as input argument:
- The page size check was fuzzy in a couple of places, sometimes
looking after only a sub-range, but what we are looking for is an exact
match on BLCKSZ.  After considering a few options here, I have settled
down to do a generalization of get_page_from_raw().  Most of the SQL
functions already used that, and this is not strictly required if not
accessing an 8-byte-wide value from a raw page, but this feels safer in
the long run for alignment-picky environment, particularly if a code
path begins to access such values.  This also reduces the number of
strings that need to be translated.
- The BRIN function brin_page_items() uses a Relation but it did not
check the access method of the opened index, potentially leading to
crashes.  All the other functions in need of a Relation already did
that.
- Some code paths could fail on elog(), but we should to use ereport()
for failures that can be triggered by the user.

Tests are added to stress all the cases that are fixed as of this
commit, with some junk raw pages (\set VERBOSITY ensures that this works
across all page sizes) and unexpected index types when functions open
relations.

Author: Michael Paquier, Justin Prysby
Discussion: https://postgr.es/m/20220218030020.GA1137@telsasoft.com
Backpatch-through: 10
2022-03-16 11:20:51 +09:00
Thomas Munro 5610411ac7 Back-patch LLVM 14 API changes.
Since LLVM 14 has stopped changing and is about to be released,
back-patch the following changes from the master branch:

  e6a7600202
  807fee1a39
  a56e7b6601

Back-patch to 11, where LLVM JIT support came in.
2022-03-16 11:41:13 +13:00
Michael Paquier ef00502a5a doc: Add ALTER/DROP ROUTINE to the event trigger matrix
ALTER ROUTINE triggers the events ddl_command_start and ddl_command_end,
and DROP ROUTINE triggers sql_drop, ddl_command_start and
ddl_command_end, but this was not mention on the matrix table.

Reported-by: Leslie Lemaire
Discussion: https://postgr.es/m/164647533363.646.5802968483136493025@wrigleys.postgresql.org
Backpatch-through: 11
2022-03-09 14:59:22 +09:00
Noah Misch 29ec94efd0 Introduce PG_TEST_TIMEOUT_DEFAULT for TAP suite non-elapsing timeouts.
Slow hosts may avoid load-induced, spurious failures by setting
environment variable PG_TEST_TIMEOUT_DEFAULT to some number of seconds
greater than 180.  Developers may see faster failures by setting that
environment variable to some lesser number of seconds.  In tests, write
$PostgreSQL::Test::Utils::timeout_default wherever the convention has
been to write 180.  This change raises the default for some briefer
timeouts.  Back-patch to v10 (all supported versions).

Discussion: https://postgr.es/m/20220218052842.GA3627003@rfd.leadboat.com
2022-03-04 18:53:17 -08:00
Tom Lane 59392746cb Fix pg_regress to print the correct postmaster address on Windows.
pg_regress reported "Unix socket" as the default location whenever
HAVE_UNIX_SOCKETS is defined.  However, that's not been accurate
on Windows since 8f3ec75de.  Update this logic to match what libpq
actually does now.

This is just cosmetic, but still it's potentially misleading.
Back-patch to v13 where 8f3ec75de came in.

Discussion: https://postgr.es/m/3894060.1646415641@sss.pgh.pa.us
2022-03-04 13:23:58 -05:00
Tom Lane 97031f4404 Fix bogus casting in BlockIdGetBlockNumber().
This macro cast the result to BlockNumber after shifting, not before,
which is the wrong thing.  Per the C spec, the uint16 fields would
promote to int not unsigned int, so that (for 32-bit int) the shift
potentially shifts a nonzero bit into the sign position.  I doubt
there are any production systems where this would actually end with
the wrong answer, but it is undefined behavior per the C spec, and
clang's -fsanitize=undefined option reputedly warns about it on some
platforms.  (I can't reproduce that right now, but the code is
undeniably wrong per spec.)  It's easy to fix by casting to
BlockNumber (uint32) in the proper places.

It's been wrong for ages, so back-patch to all supported branches.

Report and patch by Zhihong Yu (cosmetic tweaking by me)

Discussion: https://postgr.es/m/CALNJ-vT9r0DSsAOw9OXVJFxLENoVS_68kJ5x0p44atoYH+H4dg@mail.gmail.com
2022-03-03 19:03:42 -05:00
Tom Lane 1a027e6b7b Clean up assorted failures under clang's -fsanitize=undefined checks.
Most of these are cases where we could call memcpy() or other libc
functions with a NULL pointer and a zero count, which is forbidden
by POSIX even though every production version of libc allows it.
We've fixed such things before in a piecemeal way, but apparently
never made an effort to try to get them all.  I don't claim that
this patch does so either, but it gets every failure I observe in
check-world, using clang 12.0.1 on current RHEL8.

numeric.c has a different issue that the sanitizer doesn't like:
"ln(-1.0)" will compute log10(0) and then try to assign the
resulting -Inf to an integer variable.  We don't actually use the
result in such a case, so there's no live bug.

Back-patch to all supported branches, with the idea that we might
start running a buildfarm member that tests this case.  This includes
back-patching c1132aae3 (Check the size in COPY_POINTER_FIELD),
which previously silenced some of these issues in copyfuncs.c.

Discussion: https://postgr.es/m/CALNJ-vT9r0DSsAOw9OXVJFxLENoVS_68kJ5x0p44atoYH+H4dg@mail.gmail.com
2022-03-03 18:13:24 -05:00
Tom Lane 6599d8f126 Allow root-owned SSL private keys in libpq, not only the backend.
This change makes libpq apply the same private-key-file ownership
and permissions checks that we have used in the backend since commit
9a83564c5.  Namely, that the private key can be owned by either the
current user or root (with different file permissions allowed in the
two cases).  This allows system-wide management of key files, which
is just as sensible on the client side as the server, particularly
when the client is itself some application daemon.

Sync the comments about this between libpq and the backend, too.

Back-patch of a59c79564 and 50f03473e into all supported branches.

David Steele

Discussion: https://postgr.es/m/f4b7bc55-97ac-9e69-7398-335e212f7743@pgmasters.net
2022-03-02 11:57:02 -05:00
Tom Lane 9b2d762a28 Disallow execution of SPI functions during plperl function compilation.
Perl can be convinced to execute user-defined code during compilation
of a plperl function (or at least a plperlu function).  That's not
such a big problem as long as the activity is confined within the
Perl interpreter, and it's not clear we could do anything about that
anyway.  However, if such code tries to use plperl's SPI functions,
we have a bigger problem.  In the first place, those functions are
likely to crash because current_call_data->prodesc isn't set up yet.
In the second place, because it isn't set up, we lack critical info
such as whether the function is supposed to be read-only.  And in
the third place, this path allows code execution during function
validation, which is strongly discouraged because of the potential
for security exploits.  Hence, reject execution of the SPI functions
until compilation is finished.

While here, add check_spi_usage_allowed() calls to various functions
that hadn't gotten the memo about checking that.  I think that perhaps
plperl_sv_to_literal may have been intentionally omitted on the grounds
that it was safe at the time; but if so, the addition of transforms
functionality changed that.  The others are more recently added and
seem to be flat-out oversights.

Per report from Mark Murawski.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/9acdf918-7fff-4f40-f750-2ffa84f083d2@intellasoft.net
2022-02-25 17:40:44 -05:00
Andres Freund 0b1020a96d pg_waldump: Fix error message for WAL files smaller than XLOG_BLCKSZ.
When opening a WAL file smaller than XLOG_BLCKSZ (e.g. 0 bytes long) while
determining the wal_segment_size, pg_waldump checked errno, despite errno not
being set by the short read. Resulting in a bogus error message.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220214.181847.775024684568733277.horikyota.ntt@gmail.com
Backpatch: 11-, the bug was introducedin fc49e24fa
2022-02-25 10:32:38 -08:00
Andres Freund c2551483e0 Fix temporary object cleanup failing due to toast access without snapshot.
When cleaning up temporary objects during process exit the cleanup could fail
with:
  FATAL: cannot fetch toast data without an active snapshot

The bug is caused by RemoveTempRelationsCallback() not setting up a
snapshot. If an object with toasted catalog data needs to be cleaned up,
init_toast_snapshot() could fail with the above error.

Most of the time however the the problem is masked due to cached catalog
snapshots being returned by GetOldestSnapshot(). But dropping an object can
cause catalog invalidations to be emitted. If no further catalog accesses are
necessary between the invalidation processing and the next toast datum
deletion, the bug becomes visible.

It's easy to miss this bug because it typically happens after clients
disconnect and the FATAL error just ends up in the log.

Luckily temporary table cleanup at the next use of the same temporary schema
or during DISCARD ALL does not have the same problem.

Fix the bug by pushing a snapshot in RemoveTempRelationsCallback(). Also add
isolation tests for temporary object cleanup, including objects with toasted
catalog data.

A future HEAD only commit will add more assertions.

Reported-By: Miles Delahunty
Author: Andres Freund
Discussion: https://postgr.es/m/CAOFAq3BU5Mf2TTvu8D9n_ZOoFAeQswuzk7yziAb7xuw_qyw5gw@mail.gmail.com
Backpatch: 10-
2022-02-21 08:59:30 -08:00
Andrew Dunstan a5716baac0
Remove most msys special processing in TAP tests
Following migration of Windows buildfarm members running TAP tests to
use of ucrt64 perl for those tests, special processing for msys perl is
no longer necessary and so is removed.

Backpatch to release 10

Discussion: https://postgr.es/m/c65a8781-77ac-ea95-d185-6db291e1baeb@dunslane.net
2022-02-20 11:49:33 -05:00
Andrew Dunstan c27aa21e0d
Remove PostgreSQL::Test::Utils::perl2host completely
Commit f1ac4a74de disabled this processing, and as nothing has broken (as
expected) here we proceed to remove the routine and adjust all the call
sites.

Backpatch to release 10

Discussion: https://postgr.es/m/0ba775a2-8aa0-0d56-d780-69427cf6f33d@dunslane.net
Discussion: https://postgr.es/m/20220125023609.5ohu3nslxgoygihl@alap3.anarazel.de
2022-02-20 10:35:49 -05:00
Tom Lane 46b738ffc2 Suppress warning about stack_base_ptr with late-model GCC.
GCC 12 complains that set_stack_base is storing the address of
a local variable in a long-lived pointer.  This is an entirely
reasonable warning (indeed, it just helped us find a bug);
but that behavior is intentional here.  We can work around it
by using __builtin_frame_address(0) instead of a specific local
variable; that produces an address a dozen or so bytes different,
in my testing, but we don't care about such a small difference.
Maybe someday a compiler lacking that function will start to issue
a similar warning, but we'll worry about that when it happens.

Patch by me, per a suggestion from Andres Freund.  Back-patch to
v12, which is as far back as the patch will go without some pain.
(Recently-established project policy would permit a back-patch as
far as 9.2, but I'm disinclined to expend the work until GCC 12
is much more widespread.)

Discussion: https://postgr.es/m/3773792.1645141467@sss.pgh.pa.us
2022-02-17 22:45:34 -05:00
Etsuro Fujita 0dc0284ade Doc: Update documentation for modifying postgres_fdw foreign tables.
Document that they can be modified using COPY as well.

Back-patch to v11 where commit 3d956d956 added support for COPY in
postgres_fdw.
2022-02-16 15:15:04 +09:00
Amit Kapila caa231be97 WAL log unchanged toasted replica identity key attributes.
Currently, during UPDATE, the unchanged replica identity key attributes
are not logged separately because they are getting logged as part of the
new tuple. But if they are stored externally then the untoasted values are
not getting logged as part of the new tuple and logical replication won't
be able to replicate such UPDATEs. So we need to log such attributes as
part of the old_key_tuple during UPDATE.

Reported-by: Haiying Tang
Author: Dilip Kumar and Amit Kapila
Reviewed-by: Alvaro Herrera, Haiying Tang, Andres Freund
Backpatch-through: 10
Discussion: https://postgr.es/m/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com
2022-02-14 08:24:44 +05:30
Alexander Korotkov ac2303aa03 Fix memory leak in IndexScan node with reordering
Fix ExecReScanIndexScan() to free the referenced tuples while emptying the
priority queue.  Backpatch to all supported versions.

Discussion: https://postgr.es/m/CAHqSB9gECMENBQmpbv5rvmT3HTaORmMK3Ukg73DsX5H7EJV7jw%40mail.gmail.com
Author: Aliaksandr Kalenik
Reviewed-by: Tom Lane, Alexander Korotkov
Backpatch-through: 10
2022-02-14 03:32:34 +03:00
Tom Lane 51ee561f56 Fix thinko in PQisBusy().
In commit 1f39a1c06 I made PQisBusy consider conn->write_failed, but
that is now looking like complete brain fade.  In the first place, the
logic is quite wrong: it ought to be like "and not" rather than "or".
This meant that once we'd gotten into a write_failed state, PQisBusy
would always return true, probably causing the calling application to
iterate its loop until PQconsumeInput returns a hard failure thanks
to connection loss.  That's not what we want: the intended behavior
is to return an error PGresult, which the application probably has
much cleaner support for.

But in the second place, checking write_failed here seems like the
wrong thing anyway.  The idea of the write_failed mechanism is to
postpone handling of a write failure until we've read all we can from
the server; so that flag should not interfere with input-processing
behavior.  (Compare 7247e243a.)  What we *should* check for is
status = CONNECTION_BAD, ie, socket already closed.  (Most places that
close the socket don't touch asyncStatus, but they do reset status.)
This primarily ensures that if PQisBusy() returns true then there is
an open socket, which is assumed by several call sites in our own
code, and probably other applications too.

While at it, fix a nearby thinko in libpq's my_sock_write: we should
only consult errno for res < 0, not res == 0.  This is harmless since
pqsecure_raw_write would force errno to zero in such a case, but it
still could confuse readers.

Noted by Andres Freund.  Backpatch to v12 where 1f39a1c06 came in.

Discussion: https://postgr.es/m/20220211011025.ek7exh6owpzjyudn@alap3.anarazel.de
2022-02-12 13:23:20 -05:00
Tom Lane 0778b24ced Don't use_physical_tlist for an IOS with non-returnable columns.
createplan.c tries to save a runtime projection step by specifying
a scan plan node's output as being exactly the table's columns, or
index's columns in the case of an index-only scan, if there is not a
reason to do otherwise.  This logic did not previously pay attention
to whether an index's columns are returnable.  That worked, sort of
accidentally, until commit 9a3ddeb51 taught setrefs.c to reject plans
that try to read a non-returnable column.  I have no desire to loosen
setrefs.c's new check, so instead adjust use_physical_tlist() to not
try to optimize this way when there are non-returnable column(s).

Per report from Ryan Kelly.  Like the previous patch, back-patch
to all supported branches.

Discussion: https://postgr.es/m/CAHUie24ddN+pDNw7fkhNrjrwAX=fXXfGZZEHhRuofV_N_ftaSg@mail.gmail.com
2022-02-11 15:23:52 -05:00
Tom Lane d0e1fd9587 Make pg_ctl stop/restart/promote recheck postmaster aliveness.
"pg_ctl stop/restart" checked that the postmaster PID is valid just
once, as a side-effect of sending the stop signal, and then would
wait-till-timeout for the postmaster.pid file to go away.  This
neglects the case wherein the postmaster dies uncleanly after we
signal it.  Similarly, once "pg_ctl promote" has sent the signal,
it'd wait for the corresponding on-disk state change to occur
even if the postmaster dies.

I'm not sure how we've managed not to notice this problem, but it
seems to explain slow execution of the 017_shm.pl test script on AIX
since commit 4fdbf9af5, which added a speculative "pg_ctl stop" with
the idea of making real sure that the postmaster isn't there.  In the
test steps that kill-9 and then restart the postmaster, it's possible
to get past the initial signal attempt before kill() stops working
for the doomed postmaster.  If that happens, pg_ctl waited till
PGCTLTIMEOUT before giving up ... and the buildfarm's AIX members
have that set very high.

To fix, include a "kill(pid, 0)" test (similar to what
postmaster_is_alive uses) in these wait loops, so that we'll
give up immediately if the postmaster PID disappears.

While here, I chose to refactor those loops out of where they were.
do_stop() and do_restart() can perfectly well share one copy of the
wait-for-stop loop, and it seems desirable to put a similar function
beside that for wait-for-promote.

Back-patch to all supported versions, since pg_ctl's wait logic
is substantially identical in all, and we're seeing the slow test
behavior in all branches.

Discussion: https://postgr.es/m/20220210023537.GA3222837@rfd.leadboat.com
2022-02-10 16:49:39 -05:00
Andrew Dunstan eec7c640fe
Use gendef instead of pexports for building windows .def files
Modern msys systems lack pexports but have gendef instead, so use that.

Discussion: https://postgr.es/m/3ccde7a9-e4f9-e194-30e0-0936e6ad68ba@dunslane.net

Backpatch to release 9.4 to enable building with perl on older branches.
Before that pexports is not used for plperl.
2022-02-10 13:51:40 -05:00
Noah Misch b3558cc96c Use Test::Builder::todo_start(), replacing $::TODO.
Some pre-2017 Test::More versions need perfect $Test::Builder::Level
maintenance to find the variable.  Buildfarm member snapper reported an
overall failure that the file intended to hide via the TODO construct.
That trouble was reachable in v11 and v10.  For later branches, this
serves as defense in depth.  Back-patch to v10 (all supported versions).

Discussion: https://postgr.es/m/20220202055556.GB2745933@rfd.leadboat.com
2022-02-09 18:17:03 -08:00
Noah Misch 1b8462bf1d Fix back-patch of "Avoid race in RelationBuildDesc() ..."
The back-patch of commit fdd965d074 broke
CLOBBER_CACHE_ALWAYS for v9.6 through v13.  It updated the
InvalidateSystemCaches() call for CLOBBER_CACHE_RECURSIVELY, neglecting
the one for CLOBBER_CACHE_ALWAYS.  Back-patch to v13, v12, v11, and v10.

Reviewed by Tomas Vondra.  Reported by Tomas Vondra.

Discussion: https://postgr.es/m/df7b4c0b-7d92-f03f-75c4-9e08b269a716@enterprisedb.com
2022-02-09 18:16:56 -08:00
Tom Lane 5ea3b99de0 Remove ppport.h's broken re-implementation of eval_pv().
Recent versions of Devel::PPPort try to redefine eval_pv() to
dodge a bug in pre-5.31 Perl versions.  Unfortunately the redefinition
fails on compilers that don't support statements nested within
expressions.  However, we aren't actually interested in this bug fix,
since we always call eval_pv() with croak_on_error = FALSE.
So, until there's an upstream fix for this breakage, just comment
out the macro to revert to the older behavior.

Per report from Wei Sun, as well as previous buildfarm failure
on pademelon (which I'd unfortunately not looked at carefully
enough to understand the cause).  Back-patch to all supported
versions, since we're using the same ppport.h in all.

Discussion: https://postgr.es/m/tencent_2EFCC8BA0107B6EC0F97179E019A8A43C806@qq.com
Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=pademelon&dt=2022-02-02%2001%3A22%3A58
2022-02-08 19:26:17 -05:00
Tom Lane d7db957207 Stamp 13.6. 2022-02-07 16:17:41 -05:00
Peter Eisentraut 7f7148b72e Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 9aa8bc576f9af5c61de4a6fc8119abfa36493d01
2022-02-07 13:36:22 +01:00
Tom Lane 763c7aac57 Release notes for 14.2, 13.6, 12.10, 11.15, 10.20. 2022-02-06 14:24:55 -05:00
Tom Lane 7d74ff8243 Doc: be clearer that foreign-table partitions need user-added constraints.
A very well-informed user might deduce this from what we said already,
but I'd bet against it.  Lay it out explicitly.

While here, rewrite the comment about tuple routing to be more
intelligible to an average SQL user.

Per bug #17395 from Alexander Lakhin.  Back-patch to v11.  (The text
in this area is different in v10 and I'm not sufficiently excited
about this point to adapt the patch.)

Discussion: https://postgr.es/m/17395-8c326292078d1a57@postgresql.org
2022-02-05 12:55:44 -05:00
Tom Lane 9c7cf083f2 Test, don't just Assert, that mergejoin's inputs are in order.
There are two Asserts in nodeMergejoin.c that are reachable if
the input data is not in the expected order.  This seems way too
fragile.  Alexander Lakhin reported a case where the assertions
could be triggered with misconfigured foreign-table partitions,
and bitter experience with unstable operating system collation
definitions suggests another easy route to hitting them.  Neither
Assert is in a place where we can't afford one more test-and-branch,
so replace 'em with plain test-and-elog logic.

Per bug #17395.  While the reported symptom is relatively recent,
collation changes could happen anytime, so back-patch to all
supported branches.

Discussion: https://postgr.es/m/17395-8c326292078d1a57@postgresql.org
2022-02-05 11:59:30 -05:00
Bruce Momjian 58e9d28091 doc: clarify syntax notation, particularly parentheses
Also move TCL syntax to the PL/tcl section.

Reported-by: davs2rt@gmail.com

Discussion: https://postgr.es/m/164308146320.12460.3590769444508751574@wrigleys.postgresql.org

Backpatch-through: 10
2022-02-02 21:53:51 -05:00
Peter Eisentraut 31c9712610 doc: Fix mistake in PL/Python documentation
Small thinko introduced by 94aceed317

Reported-by: nassehk@gmail.com
2022-02-02 09:16:19 +01:00
Tom Lane 4d7d196ff6 Replace use of deprecated Python module distutils.sysconfig, take 2.
With Python 3.10, configure spits out warnings about the module
distutils.sysconfig being deprecated and scheduled for removal in
Python 3.12.  Change the uses in configure to use the module sysconfig
instead.  The logic stays largely the same, although we have to
rely on INCLUDEPY instead of the deprecated get_python_inc function.

Note that sysconfig exists since Python 2.7, so this moves the
minimum required version up from Python 2.6 (or 2.4, before v13).
Also, sysconfig didn't exist in Python 3.1, so the minimum 3.x
version is now 3.2.

Back-patch of commit bd233bdd8 into all supported branches.

In v10, this also includes back-patching v11's beff4bb9c, primarily
because this opinion is clearly out-of-date:

    While at it, get rid of the code's assumption that both the major and
    minor numbers contain exactly one digit.  That will foreseeably be
    broken by Python 3.10 in perhaps four or five years.  That's far enough
    out that we probably don't need to back-patch this.

Peter Eisentraut, Tom Lane, Andres Freund

Discussion: https://postgr.es/m/c74add3c-09c4-a9dd-1a03-a846e5b2fc52@enterprisedb.com
2022-02-01 19:03:41 -05:00
Tom Lane ff86ee3d21 Revert "plperl: Fix breakage of c89f409749 in back branches."
This reverts commits d81cac47a et al.  We shouldn't need that
hack after the preceding commits.

Discussion: https://postgr.es/m/20220131015130.shn6wr2fzuymerf6@alap3.anarazel.de
2022-01-31 15:07:39 -05:00
Tom Lane bc89af248e plperl: update ppport.h to Perl 5.34.0.
Also apply the changes suggested by running
perl ppport.h --compat-version=5.8.0

And remove some no-longer-required NEED_foo declarations.

Dagfinn Ilmari Mannsåker

Back-patch of commit 05798c9f7 into all supported branches.
At the time we thought this update was mostly cosmetic, but the
lack of it has caused trouble, while the patch itself hasn't.

Discussion: https://postgr.es/m/87y278s6iq.fsf@wibble.ilmari.org
Discussion: https://postgr.es/m/20220131015130.shn6wr2fzuymerf6@alap3.anarazel.de
2022-01-31 15:01:05 -05:00
Andres Freund 4afb5aeb9f plperl: Fix breakage of c89f409749 in back branches.
ppport.h was only updated in 05798c9f7f (master). Unfortunately my commit
c89f409749 uses PERL_VERSION_LT which came in with that update. Breaking most
buildfarm animals.

I should have noticed that...

We might want to backpatch the ppport update instead, but for now lets get the
buildfarm green again.

Discussion: https://postgr.es/m/20220131015130.shn6wr2fzuymerf6@alap3.anarazel.de
Backpatch: 10-14, master doesn't need it
2022-01-30 18:00:57 -08:00
Andres Freund 0dc0fe7b6c plperl: windows: Use Perl_setlocale on 5.28+, fixing compile failure.
For older versions we need our own copy of perl's setlocale(), because it was
not exposed (why we need the setlocale in the first place is explained in
plperl_init_interp) . The copy stopped working in 5.28, as some of the used
macros are not public anymore.  But Perl_setlocale is available in 5.28, so
use that.

Author: Victor Wagner <vitus@wagner.pp.ru>
Reviewed-By: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://postgr.es/m/20200501134711.08750c5f@antares.wagner.home
Backpatch: all versions
2022-01-30 16:42:45 -08:00
Tom Lane 5ad70564f4 Fix failure to validate the result of select_common_type().
Although select_common_type() has a failure-return convention, an
apparent successful return just provides a type OID that *might* work
as a common supertype; we've not validated that the required casts
actually exist.  In the mainstream use-cases that doesn't matter,
because we'll proceed to invoke coerce_to_common_type() on each input,
which will fail appropriately if the proposed common type doesn't
actually work.  However, a few callers didn't read the (nonexistent)
fine print, and thought that if they got back a nonzero OID then the
coercions were sure to work.

This affects in particular the recently-added "anycompatible"
polymorphic types; we might think that a function/operator using
such types matches cases it really doesn't.  A likely end result
of that is unexpected "ambiguous operator" errors, as for example
in bug #17387 from James Inform.  Another, much older, case is that
the parser might try to transform an "x IN (list)" construct to
a ScalarArrayOpExpr even when the list elements don't actually have
a common supertype.

It doesn't seem desirable to add more checking to select_common_type
itself, as that'd just slow down the mainstream use-cases.  Instead,
write a separate function verify_common_type that performs the
missing checks, and add a call to that where necessary.  Likewise add
verify_common_type_from_oids to go with select_common_type_from_oids.

Back-patch to v13 where the "anycompatible" types came in.  (The
symptom complained of in bug #17387 doesn't appear till v14, but
that's just because we didn't get around to converting || to use
anycompatible till then.)  In principle the "x IN (list)" fix could
go back all the way, but I'm not currently convinced that it makes
much difference in real-world cases, so I won't bother for now.

Discussion: https://postgr.es/m/17387-5dfe54b988444963@postgresql.org
2022-01-29 11:41:12 -05:00
Tomas Vondra e90f258aca Fix ordering of XIDs in ProcArrayApplyRecoveryInfo
Commit 8431e296ea reworked ProcArrayApplyRecoveryInfo to sort XIDs
before adding them to KnownAssignedXids. But the XIDs are sorted using
xidComparator, which compares the XIDs simply as uint32 values, not
logically. KnownAssignedXidsAdd() however expects XIDs in logical order,
and calls TransactionIdFollowsOrEquals() to enforce that. If there are
XIDs for which the two orderings disagree, an error is raised and the
recovery fails/restarts.

Hitting this issue is fairly easy - you just need two transactions, one
started before the 4B limit (e.g. XID 4294967290), the other sometime
after it (e.g. XID 1000). Logically (4294967290 <= 1000) but when
compared using xidComparator we try to add them in the opposite order.
Which makes KnownAssignedXidsAdd() fail with an error like this:

  ERROR: out-of-order XID insertion in KnownAssignedXids

This only happens during replica startup, while processing RUNNING_XACTS
records to build the snapshot. Once we reach STANDBY_SNAPSHOT_READY, we
skip these records. So this does not affect already running replicas,
but if you restart (or create) a replica while there are transactions
with XIDs for which the two orderings disagree, you may hit this.

Long-running transactions and frequent replica restarts increase the
likelihood of hitting this issue. Once the replica gets into this state,
it can't be started (even if the old transactions are terminated).

Fixed by sorting the XIDs logically - this is fine because we're dealing
with normal XIDs (because it's XIDs assigned to backends) and from the
same wraparound epoch (otherwise the backends could not be running at
the same time on the primary node). So there are no problems with the
triangle inequality, which is why xidComparator compares raw values.

Investigation and root cause analysis by Abhijit Menon-Sen. Patch by me.

This issue is present in all releases since 9.4, however releases up to
9.6 are EOL already so backpatch to 10 only.

Reviewed-by: Abhijit Menon-Sen
Reviewed-by: Alvaro Herrera
Backpatch-through: 10
Discussion: https://postgr.es/m/36b8a501-5d73-277c-4972-f58a4dce088a%40enterprisedb.com
2022-01-27 20:16:39 +01:00
Noah Misch 785a28e422 On sparc64+ext4, suppress test failures from known WAL read failure.
Buildfarm members kittiwake, tadarida and snapper began to fail
frequently when commits 3cd9c3b921 and
f47ed79cc8 added tests of concurrency, but
the problem was reachable before those commits.  Back-patch to v10 (all
supported versions).

Discussion: https://postgr.es/m/20220116210241.GC756210@rfd.leadboat.com
2022-01-26 18:06:23 -08:00
Magnus Hagander 81596645ca Fix pg_hba_file_rules for authentication method cert
For authentication method cert, clientcert=verify-full is implied. But
the pg_hba_file_rules entry would incorrectly show clientcert=verify-ca.

Per bug #17354

Reported-By: Feike Steenbergen
Reviewed-By: Jonathan Katz
Backpatch-through: 12
2022-01-26 09:59:19 +01:00
Tom Lane 213c5aa3bd Revert "graceful shutdown" changes for Windows, in back branches only.
This reverts commits 6051857fc and ed52c3707, but only in the back
branches.  Further testing has shown that while those changes do fix
some things, they also break others; in particular, it looks like
walreceivers fail to detect walsender-initiated connection close
reliably if the walsender shuts down this way.  We'll keep trying to
improve matters in HEAD, but it now seems unwise to push these changes
into stable releases.

Discussion: https://postgr.es/m/CA+hUKG+OeoETZQ=Qw5Ub5h3tmwQhBmDA=nuNO3KG=zWfUypFAw@mail.gmail.com
2022-01-25 12:17:40 -05:00
David Rowley f8807e7742 Consider parallel awareness when removing single-child Appends
8edd0e794 added some code to remove Append and MergeAppend nodes when they
contained a single child node.  As it turned out, this was unsafe to do
when the Append/MergeAppend was parallel_aware and the child node was not.
Removing the Append/MergeAppend, in this case, could lead to the child plan
being called multiple times by parallel workers when it was unsafe to do
so.

Here we fix this by just not removing the Append/MergeAppend when the
parallel_aware flag of the parent and child node don't match.

Reported-by: Yura Sokolov
Bug: #17335
Discussion: https://postgr.es/m/b59605fecb20ba9ea94e70ab60098c237c870628.camel%40postgrespro.ru
Backpatch-through: 12, where 8edd0e794 was first introduced
2022-01-25 21:15:00 +13:00
Michael Paquier 35893cc968 doc: Fix some grammar
This is an extraction of the user-visible changes done in 410aa24,
including all the relevant documentation parts.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220124030001.GQ23027@telsasoft.com
Backpatch-through: 10
2022-01-25 10:49:41 +09:00
Tom Lane d67354d870 Fix limitations on what SQL commands can be issued to a walsender.
In logical replication mode, a WalSender is supposed to be able
to execute any regular SQL command, as well as the special
replication commands.  Poor design of the replication-command
parser caused it to fail in various cases, notably:

* semicolons embedded in a command, or multiple SQL commands
sent in a single message;

* dollar-quoted literals containing odd numbers of single
or double quote marks;

* commands starting with a comment.

The basic problem here is that we're trying to run repl_scanner.l
across the entire input string even when it's not a replication
command.  Since repl_scanner.l does not understand all of the
token types known to the core lexer, this is doomed to have
failure modes.

We certainly don't want to make repl_scanner.l as big as scan.l,
so instead rejigger stuff so that we only lex the first token of
a non-replication command.  That will usually look like an IDENT
to repl_scanner.l, though a comment would end up getting reported
as a '-' or '/' single-character token.  If the token is a replication
command keyword, we push it back and proceed normally with repl_gram.y
parsing.  Otherwise, we can drop out of exec_replication_command()
without examining the rest of the string.

(It's still theoretically possible for repl_scanner.l to fail on
the first token; but that could only happen if it's an unterminated
single- or double-quoted string, in which case you'd have gotten
largely the same error from the core lexer too.)

In this way, repl_gram.y isn't involved at all in handling general
SQL commands, so we can get rid of the SQLCmd node type.  (In
the back branches, we can't remove it because renumbering enum
NodeTag would be an ABI break; so just leave it sit there unused.)

I failed to resist the temptation to clean up some other sloppy
coding in repl_scanner.l while at it.  The only externally-visible
behavior change from that is it now accepts \r and \f as whitespace,
same as the core lexer.

Per bug #17379 from Greg Rychlewski.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org
2022-01-24 15:33:34 -05:00
Tom Lane c94c6612da Remember to reset yy_start state when firing up repl_scanner.l.
Without this, we get odd behavior when the previous cycle of
lexing exited in a non-default exclusive state.  Every other
copy of this code is aware that it has to do BEGIN(INITIAL),
but repl_scanner.l did not get that memo.

The real-world impact of this is probably limited, since most
replication clients would abandon their connection after getting
a syntax error.  Still, it's a bug.

This mistake is old, so back-patch to all supported branches.

Discussion: https://postgr.es/m/1874781.1643035952@sss.pgh.pa.us
2022-01-24 12:09:46 -05:00
Tom Lane 0cbc507378 Suppress variable-set-but-not-used warning from clang 13.
In the normal configuration where GEQO_DEBUG isn't defined,
recent clang versions have started to complain that geqo_main.c
accumulates the edge_failures count but never does anything
with it.  As a minimal back-patchable fix, insert a void cast
to silence this warning.  (I'd speculated about ripping out the
GEQO_DEBUG logic altogether, but I don't think we'd wish to
back-patch that.)

Per recently-established project policy, this is a candidate
for back-patching into out-of-support branches: it suppresses
an annoying compiler warning but changes no behavior.  Hence,
back-patch all the way to 9.2.

Discussion: https://postgr.es/m/CA+hUKGLTSZQwES8VNPmWO9AO0wSeLt36OCPDAZTccT1h7Q7kTQ@mail.gmail.com
2022-01-23 11:09:25 -05:00