Commit Graph

50118 Commits

Author SHA1 Message Date
Bruce Momjian 650e635901 doc: adjust "cities" example to be consistent with other SQL
Reported-by: tom@crystae.net

Discussion: https://postgr.es/m/162345756191.14472.9754568432103008703@wrigleys.postgresql.org

Backpatch-through: 9.6
2021-07-02 20:42:46 -04:00
Tom Lane 7fc97752d5 Don't try to print data type names in slot_store_error_callback().
The existing code tried to do syscache lookups in an already-failed
transaction, which is problematic to say the least.  After some
consideration of alternatives, the best fix seems to be to just drop
type names from the error message altogether.  The table and column
names seem like sufficient localization.  If the user is unsure what
types are involved, she can check the local and remote table
definitions.

Having done that, we can also discard the LogicalRepTypMap hash
table, which had no other use.  Arguably, LOGICAL_REP_MSG_TYPE
replication messages are now obsolete as well; but we should
probably keep them in case some other use emerges.  (The complexity
of removing something from the replication protocol would likely
outweigh any savings anyhow.)

Masahiko Sawada and Bharath Rupireddy, per complaint from Andres
Freund.  Back-patch to v10 where this code originated.

Discussion: https://postgr.es/m/20210106020229.ne5xnuu6wlondjpe@alap3.anarazel.de
2021-07-02 16:04:54 -04:00
Andrew Dunstan 8d2be14028
add missing tag from commit b8c4261e5e 2021-07-01 15:40:42 -04:00
Andrew Dunstan bd0be7f7a4
Add new make targets world-bin and install-world-bin
These are the same as world and install-world respectively, but without
building or installing the documentation. There are many reasons for
wanting to be able to do this, including speed, lack of documentation
building tools, and wanting to build other formats of the documentation.
Plans for simplifying the buildfarm client code include using these
targets.

Backpatch to all live branches.

Discussion: https://postgr.es/m/6a421136-d462-b043-a8eb-e75b2861f3df@dunslane.net
2021-07-01 14:31:10 -04:00
Andrew Dunstan a8b564b0c9
Fix prove_installcheck to use correct paths when used with PGXS
The prove_installcheck recipe in src/Makefile.global.in was emitting
bogus paths for a couple of elements when used with PGXS. Here we create
a separate recipe for the PGXS case that does it correctly. We also take
the opportunity to make the make the file more readable by breaking up
the prove_installcheck and prove_check recipes across several lines, and
to remove the setting for REGRESS_SHLIB to src/test/recovery/Makefile,
which is the only set of tests that actually need it.

Backpatch to all live branches

Discussion: https://postgr.es/m/f2401388-936b-f4ef-a07c-a0bcc49b3300@dunslane.net
2021-07-01 08:46:38 -04:00
Michael Paquier 41edb2db1b Fix incorrect PITR message for transaction ROLLBACK PREPARED
Reaching PITR on such a transaction would cause the generation of a LOG
message mentioning a transaction committed, not aborted.

Oversight in 4f1b890.

Author: Simon Riggs
Discussion: https://postgr.es/m/CANbhV-GJ6KijeCgdOrxqMCQ+C8QiK657EMhCy4csjrPcEUFv_Q@mail.gmail.com
Backpatch-through: 9.6
2021-06-30 11:49:16 +09:00
Tom Lane 1603deca34 Don't use abort(3) in libpq's fe-print.c.
Causing a core dump on out-of-memory seems pretty unfriendly,
and surely is far outside the expected behavior of a general-purpose
library.  Just print an error message (as we did already) and return.
These functions unfortunately don't have an error return convention,
but code using them is probably just looking for a quick-n-dirty
print method and wouldn't bother to check anyway.

Although these functions are semi-deprecated, it still seems
appropriate to back-patch this.  In passing, also back-patch
b90e6cef1, just to reduce cosmetic differences between the
branches.

Discussion: https://postgr.es/m/3122443.1624735363@sss.pgh.pa.us
2021-06-28 14:17:42 -04:00
Tom Lane be567deb3a Don't depend on -fwrapv semantics in pgbench's random() function.
Instead use the common/int.h functions to check for integer overflow
in a more C-standard-compliant fashion.  This is motivated by recent
failures on buildfarm member moonjelly, where it appears that
development-tip gcc is optimizing without regard to the -fwrapv
switch.  Presumably that's a gcc bug that will be fixed soon, but
we might as well install cleaner coding here rather than wait.

(This does not address the question of whether we'll ever be able
to get rid of using -fwrapv.  Testing shows that this spot is the
only place where doing so creates visible regression test failures,
but unfortunately that proves very little.)

Back-patch to v12.  The common/int.h functions exist in v11, but
that branch doesn't use them in any client-side code.  I judge
that this case isn't interesting enough in the real world to take
even a small risk of issues from being the first such use.

Tom Lane and Fabien Coelho

Discussion: https://postgr.es/m/73927.1624815543@sss.pgh.pa.us
2021-06-28 12:40:37 -04:00
Amit Kapila 741deb2603 Fix race condition in TransactionGroupUpdateXidStatus().
When we cannot immediately acquire XactSLRULock in exclusive mode at
commit time, we add ourselves to a list of processes that need their XIDs
status update. We do this if the clog page where we need to update the
current transaction status is the same as the group leader's clog page,
otherwise, we allow the caller to clear it by itself. Now, when we can't
add ourselves to any group, we were not clearing the current proc if it
has already become a member of some group which was leading to an
assertion failure when the same proc was assigned to another backend after
the current backend exits.

Reported-by: Alexander Lakhin
Bug: 17072
Author: Amit Kapila
Tested-By: Alexander Lakhin
Backpatch-through: 11, where it was introduced
Discussion: https://postgr.es/m/17072-2f8764857ef2c92a@postgresql.org
2021-06-28 08:42:48 +05:30
Michael Paquier fd7bc10ab4 Add test for CREATE INDEX CONCURRENTLY with not-so-immutable predicate
83158f7 has improved index_set_state_flags() so as it is possible to use
transactional updates when updating pg_index state flags, but there was
not really a test case which stressed directly the possibility it fixed.
This commit adds such a test, using a predicate that looks valid in
appearance but calls a stable function.

Author: Andrey Lepikhov
Discussion: https://postgr.es/m/9b905019-5297-7372-0ad2-e1a4bb66a719@postgrespro.ru
Backpatch-through: 9.6
2021-06-28 11:17:12 +09:00
Michael Paquier acb60edf02 Make index_set_state_flags() transactional
3c84046 is the original commit that introduced index_set_state_flags(),
where the presence of SnapshotNow made necessary the use of an in-place
update.  SnapshotNow has been removed in 813fb03, so there is no actual
reasons to not make this operation transactional.

As reported by Andrey, it is possible to trigger the assertion of this
routine expecting no transactional updates when switching the pg_index
state flags, using a predicate mark as immutable but calling stable or
volatile functions.  83158f7 has been around for a couple of months on
HEAD now with no issues found related to it, so it looks safe enough for
a backpatch.

Reported-by: Andrey Lepikhov
Author: Michael Paquier
Reviewed-by: Anastasia Lubennikova
Discussion: https://postgr.es/m/20200903080440.GA8559@paquier.xyz
Discussion: https://postgr.es/m/9b905019-5297-7372-0ad2-e1a4bb66a719@postgrespro.ru
Backpatch-through: 9.6
2021-06-28 10:39:09 +09:00
Tom Lane 2d09448654 Remove memory leaks in isolationtester.
specscanner.l leaked a kilobyte of memory per token of the spec file.
Apparently somebody thought that the introductory code block would be
executed once; but it's once per yylex() call.

A couple of functions in isolationtester.c leaked small amounts of
memory due to not bothering to free one-time allocations.  Might
as well improve these so that valgrind gives this program a clean
bill of health.  Also get rid of an ugly static variable.

Coverity complained about one of the one-time leaks, which led me
to try valgrind'ing isolationtester, which led to discovery of the
larger leak.
2021-06-27 12:45:04 -04:00
Michael Paquier 88274a7a31 Remove non-existing variable reference in MSVC's Solution.pm
The version string is grabbed from PACKAGE_VERSION in pg_config.h in the
MSVC build since 8f4fb4c6, but an error message referenced a variable
that existed before that.  This had no consequences except if one messes
up enough with the version number of the build.

Author: Anton Voloshin
Discussion: https://postgr.es/m/af79ee1b-9962-b299-98e1-f90a289e19e6@postgrespro.ru
Backpatch-through: 13
2021-06-26 13:52:54 +09:00
Michael Paquier 0455b7ccce Remove some useless logs from the TAP tests of pgbench
002_pgbench_no_server was printing some array pointers instead of the
actual contents of those arrays for the expected outputs of stdout and
stderr for a tested command.  This does not add any new information that
can help with debugging as the test names allow to track failure
locations, if any.

This commit simply removes those logs as the rest of the printed
information is redundant with command_checks_all().

Per discussion with Andrew Dunstan and Álvaro Herrera.

Discussion: https://postgr.es/m/YNXNFaG7IgkzZanD@paquier.xyz
Backpatch-through: 11
2021-06-26 12:40:03 +09:00
Tom Lane ba815f00a0 Remove unnecessary failure cases in RemoveRoleFromObjectPolicy().
It's not really necessary for this function to open or lock the
relation associated with the pg_policy entry it's modifying.  The
error checks it's making on the rel are if anything counterproductive
(e.g., if we don't want to allow installation of policies on system
catalogs, here is not the place to prevent that).  In particular, it
seems just wrong to insist on an ownership check.  That has the net
effect of forcing people to use superuser for DROP OWNED BY, which
surely is not an effect we want.  Also there is no point in rebuilding
the dependencies of the policy expressions, which aren't being
changed.  Lastly, locking the table also seems counterproductive; it's
not helping to prevent race conditions, since we failed to re-read the
pg_policy row after acquiring the lock.  That means that concurrent
DDL would likely result in "tuple concurrently updated/deleted"
errors; which is the same behavior this code will produce, with less
overhead.

Per discussion of bug #17062.  Back-patch to all supported versions,
as the failure cases this eliminates seem just as undesirable in 9.6
as in HEAD.

Discussion: https://postgr.es/m/1573181.1624220108@sss.pgh.pa.us
2021-06-25 13:59:38 -04:00
Tom Lane 4a20de9d9e Make walsenders show their replication commands in pg_stat_activity.
A walsender process that has executed a SQL command left the text of
that command in pg_stat_activity.query indefinitely, which is quite
confusing if it's in RUNNING state but not doing that query.  An easy
and useful fix is to treat replication commands as if they were SQL
queries, and show them in pg_stat_activity according to the same rules
as for regular queries.  While we're at it, it seems also sensible to
set debug_query_string, allowing error logging and debugging to see
the replication command.

While here, clean up assorted silliness in exec_replication_command:
* Clean up SQLCmd code path, and fix its only-accidentally-not-buggy
  memory management.
* Remove useless duplicate call of SnapBuildClearExportedSnapshot().
* replication_scanner_finish() was never called.

Back-patch of commit f560209c6 into v10-v13.  I'd originally felt
that this didn't merit back-patching, but subsequent confusion
while debugging walsender problems suggests that it'll be useful.
Also, the original commit has now aged long enough to provide some
comfort that it won't cause problems.

Discussion: https://postgr.es/m/2673480.1624557299@sss.pgh.pa.us
Discussion: https://postgr.es/m/880181.1600026471@sss.pgh.pa.us
2021-06-25 10:46:10 -04:00
Michael Paquier af2e67b47a Cleanup some code related to pgbench log checks in TAP tests
This fixes a couple of problems within the so-said code of this commit
subject:
- Replace the use of open() with slurp_file(), fixing an issue reported
by buildfarm member fairywren whose perl installation keep around CRLF
characters, causing the matching patterns for the logs to fail.
- Remove the eval block, which is not really necessary.

This set of issues has come into light after fixing a different issue
with c13585fe, and this is wrong since this code has been introduced.

Reported-by: Andrew Dunstan, and buildfarm member fairywren
Author: Michael Paquier
Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/0f49303e-7784-b3ee-200b-cbf67be2eb9e@dunslane.net
Backpatch-through: 11
2021-06-25 20:15:31 +09:00
Peter Eisentraut 372a2806eb doc: Change reloption data type spelling for consistency
Use "floating point" rather than "float4", like everywhere else in
this context.

Author: Shinya11.Kato@nttdata.com
Discussion: https://www.postgresql.org/message-id/flat/TYAPR01MB28965989AF84B57FC351B97BC40F9@TYAPR01MB2896.jpnprd01.prod.outlook.com
2021-06-25 08:12:52 +02:00
Thomas Munro d9c05a9ec4 Prepare for forthcoming LLVM 13 API change.
LLVM 13 (due out in September) has changed the semantics of
LLVMOrcAbsoluteSymbols(), so we need to bump some reference counts to
avoid a double-free that causes crashes and bad query results.

A proactive change seems necessary to avoid having a window of time
where our respective latest releases would interact badly.  It's
possible that the situation could change before then, though.

Thanks to Fabien Coelho for monitoring bleeding edge LLVM and Andres
Freund for tracking down the change.

Back-patch to 11, where the JIT code arrived.

Discussion: https://postgr.es/m/CA%2BhUKGLEy8mgtN7BNp0ooFAjUedDTJj5dME7NxLU-m91b85siA%40mail.gmail.com
2021-06-25 11:29:00 +12:00
Michael Paquier 7a9eaf111a Fix pattern matching logic for logs in TAP tests of pgbench
The logic checking for the format of per-thread logs used grep() with
directly "$re", which would cause the test to consider all the logs as
a match without caring about their format at all.  Using "/$re/" makes
grep() perform a regex test, which is what we want here.

While on it, improve some of the tests to be more picky with the
patterns expected and add more comments to describe the tests.

Issue discovered while digging into a separate patch.

Author: Fabien Coelho, Michael Paquier
Discussion: https://postgr.es/m/YNPsPAUoVDCpPOGk@paquier.xyz
Backpatch-through: 11
2021-06-25 06:52:47 +09:00
Amit Kapila 56e366f675 Fix ABI break introduced by commit 4daa140a2f.
Move the newly defined enum value REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT
at the end to avoid ABI break in the back branches. We need to back-patch
this till v11 because before that it is already at the end.

Reported-by: Tomas Vondra
Backpatch-through: 11
Discussion: https://postgr.es/m/CAExHW5sPKF-Oovx_qZe4p5oM6Dvof7_P+XgsNAViug15Fm99jA@mail.gmail.com
2021-06-24 15:21:50 +05:30
Heikki Linnakangas 6fb377e5f7 Another fix to relmapper race condition.
In previous commit, I missed that relmap_redo() was also not acquiring the
RelationMappingLock. Thanks to Thomas Munro for pointing that out.

Backpatch-through: 9.6, like previous commit.
Discussion: https://www.postgresql.org/message-id/CA%2BhUKGLev%3DPpOSaL3WRZgOvgk217et%2BbxeJcRr4eR-NttP1F6Q%40mail.gmail.com
2021-06-24 11:19:31 +03:00
Heikki Linnakangas 2a0ab13f8d Prevent race condition while reading relmapper file.
Contrary to the comment here, POSIX does not guarantee atomicity of a
read(), if another process calls write() concurrently. Or at least Linux
does not. Add locking to load_relmap_file() to avoid the race condition.

Fixes bug #17064. Thanks to Alexander Lakhin for the report and test case.

Backpatch-through: 9.6, all supported versions.
Discussion: https://www.postgresql.org/message-id/17064-bb0d7904ef72add3@postgresql.org
2021-06-24 10:45:34 +03:00
Amit Kapila 7a4ecefe9d Doc: Update caveats in synchronous logical replication.
Reported-by: Simon Riggs
Author: Takamichi Osumi
Reviewed-by: Amit Kapila
Backpatch-through: 9.6
Discussion: https://www.postgresql.org/message-id/20210222222847.tpnb6eg3yiykzpky@alap3.anarazel.de
2021-06-24 09:31:51 +05:30
Tom Lane 5179a1ab77 Allow non-quoted identifiers as isolation test session/step names.
For no obvious reason, isolationtester has always insisted that
session and step names be written with double quotes.  This is
fairly tedious and does little for test readability, especially
since the names that people actually choose almost always look
like normal identifiers.  Hence, let's tweak the lexer to allow
SQL-like identifiers not only double-quoted strings.

(They're SQL-like, not exactly SQL, because I didn't add any
case-folding logic.  Also there's no provision for U&"..." names,
not that anyone's likely to care.)

There is one incompatibility introduced by this change: if you write
"foo""bar" with no space, that used to be taken as two identifiers,
but now it's just one identifier with an embedded quote mark.

I converted all the src/test/isolation/ specfiles to remove
unnecessary double quotes, but stopped there because my
eyes were glazing over already.

Like 741d7f104, back-patch to all supported branches, so that this
isn't a stumbling block for back-patching isolation test changes.

Discussion: https://postgr.es/m/759113.1623861959@sss.pgh.pa.us
2021-06-23 18:41:39 -04:00
Tom Lane 9aa06956ea Doc: fix confusion about LEAKPROOF in syntax summaries.
The syntax summaries for CREATE FUNCTION and allied commands
made it look like LEAKPROOF is an alternative to
IMMUTABLE/STABLE/VOLATILE, when of course it is an orthogonal
option.  Improve that.

Per gripe from aazamrafeeque0.  Thanks to David Johnston for
suggestions.

Discussion: https://postgr.es/m/162444349581.694.5818572718530259025@wrigleys.postgresql.org
2021-06-23 14:27:13 -04:00
Tom Lane 13f3655683 Don't assume GSSAPI result strings are null-terminated.
Our uses of gss_display_status() and gss_display_name() assumed
that the gss_buffer_desc strings returned by those functions are
null-terminated.  It appears that they generally are, given the
lack of field complaints up to now.  However, the available
documentation does not promise this, and some man pages
for gss_display_status() show examples that rely on the
gss_buffer_desc.length field instead of expecting null
termination.  Also, we now have a report that on some
implementations, clang's address sanitizer is of the opinion
that the byte after the specified length is undefined.

Hence, change the code to rely on the length field instead.

This might well be cosmetic rather than fixing any real bug, but
it's hard to be sure, so back-patch to all supported branches.
While here, also back-patch the v12 changes that made pg_GSS_error
deal honestly with multiple messages available from
gss_display_status.

Per report from Sudheer H R.

Discussion: https://postgr.es/m/5372B6D4-8276-42C0-B8FB-BD0918826FC3@tekenlight.com
2021-06-23 14:01:32 -04:00
Tom Lane b961bdfe16 Improve display of query results in isolation tests.
Previously, isolationtester displayed SQL query results using some
ad-hoc code that clearly hadn't had much effort expended on it.
Field values longer than 14 characters weren't separated from
the next field, and usually caused misalignment of the columns
too.  Also there was no visual separation of a query's result
from subsequent isolationtester output.  This made test result
files confusing and hard to read.

To improve matters, let's use libpq's PQprint() function.  Although
that's long since unused by psql, it's still plenty good enough
for the purpose here.

Like 741d7f104, back-patch to all supported branches, so that this
isn't a stumbling block for back-patching isolation test changes.

Discussion: https://postgr.es/m/582362.1623798221@sss.pgh.pa.us
2021-06-23 11:12:31 -04:00
Tom Lane e2cde85ef2 Use annotations to reduce instability of isolation-test results.
We've long contended with isolation test results that aren't entirely
stable.  Some test scripts insert long delays to try to force stable
results, which is not terribly desirable; but other erratic failure
modes remain, causing unrepeatable buildfarm failures.  I've spent a
fair amount of time trying to solve this by improving the server-side
support code, without much success: that way is fundamentally unable
to cope with diffs that stem from chance ordering of arrival of
messages from different server processes.

We can improve matters on the client side, however, by annotating
the test scripts themselves to show the desired reporting order
of events that might occur in different orders.  This patch adds
three types of annotations to deal with (a) test steps that might or
might not complete their waits before the isolationtester can see them
waiting; (b) test steps in different sessions that can legitimately
complete in either order; and (c) NOTIFY messages that might arrive
before or after the completion of a step in another session.  We might
need more annotation types later, but this seems to be enough to deal
with the instabilities we've seen in the buildfarm.  It also lets us
get rid of all the long delays that were previously used, cutting more
than a minute off the runtime of the isolation tests.

Back-patch to all supported branches, because the buildfarm
instabilities affect all the branches, and because it seems desirable
to keep isolationtester's capabilities the same across all branches
to simplify possible future back-patching of tests.

Discussion: https://postgr.es/m/327948.1623725828@sss.pgh.pa.us
2021-06-22 21:43:12 -04:00
Tom Lane 6f1321d5ae Restore the portal-level snapshot for simple expressions, too.
Commits 84f5c2908 et al missed the need to cover plpgsql's "simple
expression" code path.  If the first thing we execute after a
COMMIT/ROLLBACK is one of those, rather than a full-fledged SPI command,
we must explicitly do EnsurePortalSnapshotExists() to make sure we have
an outer snapshot.  Note that it wouldn't be good enough to just push a
snapshot for the duration of the expression execution: what comes back
might be toasted, so we'd better have a snapshot protecting it.

The test case demonstrating this fact cheats a bit by marking a SQL
function immutable even though it fetches from a table.  That's
nothing that users haven't been seen to do, though.

Per report from Jim Nasby.  Back-patch to v11, like the previous fix.

Discussion: https://postgr.es/m/378885e4-f85f-fc28-6c91-c4d1c080bf26@amazon.com
2021-06-22 17:48:39 -04:00
Thomas Munro 6dcb185bfc Back-patch "Tolerate version lookup failure for old style Windows locale names."
If users provide old style pre-standardized Windows locale names in a
CREATE COLLATION command, the OS is unable to provide version
information.  Continue without capturing version information, rather
than exposing an OS error.

This was originally done in commit 9f12a3b9 for 14 only, to support
future features that might encounter old style names from initdb's
default.  It wasn't done in 13 because I didn't consider that users
might actually want to use the old format explicitly (something we
should consider blocking in a future release with a better error
message, but that's not a policy we've decided on yet).

Back-patch to 13, based on the field complaint in pgsql-bugs #17058.

Reported-by: Yasushi Yamashita <developer@yamashi-ta.jp>
Discussion: https://postgr.es/m/17058-b49f5793c912c5aa%40postgresql.org
2021-06-22 13:21:25 +12:00
Tom Lane 33af10c598 Fix misbehavior of DROP OWNED BY with duplicate polroles entries.
Ordinarily, a pg_policy.polroles array wouldn't list the same role
more than once; but CREATE POLICY does not prevent that.  If we
perform DROP OWNED BY on a role that is listed more than once,
RemoveRoleFromObjectPolicy either suffered an assertion failure
or encountered a tuple-updated-by-self error.  Rewrite it to cope
correctly with duplicate entries, and add a CommandCounterIncrement
call to prevent the other problem.

Per discussion, there's other cleanup that ought to happen here,
but this seems like the minimum essential fix.

Per bug #17062 from Alexander Lakhin.  It's been broken all along,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/17062-11f471ae3199ca23@postgresql.org
2021-06-18 18:00:09 -04:00
Tom Lane 102f31a208 Avoid scribbling on input node tree in CREATE/ALTER DOMAIN.
This works fine in the "simple Query" code path; but if the
statement is in the plan cache then it's corrupted for future
re-execution.  Apply copyObject() to protect the original
tree from modification, as we've done elsewhere.

This narrow fix is applied only to the back branches.  In HEAD,
the problem was fixed more generally by commit 7c337b6b5; but
that changed ProcessUtility's API, so it's infeasible to
back-patch.

Per bug #17053 from Charles Samborski.

Discussion: https://postgr.es/m/931771.1623893989@sss.pgh.pa.us
Discussion: https://postgr.es/m/17053-3ca3f501bbc212b4@postgresql.org
2021-06-18 12:09:22 -04:00
Andrew Dunstan 5b6b5e5ee5
Don't set a fast default for anything but a plain table
The fast default code added in Release 11 omitted to check that the
table a fast default was being added to was a plain table. Thus one
could be added to a foreign table, which predicably blows up. Here we
perform that check.

In addition, on the back branches, since some of these might have
escaped into the wild, if we encounter a missing value for
an attribute of something other than a plain table we ignore it.

Fixes bug #17056

Backpatch to release 11,

Reviewed by: Andres Freund, Álvaro Herrera and Tom Lane
2021-06-18 07:44:58 -04:00
Amit Kapila 357cb8f07f Fix valgrind issue in pgoutput.c.
We use a tuple conversion map for partitions when replicating using an
ancestor's schema to convert tuples from partition's type to the
ancestor's. Before this map got initialized, we were processing
invalidation messages which access this map.

This issue happens only in version 13 as in HEAD we already have a code
that initializes each relation entry before we can process any
invalidation message. This issue is introduced by commit d250568121 in
version 13.

Reported-by: Tom Lane, as per buildfarm meber skink
Author: Amit Langote
Reviewed-by: Dilip Kumar, Amit Kapila
Discussion: https://www.postgresql.org/message-id/648020.1623854904@sss.pgh.pa.us
2021-06-18 08:51:18 +05:30
Heikki Linnakangas 42f782be91 Tidy up GetMultiXactIdMembers()'s behavior on error
One of the error paths left *members uninitialized. That's not a live
bug, because most callers don't look at *members when the function
returns -1, but let's be tidy. One caller, in heap_lock_tuple(), does
"if (members != NULL) pfree(members)", but AFAICS it never passes an
invalid 'multi' value so it should not reach that error case.

The callers are also a bit inconsistent in their expectations.
heap_lock_tuple() pfrees the 'members' array if it's not-NULL, others
pfree() it if "nmembers >= 0", and others if "nmembers > 0". That's
not a live bug either, because the function should never return 0, but
add an Assert for that to make it more clear. I left the callers alone
for now.

I also moved the line where we set *nmembers. It wasn't wrong before,
but I like to do that right next to the 'return' statement, to make it
clear that it's always set on return.

Also remove one unreachable return statement after ereport(ERROR), for
brevity and for consistency with the similar if-block right after it.

Author: Greg Nancarrow with the additional changes by me
Backpatch-through: 9.6, all supported versions
2021-06-17 14:52:45 +03:00
Peter Eisentraut 3989f8fb9d Fix subtransaction test for Python 3.10
Starting with Python 3.10, the stacktrace looks differently:
  -  PL/Python function "subtransaction_exit_subtransaction_in_with", line 3, in <module>
  -    s.__exit__(None, None, None)
  +  PL/Python function "subtransaction_exit_subtransaction_in_with", line 2, in <module>
  +    with plpy.subtransaction() as s:
Using try/except specifically makes the error look always the same.

(See https://github.com/python/cpython/pull/25719 for the discussion
of this change in Python.)

Author: Honza Horak <hhorak@redhat.com>
Discussion: https://www.postgresql.org/message-id/flat/853083.1620749597%40sss.pgh.pa.us
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1959080
2021-06-17 09:01:22 +02:00
Amit Kapila 9f7bba2629 Document a few caveats in synchronous logical replication.
In a synchronous logical setup, locking [user] catalog tables can cause
deadlock. This is because logical decoding of transactions can lock
catalog tables to access them so exclusively locking those in transactions
can lead to deadlock. To avoid this users must refrain from having
exclusive locks on catalog tables.

Author: Takamichi Osumi
Reviewed-by: Vignesh C, Amit Kapila
Backpatch-through: 9.6
Discussion: https://www.postgresql.org/message-id/20210222222847.tpnb6eg3yiykzpky%40alap3.anarazel.de
2021-06-17 10:52:01 +05:30
Tom Lane d03a41d1c8 Fix plancache refcount leak after error in ExecuteQuery.
When stuffing a plan from the plancache into a Portal, one is
not supposed to risk throwing an error between GetCachedPlan and
PortalDefineQuery; if that happens, the plan refcount incremented
by GetCachedPlan will be leaked.  I managed to break this rule
while refactoring code in 9dbf2b7d7.  There is no visible
consequence other than some memory leakage, and since nobody is
very likely to trigger the relevant error conditions many times
in a row, it's not surprising we haven't noticed.  Nonetheless,
it's a bug, so rearrange the order of operations to remove the
hazard.

Noted on the way to looking for a better fix for bug #17053.
This mistake is pretty old, so back-patch to all supported
branches.
2021-06-16 19:30:17 -04:00
Heikki Linnakangas e89a8e30e0 Fix outdated comment that talked about seek position of WAL file.
Since commit c24dcd0cfd, we have been using pg_pread() to read the WAL
file, which doesn't change the seek position (unless we fall back to
the implementation in src/port/pread.c). Update comment accordingly.

Backpatch-through: 12, where we started to use pg_pread()
2021-06-16 12:45:53 +03:00
Andrew Dunstan d906d106f8
Further refinement of stuck_on_old_timeline recovery test
TestLib::perl2host can take a file argument as well as a directory
argument, so that code becomes substantially simpler. Also add comments
on why we're using forward slashes, and why we're setting
PERL_BADLANG=0.

Discussion: https://postgr.es/m/e9947bcd-20ee-027c-f0fe-01f736b7e345@dunslane.net
2021-06-15 15:36:50 -04:00
Amit Kapila 602a32a685 Fix decoding of speculative aborts.
During decoding for speculative inserts, we were relying for cleaning
toast hash on confirmation records or next change records. But that
could lead to multiple problems (a) memory leak if there is neither a
confirmation record nor any other record after toast insertion for a
speculative insert in the transaction, (b) error and assertion failures
if the next operation is not an insert/update on the same table.

The fix is to start queuing spec abort change and clean up toast hash
and change record during its processing. Currently, we are queuing the
spec aborts for both toast and main table even though we perform cleanup
while processing the main table's spec abort record. Later, if we have a
way to distinguish between the spec abort record of toast and the main
table, we can avoid queuing the change for spec aborts of toast tables.

Reported-by: Ashutosh Bapat
Author: Dilip Kumar
Reviewed-by: Amit Kapila
Backpatch-through: 9.6, where it was introduced
Discussion: https://postgr.es/m/CAExHW5sPKF-Oovx_qZe4p5oM6Dvof7_P+XgsNAViug15Fm99jA@mail.gmail.com
2021-06-15 08:41:16 +05:30
Tom Lane 2da5ddf834 Update variant expected-result file.
This should have been updated in d2d8a229b, but it was overlooked.
According to 31a877f18 which added it, this file is meant to show the
results you get under default_transaction_isolation = serializable.
We've largely lost track of that goal in other isolation tests, but
as long as we've got this one, it should be right.

Noted while fooling about with the isolationtester.
2021-06-14 21:58:26 -04:00
Tom Lane 0483ae7c51 Remove orphaned expected-result file.
This should have been removed in 43e084197, which removed the
corresponding spec file.  Noted while fooling about with the
isolationtester.
2021-06-14 21:28:21 -04:00
Tom Lane bc7885b7f5 Work around portability issue with newer versions of mktime().
Recent glibc versions have made mktime() fail if tm_isdst is
inconsistent with the prevailing timezone; in particular it fails for
tm_isdst = 1 when the zone is UTC.  (This seems wildly inconsistent
with the POSIX-mandated treatment of "incorrect" values for the other
fields of struct tm, so if you ask me it's a bug, but I bet they'll
say it's intentional.)  This has been observed to cause cosmetic
problems when pg_restore'ing an archive created in a different
timezone.

To fix, do mktime() using the field values from the archive, and if
that fails try again with tm_isdst = -1.  This will give a result
that's off by the UTC-offset difference from the original zone, but
that was true before, too.  It's not terribly critical since we don't
do anything with the result except possibly print it.  (Someday we
should flush this entire bit of logic and record a standard-format
timestamp in the archive instead.  That's not okay for a back-patched
bug fix, though.)

Also, guard our only other use of mktime() by having initdb's
build_time_t() set tm_isdst = -1 not 0.  This case could only have
an issue in zones that are DST year-round; but I think some do exist,
or could in future.

Per report from Wells Oliver.  Back-patch to all supported
versions, since any of them might need to run with a newer glibc.

Discussion: https://postgr.es/m/CAOC+FBWDhDHO7G-i1_n_hjRzCnUeFO+H-Czi1y10mFhRWpBrew@mail.gmail.com
2021-06-13 14:32:42 -04:00
Andrew Dunstan 47d5781cb1
Further tweaks to stuck_on_old_timeline recovery test
Translate path slashes on target directory path. This was confusing old
branches, but is applied to all branches for the sake of uniformity.
Perl is perfectly able to understand paths with forward slashes.

Along the way, restore the previous archive_wait query, for the sake of
uniformity with other tests, per gripe from Tom Lane.
2021-06-13 07:19:34 -04:00
Michael Paquier 545aa6a593 Ignore more environment variables in pg_regress.c
This is similar to the work done in 8279f68 for TestLib.pm, where
environment variables set may cause unwanted failures if using a
temporary installation with pg_regress.  The list of variables reset is
adjusted in each stable branch depending on what is supported.

Comments are added to remember that the lists in TestLib.pm and
pg_regress.c had better be kept in sync.

Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/YMNR9GYDn+fHlMta@paquier.xyz
Backpatch-through: 9.6
2021-06-13 20:07:45 +09:00
Tom Lane 94e21021cd Restore robustness of TAP tests that wait for postmaster restart.
Several TAP tests use poll_query_until() to wait for the postmaster
to restart.  They were checking to see if a trivial query
(e.g. "SELECT 1") succeeds.  However, that's problematic in the wake
of commit 11e9caff8, because now that we feed said query to psql
via stdin, we risk IPC::Run whining about a SIGPIPE failure if psql
quits before reading the query.  Hence, we can't use a nonempty
query in cases where we need to wait for connection failures to
stop happening.

Per the precedent of commits c757a3da0 and 6d41dd045, we can pass
"undef" as the query in such cases to ensure that IPC::Run has
nothing to write.  However, then we have to say that the expected
output is empty, and this exposes a deficiency in poll_query_until:
if psql fails altogether and returns empty stdout, poll_query_until
will treat that as a success!  That's because, contrary to its
documentation, it makes no actual check for psql failure, looking
neither at the exit status nor at stderr.

To fix that, adjust poll_query_until to insist on empty stderr as
well as a stdout match.  (I experimented with checking exit status
instead, but it seems that psql often does exit(1) in cases that we
need to consider successes.  That might be something to fix someday,
but it would be a non-back-patchable behavior change.)

Back-patch to v10.  The test cases needing this exist only as far
back as v11, but it seems wise to keep poll_query_until's behavior
the same in v10, in case we back-patch another such test case in
future.  (9.6 does not currently need this change, because in that
branch poll_query_until can't be told to accept empty stdout as
a success case.)

Per assorted buildfarm failures, mostly on hoverfly.

Discussion: https://postgr.es/m/CAA4eK1+zM6L4QSA1XMvXY_qqWwdUmqkOS1+hWvL8QcYEBGA1Uw@mail.gmail.com
2021-06-12 15:12:10 -04:00
Tom Lane f479ea94bd Ensure pg_filenode_relation(0, 0) returns NULL.
Previously, a zero value for the relfilenode resulted in
a confusing error message about "unexpected duplicate".
This function returns NULL for other invalid relfilenode
values, so zero should be treated likewise.

It's been like this all along, so back-patch to all supported
branches.

Justin Pryzby

Discussion: https://postgr.es/m/20210612023324.GT16435@telsasoft.com
2021-06-12 13:29:24 -04:00
Tom Lane 8b5055812c Don't use Asserts to check for violations of replication protocol.
Using an Assert to check the validity of incoming messages is an
extremely poor decision.  In a debug build, it should not be that easy
for a broken or malicious remote client to crash the logrep worker.
The consequences could be even worse in non-debug builds, which will
fail to make such checks at all, leading to who-knows-what misbehavior.
Hence, promote every Assert that could possibly be triggered by wrong
or out-of-order replication messages to a full test-and-ereport.

To avoid bloating the set of messages the translation team has to cope
with, establish a policy that replication protocol violation error
reports don't need to be translated.  Hence, all the new messages here
use errmsg_internal().  A couple of old messages are changed likewise
for consistency.

Along the way, fix some non-idiomatic or outright wrong uses of
hash_search().

Most of these mistakes are new with the "streaming replication"
patch (commit 464824323), but a couple go back a long way.
Back-patch as appropriate.

Discussion: https://postgr.es/m/1719083.1623351052@sss.pgh.pa.us
2021-06-12 12:59:15 -04:00