Commit Graph

47005 Commits

Author SHA1 Message Date
Andrew Dunstan 58ac8300bd
Fix prove_installcheck to use correct paths when used with PGXS
The prove_installcheck recipe in src/Makefile.global.in was emitting
bogus paths for a couple of elements when used with PGXS. Here we create
a separate recipe for the PGXS case that does it correctly. We also take
the opportunity to make the make the file more readable by breaking up
the prove_installcheck and prove_check recipes across several lines, and
to remove the setting for REGRESS_SHLIB to src/test/recovery/Makefile,
which is the only set of tests that actually need it.

Backpatch to all live branches

Discussion: https://postgr.es/m/f2401388-936b-f4ef-a07c-a0bcc49b3300@dunslane.net
2021-07-01 08:47:21 -04:00
Michael Paquier 93d3d0cf30 Fix incorrect PITR message for transaction ROLLBACK PREPARED
Reaching PITR on such a transaction would cause the generation of a LOG
message mentioning a transaction committed, not aborted.

Oversight in 4f1b890.

Author: Simon Riggs
Discussion: https://postgr.es/m/CANbhV-GJ6KijeCgdOrxqMCQ+C8QiK657EMhCy4csjrPcEUFv_Q@mail.gmail.com
Backpatch-through: 9.6
2021-06-30 11:49:24 +09:00
Tom Lane 34c24e5a43 Don't use abort(3) in libpq's fe-print.c.
Causing a core dump on out-of-memory seems pretty unfriendly,
and surely is far outside the expected behavior of a general-purpose
library.  Just print an error message (as we did already) and return.
These functions unfortunately don't have an error return convention,
but code using them is probably just looking for a quick-n-dirty
print method and wouldn't bother to check anyway.

Although these functions are semi-deprecated, it still seems
appropriate to back-patch this.  In passing, also back-patch
b90e6cef1, just to reduce cosmetic differences between the
branches.

Discussion: https://postgr.es/m/3122443.1624735363@sss.pgh.pa.us
2021-06-28 14:17:42 -04:00
Amit Kapila c62c3769ff Fix race condition in TransactionGroupUpdateXidStatus().
When we cannot immediately acquire CLogControlLock in exclusive mode at
commit time, we add ourselves to a list of processes that need their XIDs
status update. We do this if the clog page where we need to update the
current transaction status is the same as the group leader's clog page,
otherwise, we allow the caller to clear it by itself. Now, when we can't
add ourselves to any group, we were not clearing the current proc if it
has already become a member of some group which was leading to an
assertion failure when the same proc was assigned to another backend after
the current backend exits.

Reported-by: Alexander Lakhin
Bug: 17072
Author: Amit Kapila
Tested-By: Alexander Lakhin
Backpatch-through: 11, where it was introduced
Discussion: https://postgr.es/m/17072-2f8764857ef2c92a@postgresql.org
2021-06-28 09:09:42 +05:30
Michael Paquier 38ca11adeb Add test for CREATE INDEX CONCURRENTLY with not-so-immutable predicate
83158f7 has improved index_set_state_flags() so as it is possible to use
transactional updates when updating pg_index state flags, but there was
not really a test case which stressed directly the possibility it fixed.
This commit adds such a test, using a predicate that looks valid in
appearance but calls a stable function.

Author: Andrey Lepikhov
Discussion: https://postgr.es/m/9b905019-5297-7372-0ad2-e1a4bb66a719@postgrespro.ru
Backpatch-through: 9.6
2021-06-28 11:17:20 +09:00
Michael Paquier 08acba558b Make index_set_state_flags() transactional
3c84046 is the original commit that introduced index_set_state_flags(),
where the presence of SnapshotNow made necessary the use of an in-place
update.  SnapshotNow has been removed in 813fb03, so there is no actual
reasons to not make this operation transactional.

As reported by Andrey, it is possible to trigger the assertion of this
routine expecting no transactional updates when switching the pg_index
state flags, using a predicate mark as immutable but calling stable or
volatile functions.  83158f7 has been around for a couple of months on
HEAD now with no issues found related to it, so it looks safe enough for
a backpatch.

Reported-by: Andrey Lepikhov
Author: Michael Paquier
Reviewed-by: Anastasia Lubennikova
Discussion: https://postgr.es/m/20200903080440.GA8559@paquier.xyz
Discussion: https://postgr.es/m/9b905019-5297-7372-0ad2-e1a4bb66a719@postgrespro.ru
Backpatch-through: 9.6
2021-06-28 10:43:04 +09:00
Tom Lane 1acab12096 Remove memory leaks in isolationtester.
specscanner.l leaked a kilobyte of memory per token of the spec file.
Apparently somebody thought that the introductory code block would be
executed once; but it's once per yylex() call.

A couple of functions in isolationtester.c leaked small amounts of
memory due to not bothering to free one-time allocations.  Might
as well improve these so that valgrind gives this program a clean
bill of health.  Also get rid of an ugly static variable.

Coverity complained about one of the one-time leaks, which led me
to try valgrind'ing isolationtester, which led to discovery of the
larger leak.
2021-06-27 12:45:04 -04:00
Michael Paquier dcb0e243dd Remove some useless logs from the TAP tests of pgbench
002_pgbench_no_server was printing some array pointers instead of the
actual contents of those arrays for the expected outputs of stdout and
stderr for a tested command.  This does not add any new information that
can help with debugging as the test names allow to track failure
locations, if any.

This commit simply removes those logs as the rest of the printed
information is redundant with command_checks_all().

Per discussion with Andrew Dunstan and Álvaro Herrera.

Discussion: https://postgr.es/m/YNXNFaG7IgkzZanD@paquier.xyz
Backpatch-through: 11
2021-06-26 12:40:12 +09:00
Tom Lane fea89d64e8 Remove unnecessary failure cases in RemoveRoleFromObjectPolicy().
It's not really necessary for this function to open or lock the
relation associated with the pg_policy entry it's modifying.  The
error checks it's making on the rel are if anything counterproductive
(e.g., if we don't want to allow installation of policies on system
catalogs, here is not the place to prevent that).  In particular, it
seems just wrong to insist on an ownership check.  That has the net
effect of forcing people to use superuser for DROP OWNED BY, which
surely is not an effect we want.  Also there is no point in rebuilding
the dependencies of the policy expressions, which aren't being
changed.  Lastly, locking the table also seems counterproductive; it's
not helping to prevent race conditions, since we failed to re-read the
pg_policy row after acquiring the lock.  That means that concurrent
DDL would likely result in "tuple concurrently updated/deleted"
errors; which is the same behavior this code will produce, with less
overhead.

Per discussion of bug #17062.  Back-patch to all supported versions,
as the failure cases this eliminates seem just as undesirable in 9.6
as in HEAD.

Discussion: https://postgr.es/m/1573181.1624220108@sss.pgh.pa.us
2021-06-25 13:59:38 -04:00
Tom Lane c39983600b Make walsenders show their replication commands in pg_stat_activity.
A walsender process that has executed a SQL command left the text of
that command in pg_stat_activity.query indefinitely, which is quite
confusing if it's in RUNNING state but not doing that query.  An easy
and useful fix is to treat replication commands as if they were SQL
queries, and show them in pg_stat_activity according to the same rules
as for regular queries.  While we're at it, it seems also sensible to
set debug_query_string, allowing error logging and debugging to see
the replication command.

While here, clean up assorted silliness in exec_replication_command:
* Clean up SQLCmd code path, and fix its only-accidentally-not-buggy
  memory management.
* Remove useless duplicate call of SnapBuildClearExportedSnapshot().
* replication_scanner_finish() was never called.

Back-patch of commit f560209c6 into v10-v13.  I'd originally felt
that this didn't merit back-patching, but subsequent confusion
while debugging walsender problems suggests that it'll be useful.
Also, the original commit has now aged long enough to provide some
comfort that it won't cause problems.

Discussion: https://postgr.es/m/2673480.1624557299@sss.pgh.pa.us
Discussion: https://postgr.es/m/880181.1600026471@sss.pgh.pa.us
2021-06-25 10:46:10 -04:00
Michael Paquier c4c9c77e56 Cleanup some code related to pgbench log checks in TAP tests
This fixes a couple of problems within the so-said code of this commit
subject:
- Replace the use of open() with slurp_file(), fixing an issue reported
by buildfarm member fairywren whose perl installation keep around CRLF
characters, causing the matching patterns for the logs to fail.
- Remove the eval block, which is not really necessary.

This set of issues has come into light after fixing a different issue
with c13585fe, and this is wrong since this code has been introduced.

Reported-by: Andrew Dunstan, and buildfarm member fairywren
Author: Michael Paquier
Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/0f49303e-7784-b3ee-200b-cbf67be2eb9e@dunslane.net
Backpatch-through: 11
2021-06-25 20:15:39 +09:00
Thomas Munro 6ada4fd066 Prepare for forthcoming LLVM 13 API change.
LLVM 13 (due out in September) has changed the semantics of
LLVMOrcAbsoluteSymbols(), so we need to bump some reference counts to
avoid a double-free that causes crashes and bad query results.

A proactive change seems necessary to avoid having a window of time
where our respective latest releases would interact badly.  It's
possible that the situation could change before then, though.

Thanks to Fabien Coelho for monitoring bleeding edge LLVM and Andres
Freund for tracking down the change.

Back-patch to 11, where the JIT code arrived.

Discussion: https://postgr.es/m/CA%2BhUKGLEy8mgtN7BNp0ooFAjUedDTJj5dME7NxLU-m91b85siA%40mail.gmail.com
2021-06-25 11:29:47 +12:00
Michael Paquier 0efd2a1a66 Fix pattern matching logic for logs in TAP tests of pgbench
The logic checking for the format of per-thread logs used grep() with
directly "$re", which would cause the test to consider all the logs as
a match without caring about their format at all.  Using "/$re/" makes
grep() perform a regex test, which is what we want here.

While on it, improve some of the tests to be more picky with the
patterns expected and add more comments to describe the tests.

Issue discovered while digging into a separate patch.

Author: Fabien Coelho, Michael Paquier
Discussion: https://postgr.es/m/YNPsPAUoVDCpPOGk@paquier.xyz
Backpatch-through: 11
2021-06-25 06:52:56 +09:00
Tom Lane c6cb62f613 Stabilize results of insert-conflict-toast.spec.
This back-branch test script was later absorbed into
insert-conflict-specconflict.spec, which required some stabilization
in commit 741d7f104, so perhaps it's not surprising that it needs a
bit of love too.

It's odd though that we hadn't seen it fail before now, because
I thought that 741d7f104 did not change isolationtester's timing
behavior for scripts without any annotation markers.  In any case,
this script is racy on its face, so add an annotation to force stable
reporting order.

Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=piculet&dt=2021-06-24%2009%3A54%3A56
Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&dt=2021-06-24%2010%3A10%3A00
2021-06-24 11:30:32 -04:00
Amit Kapila e95f617acc Fix ABI break introduced by commit 4daa140a2f.
Move the newly defined enum value REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT
at the end to avoid ABI break in the back branches. We need to back-patch
this till v11 because before that it is already at the end.

Reported-by: Tomas Vondra
Backpatch-through: 11
Discussion: https://postgr.es/m/CAExHW5sPKF-Oovx_qZe4p5oM6Dvof7_P+XgsNAViug15Fm99jA@mail.gmail.com
2021-06-24 15:26:37 +05:30
Heikki Linnakangas eb3bd243a2 Another fix to relmapper race condition.
In previous commit, I missed that relmap_redo() was also not acquiring the
RelationMappingLock. Thanks to Thomas Munro for pointing that out.

Backpatch-through: 9.6, like previous commit.
Discussion: https://www.postgresql.org/message-id/CA%2BhUKGLev%3DPpOSaL3WRZgOvgk217et%2BbxeJcRr4eR-NttP1F6Q%40mail.gmail.com
2021-06-24 11:19:37 +03:00
Heikki Linnakangas c78bb32c19 Prevent race condition while reading relmapper file.
Contrary to the comment here, POSIX does not guarantee atomicity of a
read(), if another process calls write() concurrently. Or at least Linux
does not. Add locking to load_relmap_file() to avoid the race condition.

Fixes bug #17064. Thanks to Alexander Lakhin for the report and test case.

Backpatch-through: 9.6, all supported versions.
Discussion: https://www.postgresql.org/message-id/17064-bb0d7904ef72add3@postgresql.org
2021-06-24 10:45:46 +03:00
Amit Kapila e00e5db22d Doc: Update caveats in synchronous logical replication.
Reported-by: Simon Riggs
Author: Takamichi Osumi
Reviewed-by: Amit Kapila
Backpatch-through: 9.6
Discussion: https://www.postgresql.org/message-id/20210222222847.tpnb6eg3yiykzpky@alap3.anarazel.de
2021-06-24 09:49:23 +05:30
Tom Lane 94d8d8d89f Allow non-quoted identifiers as isolation test session/step names.
For no obvious reason, isolationtester has always insisted that
session and step names be written with double quotes.  This is
fairly tedious and does little for test readability, especially
since the names that people actually choose almost always look
like normal identifiers.  Hence, let's tweak the lexer to allow
SQL-like identifiers not only double-quoted strings.

(They're SQL-like, not exactly SQL, because I didn't add any
case-folding logic.  Also there's no provision for U&"..." names,
not that anyone's likely to care.)

There is one incompatibility introduced by this change: if you write
"foo""bar" with no space, that used to be taken as two identifiers,
but now it's just one identifier with an embedded quote mark.

I converted all the src/test/isolation/ specfiles to remove
unnecessary double quotes, but stopped there because my
eyes were glazing over already.

Like 741d7f104, back-patch to all supported branches, so that this
isn't a stumbling block for back-patching isolation test changes.

Discussion: https://postgr.es/m/759113.1623861959@sss.pgh.pa.us
2021-06-23 18:41:39 -04:00
Tom Lane 5c22cf0e73 Doc: fix confusion about LEAKPROOF in syntax summaries.
The syntax summaries for CREATE FUNCTION and allied commands
made it look like LEAKPROOF is an alternative to
IMMUTABLE/STABLE/VOLATILE, when of course it is an orthogonal
option.  Improve that.

Per gripe from aazamrafeeque0.  Thanks to David Johnston for
suggestions.

Discussion: https://postgr.es/m/162444349581.694.5818572718530259025@wrigleys.postgresql.org
2021-06-23 14:27:13 -04:00
Tom Lane 361acef7e4 Don't assume GSSAPI result strings are null-terminated.
Our uses of gss_display_status() and gss_display_name() assumed
that the gss_buffer_desc strings returned by those functions are
null-terminated.  It appears that they generally are, given the
lack of field complaints up to now.  However, the available
documentation does not promise this, and some man pages
for gss_display_status() show examples that rely on the
gss_buffer_desc.length field instead of expecting null
termination.  Also, we now have a report that on some
implementations, clang's address sanitizer is of the opinion
that the byte after the specified length is undefined.

Hence, change the code to rely on the length field instead.

This might well be cosmetic rather than fixing any real bug, but
it's hard to be sure, so back-patch to all supported branches.
While here, also back-patch the v12 changes that made pg_GSS_error
deal honestly with multiple messages available from
gss_display_status.

Per report from Sudheer H R.

Discussion: https://postgr.es/m/5372B6D4-8276-42C0-B8FB-BD0918826FC3@tekenlight.com
2021-06-23 14:01:32 -04:00
Tom Lane b1aa0f2284 Improve display of query results in isolation tests.
Previously, isolationtester displayed SQL query results using some
ad-hoc code that clearly hadn't had much effort expended on it.
Field values longer than 14 characters weren't separated from
the next field, and usually caused misalignment of the columns
too.  Also there was no visual separation of a query's result
from subsequent isolationtester output.  This made test result
files confusing and hard to read.

To improve matters, let's use libpq's PQprint() function.  Although
that's long since unused by psql, it's still plenty good enough
for the purpose here.

Like 741d7f104, back-patch to all supported branches, so that this
isn't a stumbling block for back-patching isolation test changes.

Discussion: https://postgr.es/m/582362.1623798221@sss.pgh.pa.us
2021-06-23 11:12:31 -04:00
Tom Lane a1417e4378 Use annotations to reduce instability of isolation-test results.
We've long contended with isolation test results that aren't entirely
stable.  Some test scripts insert long delays to try to force stable
results, which is not terribly desirable; but other erratic failure
modes remain, causing unrepeatable buildfarm failures.  I've spent a
fair amount of time trying to solve this by improving the server-side
support code, without much success: that way is fundamentally unable
to cope with diffs that stem from chance ordering of arrival of
messages from different server processes.

We can improve matters on the client side, however, by annotating
the test scripts themselves to show the desired reporting order
of events that might occur in different orders.  This patch adds
three types of annotations to deal with (a) test steps that might or
might not complete their waits before the isolationtester can see them
waiting; (b) test steps in different sessions that can legitimately
complete in either order; and (c) NOTIFY messages that might arrive
before or after the completion of a step in another session.  We might
need more annotation types later, but this seems to be enough to deal
with the instabilities we've seen in the buildfarm.  It also lets us
get rid of all the long delays that were previously used, cutting more
than a minute off the runtime of the isolation tests.

Back-patch to all supported branches, because the buildfarm
instabilities affect all the branches, and because it seems desirable
to keep isolationtester's capabilities the same across all branches
to simplify possible future back-patching of tests.

Discussion: https://postgr.es/m/327948.1623725828@sss.pgh.pa.us
2021-06-22 21:43:12 -04:00
Tom Lane 77200c5692 Restore the portal-level snapshot for simple expressions, too.
Commits 84f5c2908 et al missed the need to cover plpgsql's "simple
expression" code path.  If the first thing we execute after a
COMMIT/ROLLBACK is one of those, rather than a full-fledged SPI command,
we must explicitly do EnsurePortalSnapshotExists() to make sure we have
an outer snapshot.  Note that it wouldn't be good enough to just push a
snapshot for the duration of the expression execution: what comes back
might be toasted, so we'd better have a snapshot protecting it.

The test case demonstrating this fact cheats a bit by marking a SQL
function immutable even though it fetches from a table.  That's
nothing that users haven't been seen to do, though.

Per report from Jim Nasby.  Back-patch to v11, like the previous fix.

Discussion: https://postgr.es/m/378885e4-f85f-fc28-6c91-c4d1c080bf26@amazon.com
2021-06-22 17:48:39 -04:00
Tom Lane ea5ae3ae1a Fix misbehavior of DROP OWNED BY with duplicate polroles entries.
Ordinarily, a pg_policy.polroles array wouldn't list the same role
more than once; but CREATE POLICY does not prevent that.  If we
perform DROP OWNED BY on a role that is listed more than once,
RemoveRoleFromObjectPolicy either suffered an assertion failure
or encountered a tuple-updated-by-self error.  Rewrite it to cope
correctly with duplicate entries, and add a CommandCounterIncrement
call to prevent the other problem.

Per discussion, there's other cleanup that ought to happen here,
but this seems like the minimum essential fix.

Per bug #17062 from Alexander Lakhin.  It's been broken all along,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/17062-11f471ae3199ca23@postgresql.org
2021-06-18 18:00:09 -04:00
Tom Lane 4b8b3562e1 Avoid scribbling on input node tree in CREATE/ALTER DOMAIN.
This works fine in the "simple Query" code path; but if the
statement is in the plan cache then it's corrupted for future
re-execution.  Apply copyObject() to protect the original
tree from modification, as we've done elsewhere.

This narrow fix is applied only to the back branches.  In HEAD,
the problem was fixed more generally by commit 7c337b6b5; but
that changed ProcessUtility's API, so it's infeasible to
back-patch.

Per bug #17053 from Charles Samborski.

Discussion: https://postgr.es/m/931771.1623893989@sss.pgh.pa.us
Discussion: https://postgr.es/m/17053-3ca3f501bbc212b4@postgresql.org
2021-06-18 12:09:22 -04:00
Tom Lane 0d3b69ae00 s/table_close/heap_close/ in v11.
Back-patching thinko in 306c31804.  Per buildfarm.
2021-06-18 11:45:45 -04:00
Andrew Dunstan 306c318043
Don't set a fast default for anything but a plain table
The fast default code added in Release 11 omitted to check that the
table a fast default was being added to was a plain table. Thus one
could be added to a foreign table, which predicably blows up. Here we
perform that check.

In addition, on the back branches, since some of these might have
escaped into the wild, if we encounter a missing value for
an attribute of something other than a plain table we ignore it.

Fixes bug #17056

Backpatch to release 11,

Reviewed by: Andres Freund, Álvaro Herrera and Tom Lane
2021-06-18 07:53:08 -04:00
Peter Eisentraut ba529a6ff4 Update plpython_subtransaction alternative expected files
The original patch only targeted Python 2.6 and newer, since that is
what we have supported in PostgreSQL 13 and newer.  For older
branches, we need to fix it up for older Python versions.
2021-06-18 06:51:56 +02:00
Heikki Linnakangas 25c171f322 Tidy up GetMultiXactIdMembers()'s behavior on error
One of the error paths left *members uninitialized. That's not a live
bug, because most callers don't look at *members when the function
returns -1, but let's be tidy. One caller, in heap_lock_tuple(), does
"if (members != NULL) pfree(members)", but AFAICS it never passes an
invalid 'multi' value so it should not reach that error case.

The callers are also a bit inconsistent in their expectations.
heap_lock_tuple() pfrees the 'members' array if it's not-NULL, others
pfree() it if "nmembers >= 0", and others if "nmembers > 0". That's
not a live bug either, because the function should never return 0, but
add an Assert for that to make it more clear. I left the callers alone
for now.

I also moved the line where we set *nmembers. It wasn't wrong before,
but I like to do that right next to the 'return' statement, to make it
clear that it's always set on return.

Also remove one unreachable return statement after ereport(ERROR), for
brevity and for consistency with the similar if-block right after it.

Author: Greg Nancarrow with the additional changes by me
Backpatch-through: 9.6, all supported versions
2021-06-17 14:52:51 +03:00
Peter Eisentraut 1a2752be81 Fix subtransaction test for Python 3.10
Starting with Python 3.10, the stacktrace looks differently:
  -  PL/Python function "subtransaction_exit_subtransaction_in_with", line 3, in <module>
  -    s.__exit__(None, None, None)
  +  PL/Python function "subtransaction_exit_subtransaction_in_with", line 2, in <module>
  +    with plpy.subtransaction() as s:
Using try/except specifically makes the error look always the same.

(See https://github.com/python/cpython/pull/25719 for the discussion
of this change in Python.)

Author: Honza Horak <hhorak@redhat.com>
Discussion: https://www.postgresql.org/message-id/flat/853083.1620749597%40sss.pgh.pa.us
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1959080
2021-06-17 09:02:44 +02:00
Amit Kapila 5a456034b8 Document a few caveats in synchronous logical replication.
In a synchronous logical setup, locking [user] catalog tables can cause
deadlock. This is because logical decoding of transactions can lock
catalog tables to access them so exclusively locking those in transactions
can lead to deadlock. To avoid this users must refrain from having
exclusive locks on catalog tables.

Author: Takamichi Osumi
Reviewed-by: Vignesh C, Amit Kapila
Backpatch-through: 9.6
Discussion: https://www.postgresql.org/message-id/20210222222847.tpnb6eg3yiykzpky%40alap3.anarazel.de
2021-06-17 10:59:11 +05:30
Michael Paquier 8f32299424 Detect unused steps in isolation specs and do some cleanup
This is useful for developers to find out if an isolation spec is
over-engineered or if it needs more work by warning at the end of a
test run if a step is not used, generating a failure with extra diffs.

While on it, clean up all the specs which include steps not used in any
permutations to simplify them.

This is a backpatch of 989d23b and 06fdc4e, as it is becoming useful to
make all the branches consistent for an upcoming patch that will improve
the output generated by isolationtester.

Author: Michael Paquier
Reviewed-by: Asim Praveen, Melanie Plageman
Discussion: https://postgr.es/m/20190819080820.GG18166@paquier.xyz
Discussion: https://postgr.es/m/794820.1623872009@sss.pgh.pa.us
Backpatch-through: 9.6
2021-06-17 11:57:26 +09:00
Michael Paquier 834cb72691 Remove dry-run mode from isolationtester
The original purpose of the dry-run mode is to be able to print all the
possible permutations from a spec file, but it has become less useful
since isolation tests have improved regarding deadlock detection as one
step not wanted by the author could block indefinitely now (originally
the step blocked would have been detected rather quickly).  Per
discussion, let's remove it.

This is a backpatch of 9903338 for 9.6~12.  It is proving to become
useful to have on those branches so as the code gets consistent across
all supported versions, as a matter of improving the output generated by
isolationtester.

Author: Michael Paquier
Reviewed-by: Asim Praveen, Melanie Plageman
Discussion: https://postgr.es/m/20190819080820.GG18166@paquier.xyz
Discussion: https://postgr.es/m/794820.1623872009@sss.pgh.pa.us
Backpatch-through: 9.6
2021-06-17 11:01:20 +09:00
Tom Lane 9cf1632660 Fix plancache refcount leak after error in ExecuteQuery.
When stuffing a plan from the plancache into a Portal, one is
not supposed to risk throwing an error between GetCachedPlan and
PortalDefineQuery; if that happens, the plan refcount incremented
by GetCachedPlan will be leaked.  I managed to break this rule
while refactoring code in 9dbf2b7d7.  There is no visible
consequence other than some memory leakage, and since nobody is
very likely to trigger the relevant error conditions many times
in a row, it's not surprising we haven't noticed.  Nonetheless,
it's a bug, so rearrange the order of operations to remove the
hazard.

Noted on the way to looking for a better fix for bug #17053.
This mistake is pretty old, so back-patch to all supported
branches.
2021-06-16 19:30:17 -04:00
Andrew Dunstan c0a7587807
Further refinement of stuck_on_old_timeline recovery test
TestLib::perl2host can take a file argument as well as a directory
argument, so that code becomes substantially simpler. Also add comments
on why we're using forward slashes, and why we're setting
PERL_BADLANG=0.

Discussion: https://postgr.es/m/e9947bcd-20ee-027c-f0fe-01f736b7e345@dunslane.net
2021-06-15 15:37:07 -04:00
Amit Kapila 1f8a934e0a Fix decoding of speculative aborts.
During decoding for speculative inserts, we were relying for cleaning
toast hash on confirmation records or next change records. But that
could lead to multiple problems (a) memory leak if there is neither a
confirmation record nor any other record after toast insertion for a
speculative insert in the transaction, (b) error and assertion failures
if the next operation is not an insert/update on the same table.

The fix is to start queuing spec abort change and clean up toast hash
and change record during its processing. Currently, we are queuing the
spec aborts for both toast and main table even though we perform cleanup
while processing the main table's spec abort record. Later, if we have a
way to distinguish between the spec abort record of toast and the main
table, we can avoid queuing the change for spec aborts of toast tables.

Reported-by: Ashutosh Bapat
Author: Dilip Kumar
Reviewed-by: Amit Kapila
Backpatch-through: 9.6, where it was introduced
Discussion: https://postgr.es/m/CAExHW5sPKF-Oovx_qZe4p5oM6Dvof7_P+XgsNAViug15Fm99jA@mail.gmail.com
2021-06-15 09:02:32 +05:30
Tom Lane 73fa762417 Work around portability issue with newer versions of mktime().
Recent glibc versions have made mktime() fail if tm_isdst is
inconsistent with the prevailing timezone; in particular it fails for
tm_isdst = 1 when the zone is UTC.  (This seems wildly inconsistent
with the POSIX-mandated treatment of "incorrect" values for the other
fields of struct tm, so if you ask me it's a bug, but I bet they'll
say it's intentional.)  This has been observed to cause cosmetic
problems when pg_restore'ing an archive created in a different
timezone.

To fix, do mktime() using the field values from the archive, and if
that fails try again with tm_isdst = -1.  This will give a result
that's off by the UTC-offset difference from the original zone, but
that was true before, too.  It's not terribly critical since we don't
do anything with the result except possibly print it.  (Someday we
should flush this entire bit of logic and record a standard-format
timestamp in the archive instead.  That's not okay for a back-patched
bug fix, though.)

Also, guard our only other use of mktime() by having initdb's
build_time_t() set tm_isdst = -1 not 0.  This case could only have
an issue in zones that are DST year-round; but I think some do exist,
or could in future.

Per report from Wells Oliver.  Back-patch to all supported
versions, since any of them might need to run with a newer glibc.

Discussion: https://postgr.es/m/CAOC+FBWDhDHO7G-i1_n_hjRzCnUeFO+H-Czi1y10mFhRWpBrew@mail.gmail.com
2021-06-13 14:32:42 -04:00
Andrew Dunstan 8cb3d95c20
Further tweaks to stuck_on_old_timeline recovery test
Translate path slashes on target directory path. This was confusing old
branches, but is applied to all branches for the sake of uniformity.
Perl is perfectly able to understand paths with forward slashes.

Along the way, restore the previous archive_wait query, for the sake of
uniformity with other tests, per gripe from Tom Lane.
2021-06-13 07:19:36 -04:00
Michael Paquier 15cb0bf471 Ignore more environment variables in pg_regress.c
This is similar to the work done in 8279f68 for TestLib.pm, where
environment variables set may cause unwanted failures if using a
temporary installation with pg_regress.  The list of variables reset is
adjusted in each stable branch depending on what is supported.

Comments are added to remember that the lists in TestLib.pm and
pg_regress.c had better be kept in sync.

Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/YMNR9GYDn+fHlMta@paquier.xyz
Backpatch-through: 9.6
2021-06-13 20:07:54 +09:00
Tom Lane b170ca021c Restore robustness of TAP tests that wait for postmaster restart.
Several TAP tests use poll_query_until() to wait for the postmaster
to restart.  They were checking to see if a trivial query
(e.g. "SELECT 1") succeeds.  However, that's problematic in the wake
of commit 11e9caff8, because now that we feed said query to psql
via stdin, we risk IPC::Run whining about a SIGPIPE failure if psql
quits before reading the query.  Hence, we can't use a nonempty
query in cases where we need to wait for connection failures to
stop happening.

Per the precedent of commits c757a3da0 and 6d41dd045, we can pass
"undef" as the query in such cases to ensure that IPC::Run has
nothing to write.  However, then we have to say that the expected
output is empty, and this exposes a deficiency in poll_query_until:
if psql fails altogether and returns empty stdout, poll_query_until
will treat that as a success!  That's because, contrary to its
documentation, it makes no actual check for psql failure, looking
neither at the exit status nor at stderr.

To fix that, adjust poll_query_until to insist on empty stderr as
well as a stdout match.  (I experimented with checking exit status
instead, but it seems that psql often does exit(1) in cases that we
need to consider successes.  That might be something to fix someday,
but it would be a non-back-patchable behavior change.)

Back-patch to v10.  The test cases needing this exist only as far
back as v11, but it seems wise to keep poll_query_until's behavior
the same in v10, in case we back-patch another such test case in
future.  (9.6 does not currently need this change, because in that
branch poll_query_until can't be told to accept empty stdout as
a success case.)

Per assorted buildfarm failures, mostly on hoverfly.

Discussion: https://postgr.es/m/CAA4eK1+zM6L4QSA1XMvXY_qqWwdUmqkOS1+hWvL8QcYEBGA1Uw@mail.gmail.com
2021-06-12 15:12:10 -04:00
Tom Lane 25d1ef1aaf Ensure pg_filenode_relation(0, 0) returns NULL.
Previously, a zero value for the relfilenode resulted in
a confusing error message about "unexpected duplicate".
This function returns NULL for other invalid relfilenode
values, so zero should be treated likewise.

It's been like this all along, so back-patch to all supported
branches.

Justin Pryzby

Discussion: https://postgr.es/m/20210612023324.GT16435@telsasoft.com
2021-06-12 13:29:24 -04:00
Tom Lane 9eecea7f37 Don't use Asserts to check for violations of replication protocol.
Using an Assert to check the validity of incoming messages is an
extremely poor decision.  In a debug build, it should not be that easy
for a broken or malicious remote client to crash the logrep worker.
The consequences could be even worse in non-debug builds, which will
fail to make such checks at all, leading to who-knows-what misbehavior.
Hence, promote every Assert that could possibly be triggered by wrong
or out-of-order replication messages to a full test-and-ereport.

To avoid bloating the set of messages the translation team has to cope
with, establish a policy that replication protocol violation error
reports don't need to be translated.  Hence, all the new messages here
use errmsg_internal().  A couple of old messages are changed likewise
for consistency.

Along the way, fix some non-idiomatic or outright wrong uses of
hash_search().

Most of these mistakes are new with the "streaming replication"
patch (commit 464824323), but a couple go back a long way.
Back-patch as appropriate.

Discussion: https://postgr.es/m/1719083.1623351052@sss.pgh.pa.us
2021-06-12 12:59:15 -04:00
Andrew Dunstan 8b9e1275c5
Fix new recovery test for use under msys
Commit caba8f0d43 wasn't quite right for msys, as demonstrated by
several buildfarm animals, including jacana and fairywren. We need to
use the msys perl in the archive command, but call it in such a way that
Windows will understand the path. Furthermore, inside the copy script we
need to convert a Windows path to an msys path.
2021-06-12 08:55:29 -04:00
Michael Paquier a04eb75dec Remove PGSSLCRLDIR from the list of variables ignored in TAP tests
This variable was present in the list added by 9d660670, but it is not
supported by this branch.  Issue noticed while diving into a similar
change for pg_regress.c.

Backpatch-through: 9.6
2021-06-12 10:39:33 +09:00
Tom Lane eea081ad01 Rearrange logrep worker's snapshot handling some more.
It turns out that worker.c's code path for TRUNCATE was also
careless about establishing a snapshot while executing user-defined
code, allowing the checks added by commit 84f5c2908 to fail when
a trigger is fired in that context.

We could just wrap Push/PopActiveSnapshot around the truncate call,
but it seems better to establish a policy of holding a snapshot
throughout execution of a replication step.  To help with that and
possible future requirements, replace the previous ensure_transaction
calls with pairs of begin/end_replication_step calls.

Per report from Mark Dilger.  Back-patch to v11, like the previous
changes.

Discussion: https://postgr.es/m/B4A3AF82-79ED-4F4C-A4E5-CD2622098972@enterprisedb.com
2021-06-10 12:27:27 -04:00
Robert Haas 534b9be805 Adjust new test case to set wal_keep_segments.
Per buildfarm member conchuela and Kyotaro Horiguchi, it's possible
for the WAL segment that the cascading standby needs to be removed
too quickly. Hopefully this will prevent that.

Kyotaro Horiguchi

Discussion: http://postgr.es/m/20210610.101240.1270925505780628275.horikyota.ntt@gmail.com
2021-06-10 09:43:35 -04:00
Robert Haas ca158c168e Fix corner case failure of new standby to follow new primary.
This only happens if (1) the new standby has no WAL available locally,
(2) the new standby is starting from the old timeline, (3) the promotion
happened in the WAL segment from which the new standby is starting,
(4) the timeline history file for the new timeline is available from
the archive but the WAL files for are not (i.e. this is a race),
(5) the WAL files for the new timeline are available via streaming,
and (6) recovery_target_timeline='latest'.

Commit ee994272ca introduced this
logic and was an improvement over the previous code, but it mishandled
this case. If recovery_target_timeline='latest' and restore_command is
set, validateRecoveryParameters() can change recoveryTargetTLI to be
different from receiveTLI. If streaming is then tried afterward,
expectedTLEs gets initialized with the history of the wrong timeline.
It's supposed to be a list of entries explaining how to get to the
target timeline, but in this case it ends up with a list of entries
explaining how to get to the new standby's original timeline, which
isn't right.

Dilip Kumar and Robert Haas, reviewed by Kyotaro Horiguchi.

Discussion: http://postgr.es/m/CAFiTN-sE-jr=LB8jQuxeqikd-Ux+jHiXyh4YDiZMPedgQKup0g@mail.gmail.com
2021-06-09 16:20:10 -04:00
Robert Haas 38982b8b7b Allow PostgresNode.pm's backup method to accept backup_options.
Partial back-port of commit 081876d75e.
A test case for a pending bug fix needs this capability, but the code
on 9.6 is significantly different, so I'm only back-patching this
change as far as v10. We'll have to work around the problem another
way in v9.6.

Discussion: http://postgr.es/m/CAFiTN-tcivNvL0Rg6rD7_CErNfE75H7+gh9WbMxjbgsattja1Q@mail.gmail.com
2021-06-09 12:30:28 -04:00
Michael Paquier f0f879e106 Fix inconsistencies in psql --help=commands
The set of subcommands supported by \dAp, \do and \dy was described
incorrectly in psql's --help.  The documentation was already consistent
with the code.

Reported-by: inoas, from IRC
Author: Matthijs van der Vleuten
Reviewed-by: Neil Chen
Discussion: https://postgr.es/m/6a984e24-2171-4039-9050-92d55e7b23fe@www.fastmail.com
Backpatch-through: 9.6
2021-06-09 16:26:02 +09:00