Commit Graph

49243 Commits

Author SHA1 Message Date
Noah Misch 7f098f7b53 Make relation-enumerating operations be security-restricted operations.
When a feature enumerates relations and runs functions associated with
all found relations, the feature's user shall not need to trust every
user having permission to create objects.  BRIN-specific functionality
in autovacuum neglected to account for this, as did pg_amcheck and
CLUSTER.  An attacker having permission to create non-temp objects in at
least one schema could execute arbitrary SQL functions under the
identity of the bootstrap superuser.  CREATE INDEX (not a
relation-enumerating operation) and REINDEX protected themselves too
late.  This change extends to the non-enumerating amcheck interface.
Back-patch to v10 (all supported versions).

Sergey Shinderuk, reviewed (in earlier versions) by Alexander Lakhin.
Reported by Alexander Lakhin.

Security: CVE-2022-1552
2022-05-09 08:35:12 -07:00
Peter Eisentraut d387588601 Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 4a507135ecc39274887f0f0ce760f964f1725579
2022-05-09 12:27:26 +02:00
Andres Freund 1e1542833e Disable 031_recovery_conflict.pl until after minor releases.
f40d362a66 disabled part of 031_recovery_conflict.pl due to instability
that's not trivial to fix in the back branches. That fixed most of the
issues. But there was one more failure (on lapwing / REL_10_STABLE).

That failure looks like it might be caused by a genuine problem. Disable the
test until after the set of releases, to avoid packagers etc potentially
having to fight with a test failure they can't do anything about.

Discussion: https://postgr.es/m/3447060.1652032749@sss.pgh.pa.us
Backpatch: 10-14
2022-05-08 18:06:38 -07:00
Tom Lane ec76d00738 Release notes for 14.3, 13.7, 12.11, 11.16, 10.21. 2022-05-08 12:36:38 -04:00
Noah Misch 27e66539a9 Fix back-patch of "Under has_wal_read_bug, skip .../001_wal.pl."
Per buildfarm members tadarida, snapper, and kittiwake.  Back-patch to
v10 (all supported versions).
2022-05-07 09:13:35 -07:00
Noah Misch 49d63b2e36 Under has_wal_read_bug, skip contrib/bloom/t/001_wal.pl.
Per buildfarm members snapper and kittiwake.  Back-patch to v10 (all
supported versions).

Discussion: https://postgr.es/m/20220116210241.GC756210@rfd.leadboat.com
2022-05-07 00:40:17 -07:00
Andres Freund 26f9e21cb6 Temporarily skip recovery deadlock test in back branches.
The recovery deadlock test has a timing issue that was fixed in 5136967f1e in
HEAD. Unfortunately the same fix doesn't quite work in the back branches: 1)
adjust_conf() doesn't exist, which is easy enough to work around 2) a restart
cleares the recovery conflict stats < 15.

These issues can be worked around, but given the upcoming set of minor
releases, skip the problematic test for now. The buildfarm doesn't show
failures in other parts of 031_recovery_conflict.pl.

Discussion: https://postgr.es/m/20220506155827.dfnaheq6ufylwrqf@alap3.anarazel.de
Backpatch: 10-14
2022-05-06 09:07:40 -07:00
Andres Freund fbf659bc40 Backpatch addition of pump_until() more completely.
In a2ab9c06ea I just backpatched the introduction of pump_until(), without
changing the existing local definitions (as 6da65a3f9a). The necessary
changes seemed more verbose than desirable. However, that leads to warnings,
as I failed to realize...

Backpatch to all versions containing pump_until() calls before
f74496dd61 (there's none in 10).

Discussion: https://postgr.es/m/2808491.1651802860@sss.pgh.pa.us
Discussion: https://postgr.es/m/18b37361-b482-b9d8-f30d-6115cd5ce25c@enterprisedb.com
Backpatch: 11-14
2022-05-06 08:39:11 -07:00
Tom Lane 2bb9f7501a Update time zone data files to tzdata release 2022a.
DST law changes in Palestine.  Historical corrections for
Chile and Ukraine.
2022-05-05 14:55:17 -04:00
Andres Freund 8019423764 Revert "Fix timing issue in deadlock recovery conflict test."
This reverts commit 512d76f6db.
2022-05-04 14:15:42 -07:00
Andres Freund 512d76f6db Fix timing issue in deadlock recovery conflict test.
Per buildfarm members longfin and skink.

Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de
Backpatch: 10-
2022-05-04 12:57:25 -07:00
Andres Freund a5ede13916 Backpatch 031_recovery_conflict.pl.
The prior commit showed that the introduction of recovery conflict tests was a
good idea. Without these tests it's hard to know that the fix didn't break
something...

031_recovery_conflict.pl was introduced in 9f8a050f68 and extended in
21e184403b.

Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de
Backpatch: 10-14
2022-05-02 18:29:52 -07:00
Andres Freund edfc03ec91 Fix possibility of self-deadlock in ResolveRecoveryConflictWithBufferPin().
The tests added in 9f8a050f68 failed nearly reliably on FreeBSD in CI, and
occasionally on the buildfarm. That turns out to be caused not by a bug in the
test, but by a longstanding bug in recovery conflict handling.

The standby timeout handler, used by ResolveRecoveryConflictWithBufferPin(),
executed SendRecoveryConflictWithBufferPin() inside a signal handler. A bad
idea, because the deadlock timeout handler (or a spurious latch set) could
have interrupted ProcWaitForSignal(). If unlucky that could cause a
self-deadlock on ProcArrayLock, if the deadlock check is in
SendRecoveryConflictWithBufferPin()->CancelDBBackends().

To fix, set a flag in StandbyTimeoutHandler(), and check the flag in
ResolveRecoveryConflictWithBufferPin().

Subsequently the recovery conflict tests will be backpatched.

Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de
Backpatch: 10-
2022-05-02 18:29:52 -07:00
Andres Freund 5c8b14a71d Backpatch addition of wait_for_log(), pump_until().
These were originally introduced in a2ab9c06ea and a2ab9c06ea, as they are
needed by a about-to-be-backpatched test.

Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de
Backpatch: 10-14
2022-05-02 18:09:43 -07:00
Etsuro Fujita 09b684647a Fix typo in comment. 2022-05-02 16:45:06 +09:00
Andrew Dunstan 01f2bc5af4 Inhibit mingw CRT's auto-globbing of command line arguments
For some reason by default the mingw C Runtime takes it upon itself to
expand program arguments that look like shell globbing characters. That
has caused much scratching of heads and mis-attribution of the causes of
some TAP test failures, so stop doing that.

This removes an inconsistency with Windows binaries built with MSVC,
which have no such behaviour.

Per suggestion from Noah Misch.

Backpatch to all live branches.

Discussion: https://postgr.es/m/20220423025927.GA1274057@rfd.leadboat.com
2022-04-25 15:50:07 -04:00
Tom Lane eade0043ca Remove inadequate assertion check in CTE inlining.
inline_cte() expected to find exactly as many references to the
target CTE as its cterefcount indicates.  While that should be
accurate for the tree as emitted by the parser, there are some
optimizations that occur upstream of here that could falsify it,
notably removal of unused subquery output expressions.

Trying to make the accounting 100% accurate seems expensive and
doomed to future breakage.  It's not really worth it, because
all this code is protecting is downstream assumptions that every
referenced CTE has a plan.  Let's convert those assertions to
regular test-and-elog just in case there's some actual problem,
and then drop the failing assertion.

Per report from Tomas Vondra (thanks also to Richard Guo for
analysis).  Back-patch to v12 where the faulty code came in.

Discussion: https://postgr.es/m/29196a1e-ed47-c7ca-9be2-b1c636816183@enterprisedb.com
2022-04-21 17:58:52 -04:00
Andrew Dunstan 17bdba3ee6 Support new perl module namespace in stable branches
Commit b3b4d8e68a moved our perl test modules to a better namespace
structure, but this has made life hard for people wishing to backpatch
improvements in the TAP tests. Here we alleviate much of that difficulty
by implementing the new module names on top of the old modules, mostly
by using a little perl typeglob aliasing magic, so that we don't have a
dual maintenance burden. This should work both for the case where a new
test is backpatched and the case where a fix to an existing test that
uses the new namespace is backpatched.

Reviewed by Michael Paquier

Per complaint from Andres Freund

Discussion: https://postgr.es/m/20220418141530.nfxtkohefvwnzncl@alap3.anarazel.de

Applied to branches 10 through 14
2022-04-21 09:23:27 -04:00
Peter Geoghegan 5487585e37 Fix CLUSTER tuplesorts on abbreviated expressions.
CLUSTER sort won't use the datum1 SortTuple field when clustering
against an index whose leading key is an expression.  This makes it
unsafe to use the abbreviated keys optimization, which was missed by the
logic that sets up SortSupport state.  Affected tuplesorts output tuples
in a completely bogus order as a result (the wrong SortSupport based
comparator was used for the leading attribute).

This issue is similar to the bug fixed on the master branch by recent
commit cc58eecc5d.  But it's a far older issue, that dates back to the
introduction of the abbreviated keys optimization by commit 4ea51cdfe8.

Backpatch to all supported versions.

Author: Peter Geoghegan <pg@bowt.ie>
Author: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA+hUKG+bA+bmwD36_oDxAoLrCwZjVtST2fqe=b4=qZcmU7u89A@mail.gmail.com
Backpatch: 10-
2022-04-20 17:17:37 -07:00
Tom Lane 33fe55c06b Disallow infinite endpoints in generate_series() for timestamps.
Such cases will lead to infinite loops, so they're of no practical
value.  The numeric variant of generate_series() already threw error
for this, so borrow its message wording.

Per report from Richard Wesley.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/91B44E7B-68D5-448F-95C8-B4B3B0F5DEAF@duckdblabs.com
2022-04-20 18:08:15 -04:00
Tom Lane 481a99811a Fix breakage in AlterFunction().
An ALTER FUNCTION command that tried to update both the function's
proparallel property and its proconfig list failed to do the former,
because it stored the new proparallel value into a tuple that was
no longer the interesting one.  Carelessness in 7aea8e4f2.

(I did not bother with a regression test, because the only likely
future breakage would be for someone to ignore the comment I added
and add some other field update after the heap_modify_tuple step.
A test using existing function properties could not catch that.)

Per report from Bryn Llewellyn.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/8AC9A37F-99BD-446F-A2F7-B89AD0022774@yugabyte.com
2022-04-19 23:03:59 -04:00
Amit Kapila 59348fbdeb Fix the check to limit sync workers.
We don't allow to invoke more sync workers once we have reached the sync
worker limit per subscription. But the check to enforce this also doesn't
allow to launch an apply worker if it gets restarted.

This code was introduced by commit de43897122 but we caught the problem
only with the test added by recent commit c91f71b9dc which started failing
occasionally in the buildfarm.

As per buildfarm.
Diagnosed-by: Amit Kapila, Masahiko Sawada, Tomas Vondra
Author: Amit Kapila
Backpatch-through: 10
Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com
	    https://postgr.es/m/f90d2b03-4462-ce95-a524-d91464e797c8@enterprisedb.com
2022-04-19 09:18:44 +05:30
Tom Lane 0795da8695 Avoid invalid array reference in transformAlterTableStmt().
Don't try to look at the attidentity field of system attributes,
because they're not there in the TupleDescAttr array.  Sometimes
this is harmless because we accidentally pick up a zero, but
otherwise we'll report "no owned sequence found" from an attempt
to alter a system attribute.  (It seems possible that a SIGSEGV
could occur, too, though I've not seen it in testing.)

It's not in this function's charter to complain that you can't
alter a system column, so instead just hard-wire an assumption
that system attributes aren't identities.  I didn't bother with
a regression test because the appearance of the bug is very
erratic.

Per bug #17465 from Roman Zharkov.  Back-patch to all supported
branches.  (There's not actually a live bug before v12, because
before that get_attidentity() did the right thing anyway.
But for consistency I changed the test in the older branches too.)

Discussion: https://postgr.es/m/17465-f2a554a6cb5740d3@postgresql.org
2022-04-18 12:16:45 -04:00
Michael Paquier a57399d70c Fix race in TAP test 002_archiving.pl when restoring history file
This test, introduced in df86e52, uses a second standby to check that
it is able to remove correctly RECOVERYHISTORY and RECOVERYXLOG at the
end of recovery.  This standby uses the archives of the primary to
restore its contents, with some of the archive's contents coming from
the first standby previously promoted.  In slow environments, it was
possible that the test did not check what it should, as the history file
generated by the promotion of the first standby may not be stored yet on
the archives the second standby feeds on.  So, it could be possible that
the second standby selects an incorrect timeline, without restoring a
history file at all.

This commits adds a wait phase to make sure that the history file
required by the second standby is archived before this cluster is
created.  This relies on poll_query_until() with pg_stat_file() and an
absolute path, something not supported in REL_10_STABLE.

While on it, this adds a new test to check that the history file has
been restored by looking at the logs of the second standby.  This
ensures that a RECOVERYHISTORY, whose removal needs to be checked,
is created in the first place.  This should make the test more robust.

This test has been introduced by df86e52, but it came in light as an
effect of the bug fixed by acf1dd42, where the extra restore_command
calls made the test much slower.

Reported-by: Andres Freund
Discussion: https://postgr.es/m/YlT23IvsXkGuLzFi@paquier.xyz
Backpatch-through: 11
2022-04-18 11:40:22 +09:00
Noah Misch 6ea49fe354 Add a temp-install prerequisite to src/interfaces/ecpg "checktcp".
The target failed, tested $PATH binaries, or tested a stale temporary
installation.  Commit c66b438db6 missed
this.  Back-patch to v10 (all supported versions).
2022-04-16 17:43:58 -07:00
Robert Haas 68e605b9ef Rethink the delay-checkpoint-end mechanism in the back-branches.
The back-patch of commit bbace5697d had
the unfortunate effect of changing the layout of PGPROC in the
back-branches, which could break extensions. This happened because it
changed the delayChkpt from type bool to type int. So, change it back,
and add a new bool delayChkptEnd field instead. The new field should
fall within what used to be padding space within the struct, and so
hopefully won't cause any extensions to break.

Per report from Markus Wanner and discussion with Tom Lane and others.

Patch originally by me, somewhat revised by Markus Wanner per a
suggestion from Michael Paquier. A very similar patch was developed
by Kyotaro Horiguchi, but I failed to see the email in which that was
posted before writing one of my own.

Discussion: http://postgr.es/m/CA+Tgmoao-kUD9c5nG5sub3F7tbo39+cdr8jKaOVEs_1aBWcJ3Q@mail.gmail.com
Discussion: http://postgr.es/m/20220406.164521.17171257901083417.horikyota.ntt@gmail.com
2022-04-14 11:10:13 -04:00
Michael Paquier 5378d55cb2 pageinspect: Fix handling of all-zero pages
Getting from get_raw_page() an all-zero page is considered as a valid
case by the buffer manager and it can happen for example when finding a
corrupted page with zero_damaged_pages enabled (using zero_damaged_pages
to look at corrupted pages happens), or after a crash when a relation
file is extended before any WAL for its new data is generated (before a
vacuum or autovacuum job comes in to do some cleanup).

However, all the functions of pageinspect, as of the index AMs (except
hash that has its own idea of new pages), heap, the FSM or the page
header have never worked with all-zero pages, causing various crashes
when going through the page internals.

This commit changes all the pageinspect functions to be compliant with
all-zero pages, where the choice is made to return NULL or no rows for
SRFs when finding a new page.  get_raw_page() still works the same way,
returning a batch of zeros in the bytea of the page retrieved.  A hard
error could be used but NULL, while more invasive, is useful when
scanning relation files in full to get a batch of results for a single
relation in one query.  Tests are added for all the code paths
impacted.

Reported-by: Daria Lepikhova
Author: Michael Paquier
Discussion: https://postgr.es/m/561e187b-3549-c8d5-03f5-525c14e65bd0@postgrespro.ru
Backpatch-through: 10
2022-04-14 15:09:39 +09:00
Tom Lane e0ed20d0b6 Prevent access to no-longer-pinned buffer in heapam_tuple_lock().
heap_fetch() used to have a "keep_buf" parameter that told it to return
ownership of the buffer pin to the caller after finding that the
requested tuple TID exists but is invisible to the specified snapshot.
This was thoughtlessly removed in commit 5db6df0c0, which broke
heapam_tuple_lock() (formerly EvalPlanQualFetch) because that function
needs to do more accesses to the tuple even if it's invisible.  The net
effect is that we would continue to touch the page for a microsecond or
two after releasing pin on the buffer.  Usually no harm would result;
but if a different session decided to defragment the page concurrently,
we could see garbage data and mistakenly conclude that there's no newer
tuple version to chain up to.  (It's hard to say whether this has
happened in the field.  The bug was actually found thanks to a later
change that allowed valgrind to detect accesses to non-pinned buffers.)

The most reasonable way to fix this is to reintroduce keep_buf,
although I made it behave slightly differently: buffer ownership
is passed back only if there is a valid tuple at the requested TID.
In HEAD, we can just add the parameter back to heap_fetch().
To avoid an API break in the back branches, introduce an additional
function heap_fetch_extended() in those branches.

In HEAD there is an additional, less obvious API change: tuple->t_data
will be set to NULL in all cases where buffer ownership is not returned,
in particular when the tuple exists but fails the time qual (and
!keep_buf).  This is to defend against any other callers attempting to
access non-pinned buffers.  We concluded that making that change in back
branches would be more likely to introduce problems than cure any.

In passing, remove a comment about heap_fetch that was obsoleted by
9a8ee1dc6.

Per bug #17462 from Daniil Anisimov.  Back-patch to v12 where the bug
was introduced.

Discussion: https://postgr.es/m/17462-9c98a0f00df9bd36@postgresql.org
2022-04-13 13:35:02 -04:00
Tom Lane 2eb3ec8f49 Doc: tweak textsearch.sgml for SEO purposes.
Google seems to like to return textsearch.html for queries about
GIN and GiST indexes, even though it's not a primary reference
for either.  It seems likely that that's because those keywords
appear in the page title.  Since "GIN and GiST Index Types" is
not a very apposite title for this material anyway, rename the
section in hopes of stopping that.

Also provide explicit links to the GIN and GiST chapters, to help
anyone who finds their way to this page regardless.

Per gripe from Jan Piotrowski.  Back-patch to supported branches.
(Unfortunately Google is likely to continue returning the 9.1
version of this page, but improving that situation is a matter
for the www team.)

Discussion: https://postgr.es/m/164978902252.1276550.9330175733459697101@wrigleys.postgresql.org
2022-04-12 18:21:26 -04:00
David Rowley af3e3a9b4f Docs: avoid confusing use of the word "synchronized"
It's misleading to call the data directory the "synchronized data
directory" when discussing a crash scenario when using pg_rewind's
--no-sync option.  Here we just remove the word "synchronized" to avoid
any possible confusion.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
Backpatch-through: 12, where --no-sync was added
2022-04-13 09:17:56 +12:00
Tom Lane 989d3e4a29 Fix postgres_fdw to check shippability of sort clauses properly.
postgres_fdw would push ORDER BY clauses to the remote side without
verifying that the sort operator is safe to ship.  Moreover, it failed
to print a suitable USING clause if the sort operator isn't default
for the sort expression's type.  The net result of this is that the
remote sort might not have anywhere near the semantics we expect,
which'd be disastrous for locally-performed merge joins in particular.

We addressed similar issues in the context of ORDER BY within an
aggregate function call in commit 7012b132d, but failed to notice
that query-level ORDER BY was broken.  Thus, much of the necessary
logic already existed, but it requires refactoring to be usable
in both cases.

Back-patch to all supported branches.  In HEAD only, remove the
core code's copy of find_em_expr_for_rel, which is no longer used
and really should never have been pushed into equivclass.c in the
first place.

Ronan Dunklau, per report from David Rowley;
reviews by David Rowley, Ranier Vilela, and myself

Discussion: https://postgr.es/m/CAApHDvr4OeC2DBVY--zVP83-K=bYrTD7F8SZDhN4g+pj2f2S-A@mail.gmail.com
2022-03-31 14:29:24 -04:00
Tom Lane fcaf7d725d Add missing newline in one libpq error message.
Oversight in commit a59c79564.  Back-patch, as that was.
Noted by Peter Eisentraut.

Discussion: https://postgr.es/m/7f85ef6d-250b-f5ec-9867-89f0b16d019f@enterprisedb.com
2022-03-31 11:24:26 -04:00
Daniel Gustafsson c3a587994a doc: Fix typo in ANALYZE documentation
Commit 61fa6ca79b accidentally wrote constrast instead of contrast.

Backpatch-through: 10
Discussion: https://postgr.es/m/88903179-5ce2-3d4d-af43-7830372bdcb6@enterprisedb.com
2022-03-31 12:03:33 +02:00
Etsuro Fujita 81fd8085de Fix typo in comment. 2022-03-30 19:00:05 +09:00
Tomas Vondra b36c271913 Document autoanalyze limitations for partitioned tables
When dealing with partitioned tables, counters for partitioned tables
are not updated when modifying child tables. This means autoanalyze may
not update optimizer statistics for the parent relations, which can
result in poor plans for some queries.

It's worth documenting this limitation, so that people are aware of it
and can take steps to mitigate it (e.g. by setting up a script executing
ANALYZE regularly).

Backpatch to v10. Older branches are affected too, of couse, but we no
longer maintain those.

Author: Justin Pryzby
Reviewed-by: Zhihong Yu, Tomas Vondra
Backpatch-through: 10
Discussion: https://postgr.es/m/20210913035409.GA10647%40telsasoft.com
2022-03-28 14:31:21 +02:00
Andres Freund 5ebd262dcb waldump: fix use-after-free in search_directory().
After closedir() dirent->d_name is not valid anymore. As there alerady are a
few places relying on the limited lifetime of pg_waldump, do so here as well,
and just pg_strdup() the string.

The bug was introduced in fc49e24fa6.

Found by UBSan, run locally.

Backpatch: 11-, like fc49e24fa6 itself.
2022-03-27 18:15:15 -07:00
Michael Paquier 5ca2aa2f26 pageinspect: Add more sanity checks to prevent out-of-bound reads
A couple of code paths use the special area on the page passed by the
function caller, expecting to find some data in it.  However, feeding
an incorrect page can lead to out-of-bound reads when trying to access
the page special area (like a heap page that has no special area,
leading PageGetSpecialPointer() to grab a pointer outside the allocated
page).

The functions used for hash and btree indexes have some protection
already against that, while some other functions using a relation OID
as argument would make sure that the access method involved is correct,
but functions taking in input a raw page without knowing the relation
the page is attached to would run into problems.

This commit improves the set of checks used in the code paths of BRIN,
btree (including one check if a leaf page is found with a non-zero
level), GIN and GiST to verify that the page given in input has a
special area size that fits with each access method, which is done
though PageGetSpecialSize(), becore calling PageGetSpecialPointer().

The scope of the checks done is limited to work with pages that one
would pass after getting a block with get_raw_page(), as it is possible
to craft byteas that could bypass existing code paths.  Having too many
checks would also impact the usability of pageinspect, as the existing
code is very useful to look at the content details in a corrupted page,
so the focus is really to avoid out-of-bound reads as this is never a
good thing even with functions whose execution is limited to
superusers.

The safest approach could be to rework the functions so as these fetch a
block using a relation OID and a block number, but there are also cases
where using a raw page is useful.

Tests are added to cover all the code paths that needed such checks, and
an error message for hash indexes is reworded to fit better with what
this commit adds.

Reported-By: Alexander Lakhin
Author: Julien Rouhaud, Michael Paquier
Discussion: https://postgr.es/m/16527-ef7606186f0610a1@postgresql.org
Discussion: https://postgr.es/m/561e187b-3549-c8d5-03f5-525c14e65bd0@postgrespro.ru
Backpatch-through: 10
2022-03-27 17:53:59 +09:00
Tom Lane cb8586d051 Suppress compiler warning in relptr_store().
clang 13 with -Wextra warns that "performing pointer subtraction with
a null pointer has undefined behavior" in the places where freepage.c
tries to set a relptr variable to constant NULL.  This appears to be
a compiler bug, but it's unlikely to get fixed instantly.  Fortunately,
we can work around it by introducing an inline support function, which
seems like a good change anyway because it removes the macro's existing
double-evaluation hazard.

Backpatch to v10 where this code was introduced.

Patch by me, based on an idea of Andres Freund's.

Discussion: https://postgr.es/m/48826.1648310694@sss.pgh.pa.us
2022-03-26 14:29:52 -04:00
Tom Lane d09f765b9d Harden TAP tests that intentionally corrupt page checksums.
The previous method for doing that was to write zeroes into a
predetermined set of page locations.  However, there's a roughly
1-in-64K chance that the existing checksum will match by chance,
and yesterday several buildfarm animals started to reproducibly
see that, resulting in test failures because no checksum mismatch
was reported.

Since the checksum includes the page LSN, test success depends on
the length of the installation's WAL history, which is affected by
(at least) the initial catalog contents, the set of locales installed
on the system, and the length of the pathname of the test directory.
Sooner or later we were going to hit a chance match, and today is
that day.

Harden these tests by specifically inverting the checksum field and
leaving all else alone, thereby guaranteeing that the checksum is
incorrect.

In passing, fix places that were using seek() to set up for syswrite(),
a combination that the Perl docs very explicitly warn against.  We've
probably escaped problems because no regular buffered I/O is done on
these filehandles; but if it ever breaks, we wouldn't deserve or get
much sympathy.

Although we've only seen problems in HEAD, now that we recognize the
environmental dependencies it seems like it might be just a matter
of time until someone manages to hit this in back-branch testing.
Hence, back-patch to v11 where we started doing this kind of test.

Discussion: https://postgr.es/m/3192026.1648185780@sss.pgh.pa.us
2022-03-25 14:23:26 -04:00
Robert Haas 3821d66a7b Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.
If TRUNCATE causes some buffers to be invalidated and thus the
checkpoint does not flush them, TRUNCATE must also ensure that the
corresponding files are truncated on disk. Otherwise, a replay
from the checkpoint might find that the buffers exist but have
the wrong contents, which may cause replay to fail.

Report by Teja Mupparti. Patch by Kyotaro Horiguchi, per a design
suggestion from Heikki Linnakangas, with some changes to the
comments by me. Review of this and a prior patch that approached
the issue differently by Heikki Linnakangas, Andres Freund, Álvaro
Herrera, Masahiko Sawada, and Tom Lane.

Discussion: http://postgr.es/m/BYAPR06MB6373BF50B469CA393C614257ABF00@BYAPR06MB6373.namprd06.prod.outlook.com
2022-03-24 14:38:51 -04:00
Andres Freund 61a007feed Don't try to translate NULL in GetConfigOptionByNum().
Noticed via -fsanitize=undefined. Introduced when a few columns in
GetConfigOptionByNum() / pg_settings started to be translated in 72be8c29a /
PG 12.

Backpatch to all affected branches, for the same reasons as 46ab07ffda.

Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de
Backpatch: 12-
2022-03-23 13:17:59 -07:00
Andres Freund c5b60a68cc Don't call fwrite() with len == 0 when writing out relcache init file.
Noticed via -fsanitize=undefined.

Backpatch to all branches, for the same reasons as 46ab07ffda.

Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de
Backpatch: 10-
2022-03-23 13:13:33 -07:00
Andres Freund 6a767bc2e7 configure: check for dlsym instead of dlopen.
When building with sanitizers the sanitizer library provides dlopen, but not
dlsym(), making configure think that -ldl isn't needed. Just checking for
dlsym() ought to suffice, hard to see dlsym() being provided without dlopen()
also being provided.

Backpatch to all branches, for the same reasons as 46ab07ffda.

Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de
Backpatch: 10-
2022-03-23 12:43:38 -07:00
Alvaro Herrera 8e0c76be19
pg_upgrade: Upgrade an Assert to a real 'if' test
It seems possible for the condition being tested to be true in
production, and nobody would never know (except when some data
eventually becomes corrupt?).

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m//202109040001.zky3wgv2qeqg@alvherre.pgsql
2022-03-23 19:23:51 +01:00
Alvaro Herrera c714ebd0e8
Fix "missing continuation record" after standby promotion
Invalidate abortedRecPtr and missingContrecPtr after a missing
continuation record is successfully skipped on a standby. This fixes a
PANIC caused when a recently promoted standby attempts to write an
OVERWRITE_RECORD with an LSN of the previously read aborted record.

Backpatch to 10 (all stable versions).

Author: Sami Imseih <simseih@amazon.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/44D259DE-7542-49C4-8A52-2AB01534DCA9@amazon.com
2022-03-23 18:22:10 +01:00
Thomas Munro bd19ab257c Try to stabilize vacuum test.
As commits b700f96c and 3414099c did for the reloptions test, make
sure VACUUM can always truncate the table as expected.

Back-patch to 12, where vacuum_truncate arrived.

Discussion: https://postgr.es/m/CAD21AoCNoWjYkdEtr%2BVDoF9v__V905AedKZ9iF%3DArgCtrbxZqw%40mail.gmail.com
2022-03-23 15:08:20 +13:00
Andres Freund 4553b960f3 Add missing dependency of pg_dumpall to WIN32RES.
When cross-building to windows, or building with mingw on windows, the build
could fail with
  x86_64-w64-mingw32-gcc: error: win32ver.o: No such file or director
because pg_dumpall didn't depend on WIN32RES, but it's recipe references
it. The build nevertheless succeeded most of the time, due to
pg_dump/pg_restore having the required dependency, causing win32ver.o to be
built.

Reported-By: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA+hUKGJeekpUPWW6yCVdf9=oBAcCp86RrBivo4Y4cwazAzGPng@mail.gmail.com
Backpatch: 10-, omission present on all live branches
2022-03-22 08:28:53 -07:00
Michael Paquier 199ca68c92 Fix failures in SSL tests caused by out-of-tree keys and certificates
This issue is environment-sensitive, where the SSL tests could fail in
various way by feeding on defaults provided by sslcert, sslkey,
sslrootkey, sslrootcert, sslcrl and sslcrldir coming from a local setup,
as of ~/.postgresql/ by default.  Horiguchi-san has reported two
failures, but more advanced testing from me (aka inclusion of garbage
SSL configuration in ~/.postgresql/ for all the configuration
parameters) has showed dozens of failures that can be triggered in the
whole test suite.

History has showed that we are not good when it comes to address such
issues, fixing them locally like in dd87799, and such problems keep
appearing.  This commit strengthens the entire test suite to put an end
to this set of problems by embedding invalid default values in all the
connection strings used in the tests.  The invalid values are prefixed
in each connection string, relying on the follow-up values passed in the
connection string to enforce any invalid value previously set.  Note
that two tests related to CRLs are required to fail with certain pre-set
configurations, but we can rely on enforcing an empty value instead
after the invalid set of values.

Reported-by: Kyotaro Horiguchi
Reviewed-by: Andrew Dunstan, Daniel Gustafsson, Kyotaro Horiguchi
Discussion: https://postgr.es/m/20220316.163658.1122740600489097632.horikyota.ntt@gmail.com
backpatch-through: 10
2022-03-22 13:21:48 +09:00
Tom Lane 69c88e2fb0 Fix assorted missing logic for GroupingFunc nodes.
The planner needs to treat GroupingFunc like Aggref for many purposes,
in particular with respect to processing of the argument expressions,
which are not to be evaluated at runtime.  A few places hadn't gotten
that memo, notably including subselect.c's processing of outer-level
aggregates.  This resulted in assertion failures or wrong plans for
cases in which a GROUPING() construct references an outer aggregation
level.

Also fix missing special cases for GroupingFunc in cost_qual_eval
(resulting in wrong cost estimates for GROUPING(), although it's
not clear that that would affect plan shapes in practice) and in
ruleutils.c (resulting in excess parentheses in pretty-print mode).

Per bug #17088 from Yaoguang Chen.  Back-patch to all supported
branches.

Richard Guo, Tom Lane

Discussion: https://postgr.es/m/17088-e33882b387de7f5c@postgresql.org
2022-03-21 17:44:29 -04:00
Tom Lane d8d378d510 Fix risk of deadlock failure while dropping a partitioned index.
DROP INDEX needs to lock the index's table before the index itself,
else it will deadlock against ordinary queries that acquire the
relation locks in that order.  This is correctly mechanized for
plain indexes by RangeVarCallbackForDropRelation; but in the case of
a partitioned index, we neglected to lock the child tables in advance
of locking the child indexes.  We can fix that by traversing the
inheritance tree and acquiring the needed locks in RemoveRelations,
after we have acquired our locks on the parent partitioned table and
index.

While at it, do some refactoring to eliminate confusion between
the actual and expected relkind in RangeVarCallbackForDropRelation.
We can save a couple of syscache lookups too, by having that function
pass back info that RemoveRelations will need.

Back-patch to v11 where partitioned indexes were added.

Jimmy Yih, Gaurab Dey, Tom Lane

Discussion: https://postgr.es/m/BYAPR05MB645402330042E17D91A70C12BD5F9@BYAPR05MB6454.namprd05.prod.outlook.com
2022-03-21 12:22:13 -04:00