Commit Graph

38329 Commits

Author SHA1 Message Date
Tom Lane c47a558528 Ensure ParseTzFile() closes the input file after failing.
We hadn't noticed this because (a) few people feed invalid
timezone abbreviation files to the server, and (b) in typical
scenarios guc.c would throw ereport(ERROR) and then transaction
abort handling would silently clean up the leaked file reference.
However, it was possible to observe file leakage warnings if one
breaks an already-active abbreviation file, because guc.c does
not throw ERROR when loading supposedly-validated settings during
session start or SIGHUP processing.

Report and fix by Kyotaro Horiguchi (cosmetic adjustments by me)

Discussion: https://postgr.es/m/20220530.173740.748502979257582392.horikyota.ntt@gmail.com
2022-05-31 14:47:44 -04:00
Heikki Linnakangas f82595ac90 Fix COPY FROM when database encoding is SQL_ASCII.
In the codepath when no encoding conversion is required, the check for
incomplete character at the end of input incorrectly used server
encoding's max character length, instead of the client's. Usually the
server and client encodings are the same when we're not performing
encoding conversion, but SQL_ASCII is an exception.

In the passing, also fix some outdated comments that still talked about
the old COPY protocol. It was removed in v14.

Per bug #17501 from Vitaly Voronov. Backpatch to v14 where this was
introduced.

Discussion: https://www.postgresql.org/message-id/17501-128b1dd039362ae6@postgresql.org
2022-05-29 23:57:16 +03:00
Michael Paquier fe441a0319 Handle NULL for short descriptions of custom GUC variables
If a short description is specified as NULL in one of the various
DefineCustomXXXVariable() functions available to external modules to
define a custom parameter, SHOW ALL would crash.  This change teaches
SHOW ALL to properly handle NULL short descriptions, as well as any code
paths that manipulate it, to gain in flexibility.  Note that
help_config.c was already able to do that, when describing a set of GUCs
for postgres --describe-config.

Author: Steve Chavez
Reviewed by: Nathan Bossart, Andres Freund, Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/CAGRrpzY6hO-Kmykna_XvsTv8P2DshGiU6G3j8yGao4mk0CqjHA%40mail.gmail.com
Backpatch-through: 10
2022-05-28 12:12:46 +09:00
Tom Lane b4be4a082b Remove misguided SSL key file ownership check in libpq.
Commits a59c79564 et al. tried to sync libpq's SSL key file
permissions checks with what we've used for years in the backend.
We did not intend to create any new failure cases, but it turns out
we did: restricting the key file's ownership breaks cases where the
client is allowed to read a key file despite not having the identical
UID.  In particular a client running as root used to be able to read
someone else's key file; and having seen that I suspect that there are
other, less-dubious use cases that this restriction breaks on some
platforms.

We don't really need an ownership check, since if we can read the key
file despite its having restricted permissions, it must have the right
ownership --- under normal conditions anyway, and the point of this
patch is that any additional corner cases where that works should be
deemed allowable, as they have been historically.  Hence, just drop
the ownership check, and rearrange the permissions check to get rid
of its faulty assumption that geteuid() can't be zero.  (Note that the
comparable backend-side code doesn't have to cater for geteuid() == 0,
since the server rejects that very early on.)

This does have the end result that the permissions safety check used
for a root user's private key file is weaker than that used for
anyone else's.  While odd, root really ought to know what she's doing
with file permissions, so I think this is acceptable.

Per report from Yogendra Suralkar.  Like the previous patch,
back-patch to all supported branches.

Discussion: https://postgr.es/m/MW3PR15MB3931DF96896DC36D21AFD47CA3D39@MW3PR15MB3931.namprd15.prod.outlook.com
2022-05-26 14:14:05 -04:00
Tom Lane 6f7eec1193 Show 'AS "?column?"' explicitly when it's important.
ruleutils.c was coded to suppress the AS label for a SELECT output
expression if the column name is "?column?", which is the parser's
fallback if it can't think of something better.  This is fine, and
avoids ugly clutter, so long as (1) nothing further up in the parse
tree relies on that column name or (2) the same fallback would be
assigned when the rule or view definition is reloaded.  Unfortunately
(2) is far from certain, both because ruleutils.c might print the
expression in a different form from how it was originally written
and because FigureColname's rules might change in future releases.
So we shouldn't rely on that.

Detecting exactly whether there is any outer-level use of a SELECT
column name would be rather expensive.  This patch takes the simpler
approach of just passing down a flag indicating whether there *could*
be any outer use; for example, the output column names of a SubLink
are not referenceable, and we also do not care about the names exposed
by the right-hand side of a setop.  This is sufficient to suppress
unwanted clutter in all but one case in the regression tests.  That
seems like reasonable evidence that it won't be too much in users'
faces, while still fixing the cases we need to fix.

Per bug #17486 from Nicolas Lutic.  This issue is ancient, so
back-patch to all supported branches.

Discussion: https://postgr.es/m/17486-1ad6fd786728b8af@postgresql.org
2022-05-21 14:45:58 -04:00
Alvaro Herrera 58b088a9b3
Fix DDL deparse of CREATE OPERATOR CLASS
When an implicit operator family is created, it wasn't getting reported.
Make it do so.

This has always been missing.  Backpatch to 10.

Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-by: Leslie LEMAIRE <leslie.lemaire@developpement-durable.gouv.fr>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Michael Paquiër <michael@paquier.xyz>
Discussion: https://postgr.es/m/f74d69e151b22171e8829551b1159e77@developpement-durable.gouv.fr
2022-05-20 18:52:55 +02:00
Alvaro Herrera 8d9d1286ac
Repurpose PROC_COPYABLE_FLAGS as PROC_XMIN_FLAGS
This is a slight, convenient semantics change from what commit
0f0cfb4940 ("Fix parallel operations that prevent oldest xmin from
advancing") introduced that lets us simplify the coding in the one place
where it is used.

Backpatch to 13.  This is related to commit 6fea65508a ("Tighten
ComputeXidHorizons' handling of walsenders") rewriting the code site
where this is used, which has not yet been backpatched, but it may well
be in the future.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/202204191637.eldwa2exvguw@alvherre.pgsql
2022-05-19 16:20:32 +02:00
David Rowley 3f712ea6dc Fix incorrect comments for Memoize struct
Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/0635f5aa-4973-8dc2-4e4e-df9fd5778a65@enterprisedb.com
Backpatch-through: 14, where Memoize was added
2022-05-19 17:14:56 +12:00
Alvaro Herrera e8b93c6e28
Update xml_1.out and xml_2.out
Commit 0fbf011200 should have updated them but didn't.
2022-05-18 23:19:53 +02:00
Alvaro Herrera 94edb85d25
Check column list length in XMLTABLE/JSON_TABLE alias
We weren't checking the length of the column list in the alias clause of
an XMLTABLE or JSON_TABLE function (a "tablefunc" RTE), and it was
possible to make the server crash by passing an overly long one.  Fix it
by throwing an error in that case, like the other places that deal with
alias lists.

In passing, modify the equivalent test used for join RTEs to look like
the other ones, which was different for no apparent reason.

This bug came in when XMLTABLE was born in version 10; backpatch to all
stable versions.

Reported-by: Wang Ke <krking@zju.edu.cn>
Discussion: https://postgr.es/m/17480-1c9d73565bb28e90@postgresql.org
2022-05-18 20:28:31 +02:00
David Rowley 23c2b76a83 Fix incorrect row estimates used for Memoize costing
In order to estimate the cache hit ratio of a Memoize node, one of the
inputs we require is the estimated number of times the Memoize node will
be rescanned.  The higher this number, the large the cache hit ratio is
likely to become.  Unfortunately, the value being passed as the number of
"calls" to the Memoize was incorrectly using the Nested Loop's
outer_path->parent->rows instead of outer_path->rows.  This failed to
account for the fact that the outer_path might be parameterized by some
upper-level Nested Loop.

This problem could lead to Memoize plans appearing more favorable than
they might actually be.  It could also lead to extended executor startup
times when work_mem values were large due to the planner setting overly
large MemoizePath->est_entries resulting in the Memoize hash table being
initially made much larger than might be required.

Fix this simply by passing outer_path->rows rather than
outer_path->parent->rows.  Also, adjust the expected regression test
output for a plan change.

Reported-by: Pavel Stehule
Author: David Rowley
Discussion: https://postgr.es/m/CAFj8pRAMp%3DQsMi6sPQJ4W3hczoFJRvyXHJV3AZAZaMyTVM312Q%40mail.gmail.com
Backpatch-through: 14, where Memoize was introduced
2022-05-16 16:08:37 +12:00
Michael Paquier 6dced63b41 Fix control file update done in restartpoints still running after promotion
If a cluster is promoted (aka the control file shows a state different
than DB_IN_ARCHIVE_RECOVERY) while CreateRestartPoint() is still
processing, this function could miss an update of the control file for
"checkPoint" and "checkPointCopy" but still do the recycling and/or
removal of the past WAL segments, assuming that the to-be-updated LSN
values should be used as reference points for the cleanup.  This causes
a follow-up restart attempting crash recovery to fail with a PANIC on a
missing checkpoint record if the end-of-recovery checkpoint triggered by
the promotion did not complete while the cluster abruptly stopped or
crashed before the completion of this checkpoint.  The PANIC would be
caused by the redo LSN referred in the control file as located in a
segment already gone, recycled by the previous restartpoint with
"checkPoint" out-of-sync in the control file.

This commit fixes the update of the control file during restartpoints so
as "checkPoint" and "checkPointCopy" are updated even if the cluster has
been promoted while a restartpoint is running, to be on par with the set
of WAL segments actually recycled in the end of CreateRestartPoint().

7863ee4 has fixed this problem already on master, but the release timing
of the latest point versions did not let me enough time to study and fix
that on all the stable branches.

Reported-by: Fujii Masao, Rui Zhao
Author: Kyotaro Horiguchi
Reviewed-by: Nathan Bossart, Michael Paquier
Discussion: https://postgr.es/m/20220316.102444.2193181487576617583.horikyota.ntt@gmail.com
Backpatch-through: 10
2022-05-16 11:26:22 +09:00
Tom Lane ac51c9fba5 Make pull_var_clause() handle GroupingFuncs exactly like Aggrefs.
This follows in the footsteps of commit 2591ee8ec by removing one more
ill-advised shortcut from planning of GroupingFuncs.  It's true that
we don't intend to execute the argument expression(s) at runtime, but
we still have to process any Vars appearing within them, or we risk
failure at setrefs.c time (or more fundamentally, in EXPLAIN trying
to print such an expression).  Vars in upper plan nodes have to have
referents in the next plan level, whether we ever execute 'em or not.

Per bug #17479 from Michael J. Sullivan.  Back-patch to all supported
branches.

Richard Guo

Discussion: https://postgr.es/m/17479-6260deceaf0ad304@postgresql.org
2022-05-12 11:31:46 -04:00
Amit Kapila d6da71fa8f Fix the logical replication timeout during large transactions.
The problem is that we don't send keep-alive messages for a long time
while processing large transactions during logical replication where we
don't send any data of such transactions. This can happen when the table
modified in the transaction is not published or because all the changes
got filtered. We do try to send the keep_alive if necessary at the end of
the transaction (via WalSndWriteData()) but by that time the
subscriber-side can timeout and exit.

To fix this we try to send the keepalive message if required after
processing certain threshold of changes.

Reported-by: Fabrice Chapuis
Author: Wang wei and Amit Kapila
Reviewed By: Masahiko Sawada, Euler Taveira, Hou Zhijie, Hayato Kuroda
Backpatch-through: 10
Discussion: https://postgr.es/m/CAA5-nLARN7-3SLU_QUxfy510pmrYK6JJb=bk3hcgemAM_pAv+w@mail.gmail.com
2022-05-11 10:51:04 +05:30
Michael Paquier ca9e9b08e4 Improve setup of environment values for commands in MSVC's vcregress.pl
The current setup assumes that commands for lz4, zstd and gzip always
exist by default if not enforced by a user's environment.  However,
vcpkg, as one example, installs libraries but no binaries, so this
default setup to assume that a command should always be present would
cause failures.  This commit improves the detection of such external
commands as follows:
* If a ENV value is available, trust the environment/user and use it.
* If a ENV value is not available, check its execution by looking in the
current PATH, by launching a simple "$command --version" (that should be
portable enough).
** On execution failure, ignore ENV{command}.
** On execution success, set ENV{command} = "$command".

Note that this new rule applies to gzip, lz4 and zstd but not tar that
we assume will always exist.  Those commands are set up in the
environment only when using bincheck and taptest.  The CI includes all
those commands and I have checked that their setup is correct there.  I
have also tested this change in a MSVC environment where we have none of
those commands.

While on it, remove the references to lz4 from the documentation and
vcregress.pl in ~v13.  --with-lz4 has been added in v14~ so there is no
point to have this information in these older branches.

Reported-by: Andrew Dunstan
Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/14402151-376b-a57a-6d0c-10ad12608e12@dunslane.net
Backpatch-through: 10
2022-05-11 10:22:29 +09:00
Tom Lane ab2f783921 Fix core dump in transformValuesClause when there are no columns.
The parser code that transformed VALUES from row-oriented to
column-oriented lists failed if there were zero columns.
You can't write that straightforwardly (though probably you
should be able to), but the case can be reached by expanding
a "tab.*" reference to a zero-column table.

Per bug #17477 from Wang Ke.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/17477-0af3c6ac6b0a6ae0@postgresql.org
2022-05-09 14:15:37 -04:00
Tom Lane 9b5797ca54 Revert "Disallow infinite endpoints in generate_series() for timestamps."
This reverts commit eafdf9de06
and its back-branch counterparts.  Corey Huinker pointed out that
we'd discussed this exact change back in 2016 and rejected it,
on the grounds that there's at least one usage pattern with LIMIT
where an infinite endpoint can usefully be used.  Perhaps that
argument needs to be re-litigated, but there's no time left before
our back-branch releases.  To keep our options open, restore the
status quo ante; if we do end up deciding to change things, waiting
one more quarter won't hurt anything.

Rather than just doing a straight revert, I added a new test case
demonstrating the usage with LIMIT.  That'll at least remind us of
the issue if we forget again.

Discussion: https://postgr.es/m/3603504.1652068977@sss.pgh.pa.us
Discussion: https://postgr.es/m/CADkLM=dzw0Pvdqp5yWKxMd+VmNkAMhG=4ku7GnCZxebWnzmz3Q@mail.gmail.com
2022-05-09 11:40:40 -04:00
Noah Misch 677a494789 In REFRESH MATERIALIZED VIEW, set user ID before running user code.
It intended to, but did not, achieve this.  Adopt the new standard of
setting user ID just after locking the relation.  Back-patch to v10 (all
supported versions).

Reviewed by Simon Riggs.  Reported by Alvaro Herrera.

Security: CVE-2022-1552
2022-05-09 08:35:12 -07:00
Noah Misch ab49ce7c34 Make relation-enumerating operations be security-restricted operations.
When a feature enumerates relations and runs functions associated with
all found relations, the feature's user shall not need to trust every
user having permission to create objects.  BRIN-specific functionality
in autovacuum neglected to account for this, as did pg_amcheck and
CLUSTER.  An attacker having permission to create non-temp objects in at
least one schema could execute arbitrary SQL functions under the
identity of the bootstrap superuser.  CREATE INDEX (not a
relation-enumerating operation) and REINDEX protected themselves too
late.  This change extends to the non-enumerating amcheck interface.
Back-patch to v10 (all supported versions).

Sergey Shinderuk, reviewed (in earlier versions) by Alexander Lakhin.
Reported by Alexander Lakhin.

Security: CVE-2022-1552
2022-05-09 08:35:12 -07:00
Peter Eisentraut e5b5a21356 Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: b7586f1542a8ffdfd1416e425f55e4e89c9a9505
2022-05-09 12:26:57 +02:00
Andres Freund 55e5a5e0fa Disable 031_recovery_conflict.pl until after minor releases.
f40d362a66 disabled part of 031_recovery_conflict.pl due to instability
that's not trivial to fix in the back branches. That fixed most of the
issues. But there was one more failure (on lapwing / REL_10_STABLE).

That failure looks like it might be caused by a genuine problem. Disable the
test until after the set of releases, to avoid packagers etc potentially
having to fight with a test failure they can't do anything about.

Discussion: https://postgr.es/m/3447060.1652032749@sss.pgh.pa.us
Backpatch: 10-14
2022-05-08 18:06:19 -07:00
Andres Freund f40d362a66 Temporarily skip recovery deadlock test in back branches.
The recovery deadlock test has a timing issue that was fixed in 5136967f1e in
HEAD. Unfortunately the same fix doesn't quite work in the back branches: 1)
adjust_conf() doesn't exist, which is easy enough to work around 2) a restart
cleares the recovery conflict stats < 15.

These issues can be worked around, but given the upcoming set of minor
releases, skip the problematic test for now. The buildfarm doesn't show
failures in other parts of 031_recovery_conflict.pl.

Discussion: https://postgr.es/m/20220506155827.dfnaheq6ufylwrqf@alap3.anarazel.de
Backpatch: 10-14
2022-05-06 09:01:08 -07:00
Andres Freund 7fa95bb0ac Backpatch addition of pump_until() more completely.
In a2ab9c06ea I just backpatched the introduction of pump_until(), without
changing the existing local definitions (as 6da65a3f9a). The necessary
changes seemed more verbose than desirable. However, that leads to warnings,
as I failed to realize...

Backpatch to all versions containing pump_until() calls before
f74496dd61 (there's none in 10).

Discussion: https://postgr.es/m/2808491.1651802860@sss.pgh.pa.us
Discussion: https://postgr.es/m/18b37361-b482-b9d8-f30d-6115cd5ce25c@enterprisedb.com
Backpatch: 11-14
2022-05-06 08:38:19 -07:00
Tom Lane 77ee14ed96 Update time zone data files to tzdata release 2022a.
DST law changes in Palestine.  Historical corrections for
Chile and Ukraine.
2022-05-05 14:55:03 -04:00
Andres Freund 6e2924b577 Revert "Fix timing issue in deadlock recovery conflict test."
This reverts commit 5136967f1e.
2022-05-04 14:20:24 -07:00
Andres Freund 5136967f1e Fix timing issue in deadlock recovery conflict test.
Per buildfarm members longfin and skink.

Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de
Backpatch: 10-
2022-05-04 12:57:15 -07:00
Andres Freund f74496dd61 Backpatch 031_recovery_conflict.pl.
The prior commit showed that the introduction of recovery conflict tests was a
good idea. Without these tests it's hard to know that the fix didn't break
something...

031_recovery_conflict.pl was introduced in 9f8a050f68 and extended in
21e184403b.

Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de
Backpatch: 10-14
2022-05-02 18:26:09 -07:00
Andres Freund 9ab3b2bdbb Fix possibility of self-deadlock in ResolveRecoveryConflictWithBufferPin().
The tests added in 9f8a050f68 failed nearly reliably on FreeBSD in CI, and
occasionally on the buildfarm. That turns out to be caused not by a bug in the
test, but by a longstanding bug in recovery conflict handling.

The standby timeout handler, used by ResolveRecoveryConflictWithBufferPin(),
executed SendRecoveryConflictWithBufferPin() inside a signal handler. A bad
idea, because the deadlock timeout handler (or a spurious latch set) could
have interrupted ProcWaitForSignal(). If unlucky that could cause a
self-deadlock on ProcArrayLock, if the deadlock check is in
SendRecoveryConflictWithBufferPin()->CancelDBBackends().

To fix, set a flag in StandbyTimeoutHandler(), and check the flag in
ResolveRecoveryConflictWithBufferPin().

Subsequently the recovery conflict tests will be backpatched.

Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de
Backpatch: 10-
2022-05-02 18:25:59 -07:00
Andres Freund 5ab8e80148 Backpatch addition of wait_for_log(), pump_until().
These were originally introduced in a2ab9c06ea and a2ab9c06ea, as they are
needed by a about-to-be-backpatched test.

Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de
Backpatch: 10-14
2022-05-02 18:09:42 -07:00
Etsuro Fujita 24c58f7a2a Fix typo in comment. 2022-05-02 16:45:02 +09:00
Etsuro Fujita ebb7902415 Disable asynchronous execution if using gating Result nodes.
mark_async_capable_plan(), which is called from create_append_plan() to
determine whether subplans are async-capable, failed to take into
account that the given subplan created from a given subpath might
include a gating Result node if the subpath is a SubqueryScanPath or
ForeignPath, causing a segmentation fault there when the subplan created
from a SubqueryScanPath includes the Result node, or causing
ExecAsyncRequest() to throw an error about an unrecognized node type
when the subplan created from a ForeignPath includes the Result node,
because in the latter case the Result node was unintentionally
considered as async-capable, but we don't currently support executing
Result nodes asynchronously.  Fix by modifying mark_async_capable_plan()
to disable asynchronous execution in such cases.  Also, adjust code in
the ProjectionPath case in mark_async_capable_plan(), for consistency
with other cases, and adjust/improve comments there.

is_async_capable_path() added in commit 27e1f1456, which was rewritten
to mark_async_capable_plan() in a later commit, has the same issue,
causing the error at execution mentioned above, so back-patch to v14
where the aforesaid commit went in.

Per report from Justin Pryzby.

Etsuro Fujita, reviewed by Zhihong Yu and Justin Pryzby.

Discussion: https://postgr.es/m/20220408124338.GK24419%40telsasoft.com
2022-04-28 15:15:02 +09:00
Andrew Dunstan 71f394667c Inhibit mingw CRT's auto-globbing of command line arguments
For some reason by default the mingw C Runtime takes it upon itself to
expand program arguments that look like shell globbing characters. That
has caused much scratching of heads and mis-attribution of the causes of
some TAP test failures, so stop doing that.

This removes an inconsistency with Windows binaries built with MSVC,
which have no such behaviour.

Per suggestion from Noah Misch.

Backpatch to all live branches.

Discussion: https://postgr.es/m/20220423025927.GA1274057@rfd.leadboat.com
2022-04-25 15:49:35 -04:00
Robert Haas 75a006beef Remove some recently-added pg_dump test cases.
Commit d2d3547979 included a pretty
extensive set of test cases, and some of them don't work on all
of our Windows machines. This happens because IPC::Run expands
its arguments as shell globs on a few machines, but doesn't on most
of the buildfarm. It might be good to fix that problem systematically
somehow, but in the meantime, there are enough test cases for this
commit that it seems OK to just remove the ones that are failing.

Discussion: http://postgr.es/m/3a190754-b2b0-d02b-dcfd-4ec1610ffbcb@dunslane.net
Discussion: http://postgr.es/m/CA+TgmoYRGUcFBy6VgN0+Pn4f6Wv=2H0HZLuPHqSy6VC8Ba7vdg@mail.gmail.com
2022-04-25 09:14:19 -04:00
Tom Lane dff6c77faf Fix incautious CTE matching in rewriteSearchAndCycle().
This function looks for a reference to the recursive WITH CTE,
but it checked only the CTE name not ctelevelsup, so that it could
seize on a lower CTE that happened to have the same name.  This
would result in planner failures later, either weird errors such as
"could not find attribute 2 in subquery targetlist", or crashes
or assertion failures.  The code also merely Assert'ed that it found
a matching entry, which is not guaranteed at all by the parser.

Per bugs #17320 and #17318 from Zhiyong Wu.
Thanks to Kyotaro Horiguchi for investigation.

Discussion: https://postgr.es/m/17320-70e37868182512ab@postgresql.org
Discussion: https://postgr.es/m/17318-2eb65a3a611d2368@postgresql.org
2022-04-23 12:16:12 -04:00
Tom Lane da22ef388a Remove inadequate assertion check in CTE inlining.
inline_cte() expected to find exactly as many references to the
target CTE as its cterefcount indicates.  While that should be
accurate for the tree as emitted by the parser, there are some
optimizations that occur upstream of here that could falsify it,
notably removal of unused subquery output expressions.

Trying to make the accounting 100% accurate seems expensive and
doomed to future breakage.  It's not really worth it, because
all this code is protecting is downstream assumptions that every
referenced CTE has a plan.  Let's convert those assertions to
regular test-and-elog just in case there's some actual problem,
and then drop the failing assertion.

Per report from Tomas Vondra (thanks also to Richard Guo for
analysis).  Back-patch to v12 where the faulty code came in.

Discussion: https://postgr.es/m/29196a1e-ed47-c7ca-9be2-b1c636816183@enterprisedb.com
2022-04-21 17:58:52 -04:00
Andrew Dunstan b235d41d96 Support new perl module namespace in stable branches
Commit b3b4d8e68a moved our perl test modules to a better namespace
structure, but this has made life hard for people wishing to backpatch
improvements in the TAP tests. Here we alleviate much of that difficulty
by implementing the new module names on top of the old modules, mostly
by using a little perl typeglob aliasing magic, so that we don't have a
dual maintenance burden. This should work both for the case where a new
test is backpatched and the case where a fix to an existing test that
uses the new namespace is backpatched.

Reviewed by Michael Paquier

Per complaint from Andres Freund

Discussion: https://postgr.es/m/20220418141530.nfxtkohefvwnzncl@alap3.anarazel.de

Applied to branches 10 through 14
2022-04-21 09:01:26 -04:00
Peter Geoghegan e4521841a1 Fix CLUSTER tuplesorts on abbreviated expressions.
CLUSTER sort won't use the datum1 SortTuple field when clustering
against an index whose leading key is an expression.  This makes it
unsafe to use the abbreviated keys optimization, which was missed by the
logic that sets up SortSupport state.  Affected tuplesorts output tuples
in a completely bogus order as a result (the wrong SortSupport based
comparator was used for the leading attribute).

This issue is similar to the bug fixed on the master branch by recent
commit cc58eecc5d.  But it's a far older issue, that dates back to the
introduction of the abbreviated keys optimization by commit 4ea51cdfe8.

Backpatch to all supported versions.

Author: Peter Geoghegan <pg@bowt.ie>
Author: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA+hUKG+bA+bmwD36_oDxAoLrCwZjVtST2fqe=b4=qZcmU7u89A@mail.gmail.com
Backpatch: 10-
2022-04-20 17:17:41 -07:00
Tom Lane e346329470 Disallow infinite endpoints in generate_series() for timestamps.
Such cases will lead to infinite loops, so they're of no practical
value.  The numeric variant of generate_series() already threw error
for this, so borrow its message wording.

Per report from Richard Wesley.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/91B44E7B-68D5-448F-95C8-B4B3B0F5DEAF@duckdblabs.com
2022-04-20 18:08:24 -04:00
Robert Haas 4a66300acd Allow db.schema.table patterns, but complain about random garbage.
psql, pg_dump, and pg_amcheck share code to process object name
patterns like 'foo*.bar*' to match all tables with names starting in
'bar' that are in schemas starting with 'foo'. Before v14, any number
of extra name parts were silently ignored, so a command line '\d
foo.bar.baz.bletch.quux' was interpreted as '\d bletch.quux'.  In v14,
as a result of commit 2c8726c4b0, we
instead treated this as a request for table quux in a schema named
'foo.bar.baz.bletch'. That caused problems for people like Justin
Pryzby who were accustomed to copying strings of the form
db.schema.table from messages generated by PostgreSQL itself and using
them as arguments to \d.

Accordingly, revise things so that if an object name pattern contains
more parts than we're expecting, we throw an error, unless there's
exactly one extra part and it matches the current database name.
That way, thisdb.myschema.mytable is accepted as meaning just
myschema.mytable, but otherdb.myschema.mytable is an error, and so
is some.random.garbage.myschema.mytable.

Mark Dilger, per report from Justin Pryzby and discussion among
various people.

Discussion: https://www.postgresql.org/message-id/20211013165426.GD27491%40telsasoft.com
2022-04-20 11:39:44 -04:00
Tom Lane 08a9e7a8c7 Fix breakage in AlterFunction().
An ALTER FUNCTION command that tried to update both the function's
proparallel property and its proconfig list failed to do the former,
because it stored the new proparallel value into a tuple that was
no longer the interesting one.  Carelessness in 7aea8e4f2.

(I did not bother with a regression test, because the only likely
future breakage would be for someone to ignore the comment I added
and add some other field update after the heap_modify_tuple step.
A test using existing function properties could not catch that.)

Per report from Bryn Llewellyn.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/8AC9A37F-99BD-446F-A2F7-B89AD0022774@yugabyte.com
2022-04-19 23:03:59 -04:00
Peter Eisentraut 7a8d8219cc Fix extract epoch from interval calculation
The new numeric code for extract epoch from interval accidentally
truncated the DAYS_PER_YEAR value to an integer, leading to results
that mismatched the floating-point interval_part calculations.

The commit a2da77cdb4 that introduced
this actually contains the regression test change that this reverts.
I suppose this was missed at the time.

Reported-by: Joseph Koshakow <koshy44@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/CAAvxfHd5n%3D13NYA2q_tUq%3D3%3DSuWU-CufmTf-Ozj%3DfrEgt7pXwQ%40mail.gmail.com
2022-04-19 21:03:27 +02:00
Amit Kapila c9dea58e27 Fix the check to limit sync workers.
We don't allow to invoke more sync workers once we have reached the sync
worker limit per subscription. But the check to enforce this also doesn't
allow to launch an apply worker if it gets restarted.

This code was introduced by commit de43897122 but we caught the problem
only with the test added by recent commit c91f71b9dc which started failing
occasionally in the buildfarm.

As per buildfarm.
Diagnosed-by: Amit Kapila, Masahiko Sawada, Tomas Vondra
Author: Amit Kapila
Backpatch-through: 10
Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com
	    https://postgr.es/m/f90d2b03-4462-ce95-a524-d91464e797c8@enterprisedb.com
2022-04-19 08:54:37 +05:30
Tom Lane e805735a83 Avoid invalid array reference in transformAlterTableStmt().
Don't try to look at the attidentity field of system attributes,
because they're not there in the TupleDescAttr array.  Sometimes
this is harmless because we accidentally pick up a zero, but
otherwise we'll report "no owned sequence found" from an attempt
to alter a system attribute.  (It seems possible that a SIGSEGV
could occur, too, though I've not seen it in testing.)

It's not in this function's charter to complain that you can't
alter a system column, so instead just hard-wire an assumption
that system attributes aren't identities.  I didn't bother with
a regression test because the appearance of the bug is very
erratic.

Per bug #17465 from Roman Zharkov.  Back-patch to all supported
branches.  (There's not actually a live bug before v12, because
before that get_attidentity() did the right thing anyway.
But for consistency I changed the test in the older branches too.)

Discussion: https://postgr.es/m/17465-f2a554a6cb5740d3@postgresql.org
2022-04-18 12:16:45 -04:00
Michael Paquier 8bcf90c7a6 Fix race in TAP test 002_archiving.pl when restoring history file
This test, introduced in df86e52, uses a second standby to check that
it is able to remove correctly RECOVERYHISTORY and RECOVERYXLOG at the
end of recovery.  This standby uses the archives of the primary to
restore its contents, with some of the archive's contents coming from
the first standby previously promoted.  In slow environments, it was
possible that the test did not check what it should, as the history file
generated by the promotion of the first standby may not be stored yet on
the archives the second standby feeds on.  So, it could be possible that
the second standby selects an incorrect timeline, without restoring a
history file at all.

This commits adds a wait phase to make sure that the history file
required by the second standby is archived before this cluster is
created.  This relies on poll_query_until() with pg_stat_file() and an
absolute path, something not supported in REL_10_STABLE.

While on it, this adds a new test to check that the history file has
been restored by looking at the logs of the second standby.  This
ensures that a RECOVERYHISTORY, whose removal needs to be checked,
is created in the first place.  This should make the test more robust.

This test has been introduced by df86e52, but it came in light as an
effect of the bug fixed by acf1dd42, where the extra restore_command
calls made the test much slower.

Reported-by: Andres Freund
Discussion: https://postgr.es/m/YlT23IvsXkGuLzFi@paquier.xyz
Backpatch-through: 11
2022-04-18 11:40:14 +09:00
Noah Misch acd0eb635e Add a temp-install prerequisite to src/interfaces/ecpg "checktcp".
The target failed, tested $PATH binaries, or tested a stale temporary
installation.  Commit c66b438db6 missed
this.  Back-patch to v10 (all supported versions).
2022-04-16 17:43:57 -07:00
Robert Haas 10520f4346 Rethink the delay-checkpoint-end mechanism in the back-branches.
The back-patch of commit bbace5697d had
the unfortunate effect of changing the layout of PGPROC in the
back-branches, which could break extensions. This happened because it
changed the delayChkpt from type bool to type int. So, change it back,
and add a new bool delayChkptEnd field instead. The new field should
fall within what used to be padding space within the struct, and so
hopefully won't cause any extensions to break.

Per report from Markus Wanner and discussion with Tom Lane and others.

Patch originally by me, somewhat revised by Markus Wanner per a
suggestion from Michael Paquier. A very similar patch was developed
by Kyotaro Horiguchi, but I failed to see the email in which that was
posted before writing one of my own.

Discussion: http://postgr.es/m/CA+Tgmoao-kUD9c5nG5sub3F7tbo39+cdr8jKaOVEs_1aBWcJ3Q@mail.gmail.com
Discussion: http://postgr.es/m/20220406.164521.17171257901083417.horikyota.ntt@gmail.com
2022-04-14 11:10:07 -04:00
Tom Lane c590e514a9 Prevent access to no-longer-pinned buffer in heapam_tuple_lock().
heap_fetch() used to have a "keep_buf" parameter that told it to return
ownership of the buffer pin to the caller after finding that the
requested tuple TID exists but is invisible to the specified snapshot.
This was thoughtlessly removed in commit 5db6df0c0, which broke
heapam_tuple_lock() (formerly EvalPlanQualFetch) because that function
needs to do more accesses to the tuple even if it's invisible.  The net
effect is that we would continue to touch the page for a microsecond or
two after releasing pin on the buffer.  Usually no harm would result;
but if a different session decided to defragment the page concurrently,
we could see garbage data and mistakenly conclude that there's no newer
tuple version to chain up to.  (It's hard to say whether this has
happened in the field.  The bug was actually found thanks to a later
change that allowed valgrind to detect accesses to non-pinned buffers.)

The most reasonable way to fix this is to reintroduce keep_buf,
although I made it behave slightly differently: buffer ownership
is passed back only if there is a valid tuple at the requested TID.
In HEAD, we can just add the parameter back to heap_fetch().
To avoid an API break in the back branches, introduce an additional
function heap_fetch_extended() in those branches.

In HEAD there is an additional, less obvious API change: tuple->t_data
will be set to NULL in all cases where buffer ownership is not returned,
in particular when the tuple exists but fails the time qual (and
!keep_buf).  This is to defend against any other callers attempting to
access non-pinned buffers.  We concluded that making that change in back
branches would be more likely to introduce problems than cure any.

In passing, remove a comment about heap_fetch that was obsoleted by
9a8ee1dc6.

Per bug #17462 from Daniil Anisimov.  Back-patch to v12 where the bug
was introduced.

Discussion: https://postgr.es/m/17462-9c98a0f00df9bd36@postgresql.org
2022-04-13 13:35:02 -04:00
Tom Lane a65747b1c7 Suppress "variable 'pagesaving' set but not used" warning.
With asserts disabled, late-model clang notices that this variable
is incremented but never otherwise read.

Discussion: https://postgr.es/m/3171401.1649275153@sss.pgh.pa.us
2022-04-06 17:03:35 -04:00
Tom Lane 9a7229948c Remove race condition in 022_crash_temp_files.pl test.
It's possible for the query that "waits for restart" to complete a
successful iteration before the postmaster has noticed its SIGKILL'd
child and begun the restart cycle.  (This is a bit hard to believe
perhaps, but it's been seen at least twice in the buildfarm, mainly
on ancient platforms that likely have quirky schedulers.)

To provide a more secure interlock, wait for the other session
we're using to report that it's been forcibly shut down.

Patch by me, based on a suggestion from Andres Freund.
Back-patch to v14 where this test case came in.

Discussion: https://postgr.es/m/1801850.1649047827@sss.pgh.pa.us
2022-04-05 20:44:01 -04:00
Tom Lane 8803df4ea9 Update some tests in 013_crash_restart.pl.
The expected backend message after SIGQUIT changed in commit
7e784d1dc, but we missed updating this test case.  Also, experience
shows that we might sometimes get "could not send data to server"
instead of either of the libpq messages the test is looking for.

Per report from Mark Dilger.  Back-patch to v14 where the
backend message changed.

Discussion: https://postgr.es/m/17BD82D7-49AC-40C9-8204-E7ADD30321A0@enterprisedb.com
2022-04-04 22:10:07 -04:00
Peter Eisentraut d480ae069e Remove obsolete comment
accidentally left behind by 4cb658af70
2022-04-02 07:28:11 +02:00
Peter Eisentraut 7a27892750 libpq: Fix pkg-config without OpenSSL
Do not add OpenSSL dependencies to libpq pkg-config file if OpenSSL is
not enabled.  Oversight in beff361bc1.

Author: Fabrice Fontaine <fontaine.fabrice@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/20220331163759.32665-1-fontaine.fabrice%40gmail.com
2022-04-01 17:12:56 +02:00
Tom Lane 402279afe4 Add missing newline in one libpq error message.
Oversight in commit a59c79564.  Back-patch, as that was.
Noted by Peter Eisentraut.

Discussion: https://postgr.es/m/7f85ef6d-250b-f5ec-9867-89f0b16d019f@enterprisedb.com
2022-03-31 11:24:26 -04:00
Etsuro Fujita 637afee327 Fix typo in comment. 2022-03-30 19:00:02 +09:00
Alvaro Herrera adc943b4e1
Revert "Fix replay of create database records on standby"
This reverts commit 49d9cfc68b.  The approach taken by this patch has
problems, so we'll come up with a radically different fix.

Discussion: https://postgr.es/m/CA+TgmoYcUPL+WOJL2ZzhH=zmrhj0iOQ=iCFM0SuYqBbqZEamEg@mail.gmail.com
2022-03-29 15:36:21 +02:00
Andres Freund c1a0d7d1c4 Fix NULL input behaviour of pg_stat_get_replication_slot().
pg_stat_get_replication_slot() accidentally was marked as non-strict, crashing
when called with NULL input. As it's already released, introduce an explicit
NULL check in 14, fix the catalog in HEAD.

Bumps catversion in HEAD.

Discussion: https://postgr.es/m/20220326212432.s5n2maw6kugnpyxw@alap3.anarazel.de
Backpatch: 14-, where replication slot stats were introduced
2022-03-27 21:44:39 -07:00
Andres Freund 6839aa7a69 waldump: fix use-after-free in search_directory().
After closedir() dirent->d_name is not valid anymore. As there alerady are a
few places relying on the limited lifetime of pg_waldump, do so here as well,
and just pg_strdup() the string.

The bug was introduced in fc49e24fa6.

Found by UBSan, run locally.

Backpatch: 11-, like fc49e24fa6 itself.
2022-03-27 18:15:10 -07:00
Tom Lane 3f7a59c59b Fix breakage of get_ps_display() in the PS_USE_NONE case.
Commit 8c6d30f21 caused this function to fail to set *displen
in the PS_USE_NONE code path.  If the variable's previous value
had been negative, that'd lead to a memory clobber at some call
sites.  We'd managed not to notice due to very thin test coverage
of such configurations, but this appears to explain buildfarm member
lorikeet's recent struggles.

Credit to Andrew Dunstan for spotting the problem.  Back-patch
to v13 where the bug was introduced.

Discussion: https://postgr.es/m/136102.1648320427@sss.pgh.pa.us
2022-03-27 12:57:52 -04:00
Tom Lane 0144c9c7e7 Suppress compiler warning in relptr_store().
clang 13 with -Wextra warns that "performing pointer subtraction with
a null pointer has undefined behavior" in the places where freepage.c
tries to set a relptr variable to constant NULL.  This appears to be
a compiler bug, but it's unlikely to get fixed instantly.  Fortunately,
we can work around it by introducing an inline support function, which
seems like a good change anyway because it removes the macro's existing
double-evaluation hazard.

Backpatch to v10 where this code was introduced.

Patch by me, based on an idea of Andres Freund's.

Discussion: https://postgr.es/m/48826.1648310694@sss.pgh.pa.us
2022-03-26 14:29:40 -04:00
Tom Lane 579cef5faf Harden TAP tests that intentionally corrupt page checksums.
The previous method for doing that was to write zeroes into a
predetermined set of page locations.  However, there's a roughly
1-in-64K chance that the existing checksum will match by chance,
and yesterday several buildfarm animals started to reproducibly
see that, resulting in test failures because no checksum mismatch
was reported.

Since the checksum includes the page LSN, test success depends on
the length of the installation's WAL history, which is affected by
(at least) the initial catalog contents, the set of locales installed
on the system, and the length of the pathname of the test directory.
Sooner or later we were going to hit a chance match, and today is
that day.

Harden these tests by specifically inverting the checksum field and
leaving all else alone, thereby guaranteeing that the checksum is
incorrect.

In passing, fix places that were using seek() to set up for syswrite(),
a combination that the Perl docs very explicitly warn against.  We've
probably escaped problems because no regular buffered I/O is done on
these filehandles; but if it ever breaks, we wouldn't deserve or get
much sympathy.

Although we've only seen problems in HEAD, now that we recognize the
environmental dependencies it seems like it might be just a matter
of time until someone manages to hit this in back-branch testing.
Hence, back-patch to v11 where we started doing this kind of test.

Discussion: https://postgr.es/m/3192026.1648185780@sss.pgh.pa.us
2022-03-25 14:23:26 -04:00
Alvaro Herrera ffd28516e6
Fix replay of create database records on standby
Crash recovery on standby may encounter missing directories when
replaying create database WAL records.  Prior to this patch, the standby
would fail to recover in such a case.  However, the directories could be
legitimately missing.  Consider a sequence of WAL records as follows:

    CREATE DATABASE
    DROP DATABASE
    DROP TABLESPACE

If, after replaying the last WAL record and removing the tablespace
directory, the standby crashes and has to replay the create database
record again, the crash recovery must be able to move on.

This patch adds a mechanism similar to invalid-page tracking, to keep a
tally of missing directories during crash recovery.  If all the missing
directory references are matched with corresponding drop records at the
end of crash recovery, the standby can safely continue following the
primary.

Backpatch to 13, at least for now.  The bug is older, but fixing it in
older branches requires more careful study of the interactions with
commit e6d8069522, which appeared in 13.

A new TAP test file is added to verify the condition.  However, because
it depends on commit d6d317dbf6, it can only be added to branch
master.  I (Álvaro) manually verified that the code behaves as expected
in branch 14.  It's a bit nervous-making to leave the code uncovered by
tests in older branches, but leaving the bug unfixed is even worse.
Also, the main reason this fix took so long is precisely that we
couldn't agree on a good strategy to approach testing for the bug, so
perhaps this is the best we can do.

Diagnosed-by: Paul Guo <paulguo@gmail.com>
Author: Paul Guo <paulguo@gmail.com>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Asim R Praveen <apraveen@pivotal.io>
Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
2022-03-25 13:16:21 +01:00
Robert Haas bbace5697d Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.
If TRUNCATE causes some buffers to be invalidated and thus the
checkpoint does not flush them, TRUNCATE must also ensure that the
corresponding files are truncated on disk. Otherwise, a replay
from the checkpoint might find that the buffers exist but have
the wrong contents, which may cause replay to fail.

Report by Teja Mupparti. Patch by Kyotaro Horiguchi, per a design
suggestion from Heikki Linnakangas, with some changes to the
comments by me. Review of this and a prior patch that approached
the issue differently by Heikki Linnakangas, Andres Freund, Álvaro
Herrera, Masahiko Sawada, and Tom Lane.

Discussion: http://postgr.es/m/BYAPR06MB6373BF50B469CA393C614257ABF00@BYAPR06MB6373.namprd06.prod.outlook.com
2022-03-24 14:32:48 -04:00
Andres Freund 81045e1e1c Don't try to translate NULL in GetConfigOptionByNum().
Noticed via -fsanitize=undefined. Introduced when a few columns in
GetConfigOptionByNum() / pg_settings started to be translated in 72be8c29a /
PG 12.

Backpatch to all affected branches, for the same reasons as 46ab07ffda.

Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de
Backpatch: 12-
2022-03-23 13:18:02 -07:00
Andres Freund 89a94c24aa Don't call fwrite() with len == 0 when writing out relcache init file.
Noticed via -fsanitize=undefined.

Backpatch to all branches, for the same reasons as 46ab07ffda.

Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de
Backpatch: 10-
2022-03-23 13:13:18 -07:00
Alvaro Herrera 9814c708c6
pg_upgrade: Upgrade an Assert to a real 'if' test
It seems possible for the condition being tested to be true in
production, and nobody would never know (except when some data
eventually becomes corrupt?).

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m//202109040001.zky3wgv2qeqg@alvherre.pgsql
2022-03-23 19:23:51 +01:00
Alvaro Herrera caaeb88ff7
Fix "missing continuation record" after standby promotion
Invalidate abortedRecPtr and missingContrecPtr after a missing
continuation record is successfully skipped on a standby. This fixes a
PANIC caused when a recently promoted standby attempts to write an
OVERWRITE_RECORD with an LSN of the previously read aborted record.

Backpatch to 10 (all stable versions).

Author: Sami Imseih <simseih@amazon.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/44D259DE-7542-49C4-8A52-2AB01534DCA9@amazon.com
2022-03-23 18:22:10 +01:00
Thomas Munro cd3a5055f9 Try to stabilize vacuum test.
As commits b700f96c and 3414099c did for the reloptions test, make
sure VACUUM can always truncate the table as expected.

Back-patch to 12, where vacuum_truncate arrived.

Discussion: https://postgr.es/m/CAD21AoCNoWjYkdEtr%2BVDoF9v__V905AedKZ9iF%3DArgCtrbxZqw%40mail.gmail.com
2022-03-23 15:06:59 +13:00
Andres Freund 2d608c9607 Add missing dependency of pg_dumpall to WIN32RES.
When cross-building to windows, or building with mingw on windows, the build
could fail with
  x86_64-w64-mingw32-gcc: error: win32ver.o: No such file or director
because pg_dumpall didn't depend on WIN32RES, but it's recipe references
it. The build nevertheless succeeded most of the time, due to
pg_dump/pg_restore having the required dependency, causing win32ver.o to be
built.

Reported-By: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA+hUKGJeekpUPWW6yCVdf9=oBAcCp86RrBivo4Y4cwazAzGPng@mail.gmail.com
Backpatch: 10-, omission present on all live branches
2022-03-22 08:28:51 -07:00
Michael Paquier fdb1be4962 Fix failures in SSL tests caused by out-of-tree keys and certificates
This issue is environment-sensitive, where the SSL tests could fail in
various way by feeding on defaults provided by sslcert, sslkey,
sslrootkey, sslrootcert, sslcrl and sslcrldir coming from a local setup,
as of ~/.postgresql/ by default.  Horiguchi-san has reported two
failures, but more advanced testing from me (aka inclusion of garbage
SSL configuration in ~/.postgresql/ for all the configuration
parameters) has showed dozens of failures that can be triggered in the
whole test suite.

History has showed that we are not good when it comes to address such
issues, fixing them locally like in dd87799, and such problems keep
appearing.  This commit strengthens the entire test suite to put an end
to this set of problems by embedding invalid default values in all the
connection strings used in the tests.  The invalid values are prefixed
in each connection string, relying on the follow-up values passed in the
connection string to enforce any invalid value previously set.  Note
that two tests related to CRLs are required to fail with certain pre-set
configurations, but we can rely on enforcing an empty value instead
after the invalid set of values.

Reported-by: Kyotaro Horiguchi
Reviewed-by: Andrew Dunstan, Daniel Gustafsson, Kyotaro Horiguchi
Discussion: https://postgr.es/m/20220316.163658.1122740600489097632.horikyota.ntt@gmail.com
backpatch-through: 10
2022-03-22 13:21:33 +09:00
Tom Lane 48b6035f0f Fix assorted missing logic for GroupingFunc nodes.
The planner needs to treat GroupingFunc like Aggref for many purposes,
in particular with respect to processing of the argument expressions,
which are not to be evaluated at runtime.  A few places hadn't gotten
that memo, notably including subselect.c's processing of outer-level
aggregates.  This resulted in assertion failures or wrong plans for
cases in which a GROUPING() construct references an outer aggregation
level.

Also fix missing special cases for GroupingFunc in cost_qual_eval
(resulting in wrong cost estimates for GROUPING(), although it's
not clear that that would affect plan shapes in practice) and in
ruleutils.c (resulting in excess parentheses in pretty-print mode).

Per bug #17088 from Yaoguang Chen.  Back-patch to all supported
branches.

Richard Guo, Tom Lane

Discussion: https://postgr.es/m/17088-e33882b387de7f5c@postgresql.org
2022-03-21 17:44:29 -04:00
Tom Lane 05ccf974cd Fix risk of deadlock failure while dropping a partitioned index.
DROP INDEX needs to lock the index's table before the index itself,
else it will deadlock against ordinary queries that acquire the
relation locks in that order.  This is correctly mechanized for
plain indexes by RangeVarCallbackForDropRelation; but in the case of
a partitioned index, we neglected to lock the child tables in advance
of locking the child indexes.  We can fix that by traversing the
inheritance tree and acquiring the needed locks in RemoveRelations,
after we have acquired our locks on the parent partitioned table and
index.

While at it, do some refactoring to eliminate confusion between
the actual and expected relkind in RangeVarCallbackForDropRelation.
We can save a couple of syscache lookups too, by having that function
pass back info that RemoveRelations will need.

Back-patch to v11 where partitioned indexes were added.

Jimmy Yih, Gaurab Dey, Tom Lane

Discussion: https://postgr.es/m/BYAPR05MB645402330042E17D91A70C12BD5F9@BYAPR05MB6454.namprd05.prod.outlook.com
2022-03-21 12:22:13 -04:00
Tom Lane ae8ec7feba Fix incorrect xmlschema output for types timetz and timestamptz.
The output of table_to_xmlschema() and allied functions includes
a regex describing valid values for these types ... but the regex
was itself invalid, as it failed to escape a literal "+" sign.

Report and fix by Renan Soares Lopes.  Back-patch to all
supported branches.

Discussion: https://postgr.es/m/7f6fabaa-3f8f-49ab-89ca-59fbfe633105@me.com
2022-03-18 16:01:42 -04:00
Tom Lane 1d072bd203 Revert applying column aliases to the output of whole-row Vars.
In commit bf7ca1587, I had the bright idea that we could make the
result of a whole-row Var (that is, foo.*) track any column aliases
that had been applied to the FROM entry the Var refers to.  However,
that's not terribly logically consistent, because now the output of
the Var is no longer of the named composite type that the Var claims
to emit.  bf7ca1587 tried to handle that by changing the output
tuple values to be labeled with a blessed RECORD type, but that's
really pretty disastrous: we can wind up storing such tuples onto
disk, whereupon they're not readable by other sessions.

The only practical fix I can see is to give up on what bf7ca1587
tried to do, and say that the column names of tuples produced by
a whole-row Var are always those of the underlying named composite
type, query aliases or no.  While this introduces some inconsistencies,
it removes others, so it's not that awful in the abstract.  What *is*
kind of awful is to make such a behavioral change in a back-patched
bug fix.  But corrupt data is worse, so back-patched it will be.

(A workaround available to anyone who's unhappy about this is to
introduce an extra level of sub-SELECT, so that the whole-row Var is
referring to the sub-SELECT's output and not to a named table type.
Then the Var is of type RECORD to begin with and there's no issue.)

Per report from Miles Delahunty.  The faulty commit dates to 9.5,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/2950001.1638729947@sss.pgh.pa.us
2022-03-17 18:18:05 -04:00
Tomas Vondra 677a1dc0ca Fix publish_as_relid with multiple publications
Commit 83fd4532a7 allowed publishing of changes via ancestors, for
publications defined with publish_via_partition_root. But the way
the ancestor was determined in get_rel_sync_entry() was incorrect,
simply updating the same variable. So with multiple publications,
replicating different ancestors, the outcome depended on the order
of publications in the list - the value from the last loop was used,
even if it wasn't the top-most ancestor.

This is a probably rare situation, as in most cases publications do
not overlap, so each partition has exactly one candidate ancestor
to replicate as and there's no ambiguity.

Fixed by tracking the "ancestor level" for each publication, and
picking the top-most ancestor. Adds a test case, verifying the
correct ancestor is used for publishing the changes and that this
does not depend on order of publications in the list.

Older releases have another bug in this loop - once all actions are
replicated, the loop is terminated, on the assumption that inspecting
additional publications is unecessary. But that misses the fact that
those additional applications may replicate different ancestors.

Fixed by removal of this break condition. We might still terminate the
loop in some cases (e.g. when replicating all actions and the ancestor
is the partition root).

Backpatch to 13, where publish_via_partition_root was introduced.

Initial report and fix by me, test added by Hou zj. Reviews and
improvements by Amit Kapila.

Author: Tomas Vondra, Hou zj, Amit Kapila
Reviewed-by: Amit Kapila, Hou zj
Discussion: https://postgr.es/m/d26d24dd-2fab-3c48-0162-2b7f84a9c893%40enterprisedb.com
2022-03-16 18:06:27 +01:00
Thomas Munro 26e0079398 Fix race between DROP TABLESPACE and checkpointing.
Commands like ALTER TABLE SET TABLESPACE may leave files for the next
checkpoint to clean up.  If such files are not removed by the time DROP
TABLESPACE is called, we request a checkpoint so that they are deleted.
However, there is presently a window before checkpoint start where new
unlink requests won't be scheduled until the following checkpoint.  This
means that the checkpoint forced by DROP TABLESPACE might not remove the
files we expect it to remove, and the following ERROR will be emitted:

	ERROR:  tablespace "mytblspc" is not empty

To fix, add a call to AbsorbSyncRequests() just before advancing the
unlink cycle counter.  This ensures that any unlink requests forwarded
prior to checkpoint start (i.e., when ckpt_started is incremented) will
be processed by the current checkpoint.  Since AbsorbSyncRequests()
performs memory allocations, it cannot be called within a critical
section, so we also need to move SyncPreCheckpoint() to before
CreateCheckPoint()'s critical section.

This is an old bug, so back-patch to all supported versions.

Author: Nathan Bossart <nathandbossart@gmail.com>
Reported-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220215235845.GA2665318%40nathanxps13
2022-03-16 17:20:50 +13:00
Thomas Munro 1396b5c6ed Fix waiting in RegisterSyncRequest().
If we run out of space in the checkpointer sync request queue (which is
hopefully rare on real systems, but common with very small buffer pool),
we wait for it to drain.  While waiting, we should report that as a wait
event so that users know what is going on, and also handle postmaster
death, since otherwise the loop might never terminate if the
checkpointer has exited.

Back-patch to 12.  Although the problem exists in earlier releases too,
the code is structured differently before 12 so I haven't gone any
further for now, in the absence of field complaints.

Reported-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220226213942.nb7uvb2pamyu26dj%40alap3.anarazel.de
2022-03-16 15:35:42 +13:00
Thomas Munro 78c0f85e43 Wake up for latches in CheckpointWriteDelay().
The checkpointer shouldn't ignore its latch.  Other backends may be
waiting for it to drain the request queue.  Hopefully real systems don't
have a full queue often, but the condition is reached easily when
shared_buffers is small.

This involves defining a new wait event, which will appear in the
pg_stat_activity view often due to spread checkpoints.

Back-patch only to 14.  Even though the problem exists in earlier
branches too, it's hard to hit there.  In 14 we stopped using signal
handlers for latches on Linux, *BSD and macOS, which were previously
hiding this problem by interrupting the sleep (though not reliably, as
the signal could arrive before the sleep begins; precisely the problem
latches address).

Reported-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220226213942.nb7uvb2pamyu26dj%40alap3.anarazel.de
2022-03-16 13:57:07 +13:00
Thomas Munro d9f7ad54e5 Back-patch LLVM 14 API changes.
Since LLVM 14 has stopped changing and is about to be released,
back-patch the following changes from the master branch:

  e6a7600202
  807fee1a39
  a56e7b6601

Back-patch to 11, where LLVM JIT support came in.
2022-03-16 11:42:00 +13:00
Tom Lane 8dcd1c3564 Restore the previous semantics of get_constraint_index().
Commit 8b069ef5d changed this function to look at pg_constraint.conindid
rather than searching pg_depend.  That was a good performance improvement,
but it failed to preserve the exact semantics.  The old code would only
return an index that was "owned by" (internally dependent on) the
specified constraint, whereas the new code will also return indexes that
are just referenced by foreign key constraints.  This confuses ALTER
TABLE, which was implicitly expecting the previous semantics, into
failing with errors like
    ERROR:  relation 146621 has multiple clustered indexes
or
    ERROR:  "pk_attbl" is not an index for table "atref"

We can fix this without reverting the performance improvement by adding
a contype check in get_constraint_index().  Another way could be to
make ALTER TABLE check it, but I'm worried that extension code could
also have subtle dependencies on the old semantics.

Tom Lane and Japin Li, per bug #17409 from Holly Roberts.
Back-patch to v14 where the error crept in.

Discussion: https://postgr.es/m/17409-52871dda8b5741cb@postgresql.org
2022-03-11 13:47:26 -05:00
Noah Misch f60bb3e0a9 Introduce PG_TEST_TIMEOUT_DEFAULT for TAP suite non-elapsing timeouts.
Slow hosts may avoid load-induced, spurious failures by setting
environment variable PG_TEST_TIMEOUT_DEFAULT to some number of seconds
greater than 180.  Developers may see faster failures by setting that
environment variable to some lesser number of seconds.  In tests, write
$PostgreSQL::Test::Utils::timeout_default wherever the convention has
been to write 180.  This change raises the default for some briefer
timeouts.  Back-patch to v10 (all supported versions).

Discussion: https://postgr.es/m/20220218052842.GA3627003@rfd.leadboat.com
2022-03-04 18:53:17 -08:00
Tom Lane a008c03dd7 Fix pg_regress to print the correct postmaster address on Windows.
pg_regress reported "Unix socket" as the default location whenever
HAVE_UNIX_SOCKETS is defined.  However, that's not been accurate
on Windows since 8f3ec75de.  Update this logic to match what libpq
actually does now.

This is just cosmetic, but still it's potentially misleading.
Back-patch to v13 where 8f3ec75de came in.

Discussion: https://postgr.es/m/3894060.1646415641@sss.pgh.pa.us
2022-03-04 13:23:58 -05:00
Tom Lane 5c9d17e94c Fix bogus casting in BlockIdGetBlockNumber().
This macro cast the result to BlockNumber after shifting, not before,
which is the wrong thing.  Per the C spec, the uint16 fields would
promote to int not unsigned int, so that (for 32-bit int) the shift
potentially shifts a nonzero bit into the sign position.  I doubt
there are any production systems where this would actually end with
the wrong answer, but it is undefined behavior per the C spec, and
clang's -fsanitize=undefined option reputedly warns about it on some
platforms.  (I can't reproduce that right now, but the code is
undeniably wrong per spec.)  It's easy to fix by casting to
BlockNumber (uint32) in the proper places.

It's been wrong for ages, so back-patch to all supported branches.

Report and patch by Zhihong Yu (cosmetic tweaking by me)

Discussion: https://postgr.es/m/CALNJ-vT9r0DSsAOw9OXVJFxLENoVS_68kJ5x0p44atoYH+H4dg@mail.gmail.com
2022-03-03 19:03:35 -05:00
Tom Lane b0bc196e52 Clean up assorted failures under clang's -fsanitize=undefined checks.
Most of these are cases where we could call memcpy() or other libc
functions with a NULL pointer and a zero count, which is forbidden
by POSIX even though every production version of libc allows it.
We've fixed such things before in a piecemeal way, but apparently
never made an effort to try to get them all.  I don't claim that
this patch does so either, but it gets every failure I observe in
check-world, using clang 12.0.1 on current RHEL8.

numeric.c has a different issue that the sanitizer doesn't like:
"ln(-1.0)" will compute log10(0) and then try to assign the
resulting -Inf to an integer variable.  We don't actually use the
result in such a case, so there's no live bug.

Back-patch to all supported branches, with the idea that we might
start running a buildfarm member that tests this case.  This includes
back-patching c1132aae3 (Check the size in COPY_POINTER_FIELD),
which previously silenced some of these issues in copyfuncs.c.

Discussion: https://postgr.es/m/CALNJ-vT9r0DSsAOw9OXVJFxLENoVS_68kJ5x0p44atoYH+H4dg@mail.gmail.com
2022-03-03 18:13:24 -05:00
Tom Lane 2a1f84636d Allow root-owned SSL private keys in libpq, not only the backend.
This change makes libpq apply the same private-key-file ownership
and permissions checks that we have used in the backend since commit
9a83564c5.  Namely, that the private key can be owned by either the
current user or root (with different file permissions allowed in the
two cases).  This allows system-wide management of key files, which
is just as sensible on the client side as the server, particularly
when the client is itself some application daemon.

Sync the comments about this between libpq and the backend, too.

Back-patch of a59c79564 and 50f03473e into all supported branches.

David Steele

Discussion: https://postgr.es/m/f4b7bc55-97ac-9e69-7398-335e212f7743@pgmasters.net
2022-03-02 11:57:02 -05:00
Tom Lane ac910bb232 Disallow execution of SPI functions during plperl function compilation.
Perl can be convinced to execute user-defined code during compilation
of a plperl function (or at least a plperlu function).  That's not
such a big problem as long as the activity is confined within the
Perl interpreter, and it's not clear we could do anything about that
anyway.  However, if such code tries to use plperl's SPI functions,
we have a bigger problem.  In the first place, those functions are
likely to crash because current_call_data->prodesc isn't set up yet.
In the second place, because it isn't set up, we lack critical info
such as whether the function is supposed to be read-only.  And in
the third place, this path allows code execution during function
validation, which is strongly discouraged because of the potential
for security exploits.  Hence, reject execution of the SPI functions
until compilation is finished.

While here, add check_spi_usage_allowed() calls to various functions
that hadn't gotten the memo about checking that.  I think that perhaps
plperl_sv_to_literal may have been intentionally omitted on the grounds
that it was safe at the time; but if so, the addition of transforms
functionality changed that.  The others are more recently added and
seem to be flat-out oversights.

Per report from Mark Murawski.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/9acdf918-7fff-4f40-f750-2ffa84f083d2@intellasoft.net
2022-02-25 17:40:43 -05:00
Andres Freund 9ff7fd9063 pg_waldump: Fix error message for WAL files smaller than XLOG_BLCKSZ.
When opening a WAL file smaller than XLOG_BLCKSZ (e.g. 0 bytes long) while
determining the wal_segment_size, pg_waldump checked errno, despite errno not
being set by the short read. Resulting in a bogus error message.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220214.181847.775024684568733277.horikyota.ntt@gmail.com
Backpatch: 11-, the bug was introducedin fc49e24fa
2022-02-25 10:31:16 -08:00
Heikki Linnakangas 7d80e93fb1 Fix data loss on crash after sorted GiST index build.
If a checkpoint happens during the index build, and the system crashes
after the checkpoint and the index build have finished, the data written
to the index before the checkpoint started could be lost. The checkpoint
won't have fsync'd it, and it won't be replayed at crash recovery either.
Fix by calling smgrimmedsync() after the index build, just like in B-tree
index build.

Backpatch to v14 where the sorted GiST index build was introduced.

Reported-by: Melanie Plageman
Discussion: https://www.postgresql.org/message-id/CAAKRu_ZJJynimxKj5xYBSziL62-iEtPE+fx-B=JzR=jUtP92mw@mail.gmail.com
2022-02-24 14:34:09 +02:00
Tom Lane dd7c059791 Re-allow underscore as first character of custom GUC names.
Commit 3db826bd5 intended that valid_custom_variable_name's
rules for valid identifiers match those of scan.l.  However,
I (tgl) had some kind of brain fade and put "_" in the wrong
list.

Fix by Japin Li, per bug #17415 from Daniel Polski.

Discussion: https://postgr.es/m/17415-ebdb683d7e09a51c@postgresql.org
2022-02-23 11:10:46 -05:00
Michael Paquier 627c79a1e8 Add compute_query_id = regress
"regress" is a new mode added to compute_query_id aimed at facilitating
regression testing when a module computing query IDs is loaded into the
backend, like pg_stat_statements.  It works the same way as "auto",
meaning that query IDs are computed if a module enables it, except that
query IDs are hidden in EXPLAIN outputs to ensure regression output
stability.

Like any GUCs of the kind (force_parallel_mode, etc.), this new
configuration can be added to an instance's postgresql.conf, or just
passed down with PGOPTIONS at command level.  compute_query_id uses an
enum for its set of option values, meaning that this addition ensures
ABI compatibility.

Using this new configuration mode allows installcheck-world to pass when
running the tests on an instance with pg_stat_statements enabled,
stabilizing the test output while checking the paths doing query ID
computations.

Reported-by: Anton Melnikov
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/1634283396.372373993@f75.i.mail.ru
Discussion: https://postgr.es/m/YgHlxgc/OimuPYhH@paquier.xyz
Backpatch-through: 14
2022-02-22 10:23:49 +09:00
Andres Freund 7bbfe59941 Fix temporary object cleanup failing due to toast access without snapshot.
When cleaning up temporary objects during process exit the cleanup could fail
with:
  FATAL: cannot fetch toast data without an active snapshot

The bug is caused by RemoveTempRelationsCallback() not setting up a
snapshot. If an object with toasted catalog data needs to be cleaned up,
init_toast_snapshot() could fail with the above error.

Most of the time however the the problem is masked due to cached catalog
snapshots being returned by GetOldestSnapshot(). But dropping an object can
cause catalog invalidations to be emitted. If no further catalog accesses are
necessary between the invalidation processing and the next toast datum
deletion, the bug becomes visible.

It's easy to miss this bug because it typically happens after clients
disconnect and the FATAL error just ends up in the log.

Luckily temporary table cleanup at the next use of the same temporary schema
or during DISCARD ALL does not have the same problem.

Fix the bug by pushing a snapshot in RemoveTempRelationsCallback(). Also add
isolation tests for temporary object cleanup, including objects with toasted
catalog data.

A future HEAD only commit will add more assertions.

Reported-By: Miles Delahunty
Author: Andres Freund
Discussion: https://postgr.es/m/CAOFAq3BU5Mf2TTvu8D9n_ZOoFAeQswuzk7yziAb7xuw_qyw5gw@mail.gmail.com
Backpatch: 10-
2022-02-21 09:57:05 -08:00
Andrew Dunstan 8b5cd373ba
Remove most msys special processing in TAP tests
Following migration of Windows buildfarm members running TAP tests to
use of ucrt64 perl for those tests, special processing for msys perl is
no longer necessary and so is removed.

Backpatch to release 10

Discussion: https://postgr.es/m/c65a8781-77ac-ea95-d185-6db291e1baeb@dunslane.net
2022-02-20 11:48:45 -05:00
Andrew Dunstan 652ff988fb
Remove PostgreSQL::Test::Utils::perl2host completely
Commit f1ac4a74de disabled this processing, and as nothing has broken (as
expected) here we proceed to remove the routine and adjust all the call
sites.

Backpatch to release 10

Discussion: https://postgr.es/m/0ba775a2-8aa0-0d56-d780-69427cf6f33d@dunslane.net
Discussion: https://postgr.es/m/20220125023609.5ohu3nslxgoygihl@alap3.anarazel.de
2022-02-20 08:55:06 -05:00
Tom Lane 2e30d77a19 Suppress warning about stack_base_ptr with late-model GCC.
GCC 12 complains that set_stack_base is storing the address of
a local variable in a long-lived pointer.  This is an entirely
reasonable warning (indeed, it just helped us find a bug);
but that behavior is intentional here.  We can work around it
by using __builtin_frame_address(0) instead of a specific local
variable; that produces an address a dozen or so bytes different,
in my testing, but we don't care about such a small difference.
Maybe someday a compiler lacking that function will start to issue
a similar warning, but we'll worry about that when it happens.

Patch by me, per a suggestion from Andres Freund.  Back-patch to
v12, which is as far back as the patch will go without some pain.
(Recently-established project policy would permit a back-patch as
far as 9.2, but I'm disinclined to expend the work until GCC 12
is much more widespread.)

Discussion: https://postgr.es/m/3773792.1645141467@sss.pgh.pa.us
2022-02-17 22:45:34 -05:00
Amit Kapila 04645bbcae WAL log unchanged toasted replica identity key attributes.
Currently, during UPDATE, the unchanged replica identity key attributes
are not logged separately because they are getting logged as part of the
new tuple. But if they are stored externally then the untoasted values are
not getting logged as part of the new tuple and logical replication won't
be able to replicate such UPDATEs. So we need to log such attributes as
part of the old_key_tuple during UPDATE.

Reported-by: Haiying Tang
Author: Dilip Kumar and Amit Kapila
Reviewed-by: Alvaro Herrera, Haiying Tang, Andres Freund
Backpatch-through: 10
Discussion: https://postgr.es/m/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com
2022-02-14 08:07:46 +05:30
Alexander Korotkov c76665edce Fix memory leak in IndexScan node with reordering
Fix ExecReScanIndexScan() to free the referenced tuples while emptying the
priority queue.  Backpatch to all supported versions.

Discussion: https://postgr.es/m/CAHqSB9gECMENBQmpbv5rvmT3HTaORmMK3Ukg73DsX5H7EJV7jw%40mail.gmail.com
Author: Aliaksandr Kalenik
Reviewed-by: Tom Lane, Alexander Korotkov
Backpatch-through: 10
2022-02-14 03:32:31 +03:00
Tom Lane ae27b1acc4 Fix thinko in PQisBusy().
In commit 1f39a1c06 I made PQisBusy consider conn->write_failed, but
that is now looking like complete brain fade.  In the first place, the
logic is quite wrong: it ought to be like "and not" rather than "or".
This meant that once we'd gotten into a write_failed state, PQisBusy
would always return true, probably causing the calling application to
iterate its loop until PQconsumeInput returns a hard failure thanks
to connection loss.  That's not what we want: the intended behavior
is to return an error PGresult, which the application probably has
much cleaner support for.

But in the second place, checking write_failed here seems like the
wrong thing anyway.  The idea of the write_failed mechanism is to
postpone handling of a write failure until we've read all we can from
the server; so that flag should not interfere with input-processing
behavior.  (Compare 7247e243a.)  What we *should* check for is
status = CONNECTION_BAD, ie, socket already closed.  (Most places that
close the socket don't touch asyncStatus, but they do reset status.)
This primarily ensures that if PQisBusy() returns true then there is
an open socket, which is assumed by several call sites in our own
code, and probably other applications too.

While at it, fix a nearby thinko in libpq's my_sock_write: we should
only consult errno for res < 0, not res == 0.  This is harmless since
pqsecure_raw_write would force errno to zero in such a case, but it
still could confuse readers.

Noted by Andres Freund.  Backpatch to v12 where 1f39a1c06 came in.

Discussion: https://postgr.es/m/20220211011025.ek7exh6owpzjyudn@alap3.anarazel.de
2022-02-12 13:23:20 -05:00
Tom Lane 277e744ae1 Don't use_physical_tlist for an IOS with non-returnable columns.
createplan.c tries to save a runtime projection step by specifying
a scan plan node's output as being exactly the table's columns, or
index's columns in the case of an index-only scan, if there is not a
reason to do otherwise.  This logic did not previously pay attention
to whether an index's columns are returnable.  That worked, sort of
accidentally, until commit 9a3ddeb51 taught setrefs.c to reject plans
that try to read a non-returnable column.  I have no desire to loosen
setrefs.c's new check, so instead adjust use_physical_tlist() to not
try to optimize this way when there are non-returnable column(s).

Per report from Ryan Kelly.  Like the previous patch, back-patch
to all supported branches.

Discussion: https://postgr.es/m/CAHUie24ddN+pDNw7fkhNrjrwAX=fXXfGZZEHhRuofV_N_ftaSg@mail.gmail.com
2022-02-11 15:23:52 -05:00
Tom Lane 1e8c5cf7c6 Make pg_ctl stop/restart/promote recheck postmaster aliveness.
"pg_ctl stop/restart" checked that the postmaster PID is valid just
once, as a side-effect of sending the stop signal, and then would
wait-till-timeout for the postmaster.pid file to go away.  This
neglects the case wherein the postmaster dies uncleanly after we
signal it.  Similarly, once "pg_ctl promote" has sent the signal,
it'd wait for the corresponding on-disk state change to occur
even if the postmaster dies.

I'm not sure how we've managed not to notice this problem, but it
seems to explain slow execution of the 017_shm.pl test script on AIX
since commit 4fdbf9af5, which added a speculative "pg_ctl stop" with
the idea of making real sure that the postmaster isn't there.  In the
test steps that kill-9 and then restart the postmaster, it's possible
to get past the initial signal attempt before kill() stops working
for the doomed postmaster.  If that happens, pg_ctl waited till
PGCTLTIMEOUT before giving up ... and the buildfarm's AIX members
have that set very high.

To fix, include a "kill(pid, 0)" test (similar to what
postmaster_is_alive uses) in these wait loops, so that we'll
give up immediately if the postmaster PID disappears.

While here, I chose to refactor those loops out of where they were.
do_stop() and do_restart() can perfectly well share one copy of the
wait-for-stop loop, and it seems desirable to put a similar function
beside that for wait-for-promote.

Back-patch to all supported versions, since pg_ctl's wait logic
is substantially identical in all, and we're seeing the slow test
behavior in all branches.

Discussion: https://postgr.es/m/20220210023537.GA3222837@rfd.leadboat.com
2022-02-10 16:49:39 -05:00
Andrew Dunstan 92f60f536e
Use gendef instead of pexports for building windows .def files
Modern msys systems lack pexports but have gendef instead, so use that.

Discussion: https://postgr.es/m/3ccde7a9-e4f9-e194-30e0-0936e6ad68ba@dunslane.net

Backpatch to release 9.4 to enable building with perl on older branches.
Before that pexports is not used for plperl.
2022-02-10 13:51:19 -05:00
Tom Lane 2e211c1661 Make timeout.c more robust against missed timer interrupts.
Commit 09cf1d522 taught schedule_alarm() to not do anything if
the next requested event is after when we expect the next interrupt
to fire.  However, if somehow an interrupt gets lost, we'll continue
to not do anything indefinitely, even after the "next interrupt" time
is obviously in the past.  Thus, one missed interrupt can break
timeout scheduling for the life of the session.  Michael Harris
reported a scenario where a bug in a user-defined function caused this
to happen, so you don't even need to assume kernel bugs exist to think
this is worth fixing.  We can make things more robust at little cost
by detecting the case where signal_due_at is before "now" and forcing
a new setitimer call to occur.  This isn't a completely bulletproof
fix of course; but in our typical usage pattern where we frequently set
timeouts and clear them before they are reached, the interrupt will
get re-enabled after at most one timeout interval, which with a little
luck will be before we really need it.

While here, let's mark signal_due_at as volatile, since the signal
handler can both examine and set it.  I'm not sure there's any
actual risk given that signal_pending is already volatile, but
it's surely questionable.

Backpatch to v14 where this logic came in.

Michael Harris and Tom Lane

Discussion: https://postgr.es/m/CADofcAWbMrvgwSMqO4iG_iD3E2v8ZUrC-_crB41my=VMM02-CA@mail.gmail.com
2022-02-10 11:52:20 -05:00
Daniel Gustafsson 5f00ef065e Set SNI ClientHello extension to localhost in tests
The connection strings in the SSL client tests were using the host
set up from Cluster.pm which is a temporary pathname. When SNI is
enabled we pass the host to OpenSSL in order to set the server name
indication ClientHello extension via SSL_set_tlsext_host_name.

OpenSSL doesn't validate the hostname apart from checking the max
length, but LibreSSL checks for RFC 5890 conformance which results
in errors during testing as the pathname from Cluster.pm is not a
valid hostname.

Fix by setting the host explicitly to localhost, as that's closer
to the intent of the test.

Backpatch through 14 where SNI support came in.

Reported-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/17391-304f81bcf724b58b@postgresql.org
Backpatch-through: 14
2022-02-10 14:23:36 +01:00
Tom Lane c23461a22a Test honestly for <sys/signalfd.h>.
Commit 6a2a70a02 supposed that any platform having <sys/epoll.h>
would also have <sys/signalfd.h>.  It turns out there are still a
few people using platforms where that's not so, so we'd better make
a separate configure probe for it.  But since it took this long to
notice, I'm content with the decision to not have a separate code
path for epoll-only machines; we'll just fall back to using poll()
for these stragglers.

Per gripe from Gabriela Serventi.  Back-patch to v14 where this
code came in.

Discussion: https://postgr.es/m/CAHOHWE-JjJDfcYuLAAEO7Jk07atFAU47z8TzHzg71gbC0aMy=g@mail.gmail.com
2022-02-09 14:24:55 -05:00
Tom Lane e327291e4a Remove ppport.h's broken re-implementation of eval_pv().
Recent versions of Devel::PPPort try to redefine eval_pv() to
dodge a bug in pre-5.31 Perl versions.  Unfortunately the redefinition
fails on compilers that don't support statements nested within
expressions.  However, we aren't actually interested in this bug fix,
since we always call eval_pv() with croak_on_error = FALSE.
So, until there's an upstream fix for this breakage, just comment
out the macro to revert to the older behavior.

Per report from Wei Sun, as well as previous buildfarm failure
on pademelon (which I'd unfortunately not looked at carefully
enough to understand the cause).  Back-patch to all supported
versions, since we're using the same ppport.h in all.

Discussion: https://postgr.es/m/tencent_2EFCC8BA0107B6EC0F97179E019A8A43C806@qq.com
Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=pademelon&dt=2022-02-02%2001%3A22%3A58
2022-02-08 19:26:09 -05:00
Peter Eisentraut 3c879e61d9 Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 063c497a909612d444c7c7188482db9aef86200f
2022-02-07 13:31:57 +01:00
Tom Lane d13a838e1c Test, don't just Assert, that mergejoin's inputs are in order.
There are two Asserts in nodeMergejoin.c that are reachable if
the input data is not in the expected order.  This seems way too
fragile.  Alexander Lakhin reported a case where the assertions
could be triggered with misconfigured foreign-table partitions,
and bitter experience with unstable operating system collation
definitions suggests another easy route to hitting them.  Neither
Assert is in a place where we can't afford one more test-and-branch,
so replace 'em with plain test-and-elog logic.

Per bug #17395.  While the reported symptom is relatively recent,
collation changes could happen anytime, so back-patch to all
supported branches.

Discussion: https://postgr.es/m/17395-8c326292078d1a57@postgresql.org
2022-02-05 11:59:30 -05:00
Andres Freund 2a3958e4d9 Fix compiler warning in non-assert builds, introduced in f862d57057.
Discussion: https://postgr.es/m/20220203183655.ralgkh54sdcgysmn@alap3.anarazel.de
Backpatch: 14-, like f862d57057
2022-02-03 10:44:38 -08:00
Etsuro Fujita 7b0cec2fa0 Further fix for EvalPlanQual with mix of local and foreign partitions.
We assume that direct-modify ForeignScan nodes cannot be re-evaluated
during EvalPlanQual processing, but the rework for inherited
UPDATE/DELETE in commit 86dc90056 changed things, without considering
that, so that such ForeignScan nodes get called as part of the
EvalPlanQual subtree during EvalPlanQual processing in the case of an
inherited UPDATE/DELETE where the inheritance set contains foreign
target relations.  To avoid re-evaluating such ForeignScan nodes during
EvalPlanQual processing, commit c3928b467 modified nodeForeignscan.c,
but the assumption made there that ExecForeignScan() should never be
called for such ForeignScan nodes during EvalPlanQual processing turned
out to be wrong in some cases, leading to a segmentation fault or a
"cannot re-evaluate a Foreign Update or Delete during EvalPlanQual"
error.

Fix by modifying nodeForeignscan.c further to avoid re-evaluating such
ForeignScan nodes even in ExecForeignScan()/ExecReScanForeignScan()
during EvalPlanQual processing.  Since this makes non-reachable the
test-and-elog added to ForeignNext() by commit c3928b467 that produced
the aforesaid error, convert the test-and-elog to an Assert.

Per bug #17355 from Alexander Lakhin.  Back-patch to v14 where both
commits came in.

Patch by me, reviewed and tested by Alexander Lakhin and Amit Langote.

Discussion: https://postgr.es/m/17355-de8e362eb7001a96@postgresql.org
2022-02-03 15:15:01 +09:00
Tom Lane 00fdfde6dc Revert "plperl: Fix breakage of c89f409749 in back branches."
This reverts commits d81cac47a et al.  We shouldn't need that
hack after the preceding commits.

Discussion: https://postgr.es/m/20220131015130.shn6wr2fzuymerf6@alap3.anarazel.de
2022-01-31 15:07:39 -05:00
Tom Lane 2d44912cf7 plperl: update ppport.h to Perl 5.34.0.
Also apply the changes suggested by running
perl ppport.h --compat-version=5.8.0

And remove some no-longer-required NEED_foo declarations.

Dagfinn Ilmari Mannsåker

Back-patch of commit 05798c9f7 into all supported branches.
At the time we thought this update was mostly cosmetic, but the
lack of it has caused trouble, while the patch itself hasn't.

Discussion: https://postgr.es/m/87y278s6iq.fsf@wibble.ilmari.org
Discussion: https://postgr.es/m/20220131015130.shn6wr2fzuymerf6@alap3.anarazel.de
2022-01-31 15:01:05 -05:00
Andres Freund d81cac47a8 plperl: Fix breakage of c89f409749 in back branches.
ppport.h was only updated in 05798c9f7f (master). Unfortunately my commit
c89f409749 uses PERL_VERSION_LT which came in with that update. Breaking most
buildfarm animals.

I should have noticed that...

We might want to backpatch the ppport update instead, but for now lets get the
buildfarm green again.

Discussion: https://postgr.es/m/20220131015130.shn6wr2fzuymerf6@alap3.anarazel.de
Backpatch: 10-14, master doesn't need it
2022-01-30 18:00:27 -08:00
Andres Freund 8484e38126 plperl: windows: Use Perl_setlocale on 5.28+, fixing compile failure.
For older versions we need our own copy of perl's setlocale(), because it was
not exposed (why we need the setlocale in the first place is explained in
plperl_init_interp) . The copy stopped working in 5.28, as some of the used
macros are not public anymore.  But Perl_setlocale is available in 5.28, so
use that.

Author: Victor Wagner <vitus@wagner.pp.ru>
Reviewed-By: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://postgr.es/m/20200501134711.08750c5f@antares.wagner.home
Backpatch: all versions
2022-01-30 16:42:42 -08:00
Tom Lane c025067f6d Fix failure to validate the result of select_common_type().
Although select_common_type() has a failure-return convention, an
apparent successful return just provides a type OID that *might* work
as a common supertype; we've not validated that the required casts
actually exist.  In the mainstream use-cases that doesn't matter,
because we'll proceed to invoke coerce_to_common_type() on each input,
which will fail appropriately if the proposed common type doesn't
actually work.  However, a few callers didn't read the (nonexistent)
fine print, and thought that if they got back a nonzero OID then the
coercions were sure to work.

This affects in particular the recently-added "anycompatible"
polymorphic types; we might think that a function/operator using
such types matches cases it really doesn't.  A likely end result
of that is unexpected "ambiguous operator" errors, as for example
in bug #17387 from James Inform.  Another, much older, case is that
the parser might try to transform an "x IN (list)" construct to
a ScalarArrayOpExpr even when the list elements don't actually have
a common supertype.

It doesn't seem desirable to add more checking to select_common_type
itself, as that'd just slow down the mainstream use-cases.  Instead,
write a separate function verify_common_type that performs the
missing checks, and add a call to that where necessary.  Likewise add
verify_common_type_from_oids to go with select_common_type_from_oids.

Back-patch to v13 where the "anycompatible" types came in.  (The
symptom complained of in bug #17387 doesn't appear till v14, but
that's just because we didn't get around to converting || to use
anycompatible till then.)  In principle the "x IN (list)" fix could
go back all the way, but I'm not currently convinced that it makes
much difference in real-world cases, so I won't bother for now.

Discussion: https://postgr.es/m/17387-5dfe54b988444963@postgresql.org
2022-01-29 11:41:18 -05:00
Michael Paquier b30282fccf Fix incorrect memory context switch in COPY TO execution
c532d15 has split the logic of COPY commands into multiple files, one
change being to move the internals of BeginCopy() to BeginCopyTo().
Originally the code was written so as we'd switch back-and-forth between
the current execution memory context and the dedicated memory context
for the COPY command, and this refactoring has introduced an extra
switch to the current memory context from the COPY context once
BeginCopyTo() is done with the past logic coming from BeginCopy().

The code was correctly doing the analyze, rewrite and planning phases in
the COPY context, but it was not assigning "copy_file" (FILE* used when
copying to a source file) and "filename" in the COPY context, making the
COPY status data inconsistent.

Author: Bharath Rupireddy
Reviewed-by: Japin Li
Discussion: https://postgr.es/m/CALj2ACWvVa69foi9jhHFY=2BuHxAoYboyE+vXQTARwxZcJnVrQ@mail.gmail.com
Backpatch-through: 14
2022-01-29 10:23:17 +09:00
Etsuro Fujita d99166ed4f Fix typo in comment. 2022-01-28 15:45:01 +09:00
Fujii Masao 6e7ee55e72 Prevent memory context logging from sending log message to connected client.
When pg_log_backend_memory_contexts() is executed, the target backend
should use LOG_SERVER_ONLY to log its memory contexts, to prevent them
from being sent to its connected client regardless of client_min_messages.
But previously the backend unexpectedly used LOG to log the message
"logging memory contexts of PID %d" and it could be sent to the client.
This is a bug in memory context logging.

To fix the bug, this commit changes that message so that it's logged with
LOG_SERVER_ONLY.

Back-patch to v14 where pg_log_backend_memory_contexts() was added.

Author: Fujii Masao
Reviewed-by: Bharath Rupireddy, Atsushi Torikoshi
Discussion: https://postgr.es/m/82c12f36-86f7-5e72-79af-7f5c37f6cad7@oss.nttdata.com
2022-01-28 11:25:45 +09:00
Tomas Vondra fb2f8e534a Fix ordering of XIDs in ProcArrayApplyRecoveryInfo
Commit 8431e296ea reworked ProcArrayApplyRecoveryInfo to sort XIDs
before adding them to KnownAssignedXids. But the XIDs are sorted using
xidComparator, which compares the XIDs simply as uint32 values, not
logically. KnownAssignedXidsAdd() however expects XIDs in logical order,
and calls TransactionIdFollowsOrEquals() to enforce that. If there are
XIDs for which the two orderings disagree, an error is raised and the
recovery fails/restarts.

Hitting this issue is fairly easy - you just need two transactions, one
started before the 4B limit (e.g. XID 4294967290), the other sometime
after it (e.g. XID 1000). Logically (4294967290 <= 1000) but when
compared using xidComparator we try to add them in the opposite order.
Which makes KnownAssignedXidsAdd() fail with an error like this:

  ERROR: out-of-order XID insertion in KnownAssignedXids

This only happens during replica startup, while processing RUNNING_XACTS
records to build the snapshot. Once we reach STANDBY_SNAPSHOT_READY, we
skip these records. So this does not affect already running replicas,
but if you restart (or create) a replica while there are transactions
with XIDs for which the two orderings disagree, you may hit this.

Long-running transactions and frequent replica restarts increase the
likelihood of hitting this issue. Once the replica gets into this state,
it can't be started (even if the old transactions are terminated).

Fixed by sorting the XIDs logically - this is fine because we're dealing
with normal XIDs (because it's XIDs assigned to backends) and from the
same wraparound epoch (otherwise the backends could not be running at
the same time on the primary node). So there are no problems with the
triangle inequality, which is why xidComparator compares raw values.

Investigation and root cause analysis by Abhijit Menon-Sen. Patch by me.

This issue is present in all releases since 9.4, however releases up to
9.6 are EOL already so backpatch to 10 only.

Reviewed-by: Abhijit Menon-Sen
Reviewed-by: Alvaro Herrera
Backpatch-through: 10
Discussion: https://postgr.es/m/36b8a501-5d73-277c-4972-f58a4dce088a%40enterprisedb.com
2022-01-27 20:15:37 +01:00
Andrew Dunstan 999dc1d265
Improve msys2 detection for TAP tests
Perl instances on some msys toolchains (e.g. UCRT64) have their
configured osname set to 'MSWin32' rather than 'msys'.  The test for
the msys2 platform is adjusted accordingly.

Backpatch to release 14.
2022-01-27 08:26:28 -05:00
Noah Misch d94a95cce9 On sparc64+ext4, suppress test failures from known WAL read failure.
Buildfarm members kittiwake, tadarida and snapper began to fail
frequently when commits 3cd9c3b921 and
f47ed79cc8 added tests of concurrency, but
the problem was reachable before those commits.  Back-patch to v10 (all
supported versions).

Discussion: https://postgr.es/m/20220116210241.GC756210@rfd.leadboat.com
2022-01-26 18:06:23 -08:00
Magnus Hagander 4afae689ea Fix pg_hba_file_rules for authentication method cert
For authentication method cert, clientcert=verify-full is implied. But
the pg_hba_file_rules entry would incorrectly show clientcert=verify-ca.

Per bug #17354

Reported-By: Feike Steenbergen
Reviewed-By: Jonathan Katz
Backpatch-through: 12
2022-01-26 09:59:14 +01:00
Tom Lane 75674c7ec1 Revert "graceful shutdown" changes for Windows, in back branches only.
This reverts commits 6051857fc and ed52c3707, but only in the back
branches.  Further testing has shown that while those changes do fix
some things, they also break others; in particular, it looks like
walreceivers fail to detect walsender-initiated connection close
reliably if the walsender shuts down this way.  We'll keep trying to
improve matters in HEAD, but it now seems unwise to push these changes
into stable releases.

Discussion: https://postgr.es/m/CA+hUKG+OeoETZQ=Qw5Ub5h3tmwQhBmDA=nuNO3KG=zWfUypFAw@mail.gmail.com
2022-01-25 12:17:40 -05:00
David Rowley 357ff66153 Consider parallel awareness when removing single-child Appends
8edd0e794 added some code to remove Append and MergeAppend nodes when they
contained a single child node.  As it turned out, this was unsafe to do
when the Append/MergeAppend was parallel_aware and the child node was not.
Removing the Append/MergeAppend, in this case, could lead to the child plan
being called multiple times by parallel workers when it was unsafe to do
so.

Here we fix this by just not removing the Append/MergeAppend when the
parallel_aware flag of the parent and child node don't match.

Reported-by: Yura Sokolov
Bug: #17335
Discussion: https://postgr.es/m/b59605fecb20ba9ea94e70ab60098c237c870628.camel%40postgrespro.ru
Backpatch-through: 12, where 8edd0e794 was first introduced
2022-01-25 21:14:27 +13:00
Tom Lane 1efcc5946d Fix limitations on what SQL commands can be issued to a walsender.
In logical replication mode, a WalSender is supposed to be able
to execute any regular SQL command, as well as the special
replication commands.  Poor design of the replication-command
parser caused it to fail in various cases, notably:

* semicolons embedded in a command, or multiple SQL commands
sent in a single message;

* dollar-quoted literals containing odd numbers of single
or double quote marks;

* commands starting with a comment.

The basic problem here is that we're trying to run repl_scanner.l
across the entire input string even when it's not a replication
command.  Since repl_scanner.l does not understand all of the
token types known to the core lexer, this is doomed to have
failure modes.

We certainly don't want to make repl_scanner.l as big as scan.l,
so instead rejigger stuff so that we only lex the first token of
a non-replication command.  That will usually look like an IDENT
to repl_scanner.l, though a comment would end up getting reported
as a '-' or '/' single-character token.  If the token is a replication
command keyword, we push it back and proceed normally with repl_gram.y
parsing.  Otherwise, we can drop out of exec_replication_command()
without examining the rest of the string.

(It's still theoretically possible for repl_scanner.l to fail on
the first token; but that could only happen if it's an unterminated
single- or double-quoted string, in which case you'd have gotten
largely the same error from the core lexer too.)

In this way, repl_gram.y isn't involved at all in handling general
SQL commands, so we can get rid of the SQLCmd node type.  (In
the back branches, we can't remove it because renumbering enum
NodeTag would be an ABI break; so just leave it sit there unused.)

I failed to resist the temptation to clean up some other sloppy
coding in repl_scanner.l while at it.  The only externally-visible
behavior change from that is it now accepts \r and \f as whitespace,
same as the core lexer.

Per bug #17379 from Greg Rychlewski.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org
2022-01-24 15:33:34 -05:00
Tom Lane ef9706bbc8 Remember to reset yy_start state when firing up repl_scanner.l.
Without this, we get odd behavior when the previous cycle of
lexing exited in a non-default exclusive state.  Every other
copy of this code is aware that it has to do BEGIN(INITIAL),
but repl_scanner.l did not get that memo.

The real-world impact of this is probably limited, since most
replication clients would abandon their connection after getting
a syntax error.  Still, it's a bug.

This mistake is old, so back-patch to all supported branches.

Discussion: https://postgr.es/m/1874781.1643035952@sss.pgh.pa.us
2022-01-24 12:09:46 -05:00
Tom Lane 1042de69db pg_dump: avoid useless query in binary_upgrade_set_type_oids_by_type_oid
Commit 6df7a9698 wrote appendPQExpBuffer where it should have
written printfPQExpBuffer.  This resulted in re-issuing the
previous query along with the desired one, which very accidentally
had no negative consequences except for some wasted cycles.

Back-patch to v14 where that came in.

Discussion: https://postgr.es/m/1714711.1642962663@sss.pgh.pa.us
2022-01-23 13:54:24 -05:00
Tom Lane ef4edf88df Suppress variable-set-but-not-used warning from clang 13.
In the normal configuration where GEQO_DEBUG isn't defined,
recent clang versions have started to complain that geqo_main.c
accumulates the edge_failures count but never does anything
with it.  As a minimal back-patchable fix, insert a void cast
to silence this warning.  (I'd speculated about ripping out the
GEQO_DEBUG logic altogether, but I don't think we'd wish to
back-patch that.)

Per recently-established project policy, this is a candidate
for back-patching into out-of-support branches: it suppresses
an annoying compiler warning but changes no behavior.  Hence,
back-patch all the way to 9.2.

Discussion: https://postgr.es/m/CA+hUKGLTSZQwES8VNPmWO9AO0wSeLt36OCPDAZTccT1h7Q7kTQ@mail.gmail.com
2022-01-23 11:09:19 -05:00
Tomas Vondra 72ac4d71b5 Correct type of front_pathkey to PathKey
In sort_inner_and_outer we iterate a list of PathKey elements, but the
variable is declared as (List *). This mistake is benign, because we
only pass the pointer to lcons() and never dereference it.

This exists since ~2004, but it's confusing. So fix and backpatch to all
supported branches.

Backpatch-through: 10
Discussion: https://postgr.es/m/bf3a6ea1-a7d8-7211-0669-189d5c169374%40enterprisedb.com
2022-01-23 04:05:08 +01:00
Tomas Vondra a192243c75 Check syscache result in AlterStatistics
The syscache lookup may return NULL even for valid OID, for example due
to a concurrent DROP STATISTICS, so a HeapTupleIsValid is necessary.
Without it, it may fail with a segfault.

Reported by Alexander Lakhin, patch by me. Backpatch to 13, where ALTER
STATISTICS ... SET STATISTICS was introduced.

Backpatch-through: 13
Discussion: https://postgr.es/m/17372-bf3b6e947e35ae77%40postgresql.org
2022-01-23 03:18:02 +01:00
Tom Lane 3839e29c58 Flush table's relcache during ALTER TABLE ADD PRIMARY KEY USING INDEX.
Previously, unless we had to add a NOT NULL constraint to the column,
this command resulted in updating only the index's relcache entry.
That's problematic when replication behavior is being driven off the
existence of a primary key: other sessions (and ours too for that
matter) failed to recalculate their opinion of whether the table can
be replicated.  Add a relcache invalidation to fix it.

This has been broken since pg_class.relhaspkey was removed in v11.
Before that, updating the table's relhaspkey value sufficed to cause
a cache flush.  Hence, backpatch to v11.

Report and patch by Hou Zhijie

Discussion: https://postgr.es/m/OS0PR01MB5716EBE01F112C62F8F9B786947B9@OS0PR01MB5716.jpnprd01.prod.outlook.com
2022-01-22 13:32:40 -05:00
Tom Lane f4ebf0dbea Fix race condition in gettext() initialization in libpq and ecpglib.
In libpq and ecpglib, multiple threads can concurrently enter the
initialization logic for message localization.  Since we set the
its-done flag before actually doing the work, it'd be possible
for some threads to reach gettext() before anyone has called
bindtextdomain().  Barring bugs in libintl itself, this would not
result in anything worse than failure to localize some early
messages.  Nonetheless, it's a bug, and an easy one to fix.

Noted while investigating bug #17299 from Clemens Zeidler
(much thanks to Liam Bowen for followup investigation on that).
It currently appears that that actually *is* a bug in libintl itself,
but that doesn't let us off the hook for this bit.

Back-patch to all supported versions.

Discussion: https://postgr.es/m/17299-7270741958c0b1ab@postgresql.org
Discussion: https://postgr.es/m/CAE7q7Eit4Eq2=bxce=Fm8HAStECjaXUE=WBQc-sDDcgJQ7s7eg@mail.gmail.com
2022-01-21 15:36:28 -05:00
Andres Freund 2b7dbe4bd5 fsync pg_logical/mappings in CheckPointLogicalRewriteHeap().
While individual logical rewrite files were synced to disk, the directory was
not. On some filesystems that could lead to loosing directory entries after a
crash.

Reported-By: Tom Lane <tgl@sss.pgh.pa.us>
Author: Nathan Bossart <bossartn@amazon.com>
Discussion: https://postgr.es/m/867F2E29-2782-4869-970E-B984C6D35A8F@amazon.com
Backpatch: 10-
2022-01-21 11:24:12 -08:00
Michael Paquier 84db5169d4 Fix one-off bug causing missing commit timestamps for subtransactions
The logic in charge of writing commit timestamps (enabled with
track_commit_timestamp) for subtransactions had a one-bug bug,
where it would be possible that commit timestamps go missing for the
last subtransaction committed.

While on it, simplify a bit the iteration logic in the loop writing the
commit timestamps, as per suggestions from Kyotaro Horiguchi and Tom
Lane, so as some variable initializations are not part of the loop
itself.

Issue introduced in 73c986a.

Analyzed-by: Alex Kingsborough
Author: Alex Kingsborough, Kyotaro Horiguchi
Discussion: https://postgr.es/m/73A66172-4050-4F2A-B7F1-13508EDA2144@amazon.com
Backpatch-through: 10
2022-01-21 14:54:47 +09:00
Tom Lane cf680bd653 Tighten TAP tests' tracking of postmaster state some more.
Commits 6c4a8903b et al. had a couple of deficiencies:

* The logic I added to Cluster::start to see if a PID file is present
could be fooled by a stale PID file left over from a previous
postmaster.  To fix, if we're not sure whether we expect to find a
running postmaster or not, validate the PID using "kill 0".

* 017_shm.pl has a loop in which it just issues repeated Cluster::start
calls; this will fail if some invocation fails but leaves self->_pid
set.  Per buildfarm results, the above fix is not enough to make this
safe: we might have "validated" a PID for a postmaster that exits
immediately after we look.  Hence, match each failed start call with
a stop call that will get us back to the self->_pid == undef state.
Add a fail_ok option to Cluster::stop to make this work.

Discussion: https://postgr.es/m/CA+hUKGKV6fOHvfiPt8=dOKzvswjAyLoFoJF1iQXMNpi7+hD1JQ@mail.gmail.com
2022-01-20 17:28:07 -05:00
Andrew Dunstan 156a846d93
Allow clean.bat to be run from anywhere
This was omitted from c3879a7b4c which modified the other msvc .bat
files.

Per request from Juan José Santamaría Flecha

Discussion: https://postgr.es/m/CAC+AXB0_fxYGbQoaYjCA8um7TTbOVP4L9aXnVmHwK8WzaT4gdA@mail.gmail.com

Backpatch to all live branches.
2022-01-20 10:20:40 -05:00
Thomas Munro b9dd162205 Try to stabilize reloptions test, again.
Since the test requires reproducible behavior from VACUUM, and since
DISABLE_PAGE_SKIPPING doesn't actually disable all forms of page
skipping, let's use a temporary table to avoid contention.

Back-patch to 12, like commit 3414099c.

Discussion: https://postgr.es/m/20220120052404.sonrhq3f3qgplpzj%40alap3.anarazel.de
2022-01-20 23:27:24 +13:00
Tom Lane b0623625a0 TAP tests: check for postmaster.pid anyway when "pg_ctl start" fails.
"pg_ctl start" might start a new postmaster and then return failure
anyway, for example if PGCTLTIMEOUT is exceeded.  If there is a
postmaster there, it's still incumbent on us to shut it down at
script end, so check for the PID file even though we are about
to fail.

This has been broken all along, so back-patch to all supported branches.

Discussion: https://postgr.es/m/647439.1642622744@sss.pgh.pa.us
2022-01-19 16:29:09 -05:00
Thomas Munro ff4d0d8cc5 Try to stabilize the reloptions test.
Where we test vacuum_truncate's effects, sometimes this is failing to
truncate as expected on the build farm.  That could be explained by page
skipping, so disable it explicitly, with the theory that commit fe246d1c
didn't go far enough.

Back-patch to 12, where the vacuum_truncate tests were added.

Discussion: https://postgr.es/m/CA%2BhUKGLT2UL5_JhmBzUgkdyKfc%3D5J-gJSQJLysMs4rqLUKLAzw%40mail.gmail.com
2022-01-19 07:29:38 +13:00
Tom Lane 3886785b4c Fix psql \d's query for identifying parent triggers.
The original coding (from c33869cc3) failed with "more than one row
returned by a subquery used as an expression" if there were unrelated
triggers of the same tgname on parent partitioned tables.  (That's
possible because statement-level triggers don't get inherited.)  Fix
by applying LIMIT 1 after sorting the candidates by inheritance level.

Also, wrap the subquery in a CASE so that we don't have to execute it at
all when the trigger is visibly non-inherited.  Aside from saving some
cycles, this avoids the need for a confusing and undocumented NULLIF().

While here, tweak the format of the emitted query to look a bit
nicer for "psql -E", and add some explanation of this subquery,
because it badly needs it.

Report and patch by Justin Pryzby (with some editing by me).
Back-patch to v13 where the faulty code came in.

Discussion: https://postgr.es/m/20211217154356.GJ17618@telsasoft.com
2022-01-17 21:18:49 -05:00
Tom Lane 4e8726566e Avoid calling gettext() in signal handlers.
It seems highly unlikely that gettext() can be relied on to be
async-signal-safe.  psql used to understand that, but someone got
it wrong long ago in the src/bin/scripts/ version of handle_sigint,
and then the bad idea was perpetuated when those two versions were
unified into src/fe_utils/cancel.c.

I'm unsure why there have not been field complaints about this
... maybe gettext() is signal-safe once it's translated at least
one message?  But we have no business assuming any such thing.

In cancel.c (v13 and up), I preserved our ability to localize
"Cancel request sent" messages by invoking gettext() before
the signal handler is set up.  In earlier branches I just made
src/bin/scripts/ not localize those messages, as psql did then.

(Just for extra unsafety, the src/bin/scripts/ version was
invoking fprintf() from a signal handler.  Sigh.)

Noted while fixing signal-safety issues in PQcancel() itself.
Back-patch to all supported branches.

Discussion: https://postgr.es/m/2937814.1641960929@sss.pgh.pa.us
2022-01-17 13:30:04 -05:00
Tom Lane 0509498770 Avoid calling strerror[_r] in PQcancel().
PQcancel() is supposed to be safe to call from a signal handler,
and indeed psql uses it that way.  All of the library functions
it uses are specified to be async-signal-safe by POSIX ...
except for strerror.  Neither plain strerror nor strerror_r
are considered safe.  When this code was written, back in the
dark ages, we probably figured "oh, strerror will just index
into a constant array of strings" ... but in any locale except C,
that's unlikely to be true.  Probably the reason we've not heard
complaints is that (a) this error-handling code is unlikely to be
reached in normal use, and (b) in many scenarios, localized error
strings would already have been loaded, after which maybe it's
safe to call strerror here.  Still, this is clearly unacceptable.

The best we can do without relying on strerror is to print the
decimal value of errno, so make it do that instead.  (This is
probably not much loss of user-friendliness, given that it is
hard to get a failure here.)

Back-patch to all supported branches.

Discussion: https://postgr.es/m/2937814.1641960929@sss.pgh.pa.us
2022-01-17 12:52:44 -05:00
Tom Lane 17da9d4c28 Teach hash_ok_operator() that record_eq is only sometimes hashable.
The need for this was foreseen long ago, but when record_eq
actually became hashable (in commit 01e658fa7), we missed updating
this spot.

Per bug #17363 from Elvis Pranskevichus.  Back-patch to v14 where
the faulty commit came in.

Discussion: https://postgr.es/m/17363-f6d42fd0d726be02@postgresql.org
2022-01-16 16:39:26 -05:00
Tom Lane d91d4338e0 Fix psql's tab-completion of enum label values.
Since enum labels have to be single-quoted, this part of the
tab completion machinery got side-swiped by commit cd69ec66c.
A side-effect of that commit is that (at least with some versions
of Readline) the text string passed for completion will omit the
leading quote mark of the enum label literal.  Libedit still acts
the same as before, though, so adapt COMPLETE_WITH_ENUM_VALUE so
that it can cope with either convention.

Also, when we fail to find any valid completion, set
rl_completion_suppress_quote = 1.  Otherwise readline will
go ahead and append a closing quote, which is unwanted.

Per report from Peter Eisentraut.  Back-patch to v13 where
cd69ec66c came in.

Discussion: https://postgr.es/m/8ca82d89-ec3d-8b28-8291-500efaf23b25@enterprisedb.com
2022-01-16 14:59:20 -05:00
Tomas Vondra ea212bd95f Build inherited extended stats on partitioned tables
Commit 859b3003de disabled building of extended stats for inheritance
trees, to prevent updating the same catalog row twice. While that
resolved the issue, it also means there are no extended stats for
declaratively partitioned tables, because there are no data in the
non-leaf relations.

That also means declaratively partitioned tables were not affected by
the issue 859b3003de addressed, which means this is a regression
affecting queries that calculate estimates for the whole inheritance
tree as a whole (which includes e.g. GROUP BY queries).

But because partitioned tables are empty, we can invert the condition
and build statistics only for the case with inheritance, without losing
anything. And we can consider them when calculating estimates.

It may be necessary to run ANALYZE on partitioned tables, to collect
proper statistics. For declarative partitioning there should no prior
statistics, and it might take time before autoanalyze is triggered. For
tables partitioned by inheritance the statistics may include data from
child relations (if built 859b3003de), contradicting the current code.

Report and patch by Justin Pryzby, minor fixes and cleanup by me.
Backpatch all the way back to PostgreSQL 10, where extended statistics
were introduced (same as 859b3003de).

Author: Justin Pryzby
Reported-by: Justin Pryzby
Backpatch-through: 10
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
2022-01-15 19:05:22 +01:00
Tomas Vondra 2cc007fd03 Ignore extended statistics for inheritance trees
Since commit 859b3003de we only build extended statistics for individual
relations, ignoring the child relations. This resolved the issue with
updating catalog tuple twice, but we still tried to use the statistics
when calculating estimates for the whole inheritance tree. When the
relations contain very distinct data, it may produce bogus estimates.

This is roughly the same issue 427c6b5b9 addressed ~15 years ago, and we
fix it the same way - by ignoring extended statistics when calculating
estimates for the inheritance tree as a whole. We still consider
extended statistics when calculating estimates for individual child
relations, of course.

This may result in plan changes due to different estimates, but if the
old statistics were not describing the inheritance tree particularly
well it's quite likely the new plans is actually better.

Report and patch by Justin Pryzby, minor fixes and cleanup by me.
Backpatch all the way back to PostgreSQL 10, where extended statistics
were introduced (same as 859b3003de).

Author: Justin Pryzby
Reported-by: Justin Pryzby
Backpatch-through: 10
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
2022-01-15 02:26:26 +01:00
Andres Freund dad1539aec Fix possible HOT corruption when RECENTLY_DEAD changes to DEAD while pruning.
Since dc7420c2c9 the horizon used for pruning is determined "lazily". A more
accurate horizon is built on-demand, rather than in GetSnapshotData(). If a
horizon computation is triggered between two HeapTupleSatisfiesVacuum() calls
for the same tuple, the result can change from RECENTLY_DEAD to DEAD.

heap_page_prune() can process the same tid multiple times (once following an
update chain, once "directly"). When the result of HeapTupleSatisfiesVacuum()
of a tuple changes from RECENTLY_DEAD during the first access, to DEAD in the
second, the "tuple is DEAD and doesn't chain to anything else" path in
heap_prune_chain() can end up marking the target of a LP_REDIRECT ItemId
unused.

Initially not easily visible,
Once the target of a LP_REDIRECT ItemId is marked unused, a new tuple version
can reuse it. At that point the corruption may become visible, as index
entries pointing to the "original" redirect item, now point to a unrelated
tuple.

To fix, compute HTSV for all tuples on a page only once. This fixes the entire
class of problems of HTSV changing inside heap_page_prune(). However,
visibility changes can obviously still occur between HTSV checks inside
heap_page_prune() and outside (e.g. in lazy_scan_prune()).

The computation of HTSV is now done in bulk, in heap_page_prune(), rather than
on-demand in heap_prune_chain(). Besides being a bit simpler, it also is
faster: Memory accesses can happen sequentially, rather than in the order of
HOT chains.

There are other causes of HeapTupleSatisfiesVacuum() results changing between
two visibility checks for the same tuple, even before dc7420c2c9. E.g.
HEAPTUPLE_INSERT_IN_PROGRESS can change to HEAPTUPLE_DEAD when a transaction
aborts between the two checks. None of the these other visibility status
changes are known to cause corruption, but heap_page_prune()'s approach makes
it hard to be confident.

A patch implementing a more fundamental redesign of heap_page_prune(), which
fixes this bug and simplifies pruning substantially, has been proposed by
Peter Geoghegan in
https://postgr.es/m/CAH2-WzmNk6V6tqzuuabxoxM8HJRaWU6h12toaS-bqYcLiht16A@mail.gmail.com

However, that redesign is larger change than desirable for backpatching. As
the new design still benefits from the batched visibility determination
introduced in this commit, it makes sense to commit this narrower fix to 14
and master, and then commit Peter's improvement in master.

The precise sequence required to trigger the bug is complicated and hard to do
exercise in an isolation test (until we have wait points). Due to that the
isolation test initially posted at
https://postgr.es/m/20211119003623.d3jusiytzjqwb62p%40alap3.anarazel.de
and updated in
https://postgr.es/m/20211122175914.ayk6gg6nvdwuhrzb%40alap3.anarazel.de
isn't committable.

A followup commit will introduce additional assertions, to detect problems
like this more easily.

Bug: #17255
Reported-By: Alexander Lakhin <exclusion@gmail.com>
Debugged-By: Andres Freund <andres@anarazel.de>
Debugged-By: Peter Geoghegan <pg@bowt.ie>
Author: Andres Freund <andres@andres@anarazel.de>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/20211122175914.ayk6gg6nvdwuhrzb@alap3.anarazel.de
Backpatch: 14-, the oldest branch containing dc7420c2c9
2022-01-14 10:56:12 -08:00
Michael Paquier ad5b6f248a Revert error handling improvements for cryptohashes
This reverts commits ab27df2, af8d530 and 3a0cced, that introduced
pg_cryptohash_error().  In order to make the core code able to pass down
the new error types that this introduced, some of the MD5-related
routines had to be reworked, causing an ABI breakage, but we found that
some external extensions rely on them.  Maintaining compatibility
outweights the error report benefits, so just revert the change in v14.

Reported-by: Laurenz Albe
Discussion: https://postgr.es/m/9f0c0a96d28cf14fc87296bbe67061c14eb53ae8.camel@cybertec.at
2022-01-14 11:25:39 +09:00
Tom Lane 4aee39ddb8 Fix ruleutils.c's dumping of whole-row Vars in more contexts.
Commit 7745bc352 intended to ensure that whole-row Vars would be
printed with "::type" decoration in all contexts where plain
"var.*" notation would result in star-expansion, notably in
ROW() and VALUES() constructs.  However, it missed the case of
INSERT with a single-row VALUES, as reported by Timur Khanjanov.

Nosing around ruleutils.c, I found a second oversight: the
code for RowCompareExpr generates ROW() notation without benefit
of an actual RowExpr, and naturally it wasn't in sync :-(.
(The code for FieldStore also does this, but we don't expect that
to generate strictly parsable SQL anyway, so I left it alone.)

Back-patch to all supported branches.

Discussion: https://postgr.es/m/efaba6f9-4190-56be-8ff2-7a1674f9194f@intrans.baku.az
2022-01-13 17:49:26 -05:00
Michael Paquier 3c1ffd02dd Fix incorrect comments in hmac.c and hmac_openssl.c
Both files referred to pg_hmac_ctx->data, which, I guess, comes from the
early versions of the patch that has resulted in commit e6bdfd9.

Author: Sergey Shinderuk
Discussion: https://postgr.es/m/8cbb56dd-63d6-a581-7a65-25a97ac4be03@postgrespro.ru
Backpatch-through: 14
2022-01-13 09:43:44 +09:00
Peter Geoghegan 41ee68a91f Fix memory leak in indexUnchanged hint mechanism.
Commit 9dc718bd added a "logically unchanged by UPDATE" hinting
mechanism, which is currently used within nbtree indexes only (see
commit d168b666).  This mechanism determined whether or not the incoming
item is a logically unchanged duplicate (a duplicate needed only for
MVCC versioning purposes) once per row updated per non-HOT update.  This
approach led to memory leaks which were noticeable with an UPDATE
statement that updated sufficiently many rows, at least on tables that
happen to have an expression index.

On HEAD, fix the issue by adding a cache to the executor's per-index
IndexInfo struct.

Take a different approach on Postgres 14 to avoid an ABI break: simply
pass down the hint to all indexes unconditionally with non-HOT UPDATEs.
This is deemed acceptable because the hint is currently interpreted
within btinsert() as "perform a bottom-up index deletion pass if and
when the only alternative is splitting the leaf page -- prefer to delete
any LP_DEAD-set items first".  nbtree must always treat the hint as a
noisy signal about what might work, as a strategy of last resort, with
costs imposed on non-HOT updaters.  (The same thing might not be true
within another index AM that applies the hint, which is why the original
behavior is preserved on HEAD.)

Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Klaudie Willis <Klaudie.Willis@protonmail.com>
Diagnosed-By: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/261065.1639497535@sss.pgh.pa.us
Backpatch: 14-, where the hinting mechanism was added.
2022-01-12 15:41:02 -08:00
Michael Paquier af8d530e47 Fix comment related to pg_cryptohash_error()
One of the comments introduced in b69aba7 was worded a bit weirdly, so
improve it.

Reported-by: Sergey Shinderuk
Discussion: https://postgr.es/m/71b9a5d2-a3bf-83bc-a243-93dcf0bcfb3b@postgrespro.ru
Backpatch-through: 14
2022-01-12 12:40:04 +09:00
Tom Lane ab27df2490 Clean up error message reported after \password encryption failure.
Experimenting with FIPS mode enabled, I saw

regression=# \password joe
Enter new password for user "joe":
Enter it again:
could not encrypt password: disabled for FIPS
out of memory

because PQencryptPasswordConn was still of the opinion that "out of
memory" is always appropriate to print.

Minor oversight in b69aba745.  Like that one, back-patch to v14.
2022-01-11 12:03:06 -05:00
Michael Paquier 3a0cced86d Improve error handling of cryptohash computations
The existing cryptohash facility was causing problems in some code paths
related to MD5 (frontend and backend) that relied on the fact that the
only type of error that could happen would be an OOM, as the MD5
implementation used in PostgreSQL ~13 (the in-core implementation is
used when compiling with or without OpenSSL in those older versions),
could fail only under this circumstance.

The new cryptohash facilities can fail for reasons other than OOMs, like
attempting MD5 when FIPS is enabled (upstream OpenSSL allows that up to
1.0.2, Fedora and Photon patch OpenSSL 1.1.1 to allow that), so this
would cause incorrect reports to show up.

This commit extends the cryptohash APIs so as callers of those routines
can fetch more context when an error happens, by using a new routine
called pg_cryptohash_error().  The error states are stored within each
implementation's internal context data, so as it is possible to extend
the logic depending on what's suited for an implementation.  The default
implementation requires few error states, but OpenSSL could report
various issues depending on its internal state so more is needed in
cryptohash_openssl.c, and the code is shaped so as we are always able to
grab the necessary information.

The core code is changed to adapt to the new error routine, painting
more "const" across the call stack where the static errors are stored,
particularly in authentication code paths on variables that provide
log details.  This way, any future changes would warn if attempting to
free these strings.  The MD5 authentication code was also a bit blurry
about the handling of "logdetail" (LOG sent to the postmaster), so
improve the comments related that, while on it.

The origin of the problem is 87ae969, that introduced the centralized
cryptohash facility.  Extra changes are done for pgcrypto in v14 for the
non-OpenSSL code path to cope with the improvements done by this
commit.

Reported-by: Michael Mühlbeyer
Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/89B7F072-5BBE-4C92-903E-D83E865D9367@trivadis.com
Backpatch-through: 14
2022-01-11 09:55:24 +09:00
Andrew Dunstan 1cd46f168f
Avoid warning about uninitialized value in MSVC python3 tests
Juan José Santamaría Flecha

Backpatch to all live branches
2022-01-10 10:12:23 -05:00
Michael Paquier f5bea83606 Fix issues with describe queries of extended statistics in psql
This addresses some problems in the describe queries used for extended
statistics:
- Two schema qualifications for the text type were missing for \dX.
- The list of extended statistics listed for a table through \d was
ordered based on the object OIDs, but it is more consistent with the
other commands to order by namespace and then by object name.
- A couple of aliases were not used in \d.  These are removed.

This is similar to commits 1f092a3 and 07f8a9e.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220107022235.GA14051@telsasoft.com
Backpatch-through: 14
2022-01-08 16:45:14 +09:00
Andrew Dunstan a7772e8748
Allow MSVC .bat wrappers to be called from anywhere
Instead of using a hardcoded or default path to the perl file the .bat
file is a wrapper for, we use a path that means the file is found in
the same directory as the .bat file.

Patch by Anton Voloshin, slightly tweaked by me.

Backpatch to all live branches

Discussion: https://postgr.es/m/2b7a674b-5fb0-d264-75ef-ecc7a31e54f8@postgrespro.ru
2022-01-07 16:14:04 -05:00
Tom Lane f285d95839 Prevent altering partitioned table's rowtype, if it's used elsewhere.
We disallow altering a column datatype within a regular table,
if the table's rowtype is used as a column type elsewhere,
because we lack code to go around and rewrite the other tables.
This restriction should apply to partitioned tables as well, but it
was not checked because ATRewriteTables and ATPrepAlterColumnType
were not on the same page about who should do it for which relkinds.

Per bug #17351 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17351-6db1870f3f4f612a@postgresql.org
2022-01-06 16:46:46 -05:00
Michael Paquier 5ddfebded4 Reduce relcache access in WAL sender streaming logical changes
get_rel_sync_entry(), which is called each time a change needs to be
logically replicated, is a rather hot code path in the WAL sender
sending logical changes.  This code path was doing a relcache access on
relkind and relpartition for each logical change, but we only need to
know this information when building or re-building the cached
information for a relation.

Some measurements prove that this is noticeable in perf profiles,
particularly when attempting to replicate changes from relations that
are not published as these cause less overhead in the WAL sender,
delaying further the replication of changes for relations that are
published.

Issue introduced in 83fd453.

Author: Hou Zhijie
Reviewed-by: Kyotaro Horiguchi, Euler Taveira
Discussion: https://postgr.es/m/OS0PR01MB5716E863AA9E591C1F010F7A947D9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Backpatch-through: 13
2022-01-05 10:27:47 +09:00
Alvaro Herrera f9db153c28
Fix silly mistake in Assert 2022-01-04 13:21:23 -03:00
Alvaro Herrera f185f35a83
Allow special SKIP LOCKED condition in Assert()
Under concurrency, it is possible for two sessions to be merrily locking
and releasing a tuple and marking it again as HEAP_XMAX_INVALID all the
while a third session attempts to lock it, miserably fails at it, and
then contemplates life, the universe and everything only to eventually
fail an assertion that said bit is not set.  Before SKIP LOCKED that was
indeed a reasonable expectation, but alas! commit df630b0dd5 falsified
it.

This bug is as old as time itself, and even older, if you think time
begins with the oldest supported branch.  Therefore, backpatch to all
supported branches.

Author: Simon Riggs <simon.riggs@enterprisedb.com>
Discussion: https://postgr.es/m/CANbhV-FeEwMnN8yuMyss7if1ZKjOKfjcgqB26n8pqu1e=q0ebg@mail.gmail.com
2022-01-04 13:01:05 -03:00
Tom Lane d228af79d0 Fix index-only scan plans, take 2.
Commit 4ace45677 failed to fix the problem fully, because the
same issue of attempting to fetch a non-returnable index column
can occur when rechecking the indexqual after using a lossy index
operator.  Moreover, it broke EXPLAIN for such indexquals (which
indicates a gap in our test cases :-().

Revert the code changes of 4ace45677 in favor of adding a new field
to struct IndexOnlyScan, containing a version of the indexqual that
can be executed against the index-returned tuple without using any
non-returnable columns.  (The restrictions imposed by check_index_only
guarantee this is possible, although we may have to recompute indexed
expressions.)  Support construction of that during setrefs.c
processing by marking IndexOnlyScan.indextlist entries as resjunk
if they can't be returned, rather than removing them entirely.
(We could alternatively require setrefs.c to look up the IndexOptInfo
again, but abusing resjunk this way seems like a reasonably safe way
to avoid needing to do that.)

This solution isn't great from an API-stability standpoint: if there
are any extensions out there that build IndexOnlyScan structs directly,
they'll be broken in the next minor releases.  However, only a very
invasive extension would be likely to do such a thing.  There's no
change in the Path representation, so typical planner extensions
shouldn't have a problem.

As before, back-patch to all supported branches.

Discussion: https://postgr.es/m/3179992.1641150853@sss.pgh.pa.us
Discussion: https://postgr.es/m/17350-b5bdcf476e5badbb@postgresql.org
2022-01-03 15:42:27 -05:00
Tom Lane cabea571d1 Fix index-only scan plans when not all index columns can be returned.
If an index has both returnable and non-returnable columns, and one of
the non-returnable columns is an expression using a Var that is in a
returnable column, then a query returning that expression could result
in an index-only scan plan that attempts to read the non-returnable
column, instead of recomputing the expression from the returnable
column as intended.

To fix, redefine the "indextlist" list of an IndexOnlyScan plan node
as containing null Consts in place of any non-returnable columns.
This solves the problem by preventing setrefs.c from falsely matching
to such entries.  The executor is happy since it only cares about the
exposed types of the entries, and ruleutils.c doesn't care because a
correct plan won't reference those entries.  I considered some other
ways to prevent setrefs.c from doing the wrong thing, but this way
seems good since (a) it allows a very localized fix, (b) it makes
the indextlist structure more compact in many cases, and (c) the
indextlist is now a more faithful representation of what the index AM
will actually produce, viz. nulls for any non-returnable columns.

This is easier to hit since we introduced included columns, but it's
possible to construct failing examples without that, as per the
added regression test.  Hence, back-patch to all supported branches.

Per bug #17350 from Louis Jachiet.

Discussion: https://postgr.es/m/17350-b5bdcf476e5badbb@postgresql.org
2022-01-01 16:12:03 -05:00
Daniel Gustafsson cb4f1be43a Revert b2a459edf "Fix GRANTED BY support in REVOKE ROLE statements"
The reverted commit attempted to fix SQL specification compliance for
the cases which 6aaaa76bb left.  This however broke existing behavior
which takes precedence over spec compliance so revert. The introduced
tests are left after the revert since the codepath isn't well covered.
Per bug report 17346. Backpatch down to 14 where it was introduced.

Reported-by: Andrew Bille <andrewbille@gmail.com>
Discussion: https://postgr.es/m/17346-f72b28bd1a341060@postgresql.org
2021-12-30 13:23:47 +01:00
Thomas Munro f7d7ac23d1 Fix overly generic name in with.sql test.
Avoid the name "test".  In the 10 branch, this could clash with
alter_table.sql, as seen in the build farm.  That other instance was
already renamed in later branches by commit 2cf8c7aa, but it's good to
future-proof the name here too.

Back-patch to 10.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGJf4RAXUyAYVUcQawcptX%3DnhEco3SYpuPK5cCbA-F1eLA%40mail.gmail.com
2021-12-30 17:14:49 +13:00
Michael Paquier 420f9ac1b7 Correct comment and some documentation about REPLICA_IDENTITY_INDEX
catalog/pg_class.h was stating that REPLICA_IDENTITY_INDEX with a
dropped index is equivalent to REPLICA_IDENTITY_DEFAULT.  The code tells
a different story, as it is equivalent to REPLICA_IDENTITY_NOTHING.

The behavior exists since the introduction of replica identities, and
fe7fd4e even added tests for this case but I somewhat forgot to fix this
comment.

While on it, this commit reorganizes the documentation about replica
identities on the ALTER TABLE page, and a note is added about the case
of dropped indexes with REPLICA_IDENTITY_INDEX.

Author: Michael Paquier, Wei Wang
Reviewed-by: Euler Taveira
Discussion: https://postgr.es/m/OS3PR01MB6275464AD0A681A0793F56879E759@OS3PR01MB6275.jpnprd01.prod.outlook.com
Backpatch-through: 10
2021-12-22 16:38:38 +09:00
Michael Paquier 8a22a40b2c Remove assertion for ALTER TABLE .. DETACH PARTITION CONCURRENTLY
One code path related to this flavor of ALTER TABLE was checking that
the relation to detach has to be a normal table or a partitioned table,
which would fail if using the command with a different relation kind.

Views, sequences and materialized views cannot be part of a partition
tree, so these would cause the command to fail anyway, but the assertion
was triggered.  Foreign tables can be part of a partition tree, and
again the assertion would have failed.  The simplest solution is just to
remove this assertion, so as we get the same failure as the
non-concurrent code path.

While on it, add a regression test in postgres_fdw for the concurrent
partition detach of a foreign table, as per a suggestion from Alexander
Lakhin.

Issue introduced in 71f4c8c.

Reported-by: Alexander Lakhin
Author: Michael Paquier, Alexander Lakhin
Reviewed-by: Peter Eisentraut, Kyotaro Horiguchi
Discussion: https://postgr.es/m/17339-a9e09aaf38a3457a@postgresql.org
Backpatch-through: 14
2021-12-22 15:38:05 +09:00
Tom Lane f9a8bc9f27 Ensure casting to typmod -1 generates a RelabelType.
Fix the code changed by commit 5c056b0c2 so that we always generate
RelabelType, not something else, for a cast to unspecified typmod.
Otherwise planner optimizations might not happen.

It appears we missed this point because the previous experiments were
done on type numeric: the parser undesirably generates a call on the
numeric() length-coercion function, but then numeric_support()
optimizes that down to a RelabelType, so that everything seems fine.
It misbehaves for types that have a non-optimized length coercion
function, such as bpchar.

Per report from John Naylor.  Back-patch to all supported branches,
as the previous patch eventually was.  Unfortunately, that no longer
includes 9.6 ... we really shouldn't put this type of change into a
nearly-EOL branch.

Discussion: https://postgr.es/m/CAFBsxsEfbFHEkouc+FSj+3K1sHipLPbEC67L0SAe-9-da8QtYg@mail.gmail.com
2021-12-16 15:36:02 -05:00
Michael Paquier 4ddbd7456a Adjust behavior of some env settings for the TAP tests of MSVC
edc2332 has introduced in vcregress.pl some control on the environment
variables LZ4, TAR and GZIP_PROGRAM to allow any TAP tests to be able
use those commands.  This makes the settings more consistent with
src/Makefile.global.in, as the same default gets used for Make and MSVC
builds.

Each parameter can be changed in buildenv.pl, but as a default gets
assigned after loading buldenv.pl, it is not possible to unset any of
these, and using an empty value would not work with "||=" either.  As
some environments may not have a compatible command in their PATH (tar
coming from MinGW is an issue, for one), this could break tests without
an exit path to bypass any failing test.  This commit changes things so
as the default values for LZ4, TAR and GZIP_PROGRAM are assigned before
loading buildenv.pl, not after.  This way, we keep the same amount of
compatibility as a GNU build with the same defaults, and it becomes
possible to unset any of those values.

While on it, this adds some documentation about those three variables in
the section dedicated to the TAP tests for MSVC.

Per discussion with Andrew Dunstan.

Discussion: https://postgr.es/m/YbGYe483803il3X7@paquier.xyz
Backpatch-through: 10
2021-12-15 10:40:08 +09:00
Tom Lane 25c7faa1fe Fix datatype confusion in logtape.c's right_offset().
This could only matter if (a) long is wider than int, and (b) the heap
of free blocks exceeds UINT_MAX entries, which seems pretty unlikely.
Still, it's a theoretical bug, so backpatch to v13 where the typo came
in (in commit c02fdc922).

In passing, also make swap_nodes() use consistent datatypes.

Ma Liangzhu

Discussion: https://postgr.es/m/17336-fc4e522d26a750fd@postgresql.org
2021-12-14 11:46:44 -05:00
Michael Paquier 4be3e005e5 Remove assertion for replication origins in PREPARE TRANSACTION
When using replication origins, pg_replication_origin_xact_setup() is an
optional choice to be able to set a LSN and a timestamp to mark the
origin, which would be additionally added to WAL for transaction commits
or aborts (including 2PC transactions).  An assertion in the code path
of PREPARE TRANSACTION assumed that this data should always be set, so
it would trigger when using replication origins without setting up an
origin LSN.  Some tests are added to cover more this kind of scenario.

Oversight in commit 1eb6d65.

Per discussion with Amit Kapila and Masahiko Sawada.

Discussion: https://postgr.es/m/YbbBfNSvMm5nIINV@paquier.xyz
Backpatch-through: 11
2021-12-14 10:58:25 +09:00
Andres Freund 5b5455b035 isolationtester: append session name to application_name.
When writing / debugging an isolation test it sometimes is useful to see which
session holds what lock etc. To make it easier, both as part of spec files and
interactively, append the session name to application_name. Since b1907d688
application_name already contains the test name, this appends the session's
name to that.

insert-conflict-specconflict did something like this manually, which can now
be removed.

As we have done lately with other test infrastructure improvements, backpatch
this change, to make it easier to backpatch tests.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Michael Paquier <michael@paquier.xyz>
Reviewed-By: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/20211211012052.2blmzcmxnxqawd2z@alap3.anarazel.de
Backpatch: 10-, to make backpatching of tests easier.
2021-12-13 12:02:42 -08:00
Alexander Korotkov 7615edd1db Fix alignment in multirange_get_range() function
The multirange_get_range() function fails when two boundaries of the same
range have different alignments.  Fix that by adding proper pointer alignment.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/17300-dced2d01ddeb1f2f%40postgresql.org
Backpatch-through: 14
2021-12-13 17:20:07 +03:00
Michael Paquier a04fc56d03 Fix some typos with {a,an}
One of the changes impacts the documentation, so backpatch.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pu6+c+r3mY24VT7u+H+E_s6vMr5OdRiZ8NT3EOa-E5Lmw@mail.gmail.com
Backpatch-through: 14
2021-12-09 15:20:45 +09:00
Amit Kapila 614b77d65a Fix double publish of child table's data.
We publish the child table's data twice for a publication that has both
child and parent tables and is published with publish_via_partition_root
as true. This happens because subscribers will initiate synchronization
using both parent and child tables, since it gets both as separate tables
in the initial table list.

Ensure that pg_publication_tables returns only parent tables in such
cases.

Author: Hou Zhijie
Reviewed-by: Greg Nancarrow, Amit Langote, Vignesh C, Amit Kapila
Backpatch-through: 13
Discussion: https://postgr.es/m/OS0PR01MB57167F45D481F78CDC5986F794B99@OS0PR01MB5716.jpnprd01.prod.outlook.com
2021-12-09 08:49:50 +05:30
Andrew Dunstan e4d73d089f
Revert "Check that we have a working tar before trying to use it"
This reverts commit f920f7e799.

The patch in effect fixed a problem we didn't have and caused another
instead.

Backpatch to release 14 like original

Discussion: https://postgr.es/m/3655283.1638977975@sss.pgh.pa.us
2021-12-08 16:50:04 -05:00
Andrew Dunstan 90c08ed113
Check that we have a working tar before trying to use it
Issue exposed by commit edc2332550 and the buildfarm.

Backpatch to release 14 where this usage started.
2021-12-08 10:23:48 -05:00
Amit Kapila f2e1730ee9 Fix origin timestamp during decoding of ROLLBACK PREPARED operation.
This happens because we were passing incorrect arguments to
ReorderBufferFinishPrepared().

Author: Masahiko Sawada
Reviewed-by: Vignesh C
Backpatch-through: 14
Discussion: https://postgr.es/m/CAD21AoBqhUqgDZUhUVnnwKRubPDNJ6m6fJDPgok3E5cWJLL+pA@mail.gmail.com
2021-12-08 15:21:12 +05:30
Michael Paquier 64ab21f0e5 Fix corruption of toast indexes with REINDEX CONCURRENTLY
REINDEX CONCURRENTLY run on a toast index or a toast relation could
corrupt the target indexes rebuilt, as a backend running in parallel
that manipulates toast values would directly release the lock on the
toast relation when its local operation is done, rather than releasing
the lock once the transaction that manipulated the toast values
committed.

The fix done here is simple: we now hold a ROW EXCLUSIVE lock on the
toast relation when saving or deleting a toast value until the
transaction working on them is committed, so as a concurrent reindex
happening in parallel would be able to wait for any activity and see any
new rows inserted (or deleted).

An isolation test is added to check after the case fixed here, which is
a bit fancy by design as it relies on allow_system_table_mods to rename
the toast table and its index to fixed names.  This way, it is possible
to reindex them directly without any dependency on the OID of the
underlying relation.  Note that this could not use a DO block either, as
REINDEX CONCURRENTLY cannot be run in a transaction block.  The test is
backpatched down to 13, where it is possible, thanks to c4a7a39, to use
allow_system_table_mods in a test suite.

Reported-by: Alexey Ermakov
Analyzed-by: Andres Freund, Noah Misch
Author: Michael Paquier
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/17268-d2fb426e0895abd4@postgresql.org
Backpatch-through: 12
2021-12-08 11:01:14 +09:00
Andrew Dunstan 93094232c8
Enable settings used in TAP tests for MSVC builds
Certain settings from configuration or the Makefile infrastructure are
used by the TAP tests, but were not being set up by vcregress.pl. This
remedies those omissions. This should increase test coverage, especially
on the buildfarm.

Reviewed by Noah Misch

Discussion: https://postgr.es/m/17093da5-e40d-8335-d53a-2bd803fc38b0@dunslane.net

Backpatch to all live branches.
2021-12-07 15:01:16 -05:00
Tom Lane ea5ecdadf6 On Windows, also call shutdown() while closing the client socket.
Further experimentation shows that commit 6051857fc is not sufficient
when using (some versions of?) OpenSSL.  The reason is obscure, but
calling shutdown(socket, SD_SEND) improves matters.

Per testing by Andrew Dunstan and Alexander Lakhin.
Back-patch as before.

Discussion: https://postgr.es/m/af5e0bf3-6a61-bb97-6cba-061ddf22ff6b@dunslane.net
2021-12-07 13:34:15 -05:00
Tom Lane 4cd2928543 On Windows, close the client socket explicitly during backend shutdown.
It turns out that this is necessary to keep Winsock from dropping any
not-yet-sent data, such as an error message explaining the reason for
process termination.  It's pretty weird that the implicit close done
by the kernel acts differently from an explicit close, but it's hard
to argue with experimental results.

Independently submitted by Alexander Lakhin and Lars Kanis (comments
by me, though).  Back-patch to all supported branches.

Discussion: https://postgr.es/m/90b34057-4176-7bb0-0dbb-9822a5f6425b@greiz-reinsdorf.de
Discussion: https://postgr.es/m/16678-253e48d34dc0c376@postgresql.org
2021-12-02 17:14:56 -05:00
Michael Paquier b6dac98b05 Move into separate file all the SQL queries used in pg_upgrade tests
The existing pg_upgrade/test.sh and the buildfarm code have been holding
the same set of SQL queries when doing cross-version upgrade tests to
adapt the objects created by the regression tests before the upgrade
(mostly, incompatible or non-existing objects need to be dropped from
the origin, perhaps re-created).

This moves all those SQL queries into a new, separate, file with a set
of \if clauses to handle the version checks depending on the old version
of the cluster to-be-upgraded.

The long-term plan is to make the buildfarm code re-use this new SQL
file, so as committers are able to fix any compatibility issues in the
tests of pg_upgrade with a refresh of the core code, without having to
poke at the buildfarm client.  Note that this is only able to handle the
main regression test suite, and that nothing is done yet for contrib
modules yet (these have more issues like their database names).

A backpatch down to 10 is done, adapting the version checks as this
script needs to be only backward-compatible, so as it becomes possible
to clean up a maximum amount of code within the buildfarm client.

Author: Justin Pryzby, Michael Paquier
Discussion: https://postgr.es/m/20201206180248.GI24052@telsasoft.com
Backpatch-through: 10
2021-12-02 10:31:29 +09:00
Tom Lane 8f4b0200e1 Avoid leaking memory during large-scale REASSIGN OWNED BY operations.
The various ALTER OWNER routines tend to leak memory in
CurrentMemoryContext.  That's not a problem when they're only called
once per command; but in this usage where we might be touching many
objects, it can amount to a serious memory leak.  Fix that by running
each call in a short-lived context.

(DROP OWNED BY likely has a similar issue, except that you'll probably
run out of lock table space before noticing.  REASSIGN is worth fixing
since for most non-table object types, it won't take any lock.)

Back-patch to all supported branches.  Unfortunately, in the back
branches this helps to only a limited extent, since the sinval message
queue bloats quite a lot in this usage before commit 3aafc030a,
consuming memory more or less comparable to what's actually leaked.
Still, it's clearly a leak with a simple fix, so we might as well fix it.

Justin Pryzby, per report from Guillaume Lelarge

Discussion: https://postgr.es/m/CAECtzeW2DAoioEGBRjR=CzHP6TdL=yosGku8qZxfX9hhtrBB0Q@mail.gmail.com
2021-12-01 13:44:47 -05:00
Michael Paquier 5550a9c385 Fix compatibility thinko for fstat() on standard streams in win32stat.c
GetFinalPathNameByHandleA() cannot be used in compilation environments
where _WIN32_WINNT < 0x0600, meaning at least Windows XP used by some
buildfarm members under MinGW that Postgres still needs to support.
This was reported as a compilation warning by the buildfarm, but this is
actually worse than the report as the code would have not worked.

Instead, this switches to GetFileInformationByHandle() that is able to
fail for standard streams and succeed for redirected ones, which is what
we are looking for herein the code emulating fstat().  We also know that
it is able to work in all the environments still supported, thanks to
the existing logic of win32stat.c.

Issue introduced by 10260c7, so backpatch down to 14.

Reported-by: Justin Pryzby, via buildfarm member jacana
Author: Michael Paquier
Reviewed-by: Juan José Santamaría Flecha
Discussion: https://postgr.es/m/20211129050122.GK17618@telsasoft.com
Backpatch-through: 14
2021-11-30 09:55:56 +09:00
Alvaro Herrera 1b1e4bfe7d
Harden be-gssapi-common.h for headerscheck
Surround the contents with a test that the feature is enabled by
configure, to silence header checking tools on systems without GSSAPI
installed.

Backpatch to 12, where the file appeared.

Discussion: https://postgr.es/m/202111161709.u3pbx5lxdimt@alvherre.pgsql
2021-11-26 17:00:29 -03:00
Alvaro Herrera d24dac9549
Fix determination of broken LSN in OVERWRITTEN_CONTRECORD
In commit ff9f111bce I mixed up inconsistent definitions of the LSN of
the first record in a page, when the previous record ends exactly at the
page boundary.  The correct LSN is adjusted to skip the WAL page header;
I failed to use that when setting XLogReaderState->overwrittenRecPtr,
so at WAL replay time VerifyOverwriteContrecord would refuse to let
replay continue past that record.

Backpatch to 10.  9.6 also contains this bug, but it's no longer being
maintained.

Discussion: https://postgr.es/m/45597.1637694259@sss.pgh.pa.us
2021-11-26 11:14:27 -03:00
Daniel Gustafsson 371087d006 Fix GRANTED BY support in REVOKE ROLE statements
Commit 6aaaa76bb added support for the GRANTED BY clause in GRANT and
REVOKE statements, but missed adding support for checking the role in
the REVOKE ROLE case. Fix by checking that the parsed role matches the
CURRENT_ROLE/CURRENT_USER requirement, and also add some tests for it.
Backpatch to v14 where GRANTED BY support was introduced.

Discussion: https://postgr.es/m/B7F6699A-A984-4943-B9BF-CEB84C003527@yesql.se
Backpatch-through: 14
2021-11-26 14:02:01 +01:00
Peter Eisentraut 1cc13b83eb Remove unneeded Python includes
Inluding <compile.h> and <eval.h> has not been necessary since Python
2.4, since they are included via <Python.h>.  Morever, <eval.h> is
being removed in Python 3.11.  So remove these includes.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/84884.1637723223%40sss.pgh.pa.us
2021-11-25 14:30:12 +01:00
Michael Paquier e415916e24 Block ALTER TABLE .. DROP NOT NULL on columns in replica identity index
Replica identities that depend directly on an index rely on a set of
properties, one of them being that all the columns defined in this index
have to be marked as NOT NULL.  There was a hole in the logic with ALTER
TABLE DROP NOT NULL, where it was possible to remove the NOT NULL
property of a column part of an index used as replica identity, so block
it to avoid problems with logical decoding down the road.

The same check was already done columns part of a primary key, so the
fix is straight-forward.

Author: Haiying Tang, Hou Zhijie
Reviewed-by: Dilip Kumar, Michael Paquier
Discussion: https://postgr.es/m/OS0PR01MB6113338C102BEE8B2FFC5BD9FB619@OS0PR01MB6113.jpnprd01.prod.outlook.com
Backpatch-through: 10
2021-11-25 15:05:24 +09:00
Michael Paquier d2198b4593 Fix fstat() emulation on Windows with standard streams
The emulation of fstat() in win32stat.c caused two issues with the
existing in-core callers, failing on EINVAL when using a stream as
argument:
- psql's \copy would crash when using a stream.
- pg_recvlogical would fail with -f -.

The tests in copyselect.sql from the main test suite covers the first
case, and there is a TAP test for the second case.  However, in both
cases, as the standard streams are always redirected, automated tests
did not notice those issues, requiring a terminal on Windows to be
reproducible.

This issue has been introduced in bed9075, and the origin of the problem
is that GetFileInformationByHandle() does not work directly on streams,
so this commit adds an extra code path to emulate and return a set of
stats that match best with the reality.  Note that redirected streams
rely on handles that can be queried with GetFileInformationByHandle(),
but we can rely on GetFinalPathNameByHandleA() to detect this case.

Author: Dmitry Koval, Juan José Santamaría Flecha
Discussion: https://postgr.es/m/17288-6b58a91025a8a8a3@postgresql.org
Backpatch-through: 14
2021-11-25 12:17:05 +09:00
David Rowley c2dc7b9e15 Flush Memoize cache when non-key parameters change, take 2
It's possible that a subplan below a Memoize node contains a parameter
from above the Memoize node.  If this parameter changes then cache entries
may become out-dated due to the new parameter value.

Previously Memoize was mistakenly not aware of this.  We fix this here by
flushing the cache whenever a parameter that's not part of the cache
key changes.

Bug: #17213
Reported by: Elvis Pranskevichus
Author: David Rowley
Discussion: https://postgr.es/m/17213-988ed34b225a2862@postgresql.org
Backpatch-through: 14, where Memoize was added
2021-11-24 23:29:56 +13:00
Michael Paquier 0e681fa458 Add support for Visual Studio 2022 in build scripts
Documentation and any code paths related to VS are updated to keep the
whole consistent.  Similarly to 2017 and 2019, the version of VS and the
version of nmake that we use to determine which code paths to use for
the build are still inconsistent in their own way.

Backpatch down to 10, so as buildfarm members are able to use this new
version of Visual Studio on all the stable branches supported.

Author: Hans Buschmann
Discussion: https://postgr.es/m/1633101364685.39218@nidsa.net
Backpatch-through: 10
2021-11-24 13:03:55 +09:00
David Rowley 7933bc0d13 Revert "Flush Memoize cache when non-key parameters change"
This reverts commit f94edb06ab.
2021-11-24 15:28:34 +13:00
David Rowley f94edb06ab Flush Memoize cache when non-key parameters change
It's possible that a subplan below a Memoize node contains a parameter
from above the Memoize node.  If this parameter changes then cache entries
may become out-dated due to the new parameter value.

Previously Memoize was mistakenly not aware of this.  We fix this here by
flushing the cache whenever a parameter that's not part of the cache
key changes.

Bug: #17213
Reported by: Elvis Pranskevichus
Author: David Rowley
Discussion: https://postgr.es/m/17213-988ed34b225a2862@postgresql.org
Backpatch-through: 14, where Memoize was added
2021-11-24 14:57:07 +13:00
David Rowley 6c32c09777 Allow Memoize to operate in binary comparison mode
Memoize would always use the hash equality operator for the cache key
types to determine if the current set of parameters were the same as some
previously cached set.  Certain types such as floating points where -0.0
and +0.0 differ in their binary representation but are classed as equal by
the hash equality operator may cause problems as unless the join uses the
same operator it's possible that whichever join operator is being used
would be able to distinguish the two values.  In which case we may
accidentally return in the incorrect rows out of the cache.

To fix this here we add a binary mode to Memoize to allow it to the
current set of parameters to previously cached values by comparing
bit-by-bit rather than logically using the hash equality operator.  This
binary mode is always used for LATERAL joins and it's used for normal
joins when any of the join operators are not hashable.

Reported-by: Tom Lane
Author: David Rowley
Discussion: https://postgr.es/m/3004308.1632952496@sss.pgh.pa.us
Backpatch-through: 14, where Memoize was added
2021-11-24 10:07:38 +13:00
Tom Lane 0fdf67476c Adjust pg_dump's priority ordering for casts.
When a stored expression depends on a user-defined cast, the backend
records the dependency as being on the cast's implementation function
--- or indeed, if there's no cast function involved but just
RelabelType or CoerceViaIO, no dependency is recorded at all.  This
is problematic for pg_dump, which is at risk of dumping things in the
wrong order leading to restore failures.  Given the lack of previous
reports, the risk isn't that high, but it can be demonstrated if the
cast is used in some view whose rowtype is then used as an input or
result type for some other function.  (That results in the view
getting hoisted into the functions portion of the dump, ahead of
the cast.)

A logically bulletproof fix for this would require including the
cast's OID in the parsed form of the expression, whence it could be
extracted by dependency.c, and then the stored dependency would force
pg_dump to do the right thing.  Such a change would be fairly invasive,
and certainly not back-patchable.  Moreover, since we'd prefer that
an expression using cast syntax be equal() to one doing the same
thing by explicit function call, the cast OID field would have to
have special ignored-by-comparisons semantics, making things messy.

So, let's instead fix this by a very simple hack in pg_dump: change
the object-type priority order so that casts are initially sorted
before functions, immediately after types.  This fixes the problem
in a fairly direct way for casts that have no implementation function.
For those that do, the implementation function will be hoisted to just
before the cast by the dependency sorting step, so that we still have
a valid dump order.  (I'm not sure that this provides a full guarantee
of no problems; but since it's been like this for many years without
any previous reports, this is probably enough to fix it in practice.)

Per report from Дмитрий Иванов.
Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAPL5KHoGa3uvyKp6z6m48LwCnTsK+LRQ_mcA4uKGfqAVSEjV_A@mail.gmail.com
2021-11-22 17:16:29 -05:00
Tom Lane aedc4600d8 Fix pg_dump --inserts mode for generated columns with dropped columns.
If a table contains a generated column that's preceded by a dropped
column, dumpTableData_insert failed to account for the dropped
column, and would emit DEFAULT placeholder(s) in the wrong column(s).
This resulted in failures at restore time.  The default COPY code path
did not have this bug, likely explaining why it wasn't noticed sooner.

While we're fixing this, we can be a little smarter about the
situation: (1) avoid unnecessarily fetching the values of generated
columns, (2) omit generated columns from the output, too, if we're
using --column-inserts.  While these modes aren't expected to be
as high-performance as the COPY path, we might as well be as
efficient as we can; it doesn't add much complexity.

Per report from Дмитрий Иванов.
Back-patch to v12 where generated columns came in.

Discussion: https://postgr.es/m/CAPL5KHrkBniyQt5e1rafm5DdXvbgiiqfEQEJ9GjtVzN71Jj5pA@mail.gmail.com
2021-11-22 15:25:48 -05:00
Alvaro Herrera c985a43df3
Add missing words in comment
Reported by Zhihong Yu.

Discussion: https://postgr.es/m/CALNJ-vR6uZivg_XkB1zKjEXeyZDEgoYanFXB-++1kBT9yZQoUw@mail.gmail.com
2021-11-22 12:38:41 -03:00
Tom Lane 3bd7556bbe pg_receivewal, pg_recvlogical: allow canceling initial password prompt.
Previously it was impossible to terminate these programs via control-C
while they were prompting for a password.  We can fix that trivially
for their initial password prompts, by moving setup of the SIGINT
handler from just before to just after their initial GetConnection()
calls.

This fix doesn't permit escaping out of later re-prompts, but those
should be exceedingly rare, since the user's password or the server's
authentication setup would have to have changed meanwhile.  We
considered applying a fix similar to commit 46d665bc2, but that
seemed more complicated than it'd be worth.  Moreover, this way is
back-patchable, which that wasn't.

The misbehavior exists in all supported versions, so back-patch to all.

Tom Lane and Nathan Bossart

Discussion: https://postgr.es/m/747443.1635536754@sss.pgh.pa.us
2021-11-21 14:13:35 -05:00
Tom Lane 6d07cbc509 Fix SP-GiST scan initialization logic for binary-compatible cases.
Commit ac9099fc1 rearranged the logic in spgGetCache() that determines
the index's attType (nominal input data type) and leafType (actual
type stored in leaf index tuples).  Turns out this broke things for
the case where (a) the actual input data type is different from the
nominal type, (b) the opclass's config function leaves leafType
defaulted, and (c) the opclass has no "compress" function.  (b) caused
us to assign the actual input data type as leafType, and then since
that's not attType, we complained that a "compress" function is
required.  For non-polymorphic opclasses, condition (a) arises in
binary-compatible cases, such as using SP-GiST text_ops for a varchar
column, or using any opclass on a domain over its nominal input type.

To fix, use attType for leafType when the index's declared column type
is different from but binary-compatible with attType.  Do this only in
the defaulted-leafType case, to avoid overriding any explicit
selection made by the opclass.

Per bug #17294 from Ilya Anfimov.  Back-patch to v14.

Discussion: https://postgr.es/m/17294-8f6c7962ce877edc@postgresql.org
2021-11-20 14:29:56 -05:00
Amit Kapila ead49ebc07 Fix parallel operations that prevent oldest xmin from advancing.
While determining xid horizons, we skip over backends that are running
Vacuum. We also ignore Create Index Concurrently, or Reindex Concurrently
for the purposes of computing Xmin for Vacuum. But we were not setting the
flags corresponding to these operations when they are performed in
parallel which was preventing Xid horizon from advancing.

The optimization related to skipping Create Index Concurrently, or Reindex
Concurrently operations was implemented in PG-14 but the fix is the same
for the Parallel Vacuum as well so back-patched till PG-13.

Author: Masahiko Sawada
Reviewed-by: Amit Kapila
Backpatch-through: 13
Discussion: https://postgr.es/m/CAD21AoCLQqgM1sXh9BrDFq0uzd3RBFKi=Vfo6cjjKODm0Onr5w@mail.gmail.com
2021-11-19 09:14:09 +05:30
Michael Paquier 048f3ee618 Fix quoting of ACL item in table for upgrade binary compatibility checks
Per buildfarm member prion, that runs the regression tests under a role
name that uses a hyphen.  Issue introduced by 835bcba.

Discussion: https://postgr.es/m/YZW4MvzCZ+hQ34vw@paquier.xyz
Backpatch-through: 12
2021-11-18 12:52:56 +09:00
Michael Paquier cf3d79aa31 Add table to regression tests for binary-compatibility checks in pg_upgrade
This commit adds to the main regression test suite a table with all
the in-core data types (some exceptions apply).  This table is not
dropped, so as pg_upgrade would be able to check the binary
compatibility of the types tracked in the table.  If a new type is added
in core, this part of the tests would need a refresh but the tests are
designed to fail if that were to happen.

As this is useful for upgrades and that these rely on the objects
created in the regression test suite of the old version upgraded from,
a backpatch down to 12 is done, which is the last point where a binary
incompatible change has been done (7c15cef).  This will hopefully be
enough to find out if something gets broken during the development of a
new version of Postgres, so as it is possible to take actions in
pg_upgrade itself in this case (like 0ccfc28 for sql_identifier).

An area that is not covered yet is related to external modules, which
may create their own types.  The testing infrastructure of pg_upgrade is
not integrated yet with the external modules stored in core
(src/test/modules/ or contrib/, all use the same database name for their
tests so there would be an overlap).  This could be improved in the
future.

Author: Justin Pryzby
Reviewed-by: Jacob Champion, Peter Eisentraut, Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/20201206180248.GI24052@telsasoft.com
Backpatch-through: 12
2021-11-18 10:37:25 +09:00
Tom Lane 53c4a580e4 Clean up error handling in pg_basebackup's walmethods.c.
The error handling here was a mess, as a result of a fundamentally
bad design (relying on errno to keep its value much longer than is
safe to assume) as well as a lot of just plain sloppiness, both as
to noticing errors at all and as to reporting the correct errno.
Moreover, the recent addition of LZ4 compression broke things
completely, because liblz4 doesn't use errno to report errors.

To improve matters, keep the error state in the DirectoryMethodData or
TarMethodData struct, and add a string field so we can handle cases
that don't set errno.  (The tar methods already had a version of this,
but it can be done more efficiently since all these cases use a
constant error string.)  Make the dir and tar methods handle errors
in basically identical ways, which they didn't before.

This requires copying errno into the state struct in a lot of places,
which is a bit tedious, but it has the virtue that we can get rid of
ad-hoc code to save and restore errno in a number of places ... not
to mention that it fixes other places that should've saved/restored
errno but neglected to.

In passing, fix some pointlessly static buffers to be ordinary
local variables.

There remains an issue about exactly how to handle errors from
fsync(), but that seems like material for its own patch.

While the LZ4 problems are new, all the rest of this is fixes for
old bugs, so backpatch to v10 where walmethods.c was introduced.

Patch by me; thanks to Michael Paquier for review.

Discussion: https://postgr.es/m/1343113.1636489231@sss.pgh.pa.us
2021-11-17 14:16:34 -05:00
Tom Lane 6b413b41b4 Handle close() failures more robustly in pg_dump and pg_basebackup.
Coverity complained that applying get_gz_error after a failed gzclose,
as we did in one place in pg_basebackup, is unsafe.  I think it's
right: it's entirely likely that the call is touching freed memory.
Change that to inspect errno, as we do for other gzclose calls.

Also, be careful to initialize errno to zero immediately before any
gzclose() call where we care about the error status.  (There are
some calls where we don't, because we already failed at some previous
step.)  This ensures that we don't get a misleadingly irrelevant
error code if gzclose() fails in a way that doesn't set errno.
We could work harder at that, but it looks to me like all such cases
are basically can't-happen if we're not misusing zlib, so it's
not worth the extra notational cruft that would be required.

Also, fix several places that simply failed to check for close-time
errors at all, mostly at some remove from the close or gzclose itself;
and one place that did check but didn't bother to report the errno.

Back-patch to v12.  These mistakes are older than that, but between
the frontend logging API changes that happened in v12 and the fact
that frontend code can't rely on %m before that, the patch would need
substantial revision to work in older branches.  It doesn't quite
seem worth the trouble given the lack of related field complaints.

Patch by me; thanks to Michael Paquier for review.

Discussion: https://postgr.es/m/1343113.1636489231@sss.pgh.pa.us
2021-11-17 13:08:25 -05:00
Tom Lane 5d5779aeaf Fix display of SQL-standard function's arguments in INSERT/SELECT.
If a SQL-standard function body contains an INSERT ... SELECT statement,
any function parameters referenced within the SELECT were always printed
in $N style, rather than using the parameter name if any.  While not
strictly incorrect, this wasn't the intention, and it's inconsistent
with the way that such parameters would be printed in any other kind
of statement.

The cause is that the recursion to get_query_def from
get_insert_query_def neglected to pass down the context->namespaces
list, passing constant NIL instead.  This is a very ancient oversight,
but AFAICT it had no visible consequences before commit e717a9a18
added an outermost namespace with function parameters.  We don't allow
INSERT ... SELECT as a sub-query, except in a top-level WITH clause,
where it couldn't contain any outer references that might need to access
upper namespaces.  So although that's arguably a bug, I don't see any
point in changing it before v14.

In passing, harden the code added to get_parameter by e717a9a18 so that
it won't crash if a PARAM_EXTERN Param appears in an unexpected place.

Per report from Erki Eessaar.  Code fix by me, regression test case
by Masahiko Sawada.

Discussion: https://postgr.es/m/AM9PR01MB8268347BED344848555167FAFE949@AM9PR01MB8268.eurprd01.prod.exchangelabs.com
2021-11-17 11:31:31 -05:00
Amit Kapila 232fd72a5e Invalidate relcache when changing REPLICA IDENTITY index.
When changing REPLICA IDENTITY INDEX to another one, the target table's
relcache was not being invalidated. This leads to skipping update/delete
operations during apply on the subscriber side as the columns required to
search corresponding rows won't get logged.

Author: Tang Haiying, Hou Zhijie
Reviewed-by: Euler Taveira, Amit Kapila
Backpatch-through: 10
Discussion: https://postgr.es/m/OS0PR01MB61133CA11630DAE45BC6AD95FB939@OS0PR01MB6113.jpnprd01.prod.outlook.com
2021-11-16 08:34:24 +05:30
Tom Lane 99389cb66b Make psql's \password default to CURRENT_USER, not PQuser(conn).
The documentation says plainly that \password acts on "the current user"
by default.  What it actually acted on, or tried to, was the username
used to log into the current session.  This is not the same thing if
one has since done SET ROLE or SET SESSION AUTHENTICATION.  Aside from
the possible surprise factor, it's quite likely that the current role
doesn't have permissions to set the password of the original role.

To fix, use "SELECT CURRENT_USER" to get the role name to act on.
(This syntax works with servers at least back to 7.0.)  Also, in
hopes of reducing confusion, include the role name that will be
acted on in the password prompt.

The discrepancy from the documentation makes this a bug, so
back-patch to all supported branches.

Patch by me; thanks to Nathan Bossart for review.

Discussion: https://postgr.es/m/747443.1635536754@sss.pgh.pa.us
2021-11-12 14:55:32 -05:00
Michael Paquier 5f81a480d5 Fix memory overrun when querying pg_stat_slru
pg_stat_get_slru() in pgstatfuncs.c would point to one element after the
end of the array PgStat_SLRUStats when finishing to scan its entries.
This had no direct consequences as no data from the extra memory area
was read, but static analyzers would rightfully complain here.  So let's
be clean.

While on it, this adds one regression test in the area reserved for
system views.

Reported-by: Alexander Kozhemyakin, via AddressSanitizer
Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/17280-37da556e86032070@postgresql.org
Backpatch-through: 13
2021-11-12 21:50:04 +09:00
Noah Misch 675cd765c2 Report any XLogReadRecord() error in XlogReadTwoPhaseData().
Buildfarm members kittiwake and tadarida have witnessed errors at this
site.  The site discarded key facts.  Back-patch to v10 (all supported
versions).

Reviewed by Michael Paquier and Tom Lane.

Discussion: https://postgr.es/m/20211107013157.GB790288@rfd.leadboat.com
2021-11-11 17:10:26 -08:00
Alvaro Herrera 9aa91cb33b
Restore lock level to set vacuum flags
Commit 27838981be mistakenly reduced the lock level from exclusive to
shared that is acquired to set PGPROC->statusFlags; this was reverted
by dcfff74fb1, but failed to do so in one spot.  Fix it.

Backpatch to 14.

Noted by Andres Freund.

Discussion: https://postgr.es/m/20211111020724.ggsfhcq3krq5r4hb@alap3.anarazel.de
2021-11-11 11:03:29 -03:00
Michael Paquier b609db7155 Fix buffer overrun in unicode string normalization with empty input
PostgreSQL 13 and newer versions are directly impacted by that through
the SQL function normalize(), which would cause a call of this function
to write one byte past its allocation if using in input an empty
string after recomposing the string with NFC and NFKC.  Older versions
(v10~v12) are not directly affected by this problem as the only code
path using normalization is SASLprep in SCRAM authentication that
forbids the case of an empty string, but let's make the code more robust
anyway there so as any out-of-core callers of this function are covered.

The solution chosen to fix this issue is simple, with the addition of a
fast-exit path if the decomposed string is found as empty.  This would
only happen for an empty string as at its lowest level a codepoint would
be decomposed as itself if it has no entry in the decomposition table or
if it has a decomposition size of 0.

Some tests are added to cover this issue in v13~.  Note that an empty
string has always been considered as normalized (grammar "IS NF[K]{C,D}
NORMALIZED", through the SQL function is_normalized()) for all the
operations allowed (NFC, NFD, NFKC and NFKD) since this feature has been
introduced as of 2991ac5.  This behavior is unchanged but some tests are
added in v13~ to check after that.

I have also checked "make normalization-check" in src/common/unicode/,
while on it (works in 13~, and breaks in older stable branches
independently of this commit).

The release notes should just mention this commit for v13~.

Reported-by: Matthijs van der Vleuten
Discussion: https://postgr.es/m/17277-0c527a373794e802@postgresql.org
Backpatch-through: 10
2021-11-11 15:01:45 +09:00
Tom Lane 74da4c71d6 Doc: improve protocol spec for logical replication Type messages.
protocol.sgml documented the layout for Type messages, but completely
dropped the ball otherwise, failing to explain what they are, when
they are sent, or what they're good for.  While at it, do a little
copy-editing on the description of Relation messages.

In passing, adjust the comment for apply_handle_type() to make it
clearer that we choose not to do anything when receiving a Type
message, not that we think it has no use whatsoever.

Per question from Stefen Hillman.

Discussion: https://postgr.es/m/CAPgW8pMknK5pup6=T4a_UG=Cz80Rgp=KONqJmTdHfaZb0RvnFg@mail.gmail.com
2021-11-10 13:12:58 -05:00
Tom Lane 3aa858c893 Fix instability in 026_overwrite_contrecord.pl test.
We've seen intermittent failures in this test on slower buildfarm
machines, which I think can be explained by assuming that autovacuum
emitted some additional WAL.  Disable autovacuum to stabilize it.

In passing, use stringwise not numeric comparison to compare
WAL file names.  Doesn't matter at present, but they are
hex strings not decimal ...

Discussion: https://postgr.es/m/1372189.1636499287@sss.pgh.pa.us
2021-11-09 18:40:19 -05:00
Tom Lane 30547d7913 libpq: reject extraneous data after SSL or GSS encryption handshake.
libpq collects up to a bufferload of data whenever it reads data from
the socket.  When SSL or GSS encryption is requested during startup,
any additional data received with the server's yes-or-no reply
remained in the buffer, and would be treated as already-decrypted data
once the encryption handshake completed.  Thus, a man-in-the-middle
with the ability to inject data into the TCP connection could stuff
some cleartext data into the start of a supposedly encryption-protected
database session.

This could probably be abused to inject faked responses to the
client's first few queries, although other details of libpq's behavior
make that harder than it sounds.  A different line of attack is to
exfiltrate the client's password, or other sensitive data that might
be sent early in the session.  That has been shown to be possible with
a server vulnerable to CVE-2021-23214.

To fix, throw a protocol-violation error if the internal buffer
is not empty after the encryption handshake.

Our thanks to Jacob Champion for reporting this problem.

Security: CVE-2021-23222
2021-11-08 11:14:56 -05:00
Tom Lane 9d5a76b8d1 Reject extraneous data after SSL or GSS encryption handshake.
The server collects up to a bufferload of data whenever it reads data
from the client socket.  When SSL or GSS encryption is requested
during startup, any additional data received with the initial
request message remained in the buffer, and would be treated as
already-decrypted data once the encryption handshake completed.
Thus, a man-in-the-middle with the ability to inject data into the
TCP connection could stuff some cleartext data into the start of
a supposedly encryption-protected database session.

This could be abused to send faked SQL commands to the server,
although that would only work if the server did not demand any
authentication data.  (However, a server relying on SSL certificate
authentication might well not do so.)

To fix, throw a protocol-violation error if the internal buffer
is not empty after the encryption handshake.

Our thanks to Jacob Champion for reporting this problem.

Security: CVE-2021-23214
2021-11-08 11:01:43 -05:00
Peter Eisentraut 5a75612022 Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: f54c1d7c2c97bb2a238a149e407023a9bc007b06
2021-11-08 10:06:30 +01:00
David Rowley 1f194ed6c2 Fix incorrect hash equality operator bug in Memoize
In v14, because we don't have a field in RestrictInfo to cache both the
left and right type's hash equality operator, we just restrict the scope
of Memoize to only when the left and right types of a RestrictInfo are the
same.

In master we add another field to RestrictInfo and cache both hash
equality operators.

Reported-by: Jaime Casanova
Author: David Rowley
Discussion: https://postgr.es/m/20210929185544.GB24346%40ahch-to
Backpatch-through: 14
2021-11-08 14:41:13 +13:00
Alexander Korotkov b0f6bd48f3 Reset lastOverflowedXid on standby when needed
Currently, lastOverflowedXid is never reset.  It's just adjusted on new
transactions known to be overflowed.  But if there are no overflowed
transactions for a long time, snapshots could be mistakenly marked as
suboverflowed due to wraparound.

This commit fixes this issue by resetting lastOverflowedXid when needed
altogether with KnownAssignedXids.

Backpatch to all supported versions.

Reported-by: Stan Hu
Discussion: https://postgr.es/m/CAMBWrQ%3DFp5UAsU_nATY7EMY7NHczG4-DTDU%3DmCvBQZAQ6wa2xQ%40mail.gmail.com
Author: Kyotaro Horiguchi, Alexander Korotkov
Reviewed-by: Stan Hu, Simon Riggs, Nikolay Samokhvalov, Andrey Borodin, Dmitry Dolgov
2021-11-06 19:13:53 +03:00
Tomas Vondra f7829feb75 Fix handling of NaN values in BRIN minmax multi
When calculating distance between float4/float8 values, we need to be a
bit more careful about NaN values in order not to trigger assert. We
consider NaN values to be equal (distace 0.0) and in infinite distance
from all other values.

On builds without asserts, this issue is mostly harmless - the ranges
may be merged in less efficient order, but the index is still correct.

Per report from Andreas Seltenreich. Backpatch to 14, where this new
BRIN opclass was introduced.

Reported-by: Andreas Seltenreich
Discussion: https://postgr.es/m/87r1bw9ukm.fsf@credativ.de
2021-11-06 01:53:36 +01:00
Alvaro Herrera 02e20bb2dc
Avoid crash in rare case of concurrent DROP
When a role being dropped contains is referenced by catalog objects that
are concurrently also being dropped, a crash can result while trying to
construct the string that describes the objects.  Suppress that by
ignoring objects whose descriptions are returned as NULL.

The majority of relevant codesites were already cautious about this
already; we had just missed a couple.

This is an old bug, so backpatch all the way back.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/17126-21887f04508cb5c8@postgresql.org
2021-11-05 12:29:35 -03:00
Heikki Linnakangas f4e3b62710 Update alternative expected output file.
Previous commit added a test to 'largeobject', but neglected the
alternative expected output file 'largeobject_1.source'. Per failure
on buildfarm animal 'hamerkop'.

Discussion: https://www.postgresql.org/message-id/DBA08346-9962-4706-92D1-230EE5201C10@yesql.se
2021-11-03 19:41:35 +02:00
Heikki Linnakangas 4ebd740cd3 Fix snapshot reference leak if lo_export fails.
If lo_export() fails to open the target file or to write to it, it leaks
the created LargeObjectDesc and its snapshot in the top-transaction
context and resource owner. That's pretty harmless, it's a small leak
after all, but it gives the user a "Snapshot reference leak" warning.

Fix by using a short-lived memory context and no resource owner for
transient LargeObjectDescs that are opened and closed within one function
call. The leak is easiest to reproduce with lo_export() on a directory
that doesn't exist, but in principle the other lo_* functions could also
fail.

Backpatch to all supported versions.

Reported-by: Andrew B
Reviewed-by: Alvaro Herrera
Discussion: https://www.postgresql.org/message-id/32bf767a-2d65-71c4-f170-122f416bab7e@iki.fi
2021-11-03 10:54:33 +02:00
Peter Geoghegan f6162c020c Fix parallel amvacuumcleanup safety bug.
Commit b4af70cb inverted the return value of the function
parallel_processing_is_safe(), but missed the amvacuumcleanup test.
Index AMs that don't support parallel cleanup at all were affected.

The practical consequences of this bug were not very serious.  Hash
indexes are affected, but since they just return the number of blocks
during hashvacuumcleanup anyway, it can't have had much impact.

Author: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAD21AoA-Em+aeVPmBbL_s1V-ghsJQSxYL-i3JP8nTfPiD1wjKw@mail.gmail.com
Backpatch: 14-, where commit b4af70cb appears.
2021-11-02 19:52:10 -07:00
Peter Geoghegan 61a86ed55b Don't overlook indexes during parallel VACUUM.
Commit b4af70cb, which simplified state managed by VACUUM, performed
refactoring of parallel VACUUM in passing.  Confusion about the exact
details of the tasks that the leader process is responsible for led to
code that made it possible for parallel VACUUM to miss a subset of the
table's indexes entirely.  Specifically, indexes that fell under the
min_parallel_index_scan_size size cutoff were missed.  These indexes are
supposed to be vacuumed by the leader (alongside any parallel unsafe
indexes), but weren't vacuumed at all.  Affected indexes could easily
end up with duplicate heap TIDs, once heap TIDs were recycled for new
heap tuples.  This had generic symptoms that might be seen with almost
any index corruption involving structural inconsistencies between an
index and its table.

To fix, make sure that the parallel VACUUM leader process performs any
required index vacuuming for indexes that happen to be below the size
cutoff.  Also document the design of parallel VACUUM with these
below-size-cutoff indexes.

It's unclear how many users might be affected by this bug.  There had to
be at least three indexes on the table to hit the bug: a smaller index,
plus at least two additional indexes that themselves exceed the size
cutoff.  Cases with just one additional index would not run into
trouble, since the parallel VACUUM cost model requires two
larger-than-cutoff indexes on the table to apply any parallel
processing.  Note also that autovacuum was not affected, since it never
uses parallel processing.

Test case based on tests from a larger patch to test parallel VACUUM by
Masahiko Sawada.

Many thanks to Kamigishi Rei for her invaluable help with tracking this
problem down.

Author: Peter Geoghegan <pg@bowt.ie>
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-By: Kamigishi Rei <iijima.yun@koumakan.jp>
Reported-By: Andrew Gierth <andrew@tao11.riddles.org.uk>
Diagnosed-By: Andres Freund <andres@anarazel.de>
Bug: #17245
Discussion: https://postgr.es/m/17245-ddf06aaf85735f36@postgresql.org
Discussion: https://postgr.es/m/20211030023740.qbnsl2xaoh2grq3d@alap3.anarazel.de
Backpatch: 14-, where the refactoring commit appears.
2021-11-02 12:06:16 -07:00
Tom Lane 16a56774fa Fix variable lifespan in ExecInitCoerceToDomain().
This undoes a mistake in 1ec7679f1: domainval and domainnull were
meant to live across loop iterations, but they were incorrectly
moved inside the loop.  The effect was only to emit useless extra
EEOP_MAKE_READONLY steps, so it's not a big deal; nonetheless,
back-patch to v13 where the mistake was introduced.

Ranier Vilela

Discussion: https://postgr.es/m/CAEudQAqXuhbkaAp-sGH6dR6Nsq7v28_0TPexHOm6FiDYqwQD-w@mail.gmail.com
2021-11-02 13:36:53 -04:00
Tom Lane 08cfa5981e Avoid O(N^2) behavior in SyncPostCheckpoint().
As in commits 6301c3ada and e9d9ba2a4, avoid doing repetitive
list_delete_first() operations, since that would be expensive when
there are many files waiting to be unlinked.  This is a slightly
larger change than in those cases.  We have to keep the list state
valid for calls to AbsorbSyncRequests(), so it's necessary to invent a
"canceled" field instead of immediately deleting PendingUnlinkEntry
entries.  Also, because we might not be able to process all the
entries, we need a new list primitive list_delete_first_n().

list_delete_first_n() is almost list_copy_tail(), but it modifies the
input List instead of making a new copy.  I found a couple of existing
uses of the latter that could profitably use the new function.  (There
might be more, but the other callers look like they probably shouldn't
overwrite the input List.)

As before, back-patch to v13.

Discussion: https://postgr.es/m/CD2F0E7F-9822-45EC-A411-AE56F14DEA9F@amazon.com
2021-11-02 11:31:54 -04:00
Tom Lane ad87bf3552 Avoid some other O(N^2) hazards in list manipulation.
In the same spirit as 6301c3ada, fix some more places where we were
using list_delete_first() in a loop and thereby risking O(N^2)
behavior.  It's not clear that the lists manipulated in these spots
can get long enough to be really problematic ... but it's not clear
that they can't, either, and the fixes are simple enough.

As before, back-patch to v13.

Discussion: https://postgr.es/m/CD2F0E7F-9822-45EC-A411-AE56F14DEA9F@amazon.com
2021-11-01 16:24:40 -04:00
Alvaro Herrera 494ec0037e
Handle XLOG_OVERWRITE_CONTRECORD in DecodeXLogOp
Failing to do so results in inability of logical decoding to process the
WAL stream.  Handle it by doing nothing.

Backpatch all the way back.

Reported-by: Petr Jelínek <petr.jelinek@enterprisedb.com>
2021-11-01 13:07:23 -03:00
Michael Paquier f255de9a45 Preserve opclass parameters across REINDEX CONCURRENTLY
The opclass parameter Datums from the old index are fetched in the same
way as for predicates and expressions, by grabbing them directly from
the system catalogs.  They are then copied into the new IndexInfo that
will be used for the creation of the new copy.

This caused the new index to be rebuilt with default parameters rather
than the ones pre-defined by a user.  The only way to get back a new
index with correct opclass parameters would be to recreate a new index
from scratch.

The issue has been introduced by 911e702.

Author: Michael Paquier
Reviewed-by: Zhihong Yu
Discussion: https://postgr.es/m/YX0CG/QpLXcPr8HJ@paquier.xyz
Backpatch-through: 13
2021-11-01 11:40:22 +09:00
Tom Lane 8424dfced7 Avoid O(N^2) behavior when the standby process releases many locks.
When replaying a transaction that held many exclusive locks on the
primary, a standby server's startup process would expend O(N^2)
effort on manipulating the list of locks.  This code was fine when
written, but commit 1cff1b95a made repetitive list_delete_first()
calls inefficient, as explained in its commit message.  Fix by just
iterating the list normally, and releasing storage only when done.
(This'd be inadequate if we needed to recover from an error occurring
partway through; but we don't.)

Back-patch to v13 where 1cff1b95a came in.

Nathan Bossart

Discussion: https://postgr.es/m/CD2F0E7F-9822-45EC-A411-AE56F14DEA9F@amazon.com
2021-10-31 15:31:38 -04:00
Peter Geoghegan bd9f4cf0ee Demote pg_unreachable() in heapam to an assertion.
Commit d168b66682, which overhauled index deletion, added a
pg_unreachable() to the end of a sort comparator used when sorting heap
TIDs from an index page.  This allows the compiler to apply
optimizations that assume that the heap TIDs from the index AM must
always be unique.

That doesn't seem like a good idea now, given recent reports of
corruption involving duplicate TIDs in indexes on Postgres 14.  Demote
to an assertion, just in case.

Backpatch: 14-, where index deletion was overhauled.
2021-10-29 10:53:46 -07:00
Tom Lane 0c8a40b391 Update time zone data files to tzdata release 2021e.
DST law changes in Fiji, Jordan, Palestine, and Samoa.  Historical
corrections for Barbados, Cook Islands, Guyana, Niue, Portugal, and
Tonga.

Also, the Pacific/Enderbury zone has been renamed to Pacific/Kanton.
The following zones have been merged into nearby, more-populous zones
whose clocks have agreed since 1970: Africa/Accra, America/Atikokan,
America/Blanc-Sablon, America/Creston, America/Curacao,
America/Nassau, America/Port_of_Spain, Antarctica/DumontDUrville,
and Antarctica/Syowa.
2021-10-29 11:38:32 -04:00
Tom Lane b1f943d2aa Improve contrib/amcheck's tests for CREATE INDEX CONCURRENTLY.
Commits fdd965d07 and 3cd9c3b92 tested CREATE INDEX CONCURRENTLY by
launching two separate pgbench runs concurrently.  This was needed so
that only a single client thread would run CREATE INDEX CONCURRENTLY,
avoiding deadlock between two CICs.  However, there's a better way,
which is to use an advisory lock to prevent concurrent CICs.  That's
better in part because the test code is shorter and more readable, but
mostly because it automatically scales things to launch an appropriate
number of CICs relative to the number of INSERT transactions.
As committed, typically half to three-quarters of the CIC transactions
were pointless because the INSERT transactions had already stopped.

In passing, remove background_pgbench, which was added to support
these tests and isn't needed anymore.  We can always put it back
if we find a use for it later.

Back-patch to v12; older pgbench versions lack the
conditional-execution features needed for this method.

Tom Lane and Andrey Borodin

Discussion: https://postgr.es/m/139687.1635277318@sss.pgh.pa.us
2021-10-28 11:45:14 -04:00
Peter Geoghegan 6cac343396 Fix ordering of items in nbtree error message.
Oversight in commit a5213adf.

Backpatch: 13-, just like commit a5213adf.
2021-10-27 13:09:01 -07:00
Peter Geoghegan d078fe83d5 Further harden nbtree posting split code.
Add more defensive checks around posting list split code.  These should
detect corruption involving duplicate table TIDs earlier and more
reliably than any existing check.

Follow up to commit 8f72bbac.

Discussion: https://postgr.es/m/CAH2-WzkrSY_kjyd1_M5xJK1uM0govJXMxPn8JUSvwcUOiHuWVw@mail.gmail.com
Backpatch: 13-, where nbtree deduplication was introduced.
2021-10-27 12:10:45 -07:00
Magnus Hagander a0b6520ecf Clarify that --system reindexes system catalogs *only*
Make this more clear both in the help message and docs.

Reviewed-By: Michael Paquier
Backpatch-through: 9.6
Discussion: https://postgr.es/m/CABUevEw6Je0WUFTLhPKOk4+BoBuDrE-fKw3N4ckqgDBMFu4paA@mail.gmail.com
2021-10-27 16:28:22 +02:00
Daniel Gustafsson 1ed1f801cd Ensure that slots are zeroed before use
The previous coding relied on the memory for the slots being zeroed
elsewhere, which while it was true in this case is not an contract
which is guaranteed to hold.  Explicitly clear the tts_isnull array
to ensure that the slots are filled from a known state.

Backpatch to v14 where the catalog multi-inserts were introduced.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAJ7c6TP0AowkUgNL6zcAK-s5HYsVHVBRWfu69FRubPpfwZGM9A@mail.gmail.com
Backpatch-through: 14
2021-10-26 10:40:08 +02:00
Amit Kapila a6a0ae127e Revert "Remove unused wait events."
This reverts commit 671eb8f344. The removed
wait events are used by some extensions and removal of these would force a
recompile of those extensions. We don't want that for released branches.

Discussion: https://postgr.es/m/E1mdOBY-0005j2-QL@gemulon.postgresql.org
2021-10-26 08:19:33 +05:30
Thomas Munro 181361a0c2 Reject huge_pages=on if shared_memory_type=sysv.
It doesn't work (it could, but hasn't been implemented).
Back-patch to 12, where shared_memory_type arrived.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/163271880203.22789.1125998876173795966@wrigleys.postgresql.org
2021-10-26 13:09:00 +13:00
Noah Misch a5b9a0000e Fix CREATE INDEX CONCURRENTLY for the newest prepared transactions.
The purpose of commit 8a54e12a38 was to
fix this, and it sufficed when the PREPARE TRANSACTION completed before
the CIC looked for lock conflicts.  Otherwise, things still broke.  As
before, in a cluster having used CIC while having enabled prepared
transactions, queries that use the resulting index can silently fail to
find rows.  It may be necessary to reindex to recover from past
occurrences; REINDEX CONCURRENTLY suffices.  Fix this for future index
builds by making CIC wait for arbitrarily-recent prepared transactions
and for ordinary transactions that may yet PREPARE TRANSACTION.  As part
of that, have PREPARE TRANSACTION transfer locks to its dummy PGPROC
before it calls ProcArrayClearTransaction().  Back-patch to 9.6 (all
supported versions).

Andrey Borodin, reviewed (in earlier versions) by Andres Freund.

Discussion: https://postgr.es/m/01824242-AA92-4FE9-9BA7-AEBAFFEA3D0C@yandex-team.ru
2021-10-23 18:36:42 -07:00
Noah Misch dde966efb2 Avoid race in RelationBuildDesc() affecting CREATE INDEX CONCURRENTLY.
CIC and REINDEX CONCURRENTLY assume backends see their catalog changes
no later than each backend's next transaction start.  That failed to
hold when a backend absorbed a relevant invalidation in the middle of
running RelationBuildDesc() on the CIC index.  Queries that use the
resulting index can silently fail to find rows.  Fix this for future
index builds by making RelationBuildDesc() loop until it finishes
without accepting a relevant invalidation.  It may be necessary to
reindex to recover from past occurrences; REINDEX CONCURRENTLY suffices.
Back-patch to 9.6 (all supported versions).

Noah Misch and Andrey Borodin, reviewed (in earlier versions) by Andres
Freund.

Discussion: https://postgr.es/m/20210730022548.GA1940096@gust.leadboat.com
2021-10-23 18:36:42 -07:00
Tom Lane 8cee4be6dc Fix frontend version of sh_error() in simplehash.h.
The code does not expect sh_error() to return, but the patch
that made this header usable in frontend didn't get that memo.

While here, plaster unlikely() on the tests that decide whether
to invoke sh_error(), and add our standard copyright notice.

Noted by Andres Freund.  Back-patch to v13 where this frontend
support came in.

Discussion: https://postgr.es/m/0D54435C-1199-4361-9D74-2FBDCF8EA164@anarazel.de
2021-10-22 16:43:38 -04:00
Tom Lane 3ad2c2455b pg_dump: fix mis-dumping of non-global default privileges.
Non-global default privilege entries should be dumped as-is,
not made relative to the default ACL for their object type.
This would typically only matter if one had revoked some
on-by-default privileges in a global entry, and then wanted
to grant them again in a non-global entry.

Per report from Boris Korzun.  This is an old bug, so back-patch
to all supported branches.

Neil Chen, test case by Masahiko Sawada

Discussion: https://postgr.es/m/111621616618184@mail.yandex.ru
Discussion: https://postgr.es/m/CAA3qoJnr2+1dVJObNtfec=qW4Z0nz=A9+r5bZKoTSy5RDjskMw@mail.gmail.com
2021-10-22 15:22:25 -04:00
Andrew Dunstan 52c0c11367
Add module build directory to the PATH for TAP tests
For non-MSVC builds this is make's $(CURDIR), while for MSVC builds it
is $topdir/$Config/$module. The directory is added as the second element
in the PATH, so that the install location takes precedence, but the
added PATH element takes precedence over the rest of the PATH.

The reason for this is to allow tests to find built products that are
not installed, such as the libpq_pipeline test driver.

The libpq_pipeline test is adjusted to take advantage of this.

Based on a suggestion from Andres Freund.

Backpatch to release 14.

Discussion: https://postgr.es/m/4941f5a5-2d50-1a0e-6701-14c5fefe92d6@dunslane.net
2021-10-22 09:50:16 -04:00
Amit Kapila 9e84f6a721 Back-patch "Add parent table name in an error in reorderbuffer.c."
This was originally done in commit 5e77625b26 for 15 only, as a
troubleshooting aid but multiple people showed interest in back-patching
this.

Author: Jeremy Schneider
Reviewed-by: Amit Kapila
Backpatch-through: 9.6
Discussion: https://postgr.es/m/808ed65b-994c-915a-361c-577f088b837f@amazon.com
2021-10-21 09:24:59 +05:30
Amit Kapila 671eb8f344 Remove unused wait events.
Commit 464824323e introduced the wait events which were neither used by
that commit nor by follow-up commits for that work.

Author: Masahiro Ikeda
Backpatch-through: 14, where it was introduced
Discussion: https://postgr.es/m/ff077840-3ab2-04dd-bbe4-4f5dfd2ad481@oss.nttdata.com
2021-10-21 08:07:08 +05:30
Michael Paquier 5040c96415 Fix corruption of pg_shdepend when copying deps from template database
Using for a new database a template database with shared dependencies
that need to be copied over was causing a corruption of pg_shdepend
because of an off-by-one computation error of the index number used for
the values inserted with a slot.

Issue introduced by e3931d0.  Monitoring the rest of the code, there are
no similar mistakes.

Reported-by: Sven Klemm
Author: Aleksander Alekseev
Reviewed-by: Daniel Gustafsson, Michael Paquier
Discussion: https://postgr.es/m/CAJ7c6TP0AowkUgNL6zcAK-s5HYsVHVBRWfu69FRubPpfwZGM9A@mail.gmail.com
Backpatch-through: 14
2021-10-21 10:39:07 +09:00
Alvaro Herrera 7182788552
Protect against collation variations in test
Discussion: https://postgr.es/m/YW/MYdSRQZtPFBWR@paquier.xyz
2021-10-20 13:05:42 -03:00
Michael Paquier 81aefaea82 Fix build of MSVC with OpenSSL 3.0.0
The build scripts of Visual Studio would fail to detect properly a 3.0.0
build as the check on the second digit was failing.  This is adjusted
where needed, allowing the builds to complete.  Note that the MSIs of
OpenSSL mentioned in the documentation have not changed any library
names for Win32 and Win64, making this change straight-forward.

Reported-by: htalaco, via github
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/YW5XKYkq6k7OtrFq@paquier.xyz
Backpatch-through: 9.6
2021-10-20 16:48:57 +09:00
Alvaro Herrera 3ce3fb2f7d
Ensure correct lock level is used in ALTER ... RENAME
Commit 1b5d797cd4 intended to relax the lock level used to rename
indexes, but inadvertently allowed *any* relation to be renamed with a
lowered lock level, as long as the command is spelled ALTER INDEX.
That's undesirable for other relation types, so retry the operation with
the higher lock if the relation turns out not to be an index.

After this fix, ALTER INDEX <sometable> RENAME will require access
exclusive lock, which it didn't before.

Author: Nathan Bossart <bossartn@amazon.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Onder Kalaci <onderk@microsoft.com>
Discussion: https://postgr.es/m/PH0PR21MB1328189E2821CDEC646F8178D8AE9@PH0PR21MB1328.namprd21.prod.outlook.com
2021-10-19 19:08:45 -03:00
Andres Freund 533315b68f Adapt src/test/ldap/t/001_auth.pl to work with openldap 2.5.
ldapsearch's deprecated -h/-p arguments were removed, need to use -H now -
which has been around for over 20 years.

As perltidy insists on reflowing the parameters anyway, change order and
"phrasing" to yield a less confusing layout (per suggestion from Tom Lane).

Discussion: https://postgr.es/m/20211009233850.wvr6apcrw2ai6cnj@alap3.anarazel.de
Backpatch: 11-, where the tests were added.
2021-10-19 11:15:45 -07:00
Tom Lane 04dae19f4d Fix assignment to array of domain over composite.
An update such as "UPDATE ... SET fld[n].subfld = whatever"
failed if the array elements were domains rather than plain
composites.  That's because isAssignmentIndirectionExpr()
failed to cope with the CoerceToDomain node that would appear
in the expression tree in this case.  The result would typically
be a crash, and even if we accidentally didn't crash, we'd not
correctly preserve other fields of the same array element.

Per report from Onder Kalaci.  Back-patch to v11 where arrays of
domains came in.

Discussion: https://postgr.es/m/PH0PR21MB132823A46AA36F0685B7A29AD8BD9@PH0PR21MB1328.namprd21.prod.outlook.com
2021-10-19 13:54:45 -04:00
Tom Lane f627fd547a Remove bogus assertion in transformExpressionList().
I think when I added this assertion (in commit 8f889b108), I was only
thinking of the use of transformExpressionList at top level of INSERT
and VALUES.  But it's also called by transformRowExpr(), which can
certainly occur in an UPDATE targetlist, so it's inappropriate to
suppose that p_multiassign_exprs must be empty.  Besides, since the
input is not expected to contain ResTargets, there's no reason it
should contain MultiAssignRefs either.  Hence this code need not
be concerned about the state of p_multiassign_exprs, and we should
just drop the assertion.

Per bug #17236 from ocean_li_996.  It's been wrong for years,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/17236-3210de9bcba1d7ca@postgresql.org
2021-10-19 11:35:15 -04:00
Daniel Gustafsson 3e2f32b01d Fix bug in TOC file error message printing
If the blob TOC file cannot be parsed, the error message was failing
to print the filename as the variable holding it was shadowed by the
destination buffer for parsing.  When the filename fails to parse,
the error will print an empty string:

 ./pg_restore -d foo -F d dump
 pg_restore: error: invalid line in large object TOC file "": ..

..instead of the intended error message:

 ./pg_restore -d foo -F d dump
 pg_restore: error: invalid line in large object TOC file "dump/blobs.toc": ..

Fix by renaming both variables as the shared name was too generic to
store either and still convey what the variable held.

Backpatch all the way down to 9.6.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/A2B151F5-B32B-4F2C-BA4A-6870856D9BDE@yesql.se
Backpatch-through: 9.6
2021-10-19 12:59:54 +02:00
Daniel Gustafsson 121be6a665 Fix sscanf limits in pg_basebackup and pg_dump
Make sure that the string parsing is limited by the size of the
destination buffer.

In pg_basebackup the available values sent from the server
is limited to two characters so there was no risk of overflow.

In pg_dump the buffer is bounded by MAXPGPATH, and thus the limit
must be inserted via preprocessor expansion and the buffer increased
by one to account for the terminator. There is no risk of overflow
here, since in this case, the buffer scanned is smaller than the
destination buffer.

Backpatch the pg_basebackup fix to 11 where it was introduced, and
the pg_dump fix all the way down to 9.6.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/B14D3D7B-F98C-4E20-9459-C122C67647FB@yesql.se
Backpatch-through: 11 and 9.6
2021-10-19 12:59:50 +02:00
Michael Paquier b1b797ec71 Block ALTER INDEX/TABLE index_name ALTER COLUMN colname SET (options)
The grammar of this command run on indexes with column names has always
been authorized by the parser, and it has never been documented.

Since 911e702, it is possible to define opclass parameters as of CREATE
INDEX, which actually broke the old case of ALTER INDEX/TABLE where
relation-level parameters n_distinct and n_distinct_inherited could be
defined for an index (see 76a47c0 and its thread where this point has
been touched, still remained unused).  Attempting to do that in v13~
would cause the index to become unusable, as there is a new dedicated
code path to load opclass parameters instead of the relation-level ones
previously available.  Note that it is possible to fix things with a
manual catalog update to bring the relation back online.

This commit disables this command for now as the use of column names for
indexes does not make sense anyway, particularly when it comes to index
expressions where names are automatically computed.  One way to properly
support this case properly in the future would be to use column numbers
when it comes to indexes, in the same way as ALTER INDEX .. ALTER COLUMN
.. SET STATISTICS.

Partitioned indexes were already blocked, but not indexes.  Some tests
are added for both cases.

There was some code in ANALYZE to enforce n_distinct to be used for an
index expression if the parameter was defined, but just remove it for
now until/if there is support for this (note that index-level parameters
never had support in pg_dump either, previously), so this was just dead
code.

Reported-by: Matthijs van der Vleuten
Author: Nathan Bossart, Michael Paquier
Reviewed-by: Vik Fearing, Dilip Kumar
Discussion: https://postgr.es/m/17220-15d684c6c2171a83@postgresql.org
Backpatch-through: 13
2021-10-19 11:04:00 +09:00
Alvaro Herrera 72d0642172
Invalidate partitions of table being attached/detached
Failing to do that, any direct inserts/updates of those partitions
would fail to enforce the correct constraint, that is, one that
considers the new partition constraint of their parent table.

Backpatch to 10.

Reported by: Hou Zhijie <houzj.fnst@fujitsu.com>
Author: Amit Langote <amitlangote09@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Nitin Jadhav <nitinjadhavpostgres@gmail.com>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>

Discussion: https://postgr.es/m/OS3PR01MB5718DA1C4609A25186D1FBF194089%40OS3PR01MB5718.jpnprd01.prod.outlook.com
2021-10-18 19:08:25 -03:00
Michael Paquier 5b353aaff6 Reset properly snapshot export state during transaction abort
During a replication slot creation, an ERROR generated in the same
transaction as the one creating a to-be-exported snapshot would have
left the backend in an inconsistent state, as the associated static
export snapshot state was not being reset on transaction abort, but only
on the follow-up command received by the WAL sender that created this
snapshot on replication slot creation.  This would trigger inconsistency
failures if this session tried to export again a snapshot, like during
the creation of a replication slot.

Note that a snapshot export cannot happen in a transaction block, so
there is no need to worry resetting this state for subtransaction
aborts.  Also, this inconsistent state would very unlikely show up to
users.  For example, one case where this could happen is an
out-of-memory error when building the initial snapshot to-be-exported.
Dilip found this problem while poking at a different patch, that caused
an error in this code path for reasons unrelated to HEAD.

Author: Dilip Kumar
Reviewed-by: Michael Paquier, Zhihong Yu
Discussion: https://postgr.es/m/CAFiTN-s0zA1Kj0ozGHwkYkHwa5U0zUE94RSc_g81WrpcETB5=w@mail.gmail.com
Backpatch-through: 9.6
2021-10-18 11:56:48 +09:00
Tom Lane 3e4c8db931 Avoid core dump in pg_dump when dumping from pre-8.3 server.
Commit f0e21f2f6 missed adding a tgisinternal output column
to getTriggers' query for pre-8.3 servers.  Back-patch to v11,
like that commit.
2021-10-16 15:03:05 -04:00
Tom Lane b5152e3ba6 Make pg_dump acquire lock on partitioned tables that are to be dumped.
It was clearly the intent to do so all along, but the original coding
fat-fingered this by checking the wrong array element.  We fixed it
in passing in 403a3d91c, but that later got reverted, and we forgot
to keep this bug fix.

Most of the time this'd be relatively harmless, since once we lock
any of the partitioned table's leaf partitions, that would suffice
to prevent major DDL on the partitioned table itself.  However, a
childless partitioned table would get dumped with no relevant lock
whatsoever, possibly allowing dump failure or inconsistent output.

Unlike 403a3d91c, there are no versioning concerns, since every server
version that has partitioned tables will allow you to lock one.

Back-patch to v10 where partitioned tables were introduced.

Discussion: https://postgr.es/m/1018205.1634346327@sss.pgh.pa.us
2021-10-16 12:24:11 -04:00
Andrew Dunstan c697d8a39b
Fix PostgresNode install_path sanity tests that fail on Windows
Backpatch to 14 where install_path was introduced.
2021-10-15 12:58:17 -04:00
Peter Geoghegan 5863115e4c Remove unstable pg_amcheck tests.
Recent pg_amcheck bugfix commit d2bf06db added a test case that the
buildfarm has shown to be non-portable.  It doesn't particularly seem
worth keeping anyway.  Remove it.

Discussion: https://postgr.es/m/CAH2-Wz=7HKJ9WzAh7+M0JfwJ1yfT9qoE+KPa3P7iGToPOtGhXg@mail.gmail.com
Backpatch: 14-, just like the original commit.
2021-10-14 14:50:25 -07:00
Jeff Davis 0b90f1c4c3 Check criticalSharedRelcachesBuilt in GetSharedSecurityLabel().
An extension may want to call GetSecurityLabel() on a shared object
before the shared relcaches are fully initialized. For instance, a
ClientAuthentication_hook might want to retrieve the security label on
a role.

Discussion: https://postgr.es/m/ecb7af0b26e3be1d96d291c8453a86f1f82d9061.camel@j-davis.com
Backpatch-through: 9.6
2021-10-14 12:24:22 -07:00
Tom Lane fd059ac2e4 Fix planner error with pulling up subquery expressions into function RTEs.
If a function-in-FROM laterally references the output of some sub-SELECT
earlier in the FROM clause, and we are able to flatten that sub-SELECT
into the outer query, the expression(s) copied into the function RTE
missed being processed by eval_const_expressions.  This'd lead to trouble
and probable crashes at execution if such expressions contained
named-argument function call syntax or functions with defaulted arguments.
The bug is masked if the query contains any explicit JOIN syntax, which
may help explain why we'd not noticed.

Per bug #17227 from Bernd Dorn.  This is an oversight in commit 7266d0997,
so back-patch to v13 where that came in.

Discussion: https://postgr.es/m/17227-5a28ed1512189fa4@postgresql.org
2021-10-14 12:43:43 -04:00
Alvaro Herrera 79c7fe1af8
Change recently added test code for stability
The test code added with ff9f111bce fails under valgrind, and probably
other slow cases too, because if (say) autovacuum runs in between and
produces WAL of its own, the large INSERT fails to account for that in
the LSN calculations.  Rewrite to use a DO loop.

Per complaint from Andres Freund

Backpatch to all branches.

Discussion: https://postgr.es/m/20211013180338.5guyqzpkcisqugrl@alap3.anarazel.de
2021-10-13 18:49:27 -03:00
Peter Geoghegan dd58194cf5 pg_amcheck: avoid unhelpful verification attempts.
Avoid calling contrib/amcheck functions with relations that are
unsuitable for checking.  Specifically, don't attempt verification of
temporary relations, or indexes whose pg_index entry indicates that the
index is invalid, or not ready.

These relations are not supported by any of the contrib/amcheck
functions, for reasons that are pretty fundamental.  For example, the
implementation of REINDEX CONCURRENTLY can add its own "transient"
pg_index entries, which has rather unclear implications for the B-Tree
verification functions, at least in the general case -- so they just
treat it as an error.  It falls to the amcheck caller (in this case
pg_amcheck) to deal with the situation at a higher level.

pg_amcheck now simply treats these conditions as additional "visibility
concerns" when it queries system catalogs.  This is a little arbitrary.
It seems to have the least problems among any of the available
alternatives.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Reported-By: Alexander Lakhin <exclusion@gmail.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Bug: #17212
Discussion: https://postgr.es/m/17212-34dd4a1d6bba98bf@postgresql.org
Backpatch: 14-, where pg_amcheck was introduced.
2021-10-13 14:08:11 -07:00
Michael Paquier 922e15c476 Fix use-after-free with multirange types in CREATE TYPE
The code was freeing the name of the multirange type function stored in
the parse tree but it should not do that.  Event triggers could for
example look at such a corrupted parsed tree with a ddl_command_end
event.

Author: Alex Kozhemyakin, Sergey Shinderuk
Reviewed-by: Peter Eisentraut, Michael Paquier
Discussion: https://postgr.es/m/d5042d46-b9cd-6efb-219a-71ed0cf45bc8@postgrespro.ru
Backpatch-through: 14
2021-10-13 16:38:15 +09:00
Michael Paquier f4e1c8892b Fix tests of pg_upgrade across different major versions
This fixes a set of issues that cause different breakages or annoyances
when using pg_upgrade's test.sh to do upgrades across different major
versions:
- test.sh is completely broken when using v14 as new version because of
the removal of testtablespace/ as Makefile rule.  Older versions of
pg_regress don't support --make-tablespacedir, blocking the creation of
the tablespace.  In order to fix that, it is simple enough to create
those directories in the script itself, but only do that when an old
version is involved.  This fix is needed on HEAD and REL_14_STABLE.
- The script would fail when using PG <= v11 as old version because of
WITH OIDS relations not supported in v12.  In order to fix this, this
steals a method from the buildfarm that uses a DO block to change all
the relations marked as WITH OIDS, allowing pg_upgrade to pass.  This is
more portable than using ALTER TABLE queries on the relations causing
issues.  This is fixed down to v12, and authored originally by Andrew
Dunstan.
- Not using --extra-float-digits=0 with v11 as old version causes
a lot of diffs in the dumps, making the whole unreadable.  This gets
only done when using v11 as old version.  This is fixed down to v12.
The buildfarm code uses that already.

Note that the addition of --wal-segsize and --allow-group-access breaks
the script when using v10 or older at initdb time as these got added in
11.  10 would be EOL'd next year and nobody has complained about those
problems yet, so nothing is done about that.  This means that this
commit fixes upgrade tests using test.sh with v11 as minimum older
version, up to HEAD, and that it is enough to apply this change down to
12.  The old and new dumps still generate diffs, still require manual
checks, and more could be done to reduce the noise, but this allows the
tests to run with a rather minimal amount of them.

I have tested this commit and test.sh with v11 as minimum across all the
branches where this is applied.  Note that this commit has no impact on
the normal pg_upgrade test run with a simple "make check".

Author:  Justin Pryzby, Andrew Dunstan, Michael Paquier
Discussion: https://postgr.es/m/20201206180248.GI24052@telsasoft.com
Backpatch-through: 12
2021-10-13 09:22:23 +09:00
Michael Paquier d834ebcf23 Add more $Test::Builder::Level in the TAP tests
Incrementing the level of the call stack reported is useful for
debugging purposes as it allows to control which part of the test is
exactly failing, especially if a test is structured with subroutines
that call routines from Test::More.

This adds more incrementations of $Test::Builder::Level where debugging
gets improved (for example it does not make sense for some paths like
pg_rewind where long subroutines are used).

A note is added to src/test/perl/README about that, based on a
suggestion from Andrew Dunstan and a wording coming from both of us.

Usage of Test::Builder::Level has spread in 12, so a backpatch down to
this version is done.

Reviewed-by: Andrew Dunstan, Peter Eisentraut, Daniel Gustafsson
Discussion: https://postgr.es/m/YV1CCFwgM1RV1LeS@paquier.xyz
Backpatch-through: 12
2021-10-12 11:16:20 +09:00
Fujii Masao 62e821ad28 Make autovacuum launcher more responsive to pg_log_backend_memory_contexts().
Previously when pg_log_backend_memory_contexts() sent the request to
the autovacuum launcher, it could take more than several seconds to
log its memory contexts. Because the function (HandleAutoVacLauncherInterrupts)
to process any new interrupts that autovacuum launcher received
didn't handle the request for logging of memory contexts. This commit changes
the function so that it handles the request, to make autovacuum launcher
more responsitve to pg_log_backend_memory_contexts().

Back-patch to v14 where pg_log_backend_memory_contexts() was added.

Author: Koyu Tanigawa
Reviewed-by: Bharath Rupireddy, Atsushi Torikoshi
Discussion: https://postgr.es/m/0aae3e074face409b35153451be5cc11@oss.nttdata.com
2021-10-12 09:51:17 +09:00
Tom Lane 2c25db32ee Fix EXPLAIN of SEARCH BREADTH FIRST queries some more.
Commit 3f50b8263 had an oversight: formerly, to deparse expressions
attached to a plan node, it was only necessary to update the
deparse_namespace ancestors list alongside calling set_deparse_plan.
Now it's necessary to update the ancestors list *first*, because
set_deparse_plan consults it, and one call site got that wrong.

This error was masked in most cases because explain.c uses just one
List object for the ancestors list, updating it in-place as the plan
is scanned, so that we accidentally had the right List assigned to
dpns->ancestors before it was needed.  It would fail only if a
WorkTableScan node were the first one that we tried to deparse a
subexpression of.

Per report from Markus Winand.  Like the previous patch,
back-patch to v14.

Discussion: https://postgr.es/m/648B0505-AA57-42C2-A2DA-E551DE46FA15@winand.at
2021-10-11 11:56:52 -04:00
Etsuro Fujita ef2e107f54 Add missing word to comment in joinrels.c.
Author: Amit Langote
Backpatch-through: 13
Discussion: https://postgr.es/m/CA%2BHiwqGQNbtamQ_9DU3osR1XiWR4wxWFZurPmN6zgbdSZDeWmw%40mail.gmail.com
2021-10-07 17:45:01 +09:00
Dean Rasheed 8e26b868d5 Fix corner-case loss of precision in numeric_power().
This fixes a loss of precision that occurs when the first input is
very close to 1, so that its logarithm is very small.

Formerly, during the initial low-precision calculation to estimate the
result weight, the logarithm was computed to a local rscale that was
capped to NUMERIC_MAX_DISPLAY_SCALE (1000). However, the base may be
as close as 1e-16383 to 1, hence its logarithm may be as small as
1e-16383, and so the local rscale needs to be allowed to exceed 16383,
otherwise all precision is lost, leading to a poor choice of rscale
for the full-precision calculation.

Fix this by removing the cap on the local rscale during the initial
low-precision calculation, as we already do in the full-precision
calculation. This doesn't change the fact that the initial calculation
is a low-precision approximation, computing the logarithm to around 8
significant digits, which is very fast, especially when the base is
very close to 1.

Patch by me, reviewed by Alvaro Herrera.

Discussion: https://postgr.es/m/CAEZATCV-Ceu%2BHpRMf416yUe4KKFv%3DtdgXQAe5-7S9tD%3D5E-T1g%40mail.gmail.com
2021-10-06 13:19:25 +01:00
Michael Paquier ae254356f9 Fix warning in TAP test of pg_verifybackup
Oversight in a3fcbcd.

Reported-by: Thomas Munro
Discussion: https://postgr.es/m/CA+hUKGKnajZEwe91OTjro9kQLCMGGFHh2vvFn8tgHgbyn4bF9w@mail.gmail.com
Backpatch-through: 13
2021-10-06 13:28:30 +09:00
Andres Freund c4465cd09e Fix TestLib::slurp_file() with offset on windows.
3c5b0685b9 used setFilePointer() to set the position of the filehandle, but
passed the wrong filehandle, always leaving the position at 0. Instead of just
fixing that, remove use of setFilePointer(), we have a perl fd at this point,
so we can just use perl's seek().

Additionally, the perl filehandle wasn't closed, just the windows filehandle.

Reviewed-By: Andrew Dunstan <andrew@dunslane.net>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20211003173038.64mmhgxctfqn7wl6@alap3.anarazel.de
Backpatch: 9.6-, like 3c5b0685b9
2021-10-04 13:28:48 -07:00
Tom Lane 919c08d909 Update our mapping of Windows time zone names some more.
Per discussion, let's just follow CLDR's default zone mappings
faithfully.  There are two changes here that are clear improvements:

* Mapping "Greenwich Standard Time" to Atlantic/Reykjavik is actually
a better fit than using London, because Iceland hasn't observed DST
since 1968, so this is more nearly what people might expect.

* Since the "Samoa" zone is specified to be UTC+13:00, we must map
it to Pacific/Apia not Pacific/Samoa; the latter refers to American
Samoa which is now on the other side of the date line.

The rest of these changes look like they're choosing the most populous
IANA zone as representative.  Whatever the details, we're just going
to say "if you don't like this mapping, complain to CLDR".

Discussion: https://postgr.es/m/3266414.1633045628@sss.pgh.pa.us
2021-10-04 14:52:17 -04:00
Michael Paquier 828f7f0009 Fix snapshot builds during promotion of hot standby node with 2PC
Some specific logic is done at the end of recovery when involving 2PC
transactions:
1) Call RecoverPreparedTransactions(), to recover the state of 2PC
transactions into memory (re-acquire locks, etc.).
2) ShutdownRecoveryTransactionEnvironment(), to move back to normal
operations, mainly cleaning up recovery locks and KnownAssignedXids
(including any 2PC transaction tracked previously).
3) Switch XLogCtl->SharedRecoveryState to RECOVERY_STATE_DONE, which is
the tipping point for any process calling RecoveryInProgress() to check
if the cluster is still in recovery or not.

Any snapshot taken between steps 2) and 3) would be empty, causing any
transaction relying on a snapshot at this point to potentially corrupt
data as there could still be some 2PC transactions to track, with
RecentXmin moving backwards on successive calls to GetSnapshotData() in
the same transaction.

As SharedRecoveryState is the point to take into account to know if it
is safe to discard KnownAssignedXids, this commit moves step 2) after
step 3), so as we can never finish with empty snapshots.

This exists since the introduction of hot standby, so backpatch all the
way down.  The window with incorrect snapshots is extremely small, but I
have seen it when running 023_pitr_prepared_xact.pl, as did buildfarm
member fairywren.  Thomas Munro also found it independently.  Special
thanks to Andres Freund for taking the time to analyze this issue.

Reported-by: Thomas Munro, Michael Paquier
Analyzed-by: Andres Freund
Discussion: https://postgr.es/m/20210422203603.fdnh3fu2mmfp2iov@alap3.anarazel.de
Backpatch-through: 9.6
2021-10-04 14:05:48 +09:00
Tom Lane e0eba586b1 Fix checking of query type in plpgsql's RETURN QUERY command.
Prior to v14, we insisted that the query in RETURN QUERY be of a type
that returns tuples.  (For instance, INSERT RETURNING was allowed,
but not plain INSERT.)  That happened indirectly because we opened a
cursor for the query, so spi.c checked SPI_is_cursor_plan().  As a
consequence, the error message wasn't terribly on-point, but at least
it was there.

Commit 2f48ede08 lost this detail.  Instead, plain RETURN QUERY
insisted that the query be a SELECT (by checking for SPI_OK_SELECT)
while RETURN QUERY EXECUTE failed to check the query type at all.
Neither of these changes was intended.

The only convenient place to check this in the EXECUTE case is inside
_SPI_execute_plan, because we haven't done parse analysis until then.
So we need to pass down a flag saying whether to enforce that the
query returns tuples.  Fortunately, we can squeeze another boolean
into struct SPIExecuteOptions without an ABI break, since there's
padding space there.  (It's unlikely that any extensions would
already be using this new struct, but preserving ABI in v14 seems
like a smart idea anyway.)

Within spi.c, it seemed like _SPI_execute_plan's parameter list
was already ridiculously long, and I didn't want to make it longer.
So I thought of passing SPIExecuteOptions down as-is, allowing that
parameter list to become much shorter.  This makes the patch a bit
more invasive than it might otherwise be, but it's all internal to
spi.c, so that seems fine.

Per report from Marc Bachmann.  Back-patch to v14 where the
faulty code came in.

Discussion: https://postgr.es/m/1F2F75F0-27DF-406F-848D-8B50C7EEF06A@gmail.com
2021-10-03 13:21:20 -04:00
Tom Lane fa8db48791 Update our mapping of Windows time zone names using CLDR info.
This corrects a bunch of entries in win32_tzmap[], and adds a few
new ones, based on the CLDR project's windowsZones.xml file.
Non-cosmetic changes fall into four main categories:

* Flat-out errors:

US/Aleutan doesn't exist
America/Salvador doesn't exist
Asia/Baku is wrong for Yerevan
Asia/Dhaka (Bangladesh) is wrong for Astana (Kazakhstan)
Europe/Bucharest is wrong for Chisinau
America/Mexico_City is wrong for Chetumal
America/Buenos_Aires is wrong for Cayenne
America/Caracas has its own zone, so poor fit for La Paz
US/Eastern is wrong for Haiti
US/Eastern is wrong for Indiana (East)
Asia/Karachi is wrong for Tashkent
Etc/UTC+12 doesn't exist
Signs of Etc/GMT zones were backwards

* Judgment calls:

(These changes follow CLDR's choices, except for the first one)

Use Europe/London for "Greenwich Standard Time", since that seems much
more likely than Africa/Casablanca to be what people will think that
zone name means.  CLDR has Atlantic/Reykjavik here, but that's no better.

Asia/Shanghai seems a better fit than Hong Kong for "China Standard
Time".

Europe/Sarajevo is now a link to Belgrade, ie "Central Europe Standard
Time"; so use Warsaw for "Central European Standard Time".

America/Sao_Paulo seems more representative than Araguaina for
"E. South America Standard Time".

Africa/Johannesburg seems more representative than Harare for
"South Africa Standard Time".

* New Windows zone names:

"Israel Standard Time"
"Kaliningrad Standard Time"
"Russia Time Zone N" for various N
"Singapore Standard Time"
"South Sudan Standard Time"
"W. Central Africa Standard Time"
"West Bank Standard Time"
"Yukon Standard Time"

Some of these replace older spellings, but I kept the older spellings
too in case our code runs on a machine with the older data.

* Replace aliases (tzdb Links) with underlying city-named zones:

(This tracks tzdb's longstanding practice, and reduces inconsistency
with the rest of the entries, as well as with CLDR.)

US/Alaska
Asia/Kuwait
Asia/Muscat
Canada/Atlantic
Australia/Canberra
Canada/Saskatchewan
US/Central
US/Eastern
US/Hawaii
US/Mountain
Canada/Newfoundland
US/Pacific

Back-patch to all supported branches, as is our usual practice for
time zone data updates.

Discussion: https://postgr.es/m/3266414.1633045628@sss.pgh.pa.us
2021-10-02 16:06:09 -04:00
Tom Lane 81464999bc Re-alphabetize the win32_tzmap[] array.
The original intent seems to have been to sort case-insensitively
by the Windows zone name, but various changes over the years did
not get that memo.  This commit just moves a few entries to
restore exact alphabetic order, to ease comparison to the outputs
of processing scripts.

Back-patch to all supported branches, as is our usual practice for
time zone data updates.

Discussion: https://postgr.es/m/3266414.1633045628@sss.pgh.pa.us
2021-10-02 16:06:09 -04:00
Andres Freund 0baa33da0f Reference test binary using TESTDIR in 001_libpq_pipeline.pl.
The previous approach didn't really work on windows, due to the PATH separator
being ';' not ':'. Instead of making the PATH change more complicated,
reference the binary using the TESTDIR environment.

Reported-By: Andres Freund <andres@anarazel.de>
Suggested-By: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/20210930214040.odkdd42vknvzifm6@alap3.anarazel.de
Backpatch: 14-, where the test was introduced.
2021-10-01 15:30:31 -07:00
Alvaro Herrera 20047609d3
Error out if SKIP LOCKED and WITH TIES are both specified
Both bugs #16676[1] and #17141[2] illustrate that the combination of
SKIP LOCKED and FETCH FIRST WITH TIES break expectations when it comes
to rows returned to other sessions accessing the same row.  Since this
situation is detectable from the syntax and hard to fix otherwise,
forbid for now, with the potential to fix in the future.

[1] https://postgr.es/m/16676-fd62c3c835880da6@postgresql.org
[2] https://postgr.es/m/17141-913d78b9675aac8e@postgresql.org

Backpatch-through: 13, where WITH TIES was introduced
Author: David Christensen <david.christensen@crunchydata.com>
Discussion: https://postgr.es/m/CAOxo6XLPccCKru3xPMaYDpa+AXyPeWFs+SskrrL+HKwDjJnLhg@mail.gmail.com
2021-10-01 18:29:18 -03:00
Alvaro Herrera 0ce67bce00
Remove unstable, unnecessary test; fix typo
Commit ff9f111bce added some test code that's unportable and doesn't
add meaningful coverage.  Remove it rather than try and get it to work
everywhere.

While at it, fix a typo in a log message added by the aforementioned
commit.

Backpatch to 14.

Discussion: https://postgr.es/m/3000074.1632947632@sss.pgh.pa.us
2021-10-01 18:03:11 -03:00
Daniel Gustafsson a5e83ad79c Fix memory leak in pg_hmac
The intermittent h buffer was not freed, causing it to leak. Backpatch
through 14 where HMAC was refactored to the current API.

Author: Sergey Shinderuk <s.shinderuk@postgrespro.ru>
Discussion: https://postgr.es/m/af07e620-7e28-a742-4637-2bc44aa7c2be@postgrespro.ru
Backpatch-through: 14
2021-10-01 22:47:05 +02:00
Tom Lane a54509bfd9 Avoid believing incomplete MCV-only stats in get_variable_range().
get_variable_range() would incautiously believe that statistics
containing only an MCV list are sufficient to derive a range estimate.
That's okay for an enum-like column that contains only MCVs, but
otherwise the estimate could be pretty bad.  Make it report that the
range is indeterminate unless the MCVs plus nullfrac account for
the whole table.

I don't think this needs a dedicated test case, since a quick code
coverage check verifies that the existing regression tests traverse
all the alternatives.  There is room to doubt that a future-proof
test case could be built anyway, given that the submitted example
accidentally doesn't fail before v11.

Per bug #17207 from Simon Perepelitsa.  Back-patch to v10.
In principle this has been broken all along, but I'm hesitant to
make such changes in 9.6, since if anyone is unhappy with 9.6.24's
behavior there will be no second chance to fix it.

Discussion: https://postgr.es/m/17207-5265aefa79e333b4@postgresql.org
2021-10-01 14:59:35 -04:00
Tom Lane e6adaa1795 Fix Portal snapshot tracking to handle subtransactions properly.
Commit 84f5c2908 forgot to consider the possibility that
EnsurePortalSnapshotExists could run inside a subtransaction with
lifespan shorter than the Portal's.  In that case, the new active
snapshot would be popped at the end of the subtransaction, leaving
a dangling pointer in the Portal, with mayhem ensuing.

To fix, make sure the ActiveSnapshot stack entry is marked with
the same subtransaction nesting level as the associated Portal.
It's certainly safe to do so since we won't be here at all unless
the stack is empty; hence we can't create an out-of-order stack.

Let's also apply this logic in the case where PortalRunUtility
sets portalSnapshot, just to be sure that path can't cause similar
problems.  It's slightly less clear that that path can't create
an out-of-order stack, so add an assertion guarding it.

Report and patch by Bertrand Drouvot (with kibitzing by me).
Back-patch to v11, like the previous commit.

Discussion: https://postgr.es/m/ff82b8c5-77f4-3fe7-6028-fcf3303e82dd@amazon.com
2021-10-01 11:10:12 -04:00
Tom Lane afc6081f6e Remove gratuitous environment dependency in 002_types.pl test.
Computing related timestamps by subtracting "N days" is sensitive
to the prevailing timezone, since we interpret that as "same local
time on the N'th prior day".  Even though the intervals in question
are only two to four days, through remarkable bad luck they managed
to cross the end of Ramadan in 2014, causing the test's output to
change if timezone is set to Africa/Casablanca.  (Maybe in other
Muslim areas as well; I didn't check.)  There's absolutely no reason
for this test to exercise interval subtraction, so just get rid of
that and use plain timestamptz constants representing the intended
values.

Per report from Andres Freund.  Back-patch to v10 where this test
script came in.

Discussion: https://postgr.es/m/20210930183641.7lh4jhvpipvromca@alap3.anarazel.de
2021-09-30 16:23:21 -04:00
Alvaro Herrera e3731bac52
Repair two portability oversights of new test
First, as pointed out by Tom Lane and Michael Paquier, I failed to
realize that Windows' PostgresNode needs an extra pg_hba.conf line
(added by PostgresNode->set_replication_conf, called internally by
->init() when 'allows_streaming=>1' is given -- but I purposefully
omitted that).  I think a good fix should be to have nodes with only
'has_archiving=>1' set up for replication too, but that's a bigger
discussion.  Fix it by calling ->set_replication_conf, which is not
unprecedented, as pointed out by Andrew Dunstan.

I also forgot to uncomment a ->finish() call for a pumpable IPC::Run
file descriptor.  Apparently this is innocuous in almost all platforms.

Backpatch to 14.  The older branches were added this file too, but not
this particular part of the test.

Discussion: https://postgr.es/m/3000074.1632947632@sss.pgh.pa.us
Discussion: https://postgr.es/m/YVT7qwhR8JmC2kfz@paquier.xyz
2021-09-30 10:01:03 -03:00
Alvaro Herrera 64a8687a68
Fix WAL replay in presence of an incomplete record
Physical replication always ships WAL segment files to replicas once
they are complete.  This is a problem if one WAL record is split across
a segment boundary and the primary server crashes before writing down
the segment with the next portion of the WAL record: WAL writing after
crash recovery would happily resume at the point where the broken record
started, overwriting that record ... but any standby or backup may have
already received a copy of that segment, and they are not rewinding.
This causes standbys to stop following the primary after the latter
crashes:
  LOG:  invalid contrecord length 7262 at A8/D9FFFBC8
because the standby is still trying to read the continuation record
(contrecord) for the original long WAL record, but it is not there and
it will never be.  A workaround is to stop the replica, delete the WAL
file, and restart it -- at which point a fresh copy is brought over from
the primary.  But that's pretty labor intensive, and I bet many users
would just give up and re-clone the standby instead.

A fix for this problem was already attempted in commit 515e3d84a0, but
it only addressed the case for the scenario of WAL archiving, so
streaming replication would still be a problem (as well as other things
such as taking a filesystem-level backup while the server is down after
having crashed), and it had performance scalability problems too; so it
had to be reverted.

This commit fixes the problem using an approach suggested by Andres
Freund, whereby the initial portion(s) of the split-up WAL record are
kept, and a special type of WAL record is written where the contrecord
was lost, so that WAL replay in the replica knows to skip the broken
parts.  With this approach, we can continue to stream/archive segment
files as soon as they are complete, and replay of the broken records
will proceed across the crash point without a hitch.

Because a new type of WAL record is added, users should be careful to
upgrade standbys first, primaries later. Otherwise they risk the standby
being unable to start if the primary happens to write such a record.

A new TAP test that exercises this is added, but the portability of it
is yet to be seen.

This has been wrong since the introduction of physical replication, so
backpatch all the way back.  In stable branches, keep the new
XLogReaderState members at the end of the struct, to avoid an ABI
break.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Nathan Bossart <bossartn@amazon.com>
Discussion: https://postgr.es/m/202108232252.dh7uxf6oxwcy@alvherre.pgsql
2021-09-29 11:41:01 -03:00
Fujii Masao 8231c500ed pgbench: Fix handling of socket errors during benchmark.
Previously socket errors such as invalid socket or socket wait method failures
during benchmark caused pgbench to exit with status 0. Instead, errors during
the run should result in exit status 2.

Back-patch to v12 where pgbench started reporting exit status.

Original complaint and patch by Hayato Kuroda.

Author: Yugo Nagata, Fabien COELHO
Reviewed-by: Kyotaro Horiguchi, Fujii Masao
Discussion: https://postgr.es/m/TYCPR01MB5870057375ACA8A73099C649F5349@TYCPR01MB5870.jpnprd01.prod.outlook.com
2021-09-29 21:49:29 +09:00
Fujii Masao 8021334d37 pgbench: Correct log level of message output when socket wait method fails.
The failure of socket wait method like "select()" doesn't terminate pgbench.
So the log level of error message when that failure happens should be ERROR.
But previously FATAL was used in that case.

Back-patch to v13 where pgbench started using common logging API.

Author: Yugo Nagata, Fabien COELHO
Reviewed-by: Kyotaro Horiguchi, Fujii Masao
Discussion: https://postgr.es/m/20210617005934.8bd37bf72efd5f1b38e6f482@sraoss.co.jp
2021-09-29 21:47:25 +09:00
Michael Paquier 2cf9cf5d7b Clarify use of "statistics objects" in the code
The code inconsistently used "statistic object" or "statistics" where
the correct term, as discussed, is actually "statistics object".  This
improves the state of the code to be more consistent.

While on it, fix an incorrect error message introduced in a4d75c8.  This
error should never happen, as the code states, but it would be
misleading.

Author: Justin Pryzby
Reviewed-by: Álvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/20210924215827.GS831@telsasoft.com
Backpatch-through: 14
2021-09-29 15:29:45 +09:00
Magnus Hagander febbb2f52c Properly schema-prefix reference to pg_catalog.pg_get_statisticsobjdef_columns
Author: Tatsuro Yamada
Backpatch-through: 14
Discussion: https://www.postgresql.org/message-id/7ad8cd13-db5b-5cf6-8561-dccad1a934cb@nttcom.co.jp
2021-09-28 16:24:24 +02:00
Peter Eisentraut 9f535e4aea Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 941ca560d0b36a8bace8432b06302ca003829f42
2021-09-27 09:22:27 +02:00
Peter Eisentraut e536a26834 Add missing $Test::Builder::Level settings
One of these was accidentally removed by c50624c.  The others are
added by analogy.

Discussion: https://www.postgresql.org/message-id/ae1143fb-455c-c80f-ed66-78d45bd93303@enterprisedb.com
2021-09-23 23:06:55 +02:00
Alexander Korotkov 7186f07189 Split macros from visibilitymap.h into a separate header
That allows to include just visibilitymapdefs.h from file.c, and in turn,
remove include of postgres.h from relcache.h.

Reported-by: Andres Freund
Discussion: https://postgr.es/m/20210913232614.czafiubr435l6egi%40alap3.anarazel.de
Author: Alexander Korotkov
Reviewed-by: Andres Freund, Tom Lane, Alvaro Herrera
Backpatch-through: 13
2021-09-23 19:59:11 +03:00
Tomas Vondra abb2f9144b Release memory allocated by dependency_degree
Calculating degree of a functional dependency may allocate a lot of
memory - we have released mot of the explicitly allocated memory, but
e.g. detoasted varlena values were left behind. That may be an issue,
because we consider a lot of dependencies (all combinations), and the
detoasting may happen for each one again.

Fixed by calling dependency_degree() in a dedicated context, and
resetting it after each call. We only need the calculated dependency
degree, so we don't need to copy anything.

Backpatch to PostgreSQL 10, where extended statistics were introduced.

Backpatch-through: 10
Discussion: https://www.postgresql.org/message-id/20210915200928.GP831%40telsasoft.com
2021-09-23 18:25:37 +02:00
Tomas Vondra bb7628e55e Free memory after building each statistics object
Until now, all extended statistics on a given relation were built in the
same memory context, without resetting. Some of the memory was released
explicitly, but not all of it - for example memory allocated while
detoasting values is hard to free. This is how it worked since extended
statistics were introduced in PostgreSQL 10, but adding support for
extended stats on expressions made the issue somewhat worse as it
increases the number of statistics to build.

Fixed by adding a memory context which gets reset after building each
statistics object (all the statistics kinds included in it). Resetting
it after building each statistics kind would be even better, but it
would require more invasive changes and copying of results, making it
harder to backpatch.

Backpatch to PostgreSQL 10, where extended statistics were introduced.

Author: Justin Pryzby
Reported-by: Justin Pryzby
Reviewed-by: Tomas Vondra
Backpatch-through: 10
Discussion: https://www.postgresql.org/message-id/20210915200928.GP831%40telsasoft.com
2021-09-23 18:22:29 +02:00
Amit Kapila 9eff859326 Invalidate all partitions for a partitioned table in publication.
Updates/Deletes on a partition were allowed even without replica identity
after the parent table was added to a publication. This would later lead
to an error on subscribers. The reason was that we were not invalidating
the partition's relcache and the publication information for partitions
was not getting rebuilt. Similarly, we were not invalidating the
partitions' relcache after dropping a partitioned table from a publication
which will prohibit Updates/Deletes on its partition without replica
identity even without any publication.

Reported-by: Haiying Tang
Author: Hou Zhijie and Vignesh C
Reviewed-by: Vignesh C and Amit Kapila
Backpatch-through: 13
Discussion: https://postgr.es/m/OS0PR01MB6113D77F583C922F1CEAA1C3FBD29@OS0PR01MB6113.jpnprd01.prod.outlook.com
2021-09-22 08:13:37 +05:30
Peter Geoghegan e665129c47 Fix "single value strategy" index deletion issue.
It is not appropriate for deduplication to apply single value strategy
when triggered by a bottom-up index deletion pass.  This wastes cycles
because later bottom-up deletion passes will overinterpret older
duplicate tuples that deduplication actually just skipped over "by
design".  It also makes bottom-up deletion much less effective for low
cardinality indexes that happen to cross a meaningless "index has single
key value per leaf page" threshold.

To fix, slightly narrow the conditions under which deduplication's
single value strategy is considered.  We already avoided the strategy
for a unique index, since our high level goal must just be to buy time
for VACUUM to run (not to buy space).  We'll now also avoid it when we
just had a bottom-up pass that reported failure.  The two cases share
the same high level goal, and already overlapped significantly, so this
approach is quite natural.

Oversight in commit d168b666, which added bottom-up index deletion.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WznaOvM+Gyj-JQ0X=JxoMDxctDTYjiEuETdAGbF5EUc3MA@mail.gmail.com
Backpatch: 14-, where bottom-up deletion was introduced.
2021-09-21 18:57:31 -07:00
Michael Paquier 90251ff199 Fix places in TestLib.pm in need of adaptation to the output of Msys perl
Contrary to the output of native perl, Msys perl generates outputs with
CRLFs characters.  There are already places in the TAP code where CRLFs
(\r\n) are automatically converted to LF (\n) on Msys, but we missed a
couple of places when running commands and using their output for
comparison, that would lead to failures.

This problem has been found thanks to the test added in 5adb067 using
TestLib::command_checks_all(), but after a closer look more code paths
were missing a filter.

This is backpatched all the way down to prevent any surprises if a new
test is introduced in stable branches.

Reviewed-by: Andrew Dunstan, Álvaro Herrera
Discussion: https://postgr.es/m/1252480.1631829409@sss.pgh.pa.us
Backpatch-through: 9.6
2021-09-22 08:43:00 +09:00
Tom Lane 2ad5f963e1 Fix misevaluation of STABLE parameters in CALL within plpgsql.
Before commit 84f5c2908, a STABLE function in a plpgsql CALL
statement's argument list would see an up-to-date snapshot,
because exec_stmt_call would push a new snapshot.  I got rid of
that because the possibility of the snapshot disappearing within
COMMIT made it too hard to manage a snapshot across the CALL
statement.  That's fine so far as the procedure itself goes,
but I forgot to think about the possibility of STABLE functions
within the CALL argument list.  As things now stand, those'll
be executed with the Portal's snapshot as ActiveSnapshot,
keeping them from seeing updates more recent than Portal startup.

(VOLATILE functions don't have a problem because they take their
own snapshots; which indeed is also why the procedure itself
doesn't have a problem.  There are no STABLE procedures.)

We can fix this by pushing a new snapshot transiently within
ExecuteCallStmt itself.  Popping the snapshot before we get
into the procedure proper eliminates the management problem.
The possibly-useless extra snapshot-grab is slightly annoying,
but it's no worse than what happened before 84f5c2908.

Per bug #17199 from Alexander Nawratil.  Back-patch to v11,
like the previous patch.

Discussion: https://postgr.es/m/17199-1ab2561f0d94af92@postgresql.org
2021-09-21 19:06:54 -04:00
Alvaro Herrera c1d1ae1db2
Document XLOG_INCLUDE_XID a little better
I noticed that commit 0bead9af48 left this flag undocumented in
XLogSetRecordFlags, which led me to discover that the flag doesn't
actually do what the one comment on it said it does.  Improve the
situation by adding some more comments.

Backpatch to 14, where the aforementioned commit appears.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/202109212119.c3nhfp64t2ql@alvherre.pgsql
2021-09-21 19:47:53 -03:00
Peter Geoghegan 955a6a8934 Remove overzealous index deletion assertion.
A broken HOT chain is not an unexpected condition, even when the offset
number points past the end of the page's line pointer array.
heap_prune_chain() does not (and never has) treated this condition as
unexpected, so derivative code in heap_index_delete_tuples() shouldn't
do so either.

Oversight in commit 4228817449.

The assertion can probably only fail on Postgres 14 and master.  Earlier
releases don't have commit 3c3b8a4b, which taught VACUUM to truncate the
line pointer array of heap pages.  Backpatch all the same, just to be
consistent.

Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/17197-9438f31f46705182@postgresql.org
Backpatch: 12-, just like commit 4228817449.
2021-09-20 14:26:24 -07:00
Peter Eisentraut 3e8525aab8 Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 10b675b81a3a04bac460cb049e0b7b6e17fb4795
2021-09-20 16:23:13 +02:00
Tomas Vondra 6606107715 Disallow extended statistics on system columns
Since introduction of extended statistics, we've disallowed references
to system columns. So for example

    CREATE STATISTICS s ON ctid FROM t;

would fail. But with extended statistics on expressions, it was possible
to work around this limitation quite easily

    CREATE STATISTICS s ON (ctid::text) FROM t;

This is an oversight in a4d75c86bf, fixed by adding a simple check.
Backpatch to PostgreSQL 14, where support for extended statistics on
expressions was introduced.

Backpatch-through: 14
Discussion: https://postgr.es/m/20210816013255.GS10479%40telsasoft.com
2021-09-20 00:45:29 +02:00
Tom Lane 4d5b4483db Fix pull_varnos to cope with translated PlaceHolderVars.
Commit 55dc86eca changed pull_varnos to use (if possible) the associated
ph_eval_at for a PlaceHolderVar.  I missed a fine point though: we might
be looking at a PHV in the quals or tlist of a child appendrel, in which
case we need to compute a ph_eval_at value that's been translated in the
same way that the PHV itself has been (cf. adjust_appendrel_attrs).
Fortunately, enough info is available in the PlaceHolderInfo to make
such translation possible without additional outside data, so we don't
need another round of uglification of planner APIs.  This is a little
bit complicated, but since it's a hard-to-hit corner case, I'm not much
worried about adding cycles here.

Per report from Jaime Casanova.  Back-patch to v12, like the previous
commit.

Discussion: https://postgr.es/m/20210915230959.GB17635@ahch-to
2021-09-17 15:41:16 -04:00
Tom Lane 388726753b Fix EXPLAIN to handle SEARCH BREADTH FIRST queries.
The rewriter transformation for SEARCH BREADTH FIRST produces a
FieldSelect on a Var of type RECORD, where the Var references the
recursive union's worktable output.  EXPLAIN VERBOSE failed to handle
this case, because it only expected such Vars to appear in CteScans
not WorkTableScans.  Fix that, and add some test cases exercising
EXPLAIN on SEARCH and CYCLE queries.

In principle this oversight is an old bug, but it seems that the
case is unreachable without SEARCH BREADTH FIRST, because the
parser fails when attempting to create such a reference manually.
So for today I'll just patch HEAD/v14.  Someday we might find that
the code portion of this patch needs to be back-patched further.

Per report from Atsushi Torikoshi.

Discussion: https://postgr.es/m/5bafa66ad529e11860339565c9e7c166@oss.nttdata.com
2021-09-16 10:45:42 -04:00
Peter Eisentraut f46dc96fcc Message style improvements 2021-09-16 15:36:58 +02:00
Andres Freund 7890a42347 Fix performance regression from session statistics.
Session statistics, as introduced by 960869da08, had several shortcomings:

- an additional GetCurrentTimestamp() call that also impaired the accuracy of
  the data collected

  This can be avoided by passing the current timestamp we already have in
  pgstat_report_stat().

- an additional statistics UDP packet sent every 500ms

  This is solved by adding the new statistics to PgStat_MsgTabstat.
  This is conceptually ugly, because session statistics are not
  table statistics.  But the struct already contains data unrelated
  to tables, so there is not much damage done.

  Connection and disconnection are reported in separate messages, which
  reduces the number of additional messages to two messages per session and a
  slight increase in PgStat_MsgTabstat size (but the same number of table
  stats fit).

- Session time computation could overflow on systems where long is 32 bit.

Reported-By: Andres Freund <andres@anarazel.de>
Author: Andres Freund <andres@anarazel.de>
Author: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/20210801205501.nyxzxoelqoo4x2qc%40alap3.anarazel.de
Backpatch: 14-, where the feature was introduced.
2021-09-16 02:10:57 -07:00
Fujii Masao 92a8d7610e Fix variable shadowing in procarray.c.
ProcArrayGroupClearXid function has a parameter named "proc",
but the same name was used for its local variables. This commit fixes
this variable shadowing, to improve code readability.

Back-patch to all supported versions, to make future back-patching
easy though this patch is classified as refactoring only.

Reported-by: Ranier Vilela
Author: Ranier Vilela, Aleksander Alekseev
https://postgr.es/m/CAEudQAqyoTZC670xWi6w-Oe2_Bk1bfu2JzXz6xRfiOUzm7xbyQ@mail.gmail.com
2021-09-16 13:07:10 +09:00
Fujii Masao fe8821ca7d Use int instead of size_t in procarray.c.
All size_t variables declared in procarray.c are actually int ones.
Let's use int instead of size_t for those variables. Which would
reduce Wsign-compare compiler warnings.

Back-patch to v14 where commit 941697c3c1 added size_t variables
in procarray.c, to make future back-patching easy though
this patch is classified as refactoring only.

Reported-by: Ranier Vilela
Author: Ranier Vilela, Aleksander Alekseev
https://postgr.es/m/CAEudQAqyoTZC670xWi6w-Oe2_Bk1bfu2JzXz6xRfiOUzm7xbyQ@mail.gmail.com
2021-09-16 12:54:15 +09:00
Tom Lane d84d62b622 Disallow LISTEN in background workers.
It's possible to execute user-defined SQL in some background processes;
for example, logical replication workers can fire triggers.  This opens
the possibility that someone would try to execute LISTEN in such a
context.  But since only regular backends ever call
ProcessNotifyInterrupt, no messages would actually be received, and
thus the registered listener would simply prevent the message queue
from being cleaned.  Eventually NOTIFY would stop working, which is bad.

Perhaps someday somebody will invent infrastructure to make listening
in a background worker actually useful.  In the meantime, forbid it.

Back-patch to v13, which is where we introduced the MyBackendType
variable.  It'd be a lot harder to implement the check without that,
and it doesn't seem worth the trouble.

Discussion: https://postgr.es/m/153243441449.1404.2274116228506175596@wrigleys.postgresql.org
2021-09-15 12:31:56 -04:00
Peter Eisentraut 9b2fd49057 Fix hash_array
Commit 054adca641 neglected to
initialize the type_id field of the synthesized type cache entry, so
it would make a new one on every call.

Also, better use the per-function memory context for this; otherwise
it leaks memory.

Discussion: https://www.postgresql.org/message-id/flat/17158-8a2ba823982537a4%40postgresql.org
2021-09-15 12:15:20 +02:00
Tom Lane 0eff10a008 Send NOTIFY signals during CommitTransaction.
Formerly, we sent signals for outgoing NOTIFY messages within
ProcessCompletedNotifies, which was also responsible for sending
relevant ones of those messages to our connected client.  It therefore
had to run during the main-loop processing that occurs just before
going idle.  This arrangement had two big disadvantages:

* Now that procedures allow intra-command COMMITs, it would be
useful to send NOTIFYs to other sessions immediately at COMMIT
(though, for reasons of wire-protocol stability, we still shouldn't
forward them to our client until end of command).

* Background processes such as replication workers would not send
NOTIFYs at all, since they never execute the client communication
loop.  We've had requests to allow triggers running in replication
workers to send NOTIFYs, so that's a problem.

To fix these things, move transmission of outgoing NOTIFY signals
into AtCommit_Notify, where it will happen during CommitTransaction.
Also move the possible call of asyncQueueAdvanceTail there, to
ensure we don't bloat the async SLRU if a background worker sends
many NOTIFYs with no one listening.

We can also drop the call of asyncQueueReadAllNotifications,
allowing ProcessCompletedNotifies to go away entirely.  That's
because commit 790026972 added a call of ProcessNotifyInterrupt
adjacent to PostgresMain's call of ProcessCompletedNotifies,
and that does its own call of asyncQueueReadAllNotifications,
meaning that we were uselessly doing two such calls (inside two
separate transactions) whenever inbound notify signals coincided
with an outbound notify.  We need only set notifyInterruptPending
to ensure that ProcessNotifyInterrupt runs, and we're done.

The existing documentation suggests that custom background workers
should call ProcessCompletedNotifies if they want to send NOTIFY
messages.  To avoid an ABI break in the back branches, reduce it
to an empty routine rather than removing it entirely.  Removal
will occur in v15.

Although the problems mentioned above have existed for awhile,
I don't feel comfortable back-patching this any further than v13.
There was quite a bit of churn in adjacent code between 12 and 13.
At minimum we'd have to also backpatch 51004c717, and a good deal
of other adjustment would also be needed, so the benefit-to-risk
ratio doesn't look attractive.

Per bug #15293 from Michael Powers (and similar gripes from others).

Artur Zakirov and Tom Lane

Discussion: https://postgr.es/m/153243441449.1404.2274116228506175596@wrigleys.postgresql.org
2021-09-14 17:18:25 -04:00
Tom Lane 29aa0ce361 Fix planner error with multiple copies of an AlternativeSubPlan.
It's possible for us to copy an AlternativeSubPlan expression node
into multiple places, for example the scan quals of several
partition children.  Then it's possible that we choose a different
one of the alternatives as optimal in each place.  Commit 41efb8340
failed to consider this scenario, so its attempt to remove "unused"
subplans could remove subplans that were still used elsewhere.

Fix by delaying the removal logic until we've examined all the
AlternativeSubPlans in a given query level.  (This does assume that
AlternativeSubPlans couldn't get copied to other query levels, but
for the foreseeable future that's fine; cf qual_is_pushdown_safe.)

Per report from Rajkumar Raghuwanshi.  Back-patch to v14
where the faulty logic came in.

Discussion: https://postgr.es/m/CAKcux6==O3NNZC3bZ2prRYv3cjm3_Zw1GfzmOjEVqYN4jub2+Q@mail.gmail.com
2021-09-14 15:11:21 -04:00
Andres Freund 4e86887e09 jit: Do not try to shut down LLVM state in case of LLVM triggered errors.
If an allocation failed within LLVM it is not safe to call back into LLVM as
LLVM is not generally safe against exceptions / stack-unwinding. Thus errors
while in LLVM code are promoted to FATAL. However llvm_shutdown() did call
back into LLVM even in such cases, while llvm_release_context() was careful
not to do so.

We cannot generally skip shutting down LLVM, as that can break profiling. But
it's OK to do so if there was an error from within LLVM.

Reported-By: Jelte Fennema <Jelte.Fennema@microsoft.com>
Author: Andres Freund <andres@anarazel.de>
Author: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/AM5PR83MB0178C52CCA0A8DEA0207DC14F7FF9@AM5PR83MB0178.EURPRD83.prod.outlook.com
Backpatch: 11-, where jit was introduced
2021-09-13 18:15:28 -07:00
Andres Freund 0d0bbee5e3 Fix potential for compiler warning in GlobalVisTestFor().
In d9d8aa9bb9 I added a defensive NULL assignment to protect against a
not-too-smart compiler warning about unitialized variable use after the
switch. Unfortunately I only did so on master and forgot to adjust that for
14.

Stephen noticed that there actually is a compiler warning :(.

Reported-By: Stephen Frost <sfrost@snowman.net>
Discussion: https://postgr.es/m/20210827224639.GX17906@tamriel.snowman.net
2021-09-13 16:50:10 -07:00
Tom Lane 896a0c44f9 Clear conn->errorMessage at successful completion of PQconnectdb().
Commits ffa2e4670 and 52a10224e caused libpq's connection-establishment
functions to usually leave a nonempty string in the connection's
errorMessage buffer, even after a successful connection.  While that
was intentional on my part, more sober reflection says that it wasn't
a great idea: the string would be a bit confusing.  Also this broke at
least one application that checked for connection success by examining
the errorMessage, instead of using PQstatus() as documented.  Let's
clear the buffer at success exit, restoring the pre-v14 behavior.

Discussion: https://postgr.es/m/4170264.1620321747@sss.pgh.pa.us
2021-09-13 16:53:11 -04:00
Tom Lane 4ffd3fe4d6 Fix EXIT out of outermost block in plpgsql.
Ordinarily, using EXIT this way would draw "control reached end of
function without RETURN".  However, if the function is one where we
don't require an explicit RETURN (such as a DO block), that should
not happen.  It did anyway, because add_dummy_return() neglected to
account for the case.

Per report from Herwig Goemans.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/868ae948-e3ca-c7ec-95a6-83cfc08ef750@gmail.com
2021-09-13 12:42:28 -04:00
Amit Kapila f5e0ff4631 Fix reorder buffer memory accounting for toast changes.
While processing toast changes in logical decoding, we rejigger the
tuple change to point to in-memory toast tuples instead to on-disk toast
tuples. And, to make sure the memory accounting is correct, we were
subtracting the old change size and then after re-computing the new tuple,
re-adding its size at the end. Now, if there is any error before we add
the new size, we will release the changes and that will update the
accounting info (subtracting the size from the counters). And we were
underflowing there which leads to an assertion failure in assert enabled
builds and wrong memory accounting in reorder buffer otherwise.

Author: Bertrand Drouvot
Reviewed-by: Amit Kapila
Backpatch-through: 13, where memory accounting was introduced
Discussion: https://postgr.es/m/92b0ee65-b8bd-e42d-c082-4f3f4bf12d34@amazon.com
2021-09-13 10:35:00 +05:30
Michael Paquier cc057fb315 Fix error handling with threads on OOM in ECPG connection logic
An out-of-memory failure happening when allocating the structures to
store the connection parameter keywords and values would mess up with
the set of connections saved, as on failure the pthread mutex would
still be hold with the new connection object listed but free()'d.

Rather than just unlocking the mutex, which would leave the static list
of connections into an inconsistent state, move the allocation for the
structures of the connection parameters before beginning the test
manipulation.  This ensures that the list of connections and the
connection mutex remain consistent all the time in this code path.

This error is unlikely going to happen, but this could mess up badly
with ECPG clients in surprising ways, so backpatch all the way down.

Reported-by: ryancaicse
Discussion: https://postgr.es/m/17186-b4cfd8f0eb4d1dee@postgresql.org
Backpatch-through: 9.6
2021-09-13 13:24:04 +09:00
Tom Lane b33283cbd3 Make pg_regexec() robust against out-of-range search_start.
If search_start is greater than the length of the string, we should just
return REG_NOMATCH immediately.  (Note that the equality case should
*not* be rejected, since the pattern might be able to match zero
characters.)  This guards various internal assumptions that the min of a
range of string positions is not more than the max.  Violation of those
assumptions could allow an attempt to fetch string[search_start-1],
possibly causing a crash.

Jaime Casanova pointed out that this situation is reachable with the
new regexp_xxx functions that accept a user-specified start position.
I don't believe it's reachable via any in-core call site in v14 and
below.  However, extensions could possibly call pg_regexec with an
out-of-range search_start, so let's back-patch the fix anyway.

Discussion: https://postgr.es/m/20210911180357.GA6870@ahch-to
2021-09-11 15:19:43 -04:00
Tom Lane d844cd75a6 Fix some anomalies with NO SCROLL cursors.
We have long forbidden fetching backwards from a NO SCROLL cursor,
but the prohibition didn't extend to cases in which we rewind the
query altogether and then re-fetch forwards.  I think the reason is
that this logic was mainly meant to protect plan nodes that can't
be run in the reverse direction.  However, re-reading the query output
is problematic if the query is volatile (which includes SELECT FOR
UPDATE, not just queries with volatile functions): the re-read can
produce different results, which confuses the cursor navigation logic
completely.  Another reason for disliking this approach is that some
code paths will either fetch backwards or rewind-and-fetch-forwards
depending on the distance to the target row; so that seemingly
identical use-cases may or may not draw the "cursor can only scan
forward" error.  Hence, let's clean things up by disallowing rewind
as well as fetch-backwards in a NO SCROLL cursor.

Ordinarily we'd only make such a definitional change in HEAD, but
there is a third reason to consider this change now.  Commit ba2c6d6ce
created some new user-visible anomalies for non-scrollable cursors
WITH HOLD, in that navigation in the cursor result got confused if the
cursor had been partially read before committing.  The only good way
to resolve those anomalies is to forbid rewinding such a cursor, which
allows removal of the incorrect cursor state manipulations that
ba2c6d6ce added to PersistHoldablePortal.

To minimize the behavioral change in the back branches (including
v14), refuse to rewind a NO SCROLL cursor only when it has a holdStore,
ie has been held over from a previous transaction due to WITH HOLD.
This should avoid breaking most applications that have been sloppy
about whether to declare cursors as scrollable.  We'll enforce the
prohibition across-the-board beginning in v15.

Back-patch to v11, as ba2c6d6ce was.

Discussion: https://postgr.es/m/3712911.1631207435@sss.pgh.pa.us
2021-09-10 13:18:32 -04:00
Tom Lane b7056c0a25 Avoid fetching from an already-terminated plan.
Some plan node types don't react well to being called again after
they've already returned NULL.  PortalRunSelect() has long dealt
with this by calling the executor with NoMovementScanDirection
if it sees that we've already run the portal to the end.  However,
commit ba2c6d6ce overlooked this point, so that persisting an
already-fully-fetched cursor would fail if it had such a plan.

Per report from Tomas Barton.  Back-patch to v11, as the faulty
commit was.  (I've omitted a test case because the type of plan
that causes a problem isn't all that stable.)

Discussion: https://postgr.es/m/CAPV2KRjd=ErgVGbvO2Ty20tKTEZZr6cYsYLxgN_W3eAo9pf5sw@mail.gmail.com
2021-09-09 13:36:44 -04:00
Fujii Masao b27d0cd314 pgbench: Stop counting skipped transactions as soon as timer is exceeded.
When throttling is used, transactions that lag behind schedule by
more than the latency limit are counted and reported as skipped.
Previously, there was the case where pgbench counted all skipped
transactions even if the timer specified in -T option was exceeded.
This could take a very long time to do that especially when unrealistically
high rate setting in -R option caused quite a lot of transactions that
lagged behind schedule. This could prevent pgbench from ending
immediately, and so pgbench could look like it got stuck to users.

To fix the issue, this commit changes pgbench so that it stops counting
skipped transactions as soon as the timer is exceeded. The timer can
make pgbench end soon even when there are lots of skipped transactions
that have not been counted yet.

Note that there is no guarantee that all skipped transactions are
counted under -T though there is under -t. This is OK in practice
because it's very unlikely to happen with realistic setting. Also this is
not the issue that this commit newly introdues. There used to be
the case where pgbench ended without counting all skipped
transactions since before.

Back-patch to v14. Per discussion, we decided not to bother
back-patch to the stable branches because it's hard to imagine
the issue happens in practice (with realistic setting).

Author: Yugo Nagata, Fabien COELHO
Reviewed-by: Greg Sabino Mullane, Fujii Masao
Discussion: https://postgr.es/m/20210613040151.265ff59d832f835bbcf8b3ba@sraoss.co.jp
2021-09-10 01:28:51 +09:00
Tom Lane 7430c77420 Check for relation length overrun soon enough.
We don't allow relations to exceed 2^32-1 blocks, because block
numbers are 32 bits and the last possible block number is reserved
to mean InvalidBlockNumber.  There is a check for this in mdextend,
but that's really way too late, because the smgr API requires us to
create a buffer for the block-to-be-added, and we do not want to
have any buffer with blocknum InvalidBlockNumber.  (Such a case
can trigger assertions in bufmgr.c, plus I think it might confuse
ReadBuffer's logic for data-past-EOF later on.)  So put the check
into ReadBuffer.

Per report from Christoph Berg.  It's been like this forever,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/YTn1iTkUYBZfcODk@msg.credativ.de
2021-09-09 11:45:48 -04:00
Fujii Masao b5ec22bf5e Fix issue with WAL archiving in standby.
Previously, walreceiver always closed the currently-opened WAL segment
and created its archive notification file, after it finished writing
the current segment up and received any WAL data that should be
written into the next segment. If walreceiver exited just before
any WAL data in the next segment arrived at standby, it did not
create the archive notification file of the current segment
even though that's known completed. This behavior could cause
WAL archiving of the segment to be delayed until subsequent
restartpoints or checkpoints created its notification file.

To fix the issue, this commit changes walreceiver so that it creates
an archive notification file of a current WAL segment immediately
if that's known completed before receiving next WAL data.

Back-patch to all supported branches.

Reported-by: Kyotaro Horiguchi
Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20200630.165503.1465894182551545886.horikyota.ntt@gmail.com
2021-09-09 23:58:05 +09:00
Tom Lane 52c300df32 Avoid useless malloc/free traffic around getFormattedTypeName().
Coverity complained that one caller of getFormattedTypeName() failed
to free the returned string.  Which is true, but rather than fixing
that one, let's get rid of this tedious and error-prone requirement.
Now that getFormattedTypeName() caches its result, strdup'ing that
result and expecting the caller to free it accomplishes little except
to waste cycles.  We do create a leak in the case where getTypes didn't
make a TypeInfo for the type, but that basically shouldn't ever happen.

Back-patch, as commit 6c450a861 was.  This isn't a particularly
interesting bug fix, but the API change seems like a hazard for
future back-patching activity if we don't back-patch it.
2021-09-08 15:09:42 -04:00
Tom Lane 67948a433e Fix misleading comments about TOAST access macros.
Seems to have been my error in commit aeb1631ed.
Noted by Christoph Berg.

Discussion: https://postgr.es/m/YTeLipdnSOg4NNcI@msg.df7cb.de
2021-09-08 14:11:35 -04:00
Tom Lane 03d01d746b Fix rewriter to set hasModifyingCTE correctly on rewritten queries.
If we copy data-modifying CTEs from the original query to a replacement
query (from a DO INSTEAD rule), we must set hasModifyingCTE properly
in the replacement query.  Failure to do this can cause various
unpleasantness, such as unsafe usage of parallel plans.  The code also
neglected to propagate hasRecursive, though that's only cosmetic at
the moment.

A difficulty arises if the rule action is an INSERT...SELECT.  We
attach the original query's RTEs and CTEs to the sub-SELECT Query, but
data-modifying CTEs are only allowed to appear in the topmost Query.
For the moment, throw an error in such cases.  It would probably be
possible to avoid this error by attaching the CTEs to the top INSERT
Query instead; but that would require a bunch of new code to adjust
ctelevelsup references.  Given the narrowness of the use-case, and
the need to back-patch this fix, it does not seem worth the trouble
for now.  We can revisit this if we get field complaints.

Per report from Greg Nancarrow.  Back-patch to all supported branches.
(The test case added here does not fail before v10, but there are
plenty of places checking top-level hasModifyingCTE in 9.6, so I have
no doubt that this code change is necessary there too.)

Greg Nancarrow and Tom Lane

Discussion: https://postgr.es/m/CAJcOf-f68DT=26YAMz_i0+Au3TcLO5oiHY5=fL6Sfuits6r+_w@mail.gmail.com
Discussion: https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com
2021-09-08 12:05:43 -04:00
Peter Eisentraut 054adca641 Disable anonymous record hash support except in special cases
Commit 01e658fa74 added hash support for row types.  This also added
support for hashing anonymous record types, using the same approach
that the type cache uses for comparison support for record types: It
just reports that it works, but it might fail at run time if a
component type doesn't actually support the operation.  We get away
with that for comparison because most types support that.  But some
types don't support hashing, so the current state can result in
failures at run time where the planner chooses hashing over sorting,
whereas that previously worked if only sorting was an option.

We do, however, want the record hashing support for path tracking in
recursive unions, and the SEARCH and CYCLE clauses built on that.  In
that case, hashing is the only plan option.  So enable that, this
commit implements the following approach: The type cache does not
report that hashing is available for the record type.  This undoes
that part of 01e658fa74.  Instead, callers that require hashing no
matter what can override that result themselves.  This patch only
touches the callers to make the aforementioned recursive query cases
work, namely the parse analysis of unions, as well as the hash_array()
function.

Reported-by: Sait Talha Nisanci <sait.nisanci@microsoft.com>
Bug: #17158
Discussion: https://www.postgresql.org/message-id/flat/17158-8a2ba823982537a4%40postgresql.org
2021-09-08 09:55:18 +02:00
Amit Kapila 8db27fbc11 Invalidate relcache for publications defined for all tables.
Updates/Deletes on a relation were allowed even without replica identity
after we define the publication for all tables. This would later lead to
an error on subscribers. The reason was that for such publications we were
not invalidating the relcache and the publication information for
relations was not getting rebuilt. Similarly, we were not invalidating the
relcache after dropping of such publications which will prohibit
Updates/Deletes without replica identity even without any publication.

Author: Vignesh C and Hou Zhijie
Reviewed-by: Hou Zhijie, Kyotaro Horiguchi, Amit Kapila
Backpatch-through: 10, where it was introduced
Discussion: https://postgr.es/m/CALDaNm0pF6zeWqCA8TCe2sDuwFAy8fCqba=nHampCKag-qLixg@mail.gmail.com
2021-09-08 12:08:29 +05:30
Magnus Hagander b7fd291042 Consistently use read-only instead of "read only"
This affects one message and some documentation that used the format
"read only", unlike everything else that used read-only.

Backpatch-through: 14
Discussion: https://postgr.es/m/CABUevExuxKwn0YM3+wdSeQSvK6CRrJ-hewocGVX3R4-xVX4eMw@mail.gmail.com
2021-09-07 22:04:45 +02:00
Tom Lane 8b895374cd Finish reverting 3eda9fc09f.
Commit 67c33a114 should have set v14's catversion back to what it was
before 3eda9fc09, to avoid forcing a useless pg_upgrade cycle on users
of 14beta3.  Do that now.

Discussion: https://postgr.es/m/2598498.1630702074@sss.pgh.pa.us
2021-09-07 10:52:25 -04:00
Heikki Linnakangas e66add755d Fix missing words in comment.
Introduced by commit c3928b467a, backpatch to v14 like that one.

Author: Amit Langote
Discussion: https://www.postgresql.org/message-id/CA+HiwqFQgNLS6VGntMcuJV6erBFV425xA6wBVnY=41GK4zC0Bw@mail.gmail.com
2021-09-07 10:30:04 +03:00
Noah Misch 47d54b6ba2 AIX: Fix missing libpq symbols by respecting SHLIB_EXPORTS.
We make each AIX shared library export all globals found in .o files
that originate in the library.  That doesn't include symbols acquired by
-lpgcommon_shlib.  That is good on average, but it became a problem for
libpq when commit e6afa8918c moved five
official libpq API symbols into src/common.  Fix this by implementing
the SHLIB_EXPORTS mechanism for AIX, so affected libraries export the
same symbols that they export on Linux.  This reintroduces symbols
pg_encoding_to_char, pg_utf_mblen, pg_char_to_encoding,
pg_valid_server_encoding, and pg_valid_server_encoding_id.  Back-patch
to v13, where the aforementioned commit first appeared.  While a minor
release is usually the wrong time to add or remove symbol exports in
libpq or libecpg, we should expect users to want each documented symbol.

Tony Reix

Discussion: https://postgr.es/m/PR3PR02MB6396742E2FC3E77D37A920BC86C79@PR3PR02MB6396.eurprd02.prod.outlook.com
2021-09-06 11:28:02 -07:00
Tom Lane 599c73a91a Fix bogus timetz_zone() results for DYNTZ abbreviations.
timetz_zone() delivered completely wrong answers if the zone was
specified by a dynamic TZ abbreviation, because it failed to account
for the difference between the POSIX conventions for field values in
struct pg_tm and the conventions used in PG-specific datetime code.

As a stopgap fix, just adjust the tm_year and tm_mon fields to match
PG conventions.  This is fixed in a different way in HEAD (388e71af8)
but I don't want to back-patch the change of reference point.

Discussion: https://postgr.es/m/CAJ7c6TOMG8zSNEZtCn5SPe+cCk3Lfxb71ZaQwT2F4T7PJ_t=KA@mail.gmail.com
2021-09-06 11:29:52 -04:00
Peter Eisentraut 1e9afc868e Fix pkg-config files for static linking
Since ea53100d5 (PostgreSQL 12), the shipped pkg-config files have
been broken for statically linking libpq because libpgcommon and
libpgport are missing.  This patch adds those two missing private
dependencies (in a non-hardcoded way).

Reported-by: Filip Gospodinov <f@gospodinov.ch>
Discussion: https://www.postgresql.org/message-id/flat/c7108bde-e051-11d5-a234-99beec01ce2a@gospodinov.ch
2021-09-06 09:41:03 +02:00
Tom Lane 718978d9da Further portability tweaks for float4/float8 hash functions.
Attempting to make hashfloat4() look as much as possible like
hashfloat8(), I'd figured I could replace NaNs with get_float4_nan()
before widening to float8.  However, results from protosciurus
and topminnow show that on some platforms that produces a different
bit-pattern from get_float8_nan(), breaking the intent of ce773f230.
Rearrange so that we use the result of get_float8_nan() for all NaN
cases.  As before, back-patch.
2021-09-04 16:29:08 -04:00
Tom Lane 5edbcbd5b9 Minor improvements for psql help output.
Fix alphabetization of the output of "\?", and improve one description.

Update PageOutput counts where needed, fixing breakage from previous
patches.

Haiying Tang (PageOutput fixes by me)

Discussion: https://postgr.es/m/OS0PR01MB61136018064660F095CB57A8FB129@OS0PR01MB6113.jpnprd01.prod.outlook.com
2021-09-04 13:28:16 -04:00
Alvaro Herrera aa8bd0890b
Revert "Avoid creating archive status ".ready" files too early"
This reverts commit 515e3d84a0 and equivalent commits in back
branches.  This solution to the problem has a number of problems, so
we'll try again with a different approach.

Per note from Andres Freund

Discussion: https://postgr.es/m/20210831042949.52eqp5xwbxgrfank@alap3.anarazel.de
2021-09-04 12:14:30 -04:00
Tom Lane 69d670e68e Remove arbitrary MAXPGPATH limit on command lengths in pg_ctl.
Replace fixed-length command buffers with psprintf() calls.  We didn't
have anything as convenient as psprintf() when this code was written,
but now that we do, there's little reason for the limitation to
stand.  Removing it eliminates some corner cases where (for example)
starting the postmaster with a whole lot of options fails.

Most individual file names that pg_ctl deals with are still restricted
to MAXPGPATH, but we've seldom had complaints about that limitation
so long as it only applies to one filename.

Back-patch to all supported branches.

Phil Krylov

Discussion: https://postgr.es/m/567e199c6b97ee19deee600311515b86@krylov.eu
2021-09-03 21:04:44 -04:00
Tom Lane 2cc018ba8f Disallow creating an ICU collation if the DB encoding won't support it.
Previously this was allowed, but the collation effectively vanished
into the ether because of the way lookup_collation() works: you could
not use the collation, nor even drop it.  Seems better to give an
error up front than to leave the user wondering why it doesn't work.

(Because this test is in DefineCollation not CreateCollation, it does
not prevent pg_import_system_collations from creating ICU collations,
regardless of the initially-chosen encoding.)

Per bug #17170 from Andrew Bille.  Back-patch to v10 where ICU support
was added.

Discussion: https://postgr.es/m/17170-95845cf3f0a9c36d@postgresql.org
2021-09-03 16:39:04 -04:00
John Naylor 67c33a114f Set the volatility of the timestamptz version of date_bin() back to immutable
543f36b43d was too hasty in thinking that the volatility of date_bin()
had to match date_trunc(), since only the latter references
session_timezone.

Bump catversion

Per feedback from Aleksander Alekseev
Backpatch to v14, as the former commit was
2021-09-03 13:40:32 -04:00
Tom Lane 08b96a2b52 Fix portability issue in tests from commit ce773f230.
Modern POSIX seems to require strtod() to accept "-NaN", but there's
nothing about NaN in SUSv2, and some of our oldest buildfarm members
don't like it.  Let's try writing it as -'NaN' instead; that seems
to produce the same result, at least on Intel hardware.

Per buildfarm.
2021-09-03 10:01:02 -04:00
Tom Lane 6b54f12332 In count_usable_fds(), duplicate stderr not stdin.
We had a complaint that the postmaster fails to start if the invoking
program closes stdin.  That happens because count_usable_fds expects
to be able to dup(0), and if it can't, we conclude there are no free
FDs and go belly-up.  So far as I can find, though, there is no other
place in the server that touches stdin, and it's not unreasonable to
expect that a daemon wouldn't use that file.

As a simple improvement, let's dup FD 2 (stderr) instead.  Unlike stdin,
it *is* reasonable for us to expect that stderr be open; even if we are
configured not to touch it, common libraries such as libc might try to
write error messages there.

Per gripe from Mario Emmenlauer.  Given the lack of previous complaints,
I'm not excited about pushing this into stable branches, but it seems
OK to squeeze it into v14.

Discussion: https://postgr.es/m/48bafc63-c30f-3962-2ded-f2e985d93e86@emmenlauer.de
2021-09-02 18:53:10 -04:00
Tom Lane 23c6bc581d Fix float4/float8 hash functions to produce uniform results for NaNs.
The IEEE 754 standard allows a wide variety of bit patterns for NaNs,
of which at least two ("NaN" and "-NaN") are pretty easy to produce
from SQL on most machines.  This is problematic because our btree
comparison functions deem all NaNs to be equal, but our float hash
functions know nothing about NaNs and will happily produce varying
hash codes for them.  That causes unexpected results from queries
that hash a column containing different NaN values.  It could also
produce unexpected lookup failures when using a hash index on a
float column, i.e. "WHERE x = 'NaN'" will not find all the rows
it should.

To fix, special-case NaN in the float hash functions, not too much
unlike the existing special case that forces zero and minus zero
to hash the same.  I arranged for the most vanilla sort of NaN
(that coming from the C99 NAN constant) to still have the same
hash code as before, to reduce the risk to existing hash indexes.

I dithered about whether to back-patch this into stable branches,
but ultimately decided to do so.  It's a clear improvement for
queries that hash internally.  If there is anybody who has -NaN
in a hash index, they'd be well advised to re-index after applying
this patch ... but the misbehavior if they don't will not be much
worse than the misbehavior they had before.

Per bug #17172 from Ma Liangzhu.

Discussion: https://postgr.es/m/17172-7505bea9e04e230f@postgresql.org
2021-09-02 17:24:42 -04:00
Tomas Vondra 50ba70a957 Identify simple column references in extended statistics
Until now, when defining extended statistics, everything except a plain
column reference was treated as complex expression. So for example "a"
was a column reference, but "(a)" would be an expression. In most cases
this does not matter much, but there were a couple strange consequences.
For example

    CREATE STATISTICS s ON a FROM t;

would fail, because extended stats require at least two columns. But

    CREATE STATISTICS s ON (a) FROM t;

would succeed, because that requirement does not apply to expressions.
Moreover, that statistics object is useless - the optimizer will always
use the regular statistics collected for attribute "a".

So do a bit more work to identify those expressions referencing a single
column, and translate them to a simple column reference. Backpatch to
14, where support for extended statistics on expressions was introduced.

Reported-by: Justin Pryzby
Backpatch-through: 14
Discussion: https://postgr.es/m/20210816013255.GS10479%40telsasoft.com
2021-09-01 18:08:43 +02:00
Fujii Masao d760d942c7 pgbench: Fix bug in measurement of disconnection delays.
When -C/--connect option is specified, pgbench establishes and closes
the connection for each transaction. In this case pgbench needs to
measure the times taken for all those connections and disconnections,
to include the average connection time in the benchmark result.
But previously pgbench could not measure those disconnection delays.
To fix the bug, this commit makes pgbench measure the disconnection
delays whenever the connection is closed at the end of transaction,
if -C/--connect option is specified.

Back-patch to v14. Per discussion, we concluded not to back-patch to v13
or before because changing that behavior in stable branches would
surprise users rather than providing benefits.

Author: Yugo Nagata
Reviewed-by: Fabien COELHO, Tatsuo Ishii, Asif Rehman, Fujii Masao
Discussion: https://postgr.es/m/20210614151155.a393bc7d8fed183e38c9f52a@sraoss.co.jp
2021-09-01 17:06:11 +09:00
Amit Kapila b7ad093d50 Fix the random test failure in 001_rep_changes.
The check to test whether the subscription workers were restarting after a
change in the subscription was failing. The reason was that the test was
assuming the walsender started before it reaches the 'streaming' state and
the walsender was exiting due to an error before that. Now, the walsender
was erroring out before reaching the 'streaming' state because it tries to
acquire the slot before the previous walsender has exited.

In passing, improve the die messages so that it is easier to investigate
the failures in the future if any.

Reported-by: Michael Paquier, as per buildfarm
Author: Ajin Cherian
Reviewed-by: Masahiko Sawada, Amit Kapila
Backpatch-through: 10, where this test was introduced
Discussion: https://postgr.es/m/YRnhFxa9bo73wfpV@paquier.xyz
2021-09-01 10:00:12 +05:30