Commit Graph

40052 Commits

Author SHA1 Message Date
Tom Lane 5f4a1a0a77 Throw a more on-point error for publications depending on columns.
Same as 42b041243, except that the trouble case is a publication
WHERE clause that depends on a column.

Again reported by Alexander Lakhin.  Back-patch to v15 where
we added publication WHERE clauses.

Discussion: https://postgr.es/m/548a47bc-87ae-b3df-c6a2-60b9966f808b@gmail.com
2024-05-02 17:36:31 -04:00
David Rowley 7e5d20bbd1 Disable run condition optimization for some WindowFuncs
94985c210 added code to detect when WindowFuncs were monotonic and
allowed additional quals to be "pushed down" into the subquery to be
used as WindowClause runConditions in order to short-circuit execution
in nodeWindowAgg.c.

The Node representation of runConditions wasn't well selected and
because we do qual pushdown before planning the subquery, the planning
of the subquery could perform subquery pull-up of nested subqueries.
For WindowFuncs with args, the arguments could be changed after pushing
the qual down to the subquery.

This was made more difficult by the fact that the code duplicated the
WindowFunc inside an OpExpr to include in the WindowClauses runCondition
field.  This could result in duplication of subqueries and a pull-up of
such a subquery could result in another initplan parameter being issued
for the 2nd version of the subplan.  This could result in errors such as:

ERROR:  WindowFunc not found in subplan target lists

Here in the backbranches, we don't have the flexibility to improve the
Node representation to resolve this, so instead we just disable the
runCondition optimization for ntile() unless the argument is a Const,
(v16 only) and likewise for count(expr) (both v15 and v16).  count(*) is
unaffected.  All other window functions which support this optimization
all take zero arguments and therefore are unaffected.

Bug: #18170
Reported-by: Zuming Jiang
Discussion: https://postgr.es/m/18170-f1d17bf9a0d58b24@postgresql.org
Backpatch-through 15 (master will be fixed independently)
2024-05-01 16:35:37 +12:00
Masahiko Sawada faba2f8f35 Fix parallel vacuum buffer usage reporting.
A parallel worker's buffer usage is accumulated to its pgBufferUsage
and then is accumulated into the leader's one at the end of the
parallel vacuum. However, since the leader process used to use
dedicated VacuumPage{Hit, Miss, Dirty} globals for the buffer usage
reporting, the worker's buffer usage was not included, leading to an
incorrect buffer usage report.

To fix the problem, this commit makes vacuum use pgBufferUsage
instruments for buffer usage reporting instead of VacuumPage{Hit,
Miss, Dirty} globals. These global variables are still used by ANALYZE
command and autoanalyze.

This also fixes the buffer usage report of vacuuming on temporary
tables, since the buffers dirtied by MarkLocalBufferDirty() were not
tracked by the VacuumPageDirty variable.

Parallel vacuum was introduced in 13, but the buffer usage reporting
for VACUUM command with the VERBOSE option was implemented in
15. So backpatch to 15.

Reported-by: Anthonin Bonnefoy
Author: Anthonin Bonnefoy
Reviewed-by: Alena Rybakina, Masahiko Sawada
Discussion: https://postgr.es/m/CAO6_XqrQk+QZQcYs_C6nk0cMfHuUWk85vT9CrcA1NffFbAVE2A@mail.gmail.com
Backpatch-through: 15
2024-05-01 12:34:01 +09:00
David Rowley 52f21f9287 Ensure we allocate NAMEDATALEN bytes for names in Index Only Scans
As an optimization, we store "name" columns as cstrings in btree
indexes.

Here we modify it so that Index Only Scans convert these cstrings back
to names with NAMEDATALEN bytes rather than storing the cstring in the
tuple slot, as was happening previously.

Bug: #17855
Reported-by: Alexander Lakhin
Reviewed-by: Alexander Lakhin, Tom Lane
Discussion: https://postgr.es/m/17855-5f523e0f9769a566@postgresql.org
Backpatch-through: 12, all supported versions
2024-05-01 13:22:16 +12:00
Tom Lane bf379b555c Disallow converting a table to a view within an outer SQL command.
We have long disallowed all forms of ALTER TABLE if the table is
already opened by some outer SQL command in the same session.
This has the same purpose as obtaining AccessExclusiveLock, but
since a session's own locks don't conflict the lock only blocks use
of the table by other sessions, not our own.  Without this check,
the ALTER might confuse the outer SQL command since any previous
inspection of the table would potentially become invalid.

However, the RelisBecomingView code path in DefineQueryRewrite never
got that memo, and assumed that AccessExclusiveLock is sufficient
for performing something morally equivalent to a rather invasive
ALTER TABLE.  Unsurprisingly, this can confuse an outer command
that is trying to do something with the table.

This was submitted as a security issue, but the security team
has been unable to identify any consequence worse than a null
pointer dereference (from trying to access rd_tableam methods
that the relation no longer has).  Therefore, in accordance
with our usual policy, it's not security material and should
just be fixed as a routine bug.

Fix by disallowing the operation if the table is open locally,
exactly as ALTER TABLE does it.

Per an anonymous security researcher, via Bundesamt für Sicherheit
in der Informationstechnik.

Patch v12-v15 only.  In v16 and later, we removed this code
altogether (cf. commit b23cd185f), so that there's no issue.
2024-04-30 15:22:55 -04:00
Noah Misch 7c5915c4b1 Close race condition between datfrozen and relfrozen updates.
vac_update_datfrozenxid() did multiple loads of relfrozenxid and
relminmxid from buffer memory, and it assumed each would get the same
value.  Not so if a concurrent vac_update_relstats() did an inplace
update.  Commit 2d2e40e3be fixed the same
kind of bug in vac_truncate_clog().  Today's bug could cause the
rel-level field and XIDs in the rel's rows to precede the db-level
field.  A cluster having such values should VACUUM affected tables.
Back-patch to v12 (all supported versions).

Discussion: https://postgr.es/m/20240423003956.e7.nmisch@google.com
2024-04-29 10:24:59 -07:00
Tom Lane 9b41d1d634 Throw a more on-point error for functions depending on columns.
ALTER COLUMN TYPE wasn't expecting to find any pg_proc objects
depending on the column whose type is to be altered.  That indeed
wasn't possible when this code was written, but it is possible
since we introduced new-style SQL function bodies.

It's about as difficult to fix this case as it is to fix dependent
views, and we've been punting on those for years, so I don't feel
too awful about punting for functions too.  (I sure wouldn't risk
back-patching such code.)  So just throw a more user-facing error.
Also, adjust some of the existing comments to reflect that these
are all pretty much the same issue.

(This patch also fixes it so we will tolerate finding such a
dependency during ALTER COLUMN SET EXPRESSION; in that, we need
not do anything to the function, so no error is wanted.  That
problem is new in HEAD.)

Per bug #18449 from Alexander Lakhin.  Back-patch to v14 where
we added new-style SQL functions.

Discussion: https://postgr.es/m/18449-f8248467aaa294d5@postgresql.org
2024-04-28 14:34:21 -04:00
Tom Lane e6e3ee5b7e Detect more overflows in timestamp[tz]_pl_interval.
In commit 25cd2d640 I (tgl) opined that "The additions of the months
and microseconds fields could also overflow, of course.  However,
I believe we need no additional checks there; the existing range
checks should catch such cases".  This is demonstrably wrong however
for the microseconds field, and given that discovery it seems prudent
to be paranoid about the months addition as well.

Report and patch by Joseph Koshakow.  As before, back-patch to all
supported branches.  (However, the test case doesn't work before
v15 because we didn't allow wider-than-int32 numbers in interval
literals.  A variant test could probably be built that fits within
that restriction, but it didn't seem worth the trouble.)

Discussion: https://postgr.es/m/CAAvxfHf77sRHKoEzUw9_cMYSpbpNS2C+J_+8Dq4+0oi8iKopeA@mail.gmail.com
2024-04-28 13:42:13 -04:00
Amit Kapila 28a8cc457b Fix the missing table sync due to improper invalidation handling.
We missed performing table sync if the invalidation happened while the
non-ready tables list was being prepared. This occurs because the sync
state was set to valid at the end of non-ready table list preparation
irrespective of the invalidations processed while the list is being
prepared.

Fix it by changing the boolean variable to a tri-state enum and by setting
table state to valid only if no invalidations have occurred while the list
is being prepared.

Reprted-by: Alexander Lakhin
Diagnosed-by: Alexander Lakhin
Author: Vignesh C
Reviewed-by: Hou Zhijie, Alexander Lakhin, Ajin Cherian, Amit Kapila
Backpatch-through: 15
Discussion: https://postgr.es/m/711a6afe-edb7-1211-cc27-1bef8239eec7@gmail.com
2024-04-25 10:33:04 +05:30
Tomas Vondra 276b7888f1 createdb: compare strategy case-insensitive
When specifying the createdb strategy, the documentation suggests valid
options are FILE_COPY and WAL_LOG, but the code does case-sensitive
comparison and accepts only "file_copy" and "wal_log" as valid.

Fixed by doing a case-insensitive comparison using pg_strcasecmp(), same
as for other string parameters nearby.

While at it, apply fmtId() to a nearby "locale_provider". This already
did the comparison in case-insensitive way, but the value would not be
double-quoted, confusing the parser and the error message.

Backpatch to 15, where the strategy was introduced.

Backpatch-through: 15
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/90c6913a-1dd2-42b4-8365-ce3b09c39b17@enterprisedb.com
2024-04-21 21:22:11 +02:00
Tom Lane f7e8917481 Fix MSVC recipe for ecpg regression tests, redux.
Forgot to inject -DCMDLINESYM=123 ...

Per buildfarm.

Discussion: https://postgr.es/m/4cc4dc47-ca2b-4129-8784-db69b5f82777@dunslane.net
2024-04-19 01:07:32 -04:00
Tom Lane 1e7b1b026d Fix MSVC recipe for ecpg regression tests.
While back-patching commit 6f0cef935, I forgot that the MSVC
build scripts would also need adjustment in the back branches.
This is a blind attempt at a fix, but it's basically copying
nearby code so I think it will work.

Per buildfarm (via Andrew Dunstan)

Discussion: https://postgr.es/m/4cc4dc47-ca2b-4129-8784-db69b5f82777@dunslane.net
2024-04-18 20:47:37 -04:00
Tom Lane 25f9372172 Fix assorted bugs in ecpg's macro mechanism.
The code associated with EXEC SQL DEFINE was unreadable and full of
bugs, notably:

* It'd attempt to free a non-malloced string if the ecpg program
tries to redefine a macro that was defined on the command line.

* Possible memory stomp if user writes "-D=foo".

* Undef'ing or redefining a macro defined on the command line would
change the state visible to the next file, when multiple files are
specified on the command line.  (While possibly that could have been
an intentional choice, the code clearly intends to revert to the
original macro state; it's just failing to consider this interaction.)

* Missing "break" in defining a new macro meant that redefinition
of an existing name would cause an extra entry to be added to the
definition list.  While not immediately harmful, a subsequent undef
would result in the prior entry becoming visible again.

* The interactions with input buffering are subtle and were entirely
undocumented.

It's not that surprising that we hadn't noticed these bugs,
because there was no test coverage at all of either the -D
command line switch or multiple input files.  This patch adds
such coverage (in a rather hacky way I guess).

In addition to the code bugs, the user documentation was confused
about whether the -D switch defines a C macro or an ecpg one, and
it failed to mention that you can write "-Dsymbol=value".

These problems are old, so back-patch to all supported branches.

Discussion: https://postgr.es/m/998011.1713217712@sss.pgh.pa.us
2024-04-16 12:31:32 -04:00
Tom Lane 5aacfa64e5 Fix generation of EC join conditions at the wrong plan level.
get_baserel_parampathinfo previously assumed without checking that
the results of generate_join_implied_equalities "necessarily satisfy
join_clause_is_movable_into".  This turns out to be wrong in the
presence of outer joins, because the generated clauses could include
Vars that mustn't be evaluated below a relevant outer join.  That
led to applying clauses at the wrong plan level and possibly getting
incorrect query results.  We must check each clause's nullable_relids,
and really the right thing to do is test join_clause_is_movable_into.

However, trying to fix it that way exposes an oversight in
equivclass.c: it wasn't careful about marking join clauses for
appendrel children with the correct clause_relids.  That caused the
modified get_baserel_parampathinfo code to reject some clauses it
still needs to accept.  (See parallel commit for HEAD/v16 for more
commentary about that.)

Per bug #18429 from Benoît Ryder.  This misbehavior existed for
a long time before commit 2489d76c4, so patch v12-v15 this way.

Discussion: https://postgr.es/m/18429-8982d4a348cc86c6@postgresql.org
2024-04-16 11:22:39 -04:00
Tom Lane 09989ba847 Fix type-checking of RECORD-returning functions in FROM, redux.
Commit 2ed8f9a01 intended to institute a policy that if a
RangeTblFunction has a coldeflist, then the function return type is
certainly RECORD, and we should use the coldeflist as the source of
truth about what the columns of the record type are.  When the
original function has been folded to a constant, inspection of the
constant might give a different answer.  This situation will lead to
a tuple-type-mismatch error at execution, but up until that point we
need to consistently believe the coldeflist, or we'll have problems
from different bits of code reaching different conclusions.

expandRTE didn't get that memo though, and would try to produce a
tupdesc based on the constant in this situation, leading to an
assertion failure.  (Desultory testing suggests that non-assert
builds often manage to give the expected error, although I also
saw a "cache lookup failed for type 0" error, and it seems at
least possible that a crash could happen.)

Some other callers of get_expr_result_type and get_expr_result_tupdesc
were also being incautious about this.  While none of them seem to
have actual bugs, they're working harder than necessary in this case,
besides which it seems safest to have an explicit policy of not using
those functions on an RTE with a coldeflist.  Adjust the code
accordingly, and add commentary to funcapi.c about this policy.

Also fix an obsolete comment that claimed "get_expr_result_type()
doesn't know how to extract type info from a RECORD constant".
That hasn't been true since commit d57534740.

Per bug #18422 from Alexander Lakhin.
As with the previous commit, back-patch to all supported branches.

Discussion: https://postgr.es/m/18422-89ca86c8eac5246d@postgresql.org
2024-04-15 12:56:56 -04:00
Tomas Vondra 3cd4135119 Update nbits_set in brin_bloom_union
Properly update the number of bits set in the bitmap after merging the
filters in brin_bloom_union.

This is mostly harmless, as the counter is used only in the output
function, which means pageinspect may show incorrect information about
the BRIN summary. The counter does not affect correctness.

Discovered while adding a regression test comparing indexes built with
and without parallelism. The parallel index builds exercise the union
procedure when merging results from workers, which is otherwise very
hard to do in a test. Which is why this went unnoticed until now.

Backpatch through 14, where the BRIN bloom opclasses were introduced.

Backpatch-through: 14
Discussion: https://postgr.es/m/1df00a66-db5a-4e66-809a-99b386a06d86%40enterprisedb.com
2024-04-14 18:17:29 +02:00
Noah Misch 7c490a18b7 freespace: Don't return blocks past the end of the main fork.
GetPageWithFreeSpace() callers assume the returned block exists in the
main fork, failing with "could not read block" errors if that doesn't
hold.  Make that assumption reliable now.  It hadn't been guaranteed,
due to the weak WAL and data ordering of participating components.  Most
operations on the fsm fork are not WAL-logged.  Relation extension is
not WAL-logged.  Hence, an fsm-fork block on disk can reference a
main-fork block that no WAL record has initialized.  That could happen
after an OS crash, a replica promote, or a PITR restore.  wal_log_hints
makes the trouble easier to hit; a replica promote or PITR ending just
after a relevant fsm-fork FPI_FOR_HINT may yield this broken state.  The
v16 RelationAddBlocks() mechanism also makes the trouble easier to hit,
since it bulk-extends even without extension lock waiters.  Commit
917dc7d239 stopped trouble around
truncation, but vectors involving PageIsNew() pages remained.

This implementation adds a RelationGetNumberOfBlocks() call when the
cached relation size doesn't confirm a block exists.  We've been unable
to identify a benchmark that slows materially, but this may show up as
additional time in lseek().  An alternative without that overhead would
be a new ReadBufferMode such that ReadBufferExtended() returns NULL
after a 0-byte read, with all other errors handled normally.  However,
each GetFreeIndexPage() caller would then need code for the return-NULL
case.  Back-patch to v14, due to earlier versions not caching relation
size and the absence of a pre-v16 problem report.

Ronan Dunklau.  Reported by Ronan Dunklau.

Discussion: https://postgr.es/m/1878547.tdWV9SEqCh%40aivenlaptop
2024-04-13 08:35:20 -07:00
Etsuro Fujita b82dca2a5a Fix WaitEventSet resource leak in WaitLatchOrSocket().
This function would have the same issue we solved in commit 501cfd07d:
If an error is thrown after calling CreateWaitEventSet(), the file
descriptor (on epoll- or kqueue-based systems) or handles (on Windows)
that the WaitEventSet contains are leaked.

Like that commit, use PG_TRY-PG_FINALLY (PG_TRY-PG_CATCH in v12) to make
sure the WaitEventSet is freed properly.

Back-patch to all supported versions, but as we do not have this issue
in HEAD (cf. commit 50c67c201), no need to apply this patch to it.

Discussion: https://postgr.es/m/CAPmGK16MqdDoD8oatp8SQWaEa4vS3nfQqDN_Sj9YRuu5J3Lj9g%40mail.gmail.com
2024-04-11 19:05:02 +09:00
Tom Lane d85db0a8e9 Fix plpgsql's handling of -- comments following expressions.
Up to now, read_sql_construct() has collected all the source text from
the statement or expression's initial token up to the character just
before the "until" token.  It normally tries to strip trailing
whitespace from that, largely for neatness.  If there was a "-- text"
comment after the expression, this resulted in removing the newline
that terminates the comment, which creates a hazard if we try to paste
the collected text into a larger SQL construct without inserting a
newline after it.  In particular this caused our handling of CASE
constructs to fail if there's a comment after a WHEN expression.

Commit 4adead1d2 noticed a similar problem with cursor arguments,
and worked around it through the rather crude hack of suppressing
the whitespace-trimming behavior for those.  Rather than do that
and leave the hazard open for future hackers to trip over, let's
fix it properly.  pl_scanner.c already has enough infrastructure
to report the end location of the expression's last token, so
we can copy up to that location and never collect any trailing
whitespace or comment to begin with.

Erik Wienhold and Tom Lane, per report from Michal Bartak.
Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAAVzF_FjRoi8fOVuLCZhQJx6HATQ7MKm=aFOHWZODFnLmjX-xA@mail.gmail.com
2024-04-10 15:45:59 -04:00
Thomas Munro 74992929a7 Fix illegal attribute propagation in LLVM JIT.
Commit 72559438 started copying more attributes from AttributeTemplate
to the functions we generate on the fly.  In the case of deform
functions, which return void, this meant that "noundef", from
AttributeTemplate's return value (a Datum) was copied to a void type.
Older LLVM releases were OK with that, but LLVM 18 crashes.

Update our llvm_copy_attributes() function to skip copying the attribute
for the return value, if the target function returns void.

Thanks to Dmitry Dolgov for help chasing this down.

Back-patch to all supported releases, like 72559438.

Reported-by: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com>
Discussion: https://postgr.es/m/CAFj8pRACpVFr7LMdVYENUkScG5FCYMZDDdSGNU-tch%2Bw98OxYg%40mail.gmail.com
2024-04-10 12:14:04 +12:00
Tom Lane 4f1d33d707 In psql, avoid leaking a PGresult after a query is cancelled.
After a query cancel, the tail end of ExecQueryAndProcessResults
took care to clear any not-yet-read PGresults; but it forgot about
the one it has already read.  There would only be such a result
when handling a multi-command string made with "\;", so that you'd
have to cancel an earlier command in such a string to reach the
bug at all.  Even then, there would only be leakage of a single
PGresult per cancel, so it's not surprising nobody noticed this.
But a leak is a leak.

Noted while re-reviewing 90f517821, but this is independent of that:
it dates to 7844c9918.  Back-patch to v15 where that came in.
2024-04-08 17:00:07 -04:00
Andres Freund dcb7cf945c simplehash: Free collisions array in SH_STAT
While SH_STAT() is only used for debugging, the allocated array can be large,
and therefore should be freed.

It's unclear why coverity started warning now.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Coverity
Discussion: https://postgr.es/m/3005248.1712538233@sss.pgh.pa.us
Backpatch: 12-
2024-04-07 19:09:04 -07:00
Heikki Linnakangas a9c20c85c7 Don't clobber test exit code at cleanup in LDAP/Kerberors tests
If the test script die()d before running the first test, the whole test
was interpreted as SKIPped rather than failed. The PostgreSQL::Cluster
module got this right.

Backpatch to all supported versions.

Discussion: https://www.postgresql.org/message-id/fb898a70-3a88-4629-88e9-f2375020061d@iki.fi
2024-04-07 20:22:28 +03:00
Heikki Linnakangas c8df46b657 Improve check in LDAP test to find the OpenLDAP installation
If the OpenLDAP installation directory is not found, set $setup to 0
so that the LDAP tests are skipped. The macOS checks were already
doing that, but the checks on other OS's were not. While we're at it,
improve the error message when the tests are skipped, to specify
whether the OS is supported at all, or if we just didn't find the
installation directory.

This was accidentally "working" without this, i.e. we were skipping
the tests if the OpenLDAP installation was not found, because of a bug
in the LdapServer test module: the END block clobbered the exit code
so if the script die()s before running the first subtest, the whole
test script was marked as SKIPped. The next commit will fix that bug,
but we need to fix the setup code first.

These checks should probably go into configure/meson, but this is
better than nothing and allows fixing the bug in the END block.

Backpatch to all supported versions.

Discussion: https://www.postgresql.org/message-id/fb898a70-3a88-4629-88e9-f2375020061d@iki.fi
2024-04-07 20:22:24 +03:00
Tom Lane f159f18141 Fix ecpg's mechanism for detecting unsupported cases in the grammar.
ecpg wants to emit a warning if it parses a SQL construct that the
backend can parse but will immediately throw a FEATURE_NOT_SUPPORTED
error for.  The way it was testing for this was to see if the string
ERRCODE_FEATURE_NOT_SUPPORTED appeared anywhere in the gram.y code.
This is, of course, not nearly good enough, as there are plenty of
rules in gram.y that throw that error only conditionally.  There was
a hack dating to 2008 to suppress the warning in one rule that
doesn't even exist anymore, but nothing for other cases we've created
since then.  End result was that you could get "unsupported feature
will be passed to server" warnings while compiling perfectly good SQL
code in ecpg.  Somehow we'd not heard complaints about this, but
it was exposed by the recent addition of an ecpg test for a SQL/JSON
construct.

To fix, suppress the warning if the rule contains any "if" statement.
Manual comparison of gram.y with the generated preproc.y file shows
that the warning is now emitted only in rules where it's sensible.

This problem has existed for a long time, so back-patch to all
supported branches.

Discussion: https://postgr.es/m/603615.1712245382@sss.pgh.pa.us
2024-04-04 15:31:53 -04:00
Etsuro Fujita 3f96d113ff Fix bogus coding in ExecAppendAsyncEventWait().
No configured-by-FDW events would result in "return" directly out of a
PG_TRY block, making the exception stack dangling.  Repair.

Oversight in commit 501cfd07d; back-patch to v14, like that commit, but
as we do not have this issue in HEAD (cf. commit 50c67c201), no need to
apply this patch to it.

In passing, improve a comment about the handling of in-process requests
in a postgres_fdw.c function called from this function.

Alexander Pyhalov, with comment adjustment/improvement by me.

Discussion: https://postgr.es/m/425fa29a429b21b0332737c42a4fdc70%40postgrespro.ru
2024-04-04 17:25:02 +09:00
Alexander Korotkov c2faf48fa3 Fix the parameters order for TableAmRoutine.relation_copy_for_cluster()
Specify OldTable first, NewTable second as used by
table_relation_copy_for_cluster() and as implemented in
heapam_relation_copy_for_cluster().

Backpatch to PostgreSQL 12, where TableAmRoutine was introduced.

Discussion: https://postgr.es/m/ME3P282MB3166860D4911AE82F92DF7C5B63F2%40ME3P282MB3166.AUSP282.PROD.OUTLOOK.COM
Author: Japin Li
Reviewed-by: Pavel Borisov
Backpatch-through: 12
2024-04-03 21:31:06 +03:00
Tom Lane 4fb56a734d Avoid deadlock during orphan temp table removal.
If temp tables have dependencies (such as sequences) then it's
possible for autovacuum's cleanup of orphan temp tables to deadlock
against an incoming backend that's trying to clean out the temp
namespace for its own use.  That can happen because RemoveTempRelations'
performDeletion call can visit objects within the namespace in
an order different from the order in which a per-table deletion
will visit them.

To fix, observe that performDeletion will begin by taking an exclusive
lock on the temp namespace (even though it won't actually delete it).
So, if we can get a shared lock on the namespace, we can be sure we're
not running concurrently with RemoveTempRelations, while also not
conflicting with ordinary use of the namespace.  This requires
introducing a conditional version of LockDatabaseObject, but that's no
big deal.  (It's surprising we've got along without that this long.)

Report and patch by Mikhail Zhilin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/c43ce028-2bc2-4865-9b89-3f706246eed5@postgrespro.ru
2024-04-02 14:59:04 -04:00
Tom Lane 98e427af9f Avoid "unused variable" warning on non-USE_SSL_ENGINE platforms.
If we are building with openssl but USE_SSL_ENGINE didn't get set,
initialize_SSL's variable "pkey" is declared but used nowhere.
Apparently this combination hasn't been exercised in the buildfarm
before now, because I've not seen this warning before, even though
the code has been like this a long time.  Move the declaration
to silence the warning (and remove its useless initialization).

Per buildfarm member sawshark.  Back-patch to all supported branches.
2024-04-01 19:01:18 -04:00
Tom Lane 08bdf0a479 Avoid possible longjmp-induced logic error in PLy_trigger_build_args.
The "pltargs" variable wasn't marked volatile, which makes it unsafe
to change its value within the PG_TRY block.  It looks like the worst
outcome would be to fail to release a refcount on Py_None during an
(improbable) error exit, which would likely go unnoticed in the field.
Still, it's a bug.  A one-liner fix could be to mark pltargs volatile,
but on the whole it seems cleaner to arrange things so that we don't
change its value within PG_TRY.

Per report from Xing Guo.  This has been there for quite awhile,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/CACpMh+DLrk=fDv07MNpBT4J413fDAm+gmMXgi8cjPONE+jvzuw@mail.gmail.com
2024-04-01 15:15:03 -04:00
Tom Lane 03561a6c7b Fix unnecessary use of moving-aggregate mode with non-moving frame.
When a plain aggregate is used as a window function, and the window
frame start is specified as UNBOUNDED PRECEDING, the frame's head
cannot move so we do not need to use moving-aggregate mode.  The check
for that was put into initialize_peragg(), failing to notice that
ExecInitWindowAgg() calls that function before it's filled in
winstate->frameOptions.  Since makeNode() would have zeroed the field,
this didn't provoke uninitialized-value complaints, nor would the
erroneous decision have resulted in more than a little inefficiency.
Still, it's wrong, so move the initialization of
winstate->frameOptions earlier to make it work properly.

While here, also fix a thinko in a comment.  Both errors crept in in
commit a9d9acbf2 which introduced the moving-aggregate mode.

Spotted by Vallimaharajan G.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/18e7f2a5167.fe36253866818.977923893562469143@zohocorp.com
2024-03-27 13:39:03 -04:00
Tom Lane b48eda4e54 Fix failure of ALTER FOREIGN TABLE SET SCHEMA to move sequences.
Ordinary ALTER TABLE SET SCHEMA will also move any owned sequences
into the new schema.  We failed to do likewise for foreign tables,
because AlterTableNamespaceInternal believed that only certain
relkinds could have indexes, owned sequences, or constraints.
We could simply add foreign tables to that relkind list, but it
seems likely that the same oversight could be made again in
future.  Instead let's remove the relkind filter altogether.
These functions shouldn't cost much when there are no objects
that they need to process, and surely this isn't an especially
performance-critical case anyway.

Per bug #18407 from Vidushi Gupta.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/18407-4fd07373d252c6a0@postgresql.org
2024-03-26 15:28:16 -04:00
Tom Lane 3c3f4fd741 Allow "make check"-style testing to work with musl C library.
The musl dynamic linker saves a pointer to the process' environment
value of LD_LIBRARY_PATH very early in startup.  When we move/clobber
the environment to make more room for ps status strings, we clobber
that value and thereby prevent libraries from being found via
LD_LIBRARY_PATH, which breaks the use of a temporary installation
for testing purposes.  To fix, stop collecting usable space for
ps status if we notice that the variable we are about to clobber
is LD_LIBRARY_PATH.  This will result in some reduction in how long
the ps status can be, but it's only likely to occur in temporary
test contexts, so it doesn't seem like a big problem.  In any case,
we don't have to do it if we see we are on glibc, which surely is
where the majority of our Linux testing is done.

Thomas Munro, Bruce Momjian, and Tom Lane, per report from Wolfgang
Walther.  Back-patch to all supported branches, with the hope that
we'll set up a buildfarm animal to test on this platform.

Discussion: https://postgr.es/m/fddd1cd6-dc16-40a2-9eb5-d7fef2101488@technowledgy.de
2024-03-26 11:44:49 -04:00
Jeff Davis b74e4a08b9 Clarify comment for LogicalTapeSetBlocks().
Discussion: https://postgr.es/m/1229327.1711160246@sss.pgh.pa.us
Backpatch-through: 13
2024-03-25 12:00:42 -07:00
Daniel Gustafsson 12128be623 Fix dumping role comments when using --no-role-passwords
Commit 9a83d56b38 added support for allowing pg_dumpall to dump
roles without including passwords, which accidentally made dumps
omit COMMENTs on roles.  This fixes it by using pg_authid to get
the comment.

Backpatch to all supported versions. Patch simultaneously written
independently by Álvaro and myself.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Bartosz Chroł <bartosz.chrol@handen.pl>
Discussion: https://postgr.es/m/AS8P194MB1271CDA0ADCA7B75FCD8E767F7332@AS8P194MB1271.EURP194.PROD.OUTLOOK.COM
Discussion: https://postgr.es/m/CAEP4nAz9V4H41_4ESJd1Gf0v%3DdevkqO1%3Dpo91jUw-GJSx8Hxqg%40mail.gmail.com
Backpatch-through: v12
2024-03-21 23:31:57 +01:00
Alvaro Herrera ea299d7825
Review wording on tablespaces w.r.t. partitioned tables
Remove a redundant comment, and document pg_class.reltablespace properly
in catalogs.sgml.

After commits a36c84c3e4, 87259588d0 and others.

Backpatch to 12.

Discussion: https://postgr.es/m/202403191013.w2kr7wqlamqz@alvherre.pgsql
2024-03-20 15:28:14 +01:00
Heikki Linnakangas d3d95f5839 Fix EXPLAIN Bitmap heap scan to count pages with no visible tuples
Previously, bitmap heap scans only counted lossy and exact pages for
explain when there was at least one visible tuple on the page.

heapam_scan_bitmap_next_block() returned true only if there was a
"valid" page with tuples to be processed. However, the lossy and exact
page counters in EXPLAIN should count the number of pages represented
in a lossy or non-lossy way in the constructed bitmap, regardless of
whether or not the pages ultimately contained visible tuples.

Backpatch to all supported versions.

Author: Melanie Plageman
Discussion: https://www.postgresql.org/message-id/CAAKRu_ZwCwWFeL_H3ia26bP2e7HiKLWt0ZmGXPVwPO6uXq0vaA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAAKRu_bxrXeZ2rCnY8LyeC2Ls88KpjWrQ%2BopUrXDRXdcfwFZGA@mail.gmail.com
2024-03-18 14:04:17 +02:00
Dean Rasheed 89ee14a2f2 Fix EXPLAIN output for subplans in MERGE.
Given a subplan in a MERGE query, EXPLAIN would sometimes fail to
properly display expressions involving Params referencing variables in
other parts of the plan tree.

This would affect subplans outside the topmost join plan node, for
which expansion of Params would go via the top-level ModifyTable plan
node.  The problem was that "inner_tlist" for the ModifyTable node's
deparse_namespace was set to the join node's targetlist, but
"inner_plan" was set to the ModifyTable node itself, rather than the
join node, leading to incorrect results when descending to the
referenced variable.

Fix and backpatch to v15, where MERGE was introduced.

Discussion: https://postgr.es/m/CAEZATCWAv-sZuH%2BwG5xJ-%2BGt7qGNGX8wUQd3XYydMFDKgRB9nw%40mail.gmail.com
2024-03-17 10:20:20 +00:00
Tom Lane 7c61d23422 Make INSERT-from-multiple-VALUES-rows handle domain target columns.
Commit a3c7a993d fixed some cases involving target columns that are
arrays or composites by applying transformAssignedExpr to the VALUES
entries, and then stripping off any assignment ArrayRefs or
FieldStores that the transformation added.  But I forgot about domains
over arrays or composites :-(.  Such cases would either fail with
surprising complaints about mismatched datatypes, or insert unexpected
coercions that could lead to odd results.  To fix, extend the
stripping logic to get rid of CoerceToDomain if it's atop an ArrayRef
or FieldStore.

While poking at this, I realized that there's a poorly documented and
not-at-all-tested behavior nearby: we coerce each VALUES column to
the domain type separately, and rely on the rewriter to merge those
operations so that the domain constraints are checked only once.
If that merging did not happen, it's entirely possible that we'd get
unexpected domain constraint failures due to checking a
partially-updated container value.  There's no bug there, but while
we're here let's improve the commentary about it and add some test
cases that explicitly exercise that behavior.

Per bug #18393 from Pablo Kharo.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/18393-65fedb1a0de9260d@postgresql.org
2024-03-14 14:57:16 -04:00
Tom Lane 6f66fadad9 Fix confusion about the return rowtype of SQL-language procedures.
There is a very ancient hack in check_sql_fn_retval that allows a
single SELECT targetlist entry of composite type to be taken as
supplying all the output columns of a function returning composite.
(This is grotty and fundamentally ambiguous, but it's really hard
to do nested composite-returning functions without it.)

As far as I know, that doesn't cause any problems in ordinary
functions.  It's disastrous for procedures however.  All procedures
that have any output parameters are labeled with prorettype RECORD,
and the CALL code expects it will get back a record with one column
per output parameter, regardless of whether any of those parameters
is composite.  Doing something else leads to an assertion failure
or core dump.

This is simple enough to fix: we just need to not apply that rule
when considering procedures.  However, that requires adding another
argument to check_sql_fn_retval, which at least in principle might be
getting called by external callers.  Therefore, in the back branches
convert check_sql_fn_retval into an ABI-preserving wrapper around a
new function check_sql_fn_retval_ext.

Per report from Yahor Yuzefovich.  This has been broken since we
implemented procedures, so back-patch to all supported branches.

Discussion: https://postgr.es/m/CABz5gWHSjj2df6uG0NRiDhZ_Uz=Y8t0FJP-_SVSsRsnrQT76Gg@mail.gmail.com
2024-03-12 18:16:10 -04:00
Heikki Linnakangas 4fce5f970d Disconnect if socket cannot be put into non-blocking mode
Commit 387da18874 moved the code to put socket into non-blocking mode
from socket_set_nonblocking() into the one-time initialization
function, pq_init(). In socket_set_nonblocking(), there indeed was a
risk of recursion on failure like the comment said, but in pq_init(),
ERROR or FATAL is fine. There's even another elog(FATAL) just after
this, if setting FD_CLOEXEC fails.

Note that COMMERROR merely logged the error, it did not close the
connection, so if putting the socket to non-blocking mode failed we
would use the connection anyway. You might not immediately notice,
because most socket operations in a regular backend wait for the
socket to become readable/writable anyway. But e.g. replication will
be quite broken.

Backpatch to all supported versions.

Discussion: https://www.postgresql.org/message-id/d40a5cd0-2722-40c5-8755-12e9e811fa3c@iki.fi
2024-03-12 10:18:50 +02:00
David Rowley 74530804fc Fix incorrect accessing of pfree'd memory in Memoize
For pass-by-reference types, the code added in 0b053e78b, which aimed to
resolve a memory leak, was overly aggressive in resetting the per-tuple
memory context which could result in pfree'd memory being accessed
resulting in failing to find previously cached results in the hash
table.

What was happening was prepare_probe_slot() was switching to the
per-tuple memory context and calling ExecEvalExpr().  ExecEvalExpr() may
have required a memory allocation.  Both MemoizeHash_hash() and
MemoizeHash_equal() were aggressively resetting the per-tuple context
and after determining the hash value, the context would have gotten reset
before MemoizeHash_equal() was called.  This could have resulted in
MemoizeHash_equal() looking at pfree'd memory.

This is less likely to have caused issues on a production build as some
other allocation would have had to have reused the pfree'd memory to
overwrite it.  Otherwise, the original contents would have been intact.
However, this clearly caused issues on MEMORY_CONTEXT_CHECKING builds.

Author: Tender Wang, Andrei Lepikhov
Reported-by: Tender Wang (using SQLancer)
Reviewed-by: Andrei Lepikhov, Richard Guo, David Rowley
Discussion: https://postgr.es/m/CAHewXNnT6N6UJkya0z-jLFzVxcwGfeRQSfhiwA+NyLg-x8iGew@mail.gmail.com
Backpatch-through: 14, where Memoize was added
2024-03-11 18:21:23 +13:00
Alexander Korotkov 84788ee5b4 Backpatch missing check_stack_depth() to some recursive functions
Backpatch changes from d57b7cc333, 75bcba6cbd to all supported branches per
proposal of Egor Chindyaskin.

Discussion: https://postgr.es/m/DE5FD776-A8CD-4378-BCFA-3BF30F1F6D60%40mail.ru
2024-03-11 03:06:45 +02:00
Tom Lane 0fe82e45cb Cope with a deficiency in OpenSSL 3.x's error reporting.
In OpenSSL 3.0.0 and later, ERR_reason_error_string randomly refuses
to provide a string for error codes representing system errno values
(e.g., "No such file or directory").  There is a poorly-documented way
to extract the errno from the SSL error code in this case, so do that
and apply strerror, rather than falling back to reporting the error
code's numeric value as we were previously doing.

Problem reported by David Zhang, although this is not his proposed
patch; it's instead based on a suggestion from Heikki Linnakangas.
Back-patch to all supported branches, since any of them are likely
to be used with recent OpenSSL.

Discussion: https://postgr.es/m/b6fb018b-f05c-4afd-abd3-318c649faf18@highgo.ca
2024-03-07 19:37:51 -05:00
Dean Rasheed b5c645d2a2 Fix handling of self-modified tuples in MERGE.
When an UPDATE or DELETE action in MERGE returns TM_SelfModified,
there are 2 possible causes:

1). The target tuple was already updated or deleted by the current
    command. This can happen if the target row joins to more than one
    source row, and the SQL standard explicitly says that this must be
    an error.

2). The target tuple was already updated or deleted by a later command
    in the current transaction. This can happen if the tuple is
    modified by a BEFORE trigger or a volatile function used in the
    query, and should be an error for the same reason that it is in a
    plain UPDATE or DELETE command.

In MERGE's primary error handling block, it failed to check for (2),
causing it to return a misleading error message in such cases.

In the secondary error handling block, following a concurrent update
from another session, it failed to check for (1), causing it to
silently ignore target rows joined to more than one source row,
instead of reporting an error.

Fix this, and add tests for both of these cases.

Per report from Wenjiang Zhang. Back-patch to v15, where MERGE was
introduced.

Discussion: https://postgr.es/m/tencent_41DE0FF443FE14B94A5898D373792109E408%40qq.com
2024-03-07 09:53:31 +00:00
Michael Paquier e82b64a92c Revert "Fix parallel-safety check of expressions and predicate for index builds"
This reverts commit eae7be600b, following a discussion with Tom Lane,
due to concerns that this impacts the decisions made by the planner for
the number of workers spawned based on the inlining and const-folding of
index expressions and predicate for cases that would have worked until
this commit.

Discussion: https://postgr.es/m/162802.1709746091@sss.pgh.pa.us
Backpatch-through: 12
2024-03-07 08:31:04 +09:00
Tom Lane 3b671dcf53 Fix type-checking of RECORD-returning functions in FROM.
In the corner case where a function returning RECORD has been
simplified to a RECORD constant or an inlined ROW() expression,
ExecInitFunctionScan failed to cross-check the function's result
rowtype against the coldeflist provided by the calling query.
That happened because get_expr_result_type is able to extract a
tupdesc from such expressions, which led ExecInitFunctionScan to
ignore the coldeflist.  (Instead, it used the extracted tupdesc
to check the function's output, which of course always succeeds.)

I have not been able to demonstrate any really serious consequences
from this, because if some column of the result is of the wrong
type and is directly referenced by a Var of the calling query,
CheckVarSlotCompatibility will catch it.  However, we definitely do
fail to report the case where the function returns more columns than
the coldeflist expects, and in the converse case where it returns
fewer columns, we get an assert failure (but, seemingly, no worse
results in non-assert builds).

To fix, always build the expected tupdesc from the coldeflist if there
is one, and consult get_expr_result_type only when there isn't one.

Also remove the failing Assert, even though it is no longer reached
after this fix.  It doesn't seem to be adding anything useful, since
later checking will deal with cases with the wrong number of columns.

The only other place I could find that is doing something similar
is inline_set_returning_function.  There's no live bug there because
we cannot be looking at a Const or RowExpr, but for consistency
change that code to agree with ExecInitFunctionScan.

Per report from PetSerAl.  After some debate I've concluded that
this should be back-patched.  There is a small risk that somebody
has been relying on such a case not throwing an error, but I judge
this outweighed by the risk that I've missed some way in which the
failure to cross-check has worse consequences than sketched above.

Discussion: https://postgr.es/m/CAKygsHSerA1eXsJHR9wft3Gn3wfHQ5RfP8XHBzF70_qcrrRvEg@mail.gmail.com
2024-03-06 14:41:13 -05:00
Michael Paquier 50b5913a87 Fix parallel-safety check of expressions and predicate for index builds
As coded, the planner logic that calculates the number of parallel
workers to use for a parallel index build uses expressions and
predicates from the relcache, which are flattened for the planner by
eval_const_expressions().

As reported in the bug, an immutable parallel-unsafe function flattened
in the relcache would become a Const, which would be considered as
parallel-safe, even if the predicate or the expressions including the
function are not safe in parallel workers.  Depending on the expressions
or predicate used, this could cause the parallel build to fail.

Tests are included that check parallel index builds with parallel-unsafe
predicate and expressions.  Two routines are added to lsyscache.h to be
able to retrieve expressions and predicate of an index from its pg_index
data.

Reported-by: Alexander Lakhin
Author: Tender Wang
Reviewed-by: Jian He, Michael Paquier
Discussion: https://postgr.es/m/CAHewXN=UaAaNn9ruHDH3Os8kxLVmtWqbssnf=dZN_s9=evHUFA@mail.gmail.com
Backpatch-through: 12
2024-03-06 17:24:06 +09:00
David Rowley 164fe7a6e1 Fix incorrectly reported stats kind in "can't happen" ERROR
The error message(s) were reporting the stats kind of 'f', which is not
correct as that's for the "dependencies" statistics kind.

Reported-by: Horst Reiterer
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/18375-ba99383eb9062d6a@postgresql.org
Backpatch-through: 12, where MCV extended stats were added.
2024-03-05 16:18:19 +13:00
Daniel Gustafsson 55ea12a282 Fix integer underflow in shared memory debugging
dsa_dump would print a large negative number instead of zero for
segment bin 0.  Fix by explicitly checking for underflow and add
special case for bin 0. Backpatch to all supported versions.

Author: Ian Ilyasov <ianilyasov@outlook.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/GV1P251MB1004E0D09D117D3CECF9256ECD502@GV1P251MB1004.EURP251.PROD.OUTLOOK.COM
Backpatch-through: v12
2024-02-29 12:19:52 +01:00