Commit Graph

54954 Commits

Author SHA1 Message Date
David Rowley 67f3a697ba Fix overly strict Assert in jsonpath code
This was failing for queries which try to get the .type() of a
jpiLikeRegex.  For example:

select jsonb_path_query('["string", "string"]',
                        '($[0] like_regex ".{7}").type()');

Reported-by: Alexander Kozhemyakin
Bug: #18035
Discussion: https://postgr.es/m/18035-64af5cdcb5adf2a9@postgresql.org
Backpatch-through: 12, where SQL/JSON path was added.
2023-08-02 01:40:56 +12:00
Etsuro Fujita d1ef5631e6 Disallow replacing joins with scans in problematic cases.
Commit e7cb7ee14, which introduced the infrastructure for FDWs and
custom scan providers to replace joins with scans, failed to add support
handling of pseudoconstant quals assigned to replaced joins in
createplan.c, leading to an incorrect plan without a gating Result node
when postgres_fdw replaced a join with such a qual.

To fix, we could add the support by 1) modifying the ForeignPath and
CustomPath structs to store the list of RestrictInfo nodes to apply to
the join, as in JoinPaths, if they represent foreign and custom scans
replacing a join with a scan, and by 2) modifying create_scan_plan() in
createplan.c to use that list in that case, instead of the
baserestrictinfo list, to get pseudoconstant quals assigned to the join;
but #1 would cause an ABI break.  So fix by modifying the infrastructure
to just disallow replacing joins with such quals.

Back-patch to all supported branches.

Reported by Nishant Sharma.  Patch by me, reviewed by Nishant Sharma and
Richard Guo.

Discussion: https://postgr.es/m/CADrsxdbcN1vejBaf8a%2BQhrZY5PXL-04mCd4GDu6qm6FigDZd6Q%40mail.gmail.com
2023-07-28 15:45:02 +09:00
Tom Lane 313ceda2fe Raise fixed token-length limit in hba.c.
Historically, hba.c limited tokens in the authentication configuration
files (pg_hba.conf and pg_ident.conf) to less than 256 bytes.  We have
seen a few reports of this limit causing problems; notably, for
moderately-complex LDAP configurations.  Increase the limit to 10240
bytes as a low-risk stop-gap solution.

In v13 and earlier, this also requires raising MAX_LINE, the limit
on overall line length.  I'm hesitant to make this code consume
too much stack space, so I only raised that to 20480 bytes.

Discussion: https://postgr.es/m/1588937.1690221208@sss.pgh.pa.us
2023-07-27 12:07:48 -04:00
Amit Kapila 2864eb977a Fix the display of UNKNOWN message type in apply worker.
We include the message type while displaying an error context in the
apply worker. Now, while retrieving the message type string if the
message type is unknown we throw an error that will hide the original
error. So, instead, we need to simply return the string indicating an
unknown message type.

Reported-by: Ashutosh Bapat
Author: Euler Taveira, Amit Kapila
Reviewed-by: Ashutosh Bapat
Backpatch-through: 15
Discussion: https://postgr.es/m/CAExHW5suAEDW-mBZt_qu4RVxWZ1vL54-L+ci2zreYWebpzxYsA@mail.gmail.com
2023-07-25 08:50:37 +05:30
Alvaro Herrera 3bb8b9342f
Make test_decoding ddl.out shorter
Some of the test_decoding test output was extremely wide, because it
deals with massive toasted values, and the aligned mode causes psql to
produce 200kB of whitespace and dashes. Change to unaligned mode
temporarily to avoid that behavior.

Backpatch to 14, where it applies cleanly.

Discussion: https://postgr.es/m/20230405103953.sxleixp3uz5lazst@alvherre.pgsql
2023-07-24 17:48:06 +02:00
Tom Lane fbaf65cd65 Guard against null plan pointer in CachedPlanIsSimplyValid().
If both the passed-in plan pointer and plansource->gplan are
NULL, CachedPlanIsSimplyValid would think that the plan pointer
is possibly-valid and try to dereference it.  For the one extant
call site in plpgsql, this situation doesn't normally happen
which is why we've not noticed. However, it appears to be possible
if the previous use of the cached plan failed, as per report from
Justin Pryzby.  Add an extra check to prevent crashing.
Back-patch to v13 where this code was added.

Discussion: https://postgr.es/m/ZLlV+STFz1l/WhAQ@telsasoft.com
2023-07-20 14:23:46 -04:00
Tom Lane bc9993a549 Doc: improve description of IN and row-constructor comparisons.
IN and NOT IN work fine on records and arrays, so just say that
they accept "expressions" not "scalar expressions".  I think that
that phrasing was meant to say that they don't work on set-returning
expressions, but that's not the common meaning of "scalar".

Revise the description of row-constructor comparisons to make it
perhaps a bit less confusing.  (This partially reverts some
dubious wording changes made by commit f56651519.)

Per gripe from Ilya Nenashev.  Back-patch to supported branches.
In HEAD and v16, also drop a NOTE about pre-8.2 behavior, which
is hopefully no longer of interest to anybody.

Discussion: https://postgr.es/m/168968062460.632.14303906825812821399@wrigleys.postgresql.org
2023-07-19 11:00:34 -04:00
Tom Lane f9278cb0ae Doc: fix out-of-date example of SPI usage.
The "count" argument of SPI_exec() only limits execution when
the query is actually returning rows.  This was not the case
before PG 9.0, so this example was correct when written; but
we missed updating it in commit 2ddc600f8.  Extend the example
to show the behavior both with and without RETURNING.

While here, improve the commentary and markup for the rest
of the example.

David G. Johnston and Tom Lane, per report from Curt Kolovson.
Back-patch to all supported branches.

Discussion: https://postgr.es/m/CANhYJV6HWtgz_qjx_APfK0PAgLUzY-2vjLuj7i_o=TZF1LAQew@mail.gmail.com
2023-07-18 11:59:39 -04:00
Michael Paquier f6ecd2622c Fix indentation in twophase.c
This has been missed in cb0cca1, noticed before buildfarm member koel
has been able to complain while poking at a different patch.  Like the
other commit, backpatch all the way down to limit the odds of merge
conflicts.

Backpatch-through: 11
2023-07-18 14:04:48 +09:00
Michael Paquier a878eff6b4 Fix recovery of 2PC transaction during crash recovery
A crash in the middle of a checkpoint with some two-phase state data
already flushed to disk by this checkpoint could cause a follow-up crash
recovery to recover twice the same transaction, once from what has been
found in pg_twophase/ at the beginning of recovery and a second time
when replaying its corresponding record.

This would lead to FATAL failures in the startup process during
recovery, where the same transaction would have a state recovered twice
instead of once:
LOG:  recovering prepared transaction 731 from shared memory
LOG:  recovering prepared transaction 731 from shared memory
FATAL:  lock ExclusiveLock on object 731/0/0 is already held

This issue is fixed by skipping the addition of any 2PC state coming
from a record whose equivalent 2PC state file has already been loaded in
TwoPhaseState at the beginning of recovery by restoreTwoPhaseData(),
which is OK as long as the system has not reached a consistent state.

The timing to get a messed up recovery processing is very racy, and
would very unlikely happen.  The thread that has reported the issue has
demonstrated the bug using injection points to force a PANIC in the
middle of a checkpoint.

Issue introduced in 728bd99, so backpatch all the way down.

Reported-by: "suyu.cmj" <mengjuan.cmj@alibaba-inc.com>
Author: "suyu.cmj" <mengjuan.cmj@alibaba-inc.com>
Author: Michael Paquier
Discussion: https://postgr.es/m/109e6994-b971-48cb-84f6-829646f18b4c.mengjuan.cmj@alibaba-inc.com
Backpatch-through: 11
2023-07-18 13:44:29 +09:00
Michael Paquier eb3abec4b6 Add indisreplident to fields refreshed by RelationReloadIndexInfo()
RelationReloadIndexInfo() is a fast-path used for index reloads in the
relation cache, and it has always forgotten about updating
indisreplident, which is something that would happen after an index is
selected for a replica identity.  This can lead to incorrect cache
information provided when executing a command in a transaction context
that updates indisreplident.

None of the code paths currently on HEAD that need to check upon
pg_index.indisreplident fetch its value from the relation cache, always
relying on a fresh copy on the syscache.  Unfortunately, this may not be
the case of out-of-core code, that could see out-of-date value.

Author: Shruthi Gowda
Reviewed-by: Robert Haas, Dilip Kumar, Michael Paquier
Discussion: https://postgr.es/m/CAASxf_PBcxax0wW-3gErUyftZ0XrCs3Lrpuhq4-Z3Fak1DoW7Q@mail.gmail.com
Backpatch-through: 11
2023-07-14 11:16:06 +09:00
Michael Paquier c0dc97c7bf Fix updates of indisvalid for partitioned indexes
indisvalid is switched to true for partitioned indexes when all its
partitions have valid indexes when attaching a new partition, up to the
top-most parent if all its leaves are themselves valid when dealing with
multiple layers of partitions.

The copy of the tuple from pg_index used to switch indisvalid to true
came from the relation cache, which is incorrect.  Particularly, in the
case reported by Shruthi Gowda, executing a series of commands in a
single transaction would cause the validation of partitioned indexes to
use an incorrect version of a pg_index tuple, as indexes are reloaded
after an invalidation request with RelationReloadIndexInfo(), a much
faster version than a full index cache rebuild.  In this case, the
limited information updated in the cache leads to an incorrect version
of the tuple used.  One of the symptoms reported was the following
error, with a replica identity update, for instance:
"ERROR: attempted to update invisible tuple"

This is incorrect since 8b08f7d, so backpatch all the way down.

Reported-by: Shruthi Gowda
Author: Michael Paquier
Reviewed-by: Shruthi Gowda, Dilip Kumar
Discussion: https://postgr.es/m/CAASxf_PBcxax0wW-3gErUyftZ0XrCs3Lrpuhq4-Z3Fak1DoW7Q@mail.gmail.com
Backpatch-through: 11
2023-07-14 10:13:15 +09:00
Andres Freund f66403749d Handle DROP DATABASE getting interrupted
Until now, when DROP DATABASE got interrupted in the wrong moment, the removal
of the pg_database row would also roll back, even though some irreversible
steps have already been taken. E.g. DropDatabaseBuffers() might have thrown
out dirty buffers, or files could have been unlinked. But we continued to
allow connections to such a corrupted database.

To fix this, mark databases invalid with an in-place update, just before
starting to perform irreversible steps. As we can't add a new column in the
back branches, we use pg_database.datconnlimit = -2 for this purpose.

An invalid database cannot be connected to anymore, but can still be
dropped.

Unfortunately we can't easily add output to psql's \l to indicate that some
database is invalid, it doesn't fit in any of the existing columns.

Add tests verifying that a interrupted DROP DATABASE is handled correctly in
the backend and in various tools.

Reported-by: Evgeny Morozov <postgresql3@realityexists.net>
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20230509004637.cgvmfwrbht7xm7p6@awork3.anarazel.de
Discussion: https://postgr.es/m/20230314174521.74jl6ffqsee5mtug@awork3.anarazel.de
Backpatch: 11-, bug present in all supported versions
2023-07-13 13:04:45 -07:00
Andres Freund 82e97b8640 Release lock after encountering bogs row in vac_truncate_clog()
When vac_truncate_clog() encounters bogus datfrozenxid / datminmxid values, it
returns early. Unfortunately, until now, it did not release
WrapLimitsVacuumLock. If the backend later tries to acquire
WrapLimitsVacuumLock, the session / autovacuum worker hangs in an
uncancellable way. Similarly, other sessions will hang waiting for the
lock. However, if the backend holding the lock exited or errored out for some
reason, the lock was released.

The bug was introduced as a side effect of 566372b3d6.

It is interesting that there are no production reports of this problem. That
is likely due to a mix of bugs leading to bogus values having gotten less
common, process exit releasing locks and instances of hangs being hard to
debug for "normal" users.

Discussion: https://postgr.es/m/20230621221208.vhsqgduwfpzwxnpg@awork3.anarazel.de
2023-07-13 13:03:31 -07:00
Tom Lane 5cb4619896 Remove unnecessary pfree() in g_intbig_compress().
GiST compress functions (like all GiST opclass functions) are
supposed to be called in short-lived memory contexts, so that
minor memory leaks in them are not of concern, and indeed
explicit pfree's are likely slightly counterproductive.
But this one in g_intbig_compress() is more than
slightly counterproductive, because it's guarded by
"if (in != DatumGetArrayTypeP(entry->key))" which means
that if this test succeeds, we've detoasted the datum twice.
(And to add insult to injury, the extra detoast result is
leaked.)  Let's just drop the whole stanza, relying on the
GiST temporary context mechanism to clean up in good time.

The analogous bit in g_int_compress() is
       if (r != (ArrayType *) DatumGetPointer(entry->key))
           pfree(r);
which doesn't have the gratuitous-detoast problem so
I left it alone.  Perhaps there is a case for removing
unnecessary pfree's more widely, but I'm not sure if it's
worth the code churn.

The potential extra decompress seems expensive enough to
justify calling this a (minor) performance bug and
back-patching.

Konstantin Knizhnik, Matthias van de Meent, Tom Lane

Discussion: https://postgr.es/m/CAEze2Wi86=DxErfvf+SCB2UKmU2amKOF60BKuJOX=w-RojRn0A@mail.gmail.com
2023-07-13 13:08:17 -04:00
Tom Lane 22447db17c Be more rigorous about local variables in PostgresMain().
Since PostgresMain calls sigsetjmp, any local variables that are not
marked "volatile" have a risk of unspecified behavior.  In practice
this means that when control returns via longjmp, such variables might
get reset to their values as of the time of sigsetjmp, depending on
whether the compiler chose to put them in registers or on the stack.
We were careful about this for "send_ready_for_query", but not the
other local variables.

In the case of the timeout_enabled flags, resetting them to
their initial "false" states is actually good, since we do
"disable_all_timeouts()" in the longjmp cleanup code path.  If that
does not happen, we risk uselessly calling "disable_timeout()" later,
which is harmless but a little bit expensive.  Let's explicitly reset
these flags so that the behavior is correct and platform-independent.
(This change means that we really don't need the new "volatile"
markings after all, but let's install them anyway since any change
in this logic could re-introduce a problem.)

There is no issue for "firstchar" and "input_message" because those
are explicitly reinitialized each time through the query processing
loop.  To make that clearer, move them to be declared inside the loop.
That leaves us with all the function-lifespan locals except the
sigjmp_buf itself marked as volatile, which seems like a good policy
to have going forward.

Because of the possibility of extra disable_timeout() calls, this
seems worth back-patching.

Sergey Shinderuk and Tom Lane

Discussion: https://postgr.es/m/2eda015b-7dff-47fd-d5e2-f1a9899b90a6@postgrespro.ru
2023-07-10 12:14:34 -04:00
Michael Paquier d1e0f408cb Fix ALTER EXTENSION SET SCHEMA with objects outside an extension's schema
As coded, the code would use as a base comparison the namespace OID from
the first object scanned in pg_depend when switching its namespace
dependency entry to the new one, and use it as a base of comparison for
any follow-up checks.  It would also be used as the old namespace OID to
switch *from* for the extension's pg_depend entry.  Hence, if the first
object scanned has a namespace different than the one stored in the
extension, we would finish by:
- Not checking that the extension objects map with the extension's
schema.
- Not switching the extension -> namespace dependency entry to the new
namespace provided by the user, making ALTER EXTENSION ineffective.

This issue exists since this command has been introduced in d9572c4 for
relocatable extension, so backpatch all the way down to 11.  The test
case has been provided by Heikki, that I have tweaked a bit to show the
effects on pg_depend for the extension.

Reported-by: Heikki Linnakangas
Author: Michael Paquier, Heikki Linnakangas
Discussion: https://postgr.es/m/20eea594-a05b-4c31-491b-007b6fceef28@iki.fi
Backpatch-through: 11
2023-07-10 09:40:14 +09:00
David Rowley 3883ef3236 Doc: update old reference to "result cache"
During the PostgreSQL 14 cycle, the Memoize executor node was briefly
called "Result Cache" until it was renamed in 83f4fcc65.  That commit
missed one reference.

Reported-by: Paul A Jungwirth
Packpatch-through: 14, where Memoize was added
Discussion: https://postgr.es/m/CA+renyX=40YXhsfPTzn13oNOPO3TJ12CK9GX-2P2pvnQiScefA@mail.gmail.com
2023-07-09 16:15:25 +12:00
Andrew Dunstan 432fbeaa52 Fix tmpdir issues with commit e213de8e78
Commit e213de8e78 fixed a problem with path lengths to a tempdir on
Windows, but caused problems on at least some Unix systems where the
system tempdir is on a different file system. To work around this, only
used the system temdir for the destination of pg_replslot on Windows,
and otherwise restore the old behaviour.

Backpatch to relase 14 like the previous patch.

Problem exposed by a myriad of buildfarm animals.
2023-07-08 12:37:46 -04:00
Andrew Dunstan 3e859b3d7a Use shorter location for pg_replslot in pg_basebackup test
The symlink to a longer location tripped up some Windows limit on
buildfarm animal fairywren when running with meson, which uses slightly
longer paths.

Backpatch to release 14 to keep the script in sync. Before that the
script skipped all symlink related tests on Windows.
2023-07-08 11:46:09 -04:00
Andres Freund d12d1a9278 Fix type of iterator variable in SH_START_ITERATE
Also add comment to make the reasoning behind the Assert() more explicit (per
Tom).

Reported-by: Ranier Vilela
Discussion: https://postgr.es/m/CAEudQAocXNJ6s1VLz+hMamLAQAiewRoW17OJ6-+9GACKfj6iPQ@mail.gmail.com
Backpatch: 11-
2023-07-06 09:57:31 -07:00
Andrew Dunstan c0cb12f9e7 Skip pg_baseback long filename test if path too long on Windows
On Windows, it's sometimes difficult to create a file with a path longer
than 255 chars, and if it can be created it might not be seen by the
archiver. This can be triggered by the test for tar backups with
filenames greater than 100 bytes. So we skip that test if the path would
exceed 255.

Backpatch to all live branches.

Reviewed by Daniel Gustafsson

Discussion: https://postgr.es/m/666ac55b-3400-fb2c-2cea-0281bf36a53c@dunslane.net
2023-07-06 12:29:13 -04:00
Heikki Linnakangas 25624c5d3f WAL-log the creation of the init fork of unlogged indexes.
We create a file, so we better WAL-log it. In practice, all the
built-in index AMs and all extensions that I'm aware of write a
metapage to the init fork, which is WAL-logged, and replay of the
metapage implicitly creates the fork too. But if ambuildempty() didn't
write any page, we would miss it.

This can be seen with dummy_index_am. Set up replication, create a
'dummy_index_am' index on an unlogged table, and look at the files
created in the replica: the init fork is not created on the
replica. Dummy_index_am doesn't do anything with the relation files,
however, so it doesn't lead to any user-visible errors.

Backpatch to all supported versions.

Reviewed-by: Robert Haas
Discussion: https://www.postgresql.org/message-id/6e5bbc08-cdfc-b2b3-9e23-1a914b9850a9%40iki.fi
2023-07-06 17:29:12 +03:00
Heikki Linnakangas a5f312c58d Fix MarkGUCPrefixReserved() to check all options.
This bug was only present on v15. MarkGUCPrefixReserved() is new in
v15, and in v16, it was rewritten to use a hash table and the new
implementation did not have this bug.

Author: Karina Litskevich, Ekaterina Sokolova
Discussion: https://www.postgresql.org/message-id/CACiT8ibqyC=_g1n6FXyFJvFW0BEjAH3_5aGqUSFeEp8GpnVrhw@mail.gmail.com
2023-07-06 13:05:25 +03:00
Amit Kapila 907d3dd531 Revert the commits related to allowing page lock to conflict among parallel group members.
This commit reverts the work done by commits 3ba59ccc89 and 72e78d831a.
Those commits were incorrect in asserting that we never acquire any other
heavy-weight lock after acquring page lock other than relation extension
lock. We can acquire a lock on catalogs while doing catalog look up after
acquring page lock.

This won't impact any existing feature but we need to think some other way
to achieve this before parallelizing other write operations or even
improving the parallelism in vacuum (like allowing multiple workers
for an index).

Reported-by: Jaime Casanova
Author: Amit Kapila
Backpatch-through: 13
Discussion: https://postgr.es/m/CAJKUy5jffnRKNvRHKQ0LynRb0RJC-o4P8Ku3x9vGAVLwDBWumQ@mail.gmail.com
2023-07-06 08:28:27 +05:30
Heikki Linnakangas fa96a74a0f Fix leak of LLVM "fatal-on-oom" section counter.
llvm_release_context() called llvm_enter_fatal_on_oom(), but was missing
the corresponding llvm_leave_fatal_on_oom() call. As a result, if JIT was
used at all, we were almost always in the "fatal-on-oom" state.

It only makes a difference if you use an extension written in C++, and
run out of memory in a C++ 'new' call. In that case, you would get a
PostgreSQL FATAL error, instead of the default behavior of throwing a
C++ exception.

Back-patch to all supported versions.

Reviewed-by: Daniel Gustafsson
Discussion: https://www.postgresql.org/message-id/54b78cca-bc84-dad8-4a7e-5b56f764fab5@iki.fi
2023-07-05 13:13:30 +03:00
Masahiko Sawada 66f8a13973 pgstat: fix subscription stats entry leak.
Commit 7b64e4b3 taught DropSubscription() to drop stats entry of
subscription that is not associated with a replication slot for apply
worker at DROP SUBSCRIPTION but missed covering the case where the
subscription is not associated with replication slots for both apply
worker and tablesync worker.

Also add a test to verify that the stats for slot-less subscription is
removed at DROP SUBSCRIPTION time.

Backpatch down to 15.

Author: Masahiko Sawada
Reviewed-by: Nathan Bossart, Hayato Kuroda, Melih Mutlu, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoB71zkP7uPT7JDPsZcvp0749ExEQnOJxeNKPDFisHar+w@mail.gmail.com
Backpatch-through: 15
2023-07-05 14:49:53 +09:00
Heikki Linnakangas e24c02e4d2 Ensure that creation of an empty relfile is fsync'd at checkpoint.
If you create a table and don't insert any data into it, the relation file
is never fsync'd. You don't lose data, because an empty table doesn't have
any data to begin with, but if you crash and lose the file, subsequent
operations on the table will fail with "could not open file" error.

To fix, register an fsync request in mdcreate(), like we do for mdwrite().

Per discussion, we probably should also fsync the containing directory
after creating a new file. But that's a separate and much wider issue.

Backpatch to all supported versions.

Reviewed-by: Andres Freund, Thomas Munro
Discussion: https://www.postgresql.org/message-id/d47d8122-415e-425c-d0a2-e0160829702d%40iki.fi
2023-07-04 18:07:27 +03:00
Thomas Munro 9ffb10f183 Re-bin segment when memory pages are freed.
It's OK to be lazy about re-binning memory segments when allocating,
because that can only leave segments in a bin that's too high.  We'll
search higher bins if necessary while allocating next time, and
also eventually re-bin, so no memory can become unreachable that way.

However, when freeing memory, the largest contiguous range of free pages
might go up, so we should re-bin eagerly to make sure we don't leave the
segment in a bin that is too low for get_best_segment() to find.

The re-binning code is moved into a function of its own, so it can be
called whenever free pages are returned to the segment's free page map.

Back-patch to all supported releases.

Author: Dongming Liu <ldming101@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier version)
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CAL1p7e8LzB2LSeAXo2pXCW4%2BRya9s0sJ3G_ReKOU%3DAjSUWjHWQ%40mail.gmail.com
2023-07-04 15:24:42 +12:00
Thomas Munro d34aa0a2f4 Fix race in SSI interaction with gin fast path.
The ginfast.c code previously checked for conflicts in before locking
the relevant buffer, leaving a window where a RW conflict could be
missed.  Re-order.

There was also a place where buffer ID and block number were confused
while trying to predicate-lock a page, noted by visual inspection.

Back-patch to all supported releases.  Fixes one more problem discovered
with the reproducer from bug #17949, in this case when Dmitry tried
other index types.

Reported-by: Artem Anisimov <artem.anisimov.255@gmail.com>
Reported-by: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/17949-a0f17035294a55e2%40postgresql.org
2023-07-04 09:10:37 +12:00
Thomas Munro ab265e9850 Fix race in SSI interaction with bitmap heap scan.
When performing a bitmap heap scan, we don't want to miss concurrent
writes that occurred after we observed the heap's rs_nblocks, but before
we took predicate locks on index pages.  Therefore, we can't skip
fetching any heap tuples that are referenced by the index, because we
need to test them all with CheckForSerializableConflictOut().  The
old optimization that would ignore any references to blocks >=
rs_nblocks gets in the way of that requirement, because it means that
concurrent writes in that window are ignored.

Removing that optimization shouldn't affect correctness at any isolation
level, because any new tuples shouldn't be visible to an MVCC snapshot.
There also shouldn't be any error-causing references to heap blocks past
the end, because we should have held at least an AccessShareLock on the
table before the index scan.  It can't get smaller while our transaction
is running.  For now, though, we'll keep the optimization at lower
levels to avoid making unnecessary changes in a bug fix.

Back-patch to all supported releases.  In release 11, the code is in a
different place but not fundamentally different.  Fixes one aspect of
bug #17949.

Reported-by: Artem Anisimov <artem.anisimov.255@gmail.com>
Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/17949-a0f17035294a55e2%40postgresql.org
2023-07-04 09:10:37 +12:00
Thomas Munro 0f275b0ee8 Fix race in SSI interaction with empty btrees.
When predicate-locking btrees, we have a special case for completely
empty btrees, since there is no page to lock.  This was racy, because,
without buffer lock held, a matching key could be inserted between the
_bt_search() and the PredicateLockRelation() calls.

Fix, by rechecking _bt_search() after taking the relation-level SIREAD
lock, if using SERIALIZABLE isolation and an empty btree is discovered.

Back-patch to all supported releases.  Fixes one aspect of bug #17949.

Reported-by: Artem Anisimov <artem.anisimov.255@gmail.com>
Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/17949-a0f17035294a55e2%40postgresql.org
2023-07-04 09:10:37 +12:00
Tomas Vondra 23ce74811a Remove expensive test of postgres_fdw batch inserts
The test inserted 70k rows into a foreign table, in order to verify
correct behavior with more than 65535 parameters, and was added in
response to a bug report.

However, this is rather expensive, especially when running the tests
under valgrind, CLOBBER_CACHE_ALWAYS etc. It doesn't seem worth it to
keep running the test, so remove it from all branches (14+).

Backpatch-through: 14
Discussion: https://postgr.es/m/2131017.1623451468@sss.pgh.pa.us
2023-07-03 18:38:12 +02:00
Andrew Dunstan 86f23d90e5 Improve pg_basebackup long file name test Windows robustness
Creation of a file with a very long name can create problems on Windows
due to its file path limits. Work around that by creating the file via a
symlink with a shorter name.

Error displayed by buildfarm animal fairywren.o

Backpatch to all live branches
2023-07-03 10:07:15 -04:00
Michael Paquier 4b15868b69 Make PG_TEST_NOCLEAN work for temporary directories in TAP tests
When set, this environment variable was only effective for data
directories but not for all the other temporary files created by
PostgreSQL::Test::Utils.  Keeping the temporary files after a successful
run can be useful for debugging purposes.

The documentation is updated to reflect the new behavior, with contents
available in doc/ since v16 and in src/test/perl/README since v15.

Author: Jacob Champion
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/CAAWbhmgHtDH1SGZ+Fw05CsXtE0mzTmjbuUxLB9mY9iPKgM6cUw@mail.gmail.com
Discussion: https://postgr.es/m/YyPd9unV14SX2bLF@paquier.xyz
Backpatch-through: 11
2023-07-03 10:06:14 +09:00
Thomas Munro f50200c016 Silence "missing contrecord" error.
Commit dd38ff28ad added a new error message "missing contrecord" when
we fail to reassemble a record.  Unfortunately that caused noisy
messages to be logged by pg_waldump at end of segment, and by walsender
when asked to shut down on a segment boundary.

Remove the new error message, so that this condition signals end-of-
WAL without a message.  It's arguably a reportable condition that should
not be silenced while performing crash recovery, but fixing that without
introducing noise in the other cases will require more research.

Back-patch to 15.

Reported-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/6a1df56e-4656-b3ce-4b7a-a9cb41df8189%40enterprisedb.com
2023-07-03 11:22:10 +12:00
Tomas Vondra 7ae4e78689 Fix oversight in handling of modifiedCols since f24523672d
Commit f24523672d fixed a memory leak by moving the modifiedCols bitmap
into the per-row memory context. In the case of AFTER UPDATE triggers,
the bitmap is however referenced from an event kept until the end of the
query, resulting in a use-after-free bug.

Fixed by copying the bitmap into the AfterTriggerEvents memory context,
which is the one where we keep the trigger events. There's only one
place that needs to do the copy, but the memory context may not exist
yet. Doing that in a separate function seems more readable.

Report by Alexander Pyhalov, fix by me. Backpatch to 13, where the
bitmap was added to the event by commit 71d60e2aa0.

Reported-by: Alexander Pyhalov
Backpatch-through: 13
Discussion: https://postgr.es/m/acddb17c89b0d6cb940eaeda18c08bbe@postgrespro.ru
2023-07-02 22:22:50 +02:00
Tomas Vondra 0c5fe4ff6b Fix memory leak in Incremental Sort rescans
The Incremental Sort had a couple issues, resulting in leaking memory
during rescans, possibly triggering OOM. The code had a couple of
related flaws:

1. During rescans, the sort states were reset but then also set to NULL
   (despite the comment saying otherwise). ExecIncrementalSort then
   sees NULL and initializes a new sort state, leaking the memory used
   by the old one.

2. Initializing the sort state also automatically rebuilt the info about
   presorted keys, leaking the already initialized info. presorted_keys
   was also unnecessarily reset to NULL.

Patch by James Coleman, based on patches by Laurenz Albe and Tom Lane.
Backpatch to 13, where Incremental Sort was introduced.

Author: James Coleman, Laurenz Albe, Tom Lane
Reported-by: Laurenz Albe, Zu-Ming Jiang
Backpatch-through: 13
Discussion: https://postgr.es/m/b2bd02dff61af15e3526293e2771f874cf2a3be7.camel%40cybertec.at
Discussion: https://postgr.es/m/db03c582-086d-e7cd-d4a1-3bc722f81765%40inf.ethz.ch
2023-07-02 20:04:40 +02:00
Michael Paquier cb4ac3e568 Fix marking of indisvalid for partitioned indexes at creation
The logic that introduced partitioned indexes missed a few things when
invalidating a partitioned index when these are created, still the code
is written to handle recursions:
1) If created from scratch because a mapping index could not be found,
the new index created could be itself invalid, if for example it was a
partitioned index with one of its leaves invalid.
2) A CCI was missing when indisvalid is set for a parent index, leading
to inconsistent trees when recursing across more than one level for a
partitioned index creation if an invalidation of the parent was
required.

This could lead to the creation of a partition index tree where some of
the partitioned indexes are marked as invalid, but some of the parents
are marked valid, which is not something that should happen (as
validatePartitionedIndex() defines, indisvalid is switched to true for a
partitioned index iff all its partitions are themselves valid).

This patch makes sure that indisvalid is set to false on a partitioned
index if at least one of its partition is invalid.  The flag is set to
true if *all* its partitions are valid.

The regression test added in this commit abuses of a failed concurrent
index creation, marked as invalid, that maps with an index created on
its partitioned table afterwards.

Reported-by: Alexander Lakhin
Reviewed-by: Alexander Lakhin
Discussion: https://postgr.es/m/14987634-43c0-0cb3-e075-94d423607e08@gmail.com
Backpatch-through: 11
2023-06-30 13:54:55 +09:00
Michael Paquier 93401ec02f Fix pg_depend entry to AMs after ALTER TABLE .. SET ACCESS METHOD
ALTER TABLE .. SET ACCESS METHOD was not registering a dependency to the
new access method with the relation altered in its rewrite phase, making
possible the drop of an access method even if there are relations that
depend on it.  During the rewrite, a temporary relation is created to
build the new relation files before swapping the new and old files, and,
while the temporary relation was registering a correct dependency to the
new AM, the old relation did not do that.  A dependency on the access
method is added when the relation files are swapped, which is the point
where pg_class is updated.

Materialized views and tables use the same code path, hence both were
impacted.

Backpatch down to 15, where this command has been introduced.

Reported-by: Alexander Lakhin
Reviewed-by: Nathan Bossart, Andres Freund
Discussion: https://postgr.es/m/18000-9145c25b1af475ca@postgresql.org
Backpatch-through: 15
2023-06-30 07:49:07 +09:00
Tom Lane cc8cca3c2d Fix order of operations in ExecEvalFieldStoreDeForm().
If the given composite datum is toasted out-of-line,
DatumGetHeapTupleHeader will perform database accesses to detoast it.
That can invalidate the result of get_cached_rowtype, as documented
(perhaps not plainly enough) in that function's API spec; which leads
to strange errors or crashes when we try to use the TupleDesc to read
the tuple.  In short then, trying to update a field of a composite
column could fail intermittently if the overall column value is wide
enough to require toasting.

We can fix the bug at no cost by just changing the order of
operations, since we don't need the TupleDesc until after detoasting.
(Other callers of get_cached_rowtype appear to get this right already,
so there's only one bug.)

Note that the added regression test case reveals this bug reliably
only with debug_discard_caches/CLOBBER_CACHE_ALWAYS.

Per bug #17994 from Alexander Lakhin.  Sadly, this patch does not fix
the missing-values issue revealed in the bug discussion; we'll need
some more work to cover that.

Discussion: https://postgr.es/m/17994-5c7100b51b4790e9@postgresql.org
2023-06-29 10:19:10 -04:00
Peter Eisentraut df5dcf41cf Remove inappropriate raw_expression_tree_walker() code
It was walking into the ColumnDef->compression field, which is not a
node but a string.  This code is currently not reachable (because the
compression field is only set in situations that don't go through
raw_expression_tree_walker()), but if it had been, this could have
behaved erratically.
2023-06-29 10:35:35 +02:00
Michael Paquier e886124647 pg_stat_statements: Fix second comment related to entry resets
This should have been part of dc73db6, but it got lost in the mix.
Oversight in 6b4d23f.

Author: Japin Li
Discussion: https://postgr.es/m/MEYP282MB1669FC91C764E277821936D3B624A@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
Backpatch-through: 14
2023-06-29 09:17:30 +09:00
Michael Paquier a757e16b7e pg_stat_statements: Fix incorrect comment with entry resets
Oversight in 6b4d23f.

Author: Japin Li, Richard Guo
Discussion: https://postgr.es/m/MEYP282MB1669FC91C764E277821936D3B624A@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
Backpatch-through: 14
2023-06-29 08:05:06 +09:00
Michael Paquier 7aa17b498b Ignore invalid indexes when enforcing index rules in ALTER TABLE ATTACH PARTITION
A portion of ALTER TABLE .. ATTACH PARTITION is to ensure that the
partition being attached to the partitioned table has a correct set of
indexes, so as there is a consistent index mapping between the
partitioned table and its new-to-be partition.  However, as introduced
in 8b08f7d, the current logic could choose an invalid index as a match,
which is something that can exist when dealing with more than two levels
of partitioning, like attaching a partitioned table (that has
partitions, with an index created by CREATE INDEX ON ONLY) to another
partitioned table.

A partitioned index with indisvalid set to false is equivalent to an
incomplete partition tree, meaning that an invalid partitioned index
does not have indexes defined in all its partitions.  Hence, choosing an
invalid partitioned index can create inconsistent partition index trees,
where the parent attaching to is valid, but its partition may be
invalid.

In the report from Alexander Lakhin, this showed up as an assertion
failure when validating an index.  Without assertions enabled, the
partition index tree would be actually broken, as indisvalid should
be switched to true for a partitioned index once all its partitions are
themselves valid.  With two levels of partitioning, the top partitioned
table used a valid index and was able to link to an invalid index stored
on its partition, itself a partitioned table.

I have studied a few options here (like the possibility to switch
indisvalid to false for the parent), but came down to the conclusion
that we'd better rely on a simple rule: invalid indexes had better never
be chosen, so as the partition attached uses and creates indexes that
the parent expects.  Some regression tests are added to provide some
coverage.  Note that the existing coverage is not impacted.

This is a problem since partitioned indexes exist, so backpatch all the
way down to v11.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/14987634-43c0-0cb3-e075-94d423607e08@gmail.com
Backpatch-through: 11
2023-06-28 15:57:43 +09:00
Heikki Linnakangas 596177a922 Fix comment on clearing padding.
Author: Japin Li
Discussion: https://www.postgresql.org/message-id/MEYP282MB16696317B5DA7D0D92306149B627A@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
2023-06-27 10:15:14 +03:00
Tom Lane a77d901714 Check for interrupts and stack overflow in TParserGet().
TParserGet() recurses for some token types, meaning it's possible
to drive it to stack overflow.  Since this is a minority behavior,
I chose to add the check_stack_depth() call to the two places that
recurse rather than doing it during every single call.

While at it, add CHECK_FOR_INTERRUPTS(), because this can run
unpleasantly long for long inputs.

Per bug #17995 from Zuming Jiang.  This is old, so back-patch
to all supported branches.

Discussion: https://postgr.es/m/17995-9f20ff3e6389db4c@postgresql.org
2023-06-24 17:18:08 -04:00
Bruce Momjian f08faee4b9 doc: rename "decades" to be more generic
Reported-by: Michael Paquier

Discussion: https://postgr.es/m/ZJTzwD2rTbHWWQ9g@paquier.xyz

Backpatch-through: 11
2023-06-23 22:50:55 -04:00
Michael Paquier 4fd633df50 Fix incorrect error message in libpq_pipeline
One of the tests for the pipeline mode with portal description expects a
non-NULL PQgetResult, but used an incorrect error message on failure,
telling that PQgetResult being NULL was the expected result.

Author: Jelte Fennema
Discussion: https://postgr.es/m/CAGECzQTkShHecFF+EZrm94Lbsu2ej569T=bz+PjMbw9Aiioxuw@mail.gmail.com
Backpatch-through: 14
2023-06-23 17:50:23 +09:00
Amit Kapila 8aac6140c8 Doc: Clarify the behavior of triggers/rules in a logical subscriber.
By default, triggers and rules do not fire on a logical replication
subscriber based on the "session_replication_role" GUC being set to
"replica". However, the docs in the logical replication section assumed
that the reader understood how this GUC worked. This modifies the docs to
be more explicit and links back to the GUC itself.

Author: Jonathan Katz, Peter Smith
Reviewed-by: Vignesh C, Euler Taveira
Backpatch-through: 11
Discussion: https://postgr.es/m/5bb2c9a2-499f-e1a2-6e33-5ce96b35cc4a@postgresql.org
2023-06-22 12:35:10 +05:30