Commit Graph

24861 Commits

Author SHA1 Message Date
Robert Haas dc21234005 Add support for incremental backup.
To take an incremental backup, you use the new replication command
UPLOAD_MANIFEST to upload the manifest for the prior backup. This
prior backup could either be a full backup or another incremental
backup.  You then use BASE_BACKUP with the INCREMENTAL option to take
the backup.  pg_basebackup now has an --incremental=PATH_TO_MANIFEST
option to trigger this behavior.

An incremental backup is like a regular full backup except that
some relation files are replaced with files with names like
INCREMENTAL.${ORIGINAL_NAME}, and the backup_label file contains
additional lines identifying it as an incremental backup. The new
pg_combinebackup tool can be used to reconstruct a data directory
from a full backup and a series of incremental backups.

Patch by me.  Reviewed by Matthias van de Meent, Dilip Kumar, Jakub
Wartak, Peter Eisentraut, and Álvaro Herrera. Thanks especially to
Jakub for incredibly helpful and extensive testing.

Discussion: http://postgr.es/m/CA+TgmoYOYZfMCyOXFyC-P+-mdrZqm5pP2N7S-r0z3_402h9rsA@mail.gmail.com
2023-12-20 09:49:12 -05:00
Robert Haas 174c480508 Add a new WAL summarizer process.
When active, this process writes WAL summary files to
$PGDATA/pg_wal/summaries. Each summary file contains information for a
certain range of LSNs on a certain TLI. For each relation, it stores a
"limit block" which is 0 if a relation is created or destroyed within
a certain range of WAL records, or otherwise the shortest length to
which the relation was truncated during that range of WAL records, or
otherwise InvalidBlockNumber. In addition, it stores a list of blocks
which have been modified during that range of WAL records, but
excluding blocks which were removed by truncation after they were
modified and never subsequently modified again.

In other words, it tells us which blocks need to copied in case of an
incremental backup covering that range of WAL records. But this
doesn't yet add the capability to actually perform an incremental
backup; the next patch will do that.

A new parameter summarize_wal enables or disables this new background
process.  The background process also automatically deletes summary
files that are older than wal_summarize_keep_time, if that parameter
has a non-zero value and the summarizer is configured to run.

Patch by me, with some design help from Dilip Kumar and Andres Freund.
Reviewed by Matthias van de Meent, Dilip Kumar, Jakub Wartak, Peter
Eisentraut, and Álvaro Herrera.

Discussion: http://postgr.es/m/CA+TgmoYOYZfMCyOXFyC-P+-mdrZqm5pP2N7S-r0z3_402h9rsA@mail.gmail.com
2023-12-20 08:42:28 -05:00
Jeff Davis 766571be16 Additional write barrier in AdvanceXLInsertBuffer().
First, mark the xlblocks member with InvalidXLogRecPtr, then issue a
write barrier, then initialize it. That ensures that the xlblocks
member doesn't appear valid while the contents are being initialized.

In preparation for reading WAL buffer contents without a lock.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACVfFMfqD5oLzZSQQZWfXiJqd-NdX0_317veP6FuB31QWA@mail.gmail.com
Reviewed-by: Andres Freund
2023-12-19 17:35:54 -08:00
Jeff Davis c3a8e2a7cb Use 64-bit atomics for xlblocks array elements.
In preparation for reading the contents of WAL buffers without a
lock. Also, avoids the previously-needed comment in GetXLogBuffer()
explaining why it's safe from torn reads.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACVfFMfqD5oLzZSQQZWfXiJqd-NdX0_317veP6FuB31QWA@mail.gmail.com
Reviewed-by: Andres Freund
2023-12-19 17:35:42 -08:00
Michael Paquier 1301c80b21 Remove MSVC scripts
This commit removes all the scripts located in src/tools/msvc/ to build
PostgreSQL with Visual Studio on Windows, meson becoming the recommended
way to achieve that.  The scripts held some information that is still
relevant with meson, information kept and moved to better locations.
Comments that referred directly to the scripts are removed.

All the documentation still relevant that was in install-windows.sgml
has been moved to installation.sgml under a new subsection for Visual.
All the content specific to the scripts is removed.  Some adjustments
for the documentation are planned in a follow-up set of changes.

Author: Michael Paquier
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://postgr.es/m/ZQzp_VMJcerM1Cs_@paquier.xyz
2023-12-20 09:44:37 +09:00
Robert Haas 47f01d727e Fix brown paper bag bug in 5c47c6546c.
The previous logic failed to work for anything other than the first
segment of a relation.

Report by Jakub Wartak. Patch by me.

Discussion: http://postgr.es/m/CAKZiRmwd3KTNMQhm9Bv4oR_1uMehXroO6kGyJQkiw9DfM8cMwQ@mail.gmail.com
2023-12-19 15:00:23 -05:00
Tom Lane 7e1ce2b3de Prevent integer overflow when forming tuple width estimates.
It's at least theoretically possible to overflow int32 when adding up
column width estimates to make a row width estimate.  (The bug example
isn't terribly convincing as a real use-case, but perhaps wide joins
would provide a more plausible route to trouble.)  This'd lead to
assertion failures or silly planner behavior.  To forestall it, make
the relevant functions compute their running sums in int64 arithmetic
and then clamp to int32 range at the end.  We can reasonably assume
that MaxAllocSize is a hard limit on actual tuple width, so clamping
to that is simply a correction for dubious input values, and there's
no need to go as far as widening width variables to int64 everywhere.

Per bug #18247 from RekGRpth.  There've been no reports of this issue
arising in practical cases, so I feel no need to back-patch.

Richard Guo and Tom Lane

Discussion: https://postgr.es/m/18247-11ac477f02954422@postgresql.org
2023-12-19 11:12:16 -05:00
Heikki Linnakangas 3c080fb4fa Simplify newNode() by removing special cases
- Remove MemoryContextAllocZeroAligned(). It was supposed to be a
  faster version of MemoryContextAllocZero(), but modern compilers turn
  the MemSetLoop() into a call to memset() anyway, making it more or
  less identical to MemoryContextAllocZero(). That was the only user of
  MemSetTest, MemSetLoop, so remove those too, as well as palloc0fast().

- Convert newNode() to a static inline function. When this was
  originally originally written, it was written as a macro because
  testing showed that gcc didn't inline the size check as we
  intended. Modern compiler versions do, and now that it just calls
  palloc0() there is no size-check to inline anyway.

One nice effect is that the palloc0() takes one less argument than
MemoryContextAllocZeroAligned(), which saves a few instructions in the
callers of newNode().

Reviewed-by: Peter Eisentraut, Tom Lane, John Naylor, Thomas Munro
Discussion: https://www.postgresql.org/message-id/b51f1fa7-7e6a-4ecc-936d-90a8a1659e7c@iki.fi
2023-12-19 12:11:47 +02:00
Amit Kapila c8bc807cf8 pgoutput: Raise an error for missing protocol version parameter.
Currently, we give a misleading error if the user omits to pass the
required parameter 'proto_version' in SQL API
pg_logical_slot_get_changes() or during START_REPLICATION protocol
message. The error raised is as follows which indicates that the wrong
version is passed.
ERROR:  client sent proto_version=0 but server only supports protocol 1 or higher

Author: Emre Hasegeli
Reviewed-by: Peter Smith, Amit Kapila
Discussion: https://postgr.es/m/CAE2gYzwdwtUbs-tPSV-QBwgTubiyGD2ZGsSnAVsDfAGGLDrGOA@mail.gmail.com
2023-12-19 09:53:33 +05:30
Tom Lane 8b965c549d compute_bitmap_pages' loop_count parameter should be double not int.
The value was double in the original implementation of this logic.
Commit da08a6598 pulled it out into a subroutine, but carelessly
declared the parameter as int when it should have been double.
On most platforms, the only ill effect would be to clamp the value
to be not more than INT_MAX, which would seldom be exceeded and
probably wouldn't change the estimates too much anyway.  Nonetheless,
it's wrong and can cause complaints from ubsan.

While here, improve the comments and parameter names.

This is an ABI change in a globally exposed subroutine, so
back-patching would create some risk of breaking extensions.
The value of the fix doesn't seem high enough to warrant taking
that risk, so fix in HEAD only.

Per report from Alexander Lakhin.

Discussion: https://postgr.es/m/f5e15fe1-202d-1936-f47c-f0c69a936b72@gmail.com
2023-12-18 12:46:10 -05:00
Nathan Bossart 0d1adae6f7 Micro-optimize datum_to_json_internal() some more.
Commit dc3f9bc549 mainly targeted the JSONTYPE_NUMERIC code path.
This commit applies similar optimizations (e.g., removing
unnecessary runtime calls to strlen() and palloc()) to nearby code.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/20231208203708.GA4126315%40nathanxps13
2023-12-18 10:34:33 -06:00
Thomas Munro 4908c58720 Provide vectored variants of smgrread() and smgrwrite().
smgrreadv() and smgrwritev() and their md.c implementations call
FileReadV() and FileWriteV().  A range of disk blocks beginning at
'blocknum' and extending for 'nblocks' can be scattered to or gathered
from multiple buffers with a single system call.  The traditional
smgrread() and smgrwrite() functions are implemented in terms of the new
functions.

Later commits will introduce calls with nblocks > 1, but the following
behavioral changes can be seen already:

* After a short transfer we'll now retry until we eventually read 0
  bytes (= EOF) or get ENOSPC, EDQUOT, EFBIG etc, where previously we
  would infer the reason.  Retrying is consistent with xlog.c's
  treatment of large WAL writes, and arguably also xlog.c and fd.c's
  treatment of EINTR.  Arbitrary short returns for larger transfers have
  been observed on several OSes, and might in theory also happen for
  transient reasons with our own pg_p*v() fallback code.

* After unexpected EOF or -1, the error thrown now talks about
  a range even for the single block case, eg "blocks 42..42".

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com
2023-12-18 15:01:50 +13:00
Michael Paquier 3c9d9acae0 Refactor pgstat_prepare_io_time() with an input argument instead of a GUC
Originally, this routine relied on track_io_timing to check if a time
interval for an I/O operation stored in pg_stat_io should be initialized
or not.  However, the addition of WAL statistics to pg_stat_io requires
that the initialization happens when track_wal_io_timing is enabled,
which is dependent on the code path where the I/O operation happens.

Author: Nazir Bilal Yavuz
Discussion: https://postgr.es/m/CAN55FZ3AiQ+ZMxUuXnBpd0Rrh1YhwJ5FudkHg=JU0P+-W8T4Vg@mail.gmail.com
2023-12-16 20:16:20 +01:00
Alvaro Herrera a6be0600ac
Remove useless LIMIT_OPTION_DEFAULT value from LimitOption
During the development that led to commit 357889eb17, for a time we
had the value LIMIT_OPTION_DEFAULT, which was mostly but not completely
removed later on, before commit.  Complete the removal now.

Author: Zhang Mingli <avamingli@gmail.com>
Discussion: https://postgr.es/m/59d61a1a-3858-475a-964f-24468c97cc67@Spark
2023-12-16 18:20:03 +01:00
Thomas Munro b485ad7f07 Provide multi-block smgrprefetch().
Previously smgrprefetch() could issue POSIX_FADV_WILLNEED advice for a
single block at a time.  Add an nblocks argument so that we can do the
same for a range of blocks.  This usually produces a single system call,
but might need to loop if it crosses a segment boundary.  Initially it
is only called with nblocks == 1, but proposed patches will make wider
calls.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> (earlier version)
Discussion: https://postgr.es/m/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com
2023-12-16 17:51:21 +13:00
Tom Lane 59bd34c2fa Fix bugs in manipulation of large objects.
In v16 and up (since commit afbfc0298), large object ownership
checking has been broken because object_ownercheck() didn't take care
of the discrepancy between our object-address representation of large
objects (classId == LargeObjectRelationId) and the catalog where their
ownership info is actually stored (LargeObjectMetadataRelationId).
This resulted in failures such as "unrecognized class ID: 2613"
when trying to update blob properties as a non-superuser.

Poking around for related bugs, I found that AlterObjectOwner_internal
would pass the wrong classId to the PostAlterHook in the no-op code
path where the large object already has the desired owner.  Also,
recordExtObjInitPriv checked for the wrong classId; that bug is only
latent because the stanza is dead code anyway, but as long as we're
carrying it around it should be less wrong.  These bugs are quite old.

In HEAD, we can reduce the scope for future bugs of this ilk by
changing AlterObjectOwner_internal's API to let the translation happen
inside that function, rather than requiring callers to know about it.

A more bulletproof fix, perhaps, would be to start using
LargeObjectMetadataRelationId as the dependency and object-address
classId for blobs.  However that has substantial risk of breaking
third-party code; even within our own code, it'd create hassles
for pg_dump which would have to cope with a version-dependent
representation.  For now, keep the status quo.

Discussion: https://postgr.es/m/2650449.1702497209@sss.pgh.pa.us
2023-12-15 13:55:05 -05:00
Michael Paquier 8a7cbfce13 Prevent tuples to be marked as dead in subtransactions on standbys
Dead tuples are ignored and are not marked as dead during recovery, as
it can lead to MVCC issues on a standby because its xmin may not match
with the primary.  This information is tracked by a field called
"xactStartedInRecovery" in the transaction state data, switched on when
starting a transaction in recovery.

Unfortunately, this information was not correctly tracked when starting
a subtransaction, because the transaction state used for the
subtransaction did not update "xactStartedInRecovery" based on the state
of its parent.  This would cause index scans done in subtransactions to
return inconsistent data, depending on how the xmin of the primary
and/or the standby evolved.

This is broken since the introduction of hot standby in efc16ea520, so
backpatch all the way down.

Author: Fei Changhong
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/tencent_C4D907A5093C071A029712E73B43C6512706@qq.com
Backpatch-through: 12
2023-12-12 17:05:18 +01:00
Daniel Gustafsson c5962ad21a Fix typo in comment
Commit 98e675ed7a accidentally mistyped IDENTIFY_SYSTEM as
IDENTIFY_SERVER. Backpatch to all supported branches.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/68138521-5345-8780-4390-1474afdcba1f@gmail.com
2023-12-12 12:16:38 +01:00
Thomas Munro 871fe4917e Provide vectored variants of FileRead() and FileWrite().
FileReadV() and FileWriteV() adapt pg_preadv() and pg_pwritev() for
fd.c's virtual file descriptors.  The simple FileRead() and FileWrite()
functions are now implemented in terms of the vectored functions, to
avoid code duplication, and they are converted back to the corresponding
simple system calls further down (commit 15c9ac36).  Later work will
make more interesting multi-iovec calls.

The traditional behavior of reporting a "fake" ENOSPC error is
simplified.  It's now always set for non-failing writes, for the benefit
of callers that expect to log a meaningful "%m" if they determine that
the write was short.  (Perhaps we should consider getting rid of that
expectation one day.)

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com
2023-12-12 13:12:43 +13:00
Tom Lane 0a5c46a7a4 Be more wary about OpenSSL not setting errno on error.
OpenSSL will sometimes return SSL_ERROR_SYSCALL without having set
errno; this is apparently a reflection of recv(2)'s habit of not
setting errno when reporting EOF.  Ensure that we treat such cases
the same as read EOF.  Previously, we'd frequently report them like
"could not accept SSL connection: Success" which is confusing, or
worse report them with an unrelated errno left over from some
previous syscall.

To fix, ensure that errno is zeroed immediately before the call,
and report its value only when it's not zero afterwards; otherwise
report EOF.

For consistency, I've applied the same coding pattern in libpq's
pqsecure_raw_read().  Bare recv(2) shouldn't really return -1 without
setting errno, but in case it does we might as well cope.

Per report from Andres Freund.  Back-patch to all supported versions.

Discussion: https://postgr.es/m/20231208181451.deqnflwxqoehhxpe@awork3.anarazel.de
2023-12-11 11:51:56 -05:00
Alvaro Herrera d3fe6e90ba
Simplify productions for FORMAT JSON [ ENCODING name ]
This removes the production json_encoding_clause_opt, instead merging
it into json_format_clause.  Also remove the auxiliary
makeJsonEncoding() function.

Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/202312071841.u2gueb5dsrbk%40alvherre.pgsql
2023-12-11 11:55:34 +01:00
Michael Paquier c7a3e6b46d Remove trace_recovery_messages
This GUC was intended as a debugging help in the 9.0 area when hot
standby and streaming replication were being developped, able to offer
more information at LOG level rather than DEBUGn.  There are more tools
available these days that are able to offer rather equivalent
information, like pg_waldump introduced in 9.3.  It is not obvious how
this facility is useful these days, so let's remove it.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/ZXEXEAUVFrvpquSd@paquier.xyz
2023-12-11 11:49:02 +01:00
Amit Kapila 8d7d2197f3 Fix an undetected deadlock due to apply worker.
The apply worker needs to update the state of the subscription tables to
'READY' during the synchronization phase which requires locking the
corresponding subscription. The apply worker also waits for the
subscription tables to reach the 'SYNCDONE' state after holding the locks
on the subscription and the wait is done using WaitLatch. The 'SYNCDONE'
state is changed by tablesync workers again by locking the corresponding
subscription. Both the state updates use AccessShareLock mode to lock the
subscription, so they can't block each other. However, a backend can
simultaneously try to acquire a lock on the same subscription using
AccessExclusiveLock mode to alter the subscription. Now, the backend's
wait on a lock can sneak in between the apply worker and table sync worker
causing deadlock.

In other words, apply_worker waits for tablesync worker which waits for
backend, and backend waits for apply worker. This is not detected by the
deadlock detector because apply worker uses WaitLatch.

The fix is to release existing locks in apply worker before it starts to
wait for tablesync worker to change the state.

Reported-by: Tomas Vondra
Author: Shlok Kyal
Reviewed-by: Amit Kapila, Peter Smith
Backpatch-through: 12
Discussion: https://postgr.es/m/d291bb50-12c4-e8af-2af2-7bb9bb4d8e3e@enterprisedb.com
2023-12-11 08:50:43 +05:30
Peter Geoghegan aa210e0c12 Fix nbtree backward scan race condition comments.
Remove comments that supposed that holding a pin was a useful interlock
for _bt_walk_left().  There are times when _bt_walk_left() doesn't hold
either a lock or a pin on any page, so clearly this can't be true.
_bt_walk_left() is even prepared to deal with concurrent deletion of
both the original page and any pages to its left.

Oversight in commit 2ed5b87f96.
2023-12-08 15:37:53 -08:00
Nathan Bossart dc3f9bc549 Micro-optimize JSONTYPE_NUMERIC code path in json.c.
This commit does the following:

* In datum_to_json_internal(), the call to IsValidJsonNumber() is
  replaced with simplified validation code.  This avoids an extra
  call to strlen() in this path, and it avoids validating the
  entire string (which is okay since we know we're dealing with a
  numeric data type's output).

* In datum_to_json_internal(), the call to escape_json() in the
  JSONTYPE_NUMERIC path is replaced with code that just surrounds
  the string with quotes.  In passing, some other nearby calls to
  appendStringInfo() have been replaced with similar code to avoid
  unnecessary calls to vsnprintf().

* In composite_to_json(), the length of the separator is now
  determined at compile time to avoid unnecessary calls to
  strlen().

On my machine, this speeds up a benchmark for the proposed COPY TO
(FORMAT json) command with many integers by upwards of 20%.  There
are likely other code paths that could be given a similar
treatment, but that is left as a future exercise.

Reviewed-by: Jeff Davis, Tom Lane, David Rowley, John Naylor
Discussion: https://postgr.es/m/20231207231251.GB3359478%40nathanxps13
2023-12-08 13:39:08 -06:00
Jeff Davis 867dd2dc87 Cache opaque handle for GUC option to avoid repeasted lookups.
When setting GUCs from proconfig, performance is important, and hash
lookups in the GUC table are significant.

Per suggestion from Robert Haas.

Discussion: https://postgr.es/m/CA+TgmoYpKxhR3HOD9syK2XwcAUVPa0+ba0XPnwWBcYxtKLkyxA@mail.gmail.com
Reviewed-by: John Naylor
2023-12-08 11:16:01 -08:00
Peter Geoghegan c9c0589fda Optimize nbtree backward scan boundary cases.
Teach _bt_binsrch (and related helper routines like _bt_search and
_bt_compare) about the initial positioning requirements of backward
scans.  Routines like _bt_binsrch already know all about "nextkey"
searches, so it seems natural to teach them about "goback"/backward
searches, too.  These concepts are closely related, and are much easier
to understand when discussed together.

Now that certain implementation details are hidden from _bt_first, it's
straightforward to add a new optimization: backward scans using the <
strategy now avoid extra leaf page accesses in certain "boundary cases".
Consider the following example, which uses the tenk1 table (and its
tenk1_hundred index) from the standard regression tests:

SELECT * FROM tenk1 WHERE hundred < 12 ORDER BY hundred DESC LIMIT 1;

Before this commit, nbtree would scan two leaf pages, even though it was
only really necessary to scan one leaf page.  We'll now descend straight
to the leaf page containing a (12, -inf) high key instead.  The scan
will locate matching non-pivot tuples with "hundred" values starting
from the value 11.  The scan won't waste a page access on the right
sibling leaf page, which cannot possibly contain any matching tuples.

You can think of the optimization added by this commit as disabling an
optimization (the _bt_compare "!pivotsearch" behavior that was added to
Postgres 12 in commit dd299df8) for a small subset of cases where it was
always counterproductive.

Equivalently, you can think of the new optimization as extending the
"pivotsearch" behavior that page deletion by VACUUM has long required
(since the aforementioned Postgres 12 commit went in) to other, similar
cases.  Obviously, this isn't strictly necessary for these new cases
(unlike VACUUM, _bt_first is prepared to move the scan to the left once
on the leaf level), but the underlying principle is the same.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=XPzM8HzaLPq278Vms420mVSHfgs9wi5tjFKHcapZCEw@mail.gmail.com
2023-12-08 11:05:17 -08:00
Tomas Vondra b437571714 Allow parallel CREATE INDEX for BRIN indexes
Allow using multiple worker processes to build BRIN index, which until
now was supported only for BTREE indexes. For large tables this often
results in significant speedup when the build is CPU-bound.

The work is split in a simple way - each worker builds BRIN summaries on
a subset of the table, determined by the regular parallel scan used to
read the data, and feeds them into a shared tuplesort which sorts them
by blkno (start of the range). The leader then reads this sorted stream
of ranges, merges duplicates (which may happen if the parallel scan does
not align with BRIN pages_per_range), and adds the resulting ranges into
the index.

The number of duplicate results produced by workers (requiring merging
in the leader process) should be fairly small, thanks to how parallel
scans assign chunks to workers. The likelihood of duplicate results may
increase for higher pages_per_range values, but then there are fewer
page ranges in total. In any case, we expect the merging to be much
cheaper than summarization, so this should be a win.

Most of the parallelism infrastructure is a simplified copy of the code
used by BTREE indexes, omitting the parts irrelevant for BRIN indexes
(e.g. uniqueness checks).

This also introduces a new index AM flag amcanbuildparallel, determining
whether to attempt to start parallel workers for the index build.

Original patch by me, with reviews and substantial reworks by Matthias
van de Meent, certainly enough to make him a co-author.

Author: Tomas Vondra, Matthias van de Meent
Reviewed-by: Matthias van de Meent
Discussion: https://postgr.es/m/c2ee7d69-ce17-43f2-d1a0-9811edbda6e6%40enterprisedb.com
2023-12-08 18:15:26 +01:00
Tomas Vondra dae761a87e Add empty BRIN ranges during CREATE INDEX
When building BRIN indexes, the brinbuildCallback only advances to the
next page range when seeing a tuple that doesn't belong to the current
one. This means that the index may end up missing ranges at the end of
the table, if those pages do not contain any indexable tuples.

We tend not to have completely empty pages at the end of a relation, but
this also applies to partial indexes, where the tuples may simply not
match the index predicate. This results in inefficient scans using the
affected BRIN index - without the summaries, the page ranges have to be
read and processed, which consumes I/O and possibly also CPU time.

The existing code already added empty ranges for earlier parts of the
table, this commit makes sure we add them for the ranges at the end of
the table too.

Patch by Matthias van de Meent, with review/improvements by me.

Author: Matthias van de Meent
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/CAEze2WiMsPZg%3DxkvSF_jt4%3D69k6K7gz5B8V2wY3gCGZ%2B1BzCbQ%40mail.gmail.com
2023-12-08 17:14:32 +01:00
Heikki Linnakangas 44913add91 Remove some unnecessary #includes of postmaster/interrupt.h
Commit 44fc6e259b removed a bunch of references to
SignalHandlerForCrashExit, leaving these #includes unneeded.
2023-12-08 13:19:37 +02:00
Heikki Linnakangas b31ba5310b Rename ShmemVariableCache to TransamVariables
The old name was misleading: It's not a cache, the values kept in the
struct are the authoritative source.

Reviewed-by: Tristan Partin, Richard Guo
Discussion: https://www.postgresql.org/message-id/6537d63d-4bb5-46f8-9b5d-73a8ba4720ab@iki.fi
2023-12-08 09:47:15 +02:00
Heikki Linnakangas 15916ffb04 Initialize ShmemVariableCache like other shmem areas
For sake of consistency.

Reviewed-by: Tristan Partin, Richard Guo
Discussion: https://www.postgresql.org/message-id/6537d63d-4bb5-46f8-9b5d-73a8ba4720ab@iki.fi
2023-12-08 09:46:59 +02:00
Heikki Linnakangas 049ef3398d Don't try to open visibilitymap when analyzing a foreign table
It's harmless, visibilitymap_count() returns 0 if the file doesn't
exist. But it's also very pointless. I noticed this when I added an
assertion in smgropen() that the relnumber is valid.

Discussion: https://www.postgresql.org/message-id/621a52fd-3cd8-4f5d-a561-d510b853bbaf@iki.fi
2023-12-08 09:16:21 +02:00
Thomas Munro cd7f19da34 Fix potential pointer overflow in xlogreader.c.
While checking if a record could fit in the circular WAL decoding
buffer, the coding from commit 3f1ce973 used arithmetic that could
overflow.  64 bit systems were unaffected for various technical reasons,
which probably explains the lack of problem reports.  Likewise for 32
bit systems running known 32 bit kernels.  The systems at risk of
problems appear to be 32 bit processes running on 64 bit kernels, with
unlucky placement in memory.

Per complaint from GCC -fsanitize=undefined -m32, while testing
variations of 039_end_of_wal.pl.

Back-patch to 15.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKH0oRPOX7DhiQ_b51sM8HqcPp2J3WA-Oen%3DdXog%2BAGGQ%40mail.gmail.com
2023-12-08 16:09:03 +13:00
David Rowley d16a0c1e2e Verify that attribute counts match in ExecCopySlot
tts_virtual_copyslot() contained an Assert that checked that the srcslot
contained <= attributes than the dstslot.  This seems to be backwards as
if the srcslot contained fewer attributes then the dstslot could be left
with stale Datum values from the previously stored tuple.  It might make
more sense to allow the source to contain additional attributes and only
copy the leading ones that also exist in the destination, however, that's
not what we're doing here.

Here we just remove the Assert from tts_virtual_copyslot() and add an
Assert to ExecCopySlot() to verify the attribute counts match.  There
does not seem to be any places where the destination contains fewer
attributes, so instead of going to the trouble of making the code
properly handle this, let's just ensure the attribute counts match.  If
this Assert fails then that will indicate that we do have cases that
require us to handle the srcslot with more attributes than the dstslot.
It seems better to only write this code if there's a genuine requirement
for it rather than write it now only to leave it untested.

Thanks to Andres Freund for helping with the analysis of this.

Discussion: https://postgr.es/m/CAApHDvpMAvBL0T+TRORquyx1iqFQKMVTXtqNocOw0Pa2uh1heg@mail.gmail.com
2023-12-07 21:28:24 +13:00
Michael Paquier d43bd090a8 Improve some error messages with invalid indexes for REINDEX CONCURRENTLY
An invalid index is skipped when doing REINDEX CONCURRENTLY at table
level, with INDEX_CORRUPTED used as errcode.  This is confusing,
because an invalid index could exist after an interruption.  The errcode
is switched to OBJECT_NOT_IN_PREREQUISITE_STATE instead, as per a
suggestion from Andres Freund.

While on it, the error messages are reworded, and a hint is added,
telling how to rebuild an invalid index in this case.  This has been
suggested by Noah Misch.

Discussion: https://postgr.es/m/20231118230958.4fm3fhk4ypshxopa@awork3.anarazel.de
2023-12-07 14:27:54 +09:00
Amit Kapila 0bf62460bb Fix issues in binary_upgrade_logical_slot_has_caught_up().
The commit 29d0a77fa6 labelled binary_upgrade_logical_slot_has_caught_up()
as a non-strict function to allow providing a better error message to callers
in case the passed slot_name is NULL. On further discussion, it seems that
it is not helpful to have a different error message for NULL input in this
function, so this patch marks the function as strict.

This patch also removes the explicit permission check to use replication
slots as this function is invoked only by superusers and instead adds an
Assert.

Reported-by: Masahiko Sawada
Author: Hayato Kuroda
Reviewed-by: Vignesh C
Discussion: https://postgr.es/m/CAD21AoDSyiBKkMXBxN_gUayZZUCOgyHnG8Ge8rcPXNP3Tf6B4g@mail.gmail.com
2023-12-07 08:42:48 +05:30
Michael Paquier c426f7c2b3 Fix assertion failure with REINDEX and event triggers
A REINDEX CONCURRENTLY run on a table with no indexes would always pop
the topmost snapshot from the active snapshot stack, making the snapshot
handling inconsistent between the multiple-relation and single-relation
cases.  This commit slightly changes the snapshot stack handling so as a
snapshot is popped only ReindexMultipleInternal() in this case after a
relation has been reindexed, fixing a problem where an event trigger
function may need a snapshot but does not have one.  This also keeps the
places where PopActiveSnapshot() is called closer to each other.

While on it, this expands the existing tests to cover all the cases that
could be faced with REINDEX commands and such event triggers, for one or
more relations, with or without indexes.

This behavior is inconsistent since 5dc92b844e, but we've never had a
need for an active snapshot at the end of a REINDEX until now.

Thanks also to Jian He for the input.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/cb538743-484c-eb6a-a8c5-359980cd3a17@gmail.com
2023-12-07 08:31:02 +09:00
Michael Paquier 7636725b92 Fix compilation on Windows with WAL_DEBUG
This has been broken since b060dbe000 that has reworked the callback
mechanism of XLogReader, most likely unnoticed because any form of
development involving WAL happens on platforms where this compiles fine.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACVF14WKQMFwcJ=3okVDhiXpuK5f7YdT+BdYXbbypMHqWA@mail.gmail.com
Backpatch-through: 13
2023-12-06 14:10:39 +09:00
Daniel Gustafsson 4d0cf0b05d Fix indentation
When preparing commit 98e675ed7a I accidentally forgot to run
pgindent, which did produce a diff. Fix by adding the required
whitespace per the koel buildfarm failure.
2023-12-05 15:54:59 +01:00
Daniel Gustafsson 98e675ed7a Fix incorrect error message for IDENTIFY_SYSTEM
Commit 5a991ef869 accidentally reversed the order of the tuples
and fields parameters, making the error message incorrectly refer
to 3 tuples with 1 field when IDENTIFY_SYSTEM returns 1 tuple and
3 or 4 fields. Fix by changing the order of the parameters.  This
also adds a comment describing why we check for < 3 when postgres
since 9.4 has been sending 4 fields.

Backpatch all the way since the bug is almost a decade old.

Author: Tomonari Katsumata <t.katsumata1122@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Bug: #18224
Backpatch-through: v12
2023-12-05 14:30:56 +01:00
Jeff Davis a86c61c9ee Optimize SearchPathCache by saving the last entry.
Repeated lookups are common, so it's worth it to check the last entry
before doing another hash lookup.

Discussion: https://postgr.es/m/04c8592dbd694e4114a3ed87139a7a04e4363030.camel%40j-davis.com
2023-12-04 17:19:16 -08:00
Nathan Bossart b14b1eb4da Teach convert() and friends to avoid copying when possible.
Presently, pg_convert() allocates a new bytea and copies the result
regardless of whether any conversion actually happened.  This
commit adjusts this function to return the source pointer as-is if
no conversion occurred.  This optimization isn't expected to make a
tremendous difference, but it still seems worthwhile to avoid
unnecessary memory allocations.

Author: Yurii Rashkovskii
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/CA%2BRLCQyknBPSWXRBQGOi6aYEcdQ9RpH9Kch4GjoeY8dQ3D%2Bvhw%40mail.gmail.com
2023-12-04 11:55:18 -06:00
Heikki Linnakangas e7c6efe305 Remove now-unnecessary Autovacuum[Launcher|Worker]IAm functions
After commit fd5e8b440d, InitProcess() is called later in the
EXEC_BACKEND startup sequence, so it's enough to set the
am_autovacuum_[launcher|worker] variables at the same place as in the
!EXEC_BACKEND case.
2023-12-04 15:34:37 +02:00
Michael Paquier f21848de20 Add support for REINDEX in event triggers
This commit adds support for REINDEX in event triggers, making this
command react for the events ddl_command_start and ddl_command_end.  The
indexes rebuilt are collected with the ReindexStmt emitted by the
caller, for the concurrent and non-concurrent paths.

Thanks to that, it is possible to know a full list of the indexes that a
single REINDEX command has worked on.

Author: Garrett Thornburg, Jian He
Reviewed-by: Jim Jones, Michael Paquier
Discussion: https://postgr.es/m/CAEEqfk5bm32G7sbhzHbES9WejD8O8DCEOaLkxoBP7HNWxjPpvg@mail.gmail.com
2023-12-04 09:53:49 +09:00
Heikki Linnakangas fd5e8b440d Refactor how InitProcess is called
The order of process initialization steps is now more consistent
between !EXEC_BACKEND and EXEC_BACKEND modes. InitProcess() is called
at the same place in either mode. We can now also move the
AttachSharedMemoryStructs() call into InitProcess() itself. This
reduces the number of "#ifdef EXEC_BACKEND" blocks.

Reviewed-by: Tristan Partin, Andres Freund, Alexander Lakhin
Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi
2023-12-03 16:39:18 +02:00
Heikki Linnakangas 388491f1e5 Pass BackgroundWorker entry in the parameter file in EXEC_BACKEND mode
The BackgroundWorker struct is now passed the same way as the Port
struct. Seems more consistent.

This makes it possible to move InitProcess later in SubPostmasterMain
(in next commit), as we no longer need to access shared memory to read
background worker entry.

Reviewed-by: Tristan Partin, Andres Freund, Alexander Lakhin
Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi
2023-12-03 16:38:54 +02:00
Heikki Linnakangas 69d903367c Refactor CreateSharedMemoryAndSemaphores
For clarity, have separate functions for *creating* the shared memory
and semaphores at postmaster or single-user backend startup, and
for *attaching* to existing shared memory structures in EXEC_BACKEND
case. CreateSharedMemoryAndSemaphores() is now called only at
postmaster startup, and a new AttachSharedMemoryStructs() function is
called at backend startup in EXEC_BACKEND mode.

Reviewed-by: Tristan Partin, Andres Freund
Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi
2023-12-03 16:09:42 +02:00
Heikki Linnakangas b19890d49d Silence Valgrind complaint with EXEC_BACKEND
The padding bytes written to the backend params file were
uninitialized. That's harmless because we don't access the padding
bytes when we read the file back in, but Valgrind doesn't know
that. In any case, clear the padding bytes to make Valgrind happy.

Reported-by: Alexander Lakhin
Discussion: https://www.postgresql.org/message-id/014768ed-8b39-c44f-b07c-098c87b1644c@gmail.com
2023-12-01 22:30:08 +02:00
Peter Eisentraut 0d93ce31a5 pgindent fix
for commit a11c9c42ea
2023-12-01 17:58:18 +01:00
Peter Eisentraut a11c9c42ea Check collation when creating partitioned index
When creating a partitioned index, the partition key must be a subset
of the index's columns.  But this currently doesn't check that the
collations between the partition key and the index definition match.
So you can construct a unique index that fails to enforce uniqueness.
(This would most likely involve a nondeterministic collation, so it
would have to be crafted explicitly and is not something that would
just happen by accident.)

This patch adds the required collation check.  As a result, any
previously allowed unique index that has a collation mismatch would no
longer be allowed to be created.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/3327cb54-f7f1-413b-8fdb-7a9dceebb938%40eisentraut.org
2023-12-01 16:05:41 +01:00
Amit Kapila f66fcc5cd6 Fix an uninitialized access in hash_xlog_squeeze_page().
Commit 861f86beea changed hash_xlog_squeeze_page() to start reading
the write buffer conditionally but forgot to initialize it leading to an
uninitialized access.

Reported-by: Alexander Lakhin
Author: Hayato Kuroda
Reviewed-by: Alexander Lakhin, Amit Kapila
Discussion: http://postgr.es/m/62ed1a9f-746a-8e86-904b-51b9b806a1d9@gmail.com
2023-12-01 10:22:13 +05:30
Thomas Munro 3b51265ee3 Adjust obsolete comment explaining set_stack_base().
Commit 7389aad6 removed the notion of backends started from inside a
signal handler.  A stray comment still referred to them, while
explaining the need for a call to set_stack_base().  That leads to the
question of whether we still need the call in !EXEC_BACKEND builds.
There doesn't seem to be much point in suppressing it now, as it doesn't
hurt and probably helps to measure the stack base from the exact same
place in EXEC_BACKEND and !EXEC_BACKEND builds.

Back-patch to 16.

Reported-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reported-by: Tristan Partin <tristan@neon.tech>
Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKG%2BEJHcevNGNOxVWxTNFbuB%3Dvjf4U68%2B85rAC_Sxvy2zEQ%40mail.gmail.com
2023-12-01 15:18:51 +13:00
Heikki Linnakangas f93133a250 Print lwlock stats also for aux processes, when built with LWLOCK_STATS
InitAuxiliaryProcess() closely resembles InitProcess(), but it didn't
call InitLWLockAccess(). But because InitLWLockAccess() is a no-op
unless compiled with LWLOCK_STATS, and everything works even if it's
not called, the only consequence was that the stats were not printed
for aux processes.

This was an oversight in commit 1c6821be31, in version 9.5, so it is
missing in all supported branches. But since it only affects
developers using LWLOCK_STATS and no one has complained, no
backpatching.

Discussion: https://www.postgresql.org/message-id/20231130202648.7k6agmuizdilufnv@awork3.anarazel.de
2023-12-01 01:00:03 +02:00
Alexander Korotkov ae2ccf66a2 Fix typo in 5a1dfde833
Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/55d8800f-4a80-5256-1e84-246fbe79acd0@gmail.com
2023-11-30 13:46:23 +02:00
Alexander Korotkov b589f211e0 Fix warning due non-standard inline declaration in 4ed8f0913b
Reported-by: Alexander Lakhin, Tom Lane
Author: Pavel Borisov
Discussion: https://postgr.es/m/55d8800f-4a80-5256-1e84-246fbe79acd0@gmail.com
2023-11-30 11:34:45 +02:00
John Naylor 095d109ccd Remove redundant setting of hashkey after insertion
It's not necessary to fill the key field in most cases, since
hash_search has already done that. Some existing call sites have an
assert or comment that this contract has been fulfilled, but those
are quite old and that practice seems unnecessary here.

While at it, remove a nearby redundant assignment that a smart compiler
will elide anyway.

Zhao Junwang, with some adjustments by me

Reviewed by Nathan Bossart, with additional feedback from Tom Lane

Discussion: http://postgr.es/m/CAEG8a3%2BUPF%3DR2QGPgJMF2mKh8xPd1H2TmfH77zPuVUFdBpiGUA%40mail.gmail.com
2023-11-30 15:25:57 +07:00
Michael Paquier 8d9978a717 Apply quotes more consistently to GUC names in logs
Quotes are applied to GUCs in a very inconsistent way across the code
base, with a mix of double quotes or no quotes used.  This commit
removes double quotes around all the GUC names that are obviously
referred to as parameters with non-English words (use of underscore,
mixed case, etc).

This is the result of a discussion with Álvaro Herrera, Nathan Bossart,
Laurenz Albe, Peter Eisentraut, Tom Lane and Daniel Gustafsson.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
2023-11-30 14:11:45 +09:00
Peter Eisentraut 7e5f517799 Improve "user mapping not found" error message
Display the name of the foreign server for which the user mapping was
not found.

Author: Ian Lawrence Barwick <barwick@gmail.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/CAB8KJ=jFzNaeyFtLcTZNOc6fd1+F93pGVLFa-wyt31wn7VNxqQ@mail.gmail.com
2023-11-30 05:34:28 +01:00
Alexander Korotkov 5a1dfde833 Make use FullTransactionId in 2PC filenames
Switch from using TransactionId to FullTransactionId in naming of 2PC files.
Transaction state file in the pg_twophase directory now have extra 8 bytes in
the name to address an epoch of a given xid.

Author: Maxim Orlov, Aleksander Alekseev, Alexander Korotkov, Teodor Sigaev
Author: Nikita Glukhov, Pavel Borisov, Yura Sokolov
Reviewed-by: Jacob Champion, Heikki Linnakangas, Alexander Korotkov
Reviewed-by: Japin Li, Pavel Borisov, Tom Lane, Peter Eisentraut, Andres Freund
Reviewed-by: Andrey Borodin, Dilip Kumar, Aleksander Alekseev
Discussion: https://postgr.es/m/CACG%3DezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe%3DpyyjVWA%40mail.gmail.com
Discussion: https://postgr.es/m/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq%2BvfkmTF5Q%40mail.gmail.com
2023-11-29 01:43:36 +02:00
Alexander Korotkov 2cdf131c46 Use larger segment file names for pg_notify
This avoids the wraparound in async.c and removes the corresponding code
complexity. The maximum amount of allocated SLRU pages for NOTIFY / LISTEN
queue is now determined by the max_notify_queue_pages GUC. The default
value is 1048576. It allows to consume up to 8 GB of disk space which is
exactly the limit we had previously.

Author: Maxim Orlov, Aleksander Alekseev, Alexander Korotkov, Teodor Sigaev
Author: Nikita Glukhov, Pavel Borisov, Yura Sokolov
Reviewed-by: Jacob Champion, Heikki Linnakangas, Alexander Korotkov
Reviewed-by: Japin Li, Pavel Borisov, Tom Lane, Peter Eisentraut, Andres Freund
Reviewed-by: Andrey Borodin, Dilip Kumar, Aleksander Alekseev
Discussion: https://postgr.es/m/CACG%3DezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe%3DpyyjVWA%40mail.gmail.com
Discussion: https://postgr.es/m/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq%2BvfkmTF5Q%40mail.gmail.com
2023-11-29 01:41:48 +02:00
Alexander Korotkov 4ed8f0913b Index SLRUs by 64-bit integers rather than by 32-bit integers
We've had repeated bugs in the area of handling SLRU wraparound in the past,
some of which have caused data loss. Switching to an indexing system for SLRUs
that does not wrap around should allow us to get rid of a whole bunch
of problems and improve the overall reliability of the system.

This particular patch however only changes the indexing and doesn't address
the wraparound per se. This is going to be done in the following patches.

Author: Maxim Orlov, Aleksander Alekseev, Alexander Korotkov, Teodor Sigaev
Author: Nikita Glukhov, Pavel Borisov, Yura Sokolov
Reviewed-by: Jacob Champion, Heikki Linnakangas, Alexander Korotkov
Reviewed-by: Japin Li, Pavel Borisov, Tom Lane, Peter Eisentraut, Andres Freund
Reviewed-by: Andrey Borodin, Dilip Kumar, Aleksander Alekseev
Discussion: https://postgr.es/m/CACG%3DezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe%3DpyyjVWA%40mail.gmail.com
Discussion: https://postgr.es/m/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq%2BvfkmTF5Q%40mail.gmail.com
2023-11-29 01:40:56 +02:00
Tom Lane a916b47e23 Clean up usage of bison precedence for non-operator keywords.
Assigning a precedence to a keyword that isn't a kind of expression
operator is rather dangerous, because it might mask grammar
ambiguities that we'd rather know about.  It's much safer to attach
explicit precedences to individual rules, which will affect the
behavior of only that one rule.  Moreover, when we do have to give
a precedence to a non-operator keyword, we should try to give it the
same precedence as IDENT, thereby reducing the risk of surprising
side-effects.

Apply this hard-won knowledge to SET (which I misassigned ages ago
in commit 2647ad658) and some SQL/JSON-related productions
(from commits 6ee30209a, 71bfd1543).

Patch HEAD only, since there's no evidence of actual bugs here.

Discussion: https://postgr.es/m/CADT4RqBPdbsZW7HS1jJP319TMRHs1hzUiP=iRJYR6UqgHCrgNQ@mail.gmail.com
2023-11-28 13:32:15 -05:00
Tom Lane c82207a548 Use BIO_{get,set}_app_data instead of BIO_{get,set}_data.
We should have done it this way all along, but we accidentally got
away with using the wrong BIO field up until OpenSSL 3.2.  There,
the library's BIO routines that we rely on use the "data" field
for their own purposes, and our conflicting use causes assorted
weird behaviors up to and including core dumps when SSL connections
are attempted.  Switch to using the approved field for the purpose,
i.e. app_data.

While at it, remove our configure probes for BIO_get_data as well
as the fallback implementation.  BIO_{get,set}_app_data have been
there since long before any OpenSSL version that we still support,
even in the back branches.

Also, update src/test/ssl/t/001_ssltests.pl to allow for a minor
change in an error message spelling that evidently came in with 3.2.

Tristan Partin and Bo Andreson.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAN55FZ1eDDYsYaL7mv+oSLUij2h_u6hvD4Qmv-7PK7jkji0uyQ@mail.gmail.com
2023-11-28 12:34:03 -05:00
Heikki Linnakangas 10a59925a3 Fix comment about ressortgrouprefs being unique in setop plans.
Author: Richard Guo, Tom Lane
Discussion: https://www.postgresql.org/message-id/CAMbWs49rAfFS-yd7=QxtDUrZDFfRBGy4rGBJNyGDH7=CLipFPg@mail.gmail.com
2023-11-28 14:15:14 +02:00
Heikki Linnakangas 60f227316c Fix assertions with RI triggers in heap_update and heap_delete.
If the tuple being updated is not visible to the crosscheck snapshot,
we return TM_Updated but the assertions would not hold in that case.
Move them to before the cross-check.

Fixes bug #17893. Backpatch to all supported versions.

Author: Alexander Lakhin
Backpatch-through: 12
Discussion: https://www.postgresql.org/message-id/17893-35847009eec517b5%40postgresql.org
2023-11-28 12:00:14 +02:00
David Rowley 930d2b442f Don't use bms_membership() in cases where we don't need to
00b41463c adjusted Bitmapset so that an empty set is always represented
as NULL.  This makes checking for empty sets far cheaper than it used
to be.

There were various places in the code where we'd call bms_membership()
to handle the 3 possible BMS_Membership values.  For the BMS_SINGLETON
case, we'd also call bms_singleton_member() to find the single set member.
This can now be done in a more optimal way by first checking if the set is
NULL and then not bothering with bms_membership() and simply call
bms_get_singleton_member() instead to find the single member.  This
function will return false if there are multiple members in the set.

Here we also tidy up some logic in examine_variable() for the single
member case.  There's now no need to call bms_is_member() as we've
already established that we're working with a singleton Bitmapset, so we
can just check if varRelid matches the singleton member.

Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/CAApHDvqW+CxNPcY245GaWiuqkkqgTudtG2ncGvvSjGn2wdTZLA@mail.gmail.com
2023-11-28 10:41:12 +13:00
Tomas Vondra a82ee7ef3a Check if ii_AmCache is NULL in aminsertcleanup
Fix a bug introduced by c1ec02be1d. It may happen that the executor
opens indexes on the result relation, but no rows end up being inserted.
Then the index_insert_cleanup still gets executed, but passes down NULL
to the AM callback. The AM callback may not expect this, as is the case
of brininsertcleanup, leading to a crash.

Fixed by only calling the cleanup callback if (ii_AmCache != NULL). This
way the AM can simply assume to only see a valid cache.

Reported-by: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-w9qC-o9hQox9UHvdVZAYTp8OrPQOKtwbvzWaRejTT=Q@mail.gmail.com
2023-11-27 16:53:06 +01:00
Heikki Linnakangas 1f395354d8 Reduce rate of walwriter wakeups due to async commits.
XLogSetAsyncXactLSN(), called at asynchronous commit, would wake up
walwriter every time the LSN advances, but walwriter doesn't actually
do anything unless it has at least 'wal_writer_flush_after' full
blocks of WAL to write. Repeatedly waking up walwriter to do nothing
is a waste of CPU cycles in both walwriter and the backends doing the
wakeups. To fix, apply the same logic in XLogSetAsyncXactLSN() to
decide whether to wake up walwriter, as walwriter uses to determine if
it has any work to do.

In the passing, rename misleadingly named 'flushbytes' local variable
to 'flushblocks'.

Author: Andres Freund, Heikki Linnakangas
Discussion: https://www.postgresql.org/message-id/20231024230929.vsc342baqs7kmbte@awork3.anarazel.de
2023-11-27 17:42:39 +02:00
Amit Kapila 360392fa2a Avoid unconditionally filling in missing values with NULL in pgoutput.
52e4f0cd4 introduced a bug in pgoutput in which missing values in tuples
were incorrectly filled in with NULL. The problem was the use of
CreateTupleDescCopy where CreateTupleDescCopyConstr was required, as the
former drops the constraints in the tuple description (specifically, the
default value constraint) on the floor.

The bug could result in incorrectness when a table replicated via
`REPLICA IDENTITY FULL` underwent a schema change that added a column
with a default value. The problem is that in such cases updates fill NULL
values in old tuples for missing columns for default values. Then on the
subscriber, we failed to find a matching tuple and missed updating the
required row.

Author: Nikhil Benesch
Reviewed-by: Hou Zhijie, Amit Kapila
Backpatch-through: 15
Discussion: http://postgr.es/m/CAPWqQZTEpZQamYsGMn6ZDRvVywwpVPiKH6OY4KSgA+NmeqFNzA@mail.gmail.com
2023-11-27 08:49:55 +05:30
Alexander Korotkov bc3c8db8ae Display length and bounds histograms in pg_stats
Values corresponding to STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM and
STATISTIC_KIND_BOUNDS_HISTOGRAM were not exposed to pg_stats when these
slot kinds were introduced in 918eee0c49.

This commit adds the missing fields to pg_stats.

Catversion is bumped.

Discussion: https://postgr.es/m/flat/b67d8b57-9357-7e82-a2e7-f6ce6eaeec67@postgrespro.ru
Author: Egor Rogov, Soumyadeep Chakraborty
Reviewed-by: Tomas Vondra, Justin Pryzby, Jian He
2023-11-27 01:32:17 +02:00
Tom Lane 3558f120f8 Doc: list AT TIME ZONE and COLLATE in operator precedence table.
These constructs have precedence, but we forgot to list them.
In HEAD, mention AT LOCAL as well as AT TIME ZONE.

Per gripe from Shay Rojansky.

Discussion: https://postgr.es/m/CADT4RqBPdbsZW7HS1jJP319TMRHs1hzUiP=iRJYR6UqgHCrgNQ@mail.gmail.com
2023-11-26 16:40:24 -05:00
Tomas Vondra b2caf7c0e1 Fix brin.c indentation issues introduced by c1ec02be1d
Per buildfarm member koel.
2023-11-26 21:35:32 +01:00
Tomas Vondra c1ec02be1d Reuse BrinDesc and BrinRevmap in brininsert
The brininsert code used to initialize (and destroy) BrinDesc and
BrinRevmap for each tuple, which is not free. This patch initializes
these structures only once, and reuses them for all inserts in the same
command. The data is passed through indexInfo->ii_AmCache.

This also introduces an optional AM callback "aminsertcleanup" that
allows performing custom cleanup in case simply pfree-ing ii_AmCache is
not sufficient (which is the case when the cache contains TupleDesc,
Buffers, and so on).

Author: Soumyadeep Chakraborty
Reviewed-by: Alvaro Herrera, Matthias van de Meent, Tomas Vondra
Discussion: https://postgr.es/m/CAE-ML%2B9r2%3DaO1wwji1sBN9gvPz2xRAtFUGfnffpd0ZqyuzjamA%40mail.gmail.com
2023-11-25 20:27:28 +01:00
Bruce Momjian 8d981341a5 C comment: clarify that WAL files can be _recycled_ or removed
Reported-by: Michael Paquier

Discussion: https://postgr.es/m/CAB7nPqSDdF0heotQU3gsepgqx+9c+6KjLd3R6aNYH7KKfDd2ig@mail.gmail.com

Author: Michael Paquier

Backpatch-through: master
2023-11-25 10:48:18 -05:00
Bruce Momjian 79588d3c8d Use SECS_PER_HOUR macro in tzparser.c, instead of constants
Reported-by: CharSyam

Discussion: https://postgr.es/m/CAMrLSE5j_aWfoBDMrSvk14oBKSy+-2cjzNNH_FciirA7Kwo9TA@mail.gmail.com

Author: CharSyam

Backpatch-through: master
2023-11-24 22:36:23 -05:00
Bruce Momjian 344afc7769 modify segno. for pg_walfile_name() and pg_walfile_name_offset()
Previously these functions returned the previous segment number if the
LSN was on a segment boundary.  We now always return the current segment
number for an LSN.

Docs updated to reflect this change.  Regression tests added, author
Andres Freund.

Also mentioned in thread https://postgr.es/m/flat/20220204225057.GA1535307%40nathanxps13#d964275c9540d8395e138efc0a75f7e8

BACKWARD INCOMPATIBILITY

Reported-by: Kyotaro Horiguchi

Discussion: https://postgr.es/m/20190726.172120.101752680.horikyota.ntt@gmail.com

Co-authored-by: Kyotaro Horiguchi

Backpatch-through: master
2023-11-24 19:44:09 -05:00
Tom Lane d053a879bb Fix timing-dependent failure in GSSAPI data transmission.
When using GSSAPI encryption in non-blocking mode, libpq sometimes
failed with "GSSAPI caller failed to retransmit all data needing
to be retried".  The cause is that pqPutMsgEnd rounds its transmit
request down to an even multiple of 8K, and sometimes that can lead
to not requesting a write of data that was requested to be written
(but reported as not written) earlier.  That can upset pg_GSS_write's
logic for dealing with not-yet-written data, since it's possible
the data in question had already been incorporated into an encrypted
packet that we weren't able to send during the previous call.

We could fix this with a one-or-two-line hack to disable pqPutMsgEnd's
round-down behavior, but that seems like making the caller work around
a behavior that pg_GSS_write shouldn't expose in this way.  Instead,
adjust pg_GSS_write to never report a partial write: it either
reports a complete write, or reflects the failure of the lower-level
pqsecure_raw_write call.  The requirement still exists for the caller
to present at least as much data as on the previous call, but with
the caller-visible write start point not moving there is no temptation
for it to present less.  We lose some ability to reclaim buffer space
early, but I doubt that that will make much difference in practice.

This also gets rid of a rather dubious assumption that "any
interesting failure condition (from pqsecure_raw_write) will recur
on the next try".  We've not seen failure reports traceable to that,
but I've never trusted it particularly and am glad to remove it.

Make the same adjustments to the equivalent backend routine
be_gssapi_write().  It is probable that there's no bug on the backend
side, since we don't have a notion of nonblock mode there; but we
should keep the logic the same to ease future maintenance.

Per bug #18210 from Lars Kanis.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/18210-4c6d0b14627f2eb8@postgresql.org
2023-11-23 13:30:18 -05:00
Heikki Linnakangas 50c67c2019 Use ResourceOwner to track WaitEventSets.
A WaitEventSet holds file descriptors or event handles (on Windows).
If FreeWaitEventSet is not called, those fds or handles are leaked.
Use ResourceOwners to track WaitEventSets, to clean those up
automatically on error.

This was a live bug in async Append nodes, if a FDW's
ForeignAsyncRequest function failed. (In back branches, I will apply a
more localized fix for that based on PG_TRY-PG_FINALLY.)

The added test doesn't check for leaking resources, so it passed even
before this commit. But at least it covers the code path.

In the passing, fix misleading comment on what the 'nevents' argument
to WaitEventSetWait means.

Report by Alexander Lakhin, analysis and suggestion for the fix by
Tom Lane. Fixes bug #17828.

Reviewed-by: Alexander Lakhin, Thomas Munro
Discussion: https://www.postgresql.org/message-id/472235.1678387869@sss.pgh.pa.us
2023-11-23 13:31:36 +02:00
Bruce Momjian 414e75540f C comment: fix typos with unnecessary apostrophes
Reported-by: Vinayak Pokale

Discussion: https://postgr.es/m/CAEySZvh7gPTOqMhuKOBXEt=qF_1BCvFQB4MAJ4yaTPJHxgX_zw@mail.gmail.com

Author: Vinayak Pokale

Backpatch-through: master
2023-11-22 23:41:15 -05:00
Amit Kapila eeb0ebad79 Fix the initial sync tables with no columns.
The copy command formed for initial sync was using parenthesis for tables
with no columns leading to syntax error. This patch avoids adding
parenthesis for such tables.

Reported-by: Justin G
Author: Vignesh C
Reviewed-by: Peter Smith, Amit Kapila
Backpatch-through: 15
Discussion: http://postgr.es/m/18203-df37fe354b626670@postgresql.org
2023-11-22 11:44:14 +05:30
Amit Kapila ff68cc6f3b Stop the search once the slot for replication origin is found.
In replorigin_session_setup(), we were needlessly looping for
max_replication_slots even after finding an existing slot for the origin.
This shouldn't hurt us much except for probably large values of
max_replication_slots.

Author: Antonin Houska
Discussion: http://postgr.es/m/2694.1700471273@antos
2023-11-22 08:39:24 +05:30
Michael Paquier e83aa9f92f Simplify some logic in CreateReplicationSlot()
This refactoring reduces the code in charge of creating replication
slots from two "if" block to a single one, making it slightly cleaner.

This change is possible since 1d04a59be3, that has removed the
intermediate code that existed between the two "if" blocks in charge of
initializing the output message buffer.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+PtnJzqKT41Zt8pChRzba=QgCqjtfYvcf84NMj3VFJoKfw@mail.gmail.com
2023-11-21 13:55:01 +09:00
Amit Kapila 7c3fb505b1 Log messages for replication slot acquisition and release.
This commit log messages (at LOG level when log_replication_commands is
set, otherwise at DEBUG1 level) when walsenders acquire and release
replication slots. These messages help to know the lifetime of a
replication slot - one can know how long a streaming standby, logical
subscriber, or replication slot consumer is down. These messages will be
useful on production servers to debug and analyze inactive replication
slots.

Note that these messages are emitted only for walsenders but not for
backends. This is because walsenders are the ones that typically hold
replication slots for longer durations, unlike backends which hold them
for executing replication related functions.

Author: Bharath Rupireddy
Reviewed-by: Peter Smith, Amit Kapila, Alvaro Herrera
Discussion: http://postgr.es/m/CALj2ACX17G7F-jeLt+7KhJ6YxVeRwR8Zk0rDh4VnT546o0UpTQ@mail.gmail.com
2023-11-21 07:59:53 +05:30
Jeff Davis ad57c2a7c5 Optimize check_search_path() by using SearchPathCache.
A hash lookup is faster than re-validating the string, particularly
because we use SplitIdentifierString() for validation.

Important when search_path changes frequently.

Discussion: https://postgr.es/m/04c8592dbd694e4114a3ed87139a7a04e4363030.camel%40j-davis.com
2023-11-20 15:53:42 -08:00
Jeff Davis 8efa301532 Be more paranoid about OOM in search_path cache.
Recent commit f26c2368dc introduced a search_path cache, but left some
potential out-of-memory hazards. Simplify the code and make it safer
against OOM.

This change reintroduces one list_copy(), losing a small amount of the
performance gained in f26c2368dc. A future change may optimize away
the list_copy() again if it can be done in a safer way.

Discussion: https://postgr.es/m/e6fded24cb8a2c53d4ef069d9f69cc7baaafe9ef.camel@j-davis.com
2023-11-20 15:36:07 -08:00
Michael Paquier 3650e7a393 Prevent overflow for block number in buffile.c
As coded, the start block calculated by BufFileAppend() would overflow
once more than 16k files are used with a default block size.  This issue
existed before b1e5c9fa9a, but there's no reason not to be clean about
it.

Per report from Coverity, with a fix suggested by Tom Lane.
2023-11-20 09:14:53 +09:00
Tomas Vondra 28f84f72fb Lock table in DROP STATISTICS
The DROP STATISTICS code failed to properly lock the table, leading to

  ERROR:  tuple concurrently deleted

when executed concurrently with ANALYZE.

Fixed by modifying RemoveStatisticsById() to acquire the same lock as
ANALYZE. This function is called only by DROP STATISTICS, as ANALYZE
calls RemoveStatisticsDataById() directly.

Reported by Justin Pryzby, fix by me. Backpatch through 12. The code was
like this since it was introduced in 10, but older releases are EOL.

Reported-by: Justin Pryzby
Reviewed-by: Tom Lane
Backpatch-through: 12

Discussion: https://postgr.es/m/ZUuk-8CfbYeq6g_u@pryzbyj2023
2023-11-19 21:03:38 +01:00
Dean Rasheed b218fbb7a3 Guard against overflow in interval_mul() and interval_div().
Commits 146604ec43 and a898b409f6 added overflow checks to
interval_mul(), but not to interval_div(), which contains almost
identical code, and so is susceptible to the same kinds of
overflows. In addition, those checks did not catch all possible
overflow conditions.

Add additional checks to the "cascade down" code in interval_mul(),
and copy all the overflow checks over to the corresponding code in
interval_div(), so that they both generate "interval out of range"
errors, rather than returning bogus results.

Given that these errors are relatively easy to hit, back-patch to all
supported branches.

Per bug #18200 from Alexander Lakhin, and subsequent investigation.

Discussion: https://postgr.es/m/18200-5ea288c7b2d504b1%40postgresql.org
2023-11-18 14:41:20 +00:00
Andres Freund b2e237afdd Release lock on heap buffer before vacuuming FSM
When there are no indexes on a table, we vacuum each heap block after
pruning it and then update the freespace map. Periodically, we also
vacuum the freespace map. This was done while unnecessarily holding a
lock on the heap page. Release the lock before calling
FreeSpaceMapVacuumRange() and, while we're at it, ensure the range
includes the heap block we just vacuumed.

There are no known deadlocks or other similar issues, therefore don't
backpatch. It's certainly not good to do all this work under a lock, but it's
not frequently reached, making it not worth the risk of backpatching.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAAKRu_YiL%3D44GvGnt1dpYouDSSoV7wzxVoXs8m3p311rp-TVQQ%40mail.gmail.com
2023-11-17 12:46:55 -08:00
Tom Lane f7816aec23 Extract column statistics from CTE references, if possible.
examine_simple_variable() left this as an unimplemented case years
ago, with the result that plans for queries involving un-flattened
CTEs might be much stupider than necessary.  It's not hard to extend
the existing logic for RTE_SUBQUERY cases to also be able to drill
down into CTEs, so let's do that.

There was some discussion of whether this patch breaks the idea
of a MATERIALIZED CTE being an optimization fence.  We concluded
it's okay, because we already allow the outer planner level to
see the estimated width and rowcount of the CTE result, and
letting it see column statistics too seems fairly equivalent.
Basically, what we expect of the optimization fence is that the
outer query should not affect the plan chosen for the CTE query.
Once that plan is chosen, it's okay for the outer planner level
to make use of whatever information we have about it.

Jian Guo and Tom Lane, per complaint from Hans Buschmann

Discussion: https://postgr.es/m/4504e67078d648cdac3651b2960da6e7@nidsa.net
2023-11-17 14:36:23 -05:00
Tom Lane 8d5573b92e Don't specify number of dimensions in cases where we don't know it.
A few places in array_in() and plperl would report a misleading value
(always MAXDIM+1) for the number of dimensions in the input, because
we'd error out as soon as that was clearly too large rather than
scanning the entire input.  There doesn't seem to be much value in
offering the true number, at least not enough to justify the extra
complication involved in trying to get it.  So just remove that
parenthetical remark.  We already have other places that do it
like that, anyway.

Per suggestions from Alexander Lakhin and Heikki Linnakangas.

Discussion: https://postgr.es/m/2794005.1683042087@sss.pgh.pa.us
2023-11-17 11:29:46 -05:00
Michael Paquier b1e5c9fa9a Change logtape/tuplestore code to use int64 for block numbers
The code previously relied on "long" as type to track block numbers,
which would be 4 bytes in all Windows builds or any 32-bit builds.  This
limited the code to be able to handle up to 16TB of data with the
default block size of 8kB, like during a CLUSTER.  This code now relies
on a more portable int64, which should be more than enough for at least
the next 20 years to come.

This issue has been reported back in 2017, but nothing was done about it
back then, so here we go now.

Reported-by: Peter Geoghegan
Reviewed-by: Heikki Linnakangas
Discussion: https://postgr.es/m/CAH2-WznCscXnWmnj=STC0aSa7QG+BRedDnZsP=Jo_R9GUZvUrg@mail.gmail.com
2023-11-17 11:20:53 +09:00
Michael Paquier c99c7a4871 Remove NOT_USED BufFileTellBlock() from buffile.c
This routine has been marked as NOT_USED since 20ad43b576 from 2000,
and a patch is planned to switch the logtape/tuplestore APIs to rely on
int64 rather than long for the block nunbers, which is more portable.

Keeping it is more confusing than anything at this stage, so let's get
rid of it entirely.

Thanks for Heikki Linnakangas for the poke on this one.

Discussion: https://postgr.es/m/5047be8c-7ee6-4dd5-af76-6c916c3103b4@iki.fi
2023-11-17 10:46:50 +09:00
Tom Lane 743ddafc71 Ensure we preprocess expressions before checking their volatility.
contain_mutable_functions and contain_volatile_functions give
reliable answers only after expression preprocessing (specifically
eval_const_expressions).  Some places understand this, but some did
not get the memo --- which is not entirely their fault, because the
problem is documented only in places far away from those functions.
Introduce wrapper functions that allow doing the right thing easily,
and add commentary in hopes of preventing future mistakes from
copy-and-paste of code that's only conditionally safe.

Two actual bugs of this ilk are fixed here.  We failed to preprocess
column GENERATED expressions before checking mutability, so that the
code could fail to detect the use of a volatile function
default-argument expression, or it could reject a polymorphic function
that is actually immutable on the datatype of interest.  Likewise,
column DEFAULT expressions weren't preprocessed before determining if
it's safe to apply the attmissingval mechanism.  A false negative
would just result in an unnecessary table rewrite, but a false
positive could allow the attmissingval mechanism to be used in a case
where it should not be, resulting in unexpected initial values in a
new column.

In passing, re-order the steps in ComputePartitionAttrs so that its
checks for invalid column references are done before applying
expression_planner, rather than after.  The previous coding would
not complain if a partition expression contains a disallowed column
reference that gets optimized away by constant folding, which seems
to me to be a behavior we do not want.

Per bug #18097 from Jim Keener.  Back-patch to all supported versions.

Discussion: https://postgr.es/m/18097-ebb179674f22932f@postgresql.org
2023-11-16 10:05:14 -05:00
Michael Paquier 2e8a0edc2a Add target "slru" to pg_stat_reset_shared()
Currently, pg_stat_reset_shared() cannot reset the counters in the view
pg_stat_slru even if it is a type of shared stats.  This patch adds
support for a new value in pg_stat_reset_shared(), called "slru", able
to do that.  Note that pg_stat_reset_shared(NULL) also resets SLRU
counters.

There may be a point in removing pg_stat_reset_slru() that was
introduced in 28cac71bd3 (v13~) as the new option overlaps with this
function, but we would lose the ability to reset individual SLRU
counters.  This is left for future reconsideration.

Author: Atsushi Torikoshi
Discussion: https://postgr.es/m/e3c25d72e81378e7b64f3c52e0306fc9@oss.nttdata.com
2023-11-16 15:41:34 +09:00
Nathan Bossart 6a72c42fd5 Retire MemoryContextResetAndDeleteChildren() macro.
As of commit eaa5808e8e, MemoryContextResetAndDeleteChildren() is
just a backwards compatibility macro for MemoryContextReset().  Now
that some time has passed, this macro seems more likely to create
confusion.

This commit removes the macro and replaces all remaining uses with
calls to MemoryContextReset().  Any third-party code that use this
macro will need to be adjusted to call MemoryContextReset()
instead.  Since the two have behaved the same way since v9.5, such
adjustments won't produce any behavior changes for all
currently-supported versions of PostgreSQL.

Reviewed-by: Amul Sul, Tom Lane, Alvaro Herrera, Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/20231113185950.GA1668018%40nathanxps13
2023-11-15 13:42:30 -06:00
Heikki Linnakangas c21e6e2fd4 Clear CurrentResourceOwner earlier in CommitTransaction.
Alexander reported a crash with repeated create + drop database, after
the ResourceOwner rewrite (commit b8bff07daa). That was fixed by the
previous commit, but it nevertheless seems like a good idea clear
CurrentResourceOwner earlier, because you're not supposed to use it
for anything after we start releasing it.

Reviewed-by: Alexander Lakhin
Discussion: https://www.postgresql.org/message-id/11b70743-c5f3-3910-8e5b-dd6c115ff829%40gmail.com
2023-11-15 11:03:49 +01:00
Heikki Linnakangas a8b330ffb6 Fix dsa.c with different resource owners.
The comments in dsa.c suggested that areas were owned by resource
owners, but it was not in fact tracked explicitly. The DSM attachments
held by the dsa were owned by resource owners, but not the area
itself.  That led to confusion if you used one resource owner to
attach or create the area, but then switched to a different resource
owner before allocating or even just accessing the allocations in the
area with dsa_get_address(). The additional DSM segments associated
with the area would get owned by a different resource owner than the
initial segment.  To fix, add an explicit 'resowner' field to
dsa_area.  It replaces the 'mapping_pinned' flag; resowner == NULL now
indicates that the mapping is pinned.

This is arguably a bug fix, but I'm not backpatching because it
doesn't seem to be a live bug in the back branches. In 'master', it is
a bug because commit b8bff07daa made ResourceOwners more strict so
that you are no longer allowed to remember new resources in a
ResourceOwner after you have started to release it. Merely accessing a
dsa pointer might need to attach a new DSM segment, and before this
commit it was temporarily remembered in the current owner for a very
brief period even if the DSA was pinned. And that could happen in
AtEOXact_PgStat(), which is called after the owner is already released.

Reported-by: Alexander Lakhin
Reviewed-by: Alexander Lakhin, Thomas Munro, Andres Freund
Discussion: https://www.postgresql.org/message-id/11b70743-c5f3-3910-8e5b-dd6c115ff829%40gmail.com
2023-11-15 10:34:28 +01:00
Jeff Davis f26c2368dc Add cache for recomputeNamespacePath().
When search_path is changed to something that was previously set, and
no invalidation happened in between, use the cached list of namespace
OIDs rather than recomputing them. This avoids syscache lookups and
ACL checks.

Important when the search_path changes frequently, such as when set in
proconfig.

An earlier version of this patch was reviewd by Nathan Bossart. This
version simplifies a few things and is safer in case of OOM.

Discussion: https://www.postgresql.org/message-id/abf4ce8804e0e05dff8c1725ae6a8ed28b7d66e0.camel%40j-davis.com
Reviewed-by: Nathan Bossart
2023-11-14 17:53:30 -08:00