Commit Graph

24738 Commits

Author SHA1 Message Date
Tomas Vondra c1ec02be1d Reuse BrinDesc and BrinRevmap in brininsert
The brininsert code used to initialize (and destroy) BrinDesc and
BrinRevmap for each tuple, which is not free. This patch initializes
these structures only once, and reuses them for all inserts in the same
command. The data is passed through indexInfo->ii_AmCache.

This also introduces an optional AM callback "aminsertcleanup" that
allows performing custom cleanup in case simply pfree-ing ii_AmCache is
not sufficient (which is the case when the cache contains TupleDesc,
Buffers, and so on).

Author: Soumyadeep Chakraborty
Reviewed-by: Alvaro Herrera, Matthias van de Meent, Tomas Vondra
Discussion: https://postgr.es/m/CAE-ML%2B9r2%3DaO1wwji1sBN9gvPz2xRAtFUGfnffpd0ZqyuzjamA%40mail.gmail.com
2023-11-25 20:27:28 +01:00
Bruce Momjian 8d981341a5 C comment: clarify that WAL files can be _recycled_ or removed
Reported-by: Michael Paquier

Discussion: https://postgr.es/m/CAB7nPqSDdF0heotQU3gsepgqx+9c+6KjLd3R6aNYH7KKfDd2ig@mail.gmail.com

Author: Michael Paquier

Backpatch-through: master
2023-11-25 10:48:18 -05:00
Bruce Momjian 79588d3c8d Use SECS_PER_HOUR macro in tzparser.c, instead of constants
Reported-by: CharSyam

Discussion: https://postgr.es/m/CAMrLSE5j_aWfoBDMrSvk14oBKSy+-2cjzNNH_FciirA7Kwo9TA@mail.gmail.com

Author: CharSyam

Backpatch-through: master
2023-11-24 22:36:23 -05:00
Bruce Momjian 344afc7769 modify segno. for pg_walfile_name() and pg_walfile_name_offset()
Previously these functions returned the previous segment number if the
LSN was on a segment boundary.  We now always return the current segment
number for an LSN.

Docs updated to reflect this change.  Regression tests added, author
Andres Freund.

Also mentioned in thread https://postgr.es/m/flat/20220204225057.GA1535307%40nathanxps13#d964275c9540d8395e138efc0a75f7e8

BACKWARD INCOMPATIBILITY

Reported-by: Kyotaro Horiguchi

Discussion: https://postgr.es/m/20190726.172120.101752680.horikyota.ntt@gmail.com

Co-authored-by: Kyotaro Horiguchi

Backpatch-through: master
2023-11-24 19:44:09 -05:00
Tom Lane d053a879bb Fix timing-dependent failure in GSSAPI data transmission.
When using GSSAPI encryption in non-blocking mode, libpq sometimes
failed with "GSSAPI caller failed to retransmit all data needing
to be retried".  The cause is that pqPutMsgEnd rounds its transmit
request down to an even multiple of 8K, and sometimes that can lead
to not requesting a write of data that was requested to be written
(but reported as not written) earlier.  That can upset pg_GSS_write's
logic for dealing with not-yet-written data, since it's possible
the data in question had already been incorporated into an encrypted
packet that we weren't able to send during the previous call.

We could fix this with a one-or-two-line hack to disable pqPutMsgEnd's
round-down behavior, but that seems like making the caller work around
a behavior that pg_GSS_write shouldn't expose in this way.  Instead,
adjust pg_GSS_write to never report a partial write: it either
reports a complete write, or reflects the failure of the lower-level
pqsecure_raw_write call.  The requirement still exists for the caller
to present at least as much data as on the previous call, but with
the caller-visible write start point not moving there is no temptation
for it to present less.  We lose some ability to reclaim buffer space
early, but I doubt that that will make much difference in practice.

This also gets rid of a rather dubious assumption that "any
interesting failure condition (from pqsecure_raw_write) will recur
on the next try".  We've not seen failure reports traceable to that,
but I've never trusted it particularly and am glad to remove it.

Make the same adjustments to the equivalent backend routine
be_gssapi_write().  It is probable that there's no bug on the backend
side, since we don't have a notion of nonblock mode there; but we
should keep the logic the same to ease future maintenance.

Per bug #18210 from Lars Kanis.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/18210-4c6d0b14627f2eb8@postgresql.org
2023-11-23 13:30:18 -05:00
Heikki Linnakangas 50c67c2019 Use ResourceOwner to track WaitEventSets.
A WaitEventSet holds file descriptors or event handles (on Windows).
If FreeWaitEventSet is not called, those fds or handles are leaked.
Use ResourceOwners to track WaitEventSets, to clean those up
automatically on error.

This was a live bug in async Append nodes, if a FDW's
ForeignAsyncRequest function failed. (In back branches, I will apply a
more localized fix for that based on PG_TRY-PG_FINALLY.)

The added test doesn't check for leaking resources, so it passed even
before this commit. But at least it covers the code path.

In the passing, fix misleading comment on what the 'nevents' argument
to WaitEventSetWait means.

Report by Alexander Lakhin, analysis and suggestion for the fix by
Tom Lane. Fixes bug #17828.

Reviewed-by: Alexander Lakhin, Thomas Munro
Discussion: https://www.postgresql.org/message-id/472235.1678387869@sss.pgh.pa.us
2023-11-23 13:31:36 +02:00
Bruce Momjian 414e75540f C comment: fix typos with unnecessary apostrophes
Reported-by: Vinayak Pokale

Discussion: https://postgr.es/m/CAEySZvh7gPTOqMhuKOBXEt=qF_1BCvFQB4MAJ4yaTPJHxgX_zw@mail.gmail.com

Author: Vinayak Pokale

Backpatch-through: master
2023-11-22 23:41:15 -05:00
Amit Kapila eeb0ebad79 Fix the initial sync tables with no columns.
The copy command formed for initial sync was using parenthesis for tables
with no columns leading to syntax error. This patch avoids adding
parenthesis for such tables.

Reported-by: Justin G
Author: Vignesh C
Reviewed-by: Peter Smith, Amit Kapila
Backpatch-through: 15
Discussion: http://postgr.es/m/18203-df37fe354b626670@postgresql.org
2023-11-22 11:44:14 +05:30
Amit Kapila ff68cc6f3b Stop the search once the slot for replication origin is found.
In replorigin_session_setup(), we were needlessly looping for
max_replication_slots even after finding an existing slot for the origin.
This shouldn't hurt us much except for probably large values of
max_replication_slots.

Author: Antonin Houska
Discussion: http://postgr.es/m/2694.1700471273@antos
2023-11-22 08:39:24 +05:30
Michael Paquier e83aa9f92f Simplify some logic in CreateReplicationSlot()
This refactoring reduces the code in charge of creating replication
slots from two "if" block to a single one, making it slightly cleaner.

This change is possible since 1d04a59be3, that has removed the
intermediate code that existed between the two "if" blocks in charge of
initializing the output message buffer.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+PtnJzqKT41Zt8pChRzba=QgCqjtfYvcf84NMj3VFJoKfw@mail.gmail.com
2023-11-21 13:55:01 +09:00
Amit Kapila 7c3fb505b1 Log messages for replication slot acquisition and release.
This commit log messages (at LOG level when log_replication_commands is
set, otherwise at DEBUG1 level) when walsenders acquire and release
replication slots. These messages help to know the lifetime of a
replication slot - one can know how long a streaming standby, logical
subscriber, or replication slot consumer is down. These messages will be
useful on production servers to debug and analyze inactive replication
slots.

Note that these messages are emitted only for walsenders but not for
backends. This is because walsenders are the ones that typically hold
replication slots for longer durations, unlike backends which hold them
for executing replication related functions.

Author: Bharath Rupireddy
Reviewed-by: Peter Smith, Amit Kapila, Alvaro Herrera
Discussion: http://postgr.es/m/CALj2ACX17G7F-jeLt+7KhJ6YxVeRwR8Zk0rDh4VnT546o0UpTQ@mail.gmail.com
2023-11-21 07:59:53 +05:30
Jeff Davis ad57c2a7c5 Optimize check_search_path() by using SearchPathCache.
A hash lookup is faster than re-validating the string, particularly
because we use SplitIdentifierString() for validation.

Important when search_path changes frequently.

Discussion: https://postgr.es/m/04c8592dbd694e4114a3ed87139a7a04e4363030.camel%40j-davis.com
2023-11-20 15:53:42 -08:00
Jeff Davis 8efa301532 Be more paranoid about OOM in search_path cache.
Recent commit f26c2368dc introduced a search_path cache, but left some
potential out-of-memory hazards. Simplify the code and make it safer
against OOM.

This change reintroduces one list_copy(), losing a small amount of the
performance gained in f26c2368dc. A future change may optimize away
the list_copy() again if it can be done in a safer way.

Discussion: https://postgr.es/m/e6fded24cb8a2c53d4ef069d9f69cc7baaafe9ef.camel@j-davis.com
2023-11-20 15:36:07 -08:00
Michael Paquier 3650e7a393 Prevent overflow for block number in buffile.c
As coded, the start block calculated by BufFileAppend() would overflow
once more than 16k files are used with a default block size.  This issue
existed before b1e5c9fa9a, but there's no reason not to be clean about
it.

Per report from Coverity, with a fix suggested by Tom Lane.
2023-11-20 09:14:53 +09:00
Tomas Vondra 28f84f72fb Lock table in DROP STATISTICS
The DROP STATISTICS code failed to properly lock the table, leading to

  ERROR:  tuple concurrently deleted

when executed concurrently with ANALYZE.

Fixed by modifying RemoveStatisticsById() to acquire the same lock as
ANALYZE. This function is called only by DROP STATISTICS, as ANALYZE
calls RemoveStatisticsDataById() directly.

Reported by Justin Pryzby, fix by me. Backpatch through 12. The code was
like this since it was introduced in 10, but older releases are EOL.

Reported-by: Justin Pryzby
Reviewed-by: Tom Lane
Backpatch-through: 12

Discussion: https://postgr.es/m/ZUuk-8CfbYeq6g_u@pryzbyj2023
2023-11-19 21:03:38 +01:00
Dean Rasheed b218fbb7a3 Guard against overflow in interval_mul() and interval_div().
Commits 146604ec43 and a898b409f6 added overflow checks to
interval_mul(), but not to interval_div(), which contains almost
identical code, and so is susceptible to the same kinds of
overflows. In addition, those checks did not catch all possible
overflow conditions.

Add additional checks to the "cascade down" code in interval_mul(),
and copy all the overflow checks over to the corresponding code in
interval_div(), so that they both generate "interval out of range"
errors, rather than returning bogus results.

Given that these errors are relatively easy to hit, back-patch to all
supported branches.

Per bug #18200 from Alexander Lakhin, and subsequent investigation.

Discussion: https://postgr.es/m/18200-5ea288c7b2d504b1%40postgresql.org
2023-11-18 14:41:20 +00:00
Andres Freund b2e237afdd Release lock on heap buffer before vacuuming FSM
When there are no indexes on a table, we vacuum each heap block after
pruning it and then update the freespace map. Periodically, we also
vacuum the freespace map. This was done while unnecessarily holding a
lock on the heap page. Release the lock before calling
FreeSpaceMapVacuumRange() and, while we're at it, ensure the range
includes the heap block we just vacuumed.

There are no known deadlocks or other similar issues, therefore don't
backpatch. It's certainly not good to do all this work under a lock, but it's
not frequently reached, making it not worth the risk of backpatching.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAAKRu_YiL%3D44GvGnt1dpYouDSSoV7wzxVoXs8m3p311rp-TVQQ%40mail.gmail.com
2023-11-17 12:46:55 -08:00
Tom Lane f7816aec23 Extract column statistics from CTE references, if possible.
examine_simple_variable() left this as an unimplemented case years
ago, with the result that plans for queries involving un-flattened
CTEs might be much stupider than necessary.  It's not hard to extend
the existing logic for RTE_SUBQUERY cases to also be able to drill
down into CTEs, so let's do that.

There was some discussion of whether this patch breaks the idea
of a MATERIALIZED CTE being an optimization fence.  We concluded
it's okay, because we already allow the outer planner level to
see the estimated width and rowcount of the CTE result, and
letting it see column statistics too seems fairly equivalent.
Basically, what we expect of the optimization fence is that the
outer query should not affect the plan chosen for the CTE query.
Once that plan is chosen, it's okay for the outer planner level
to make use of whatever information we have about it.

Jian Guo and Tom Lane, per complaint from Hans Buschmann

Discussion: https://postgr.es/m/4504e67078d648cdac3651b2960da6e7@nidsa.net
2023-11-17 14:36:23 -05:00
Tom Lane 8d5573b92e Don't specify number of dimensions in cases where we don't know it.
A few places in array_in() and plperl would report a misleading value
(always MAXDIM+1) for the number of dimensions in the input, because
we'd error out as soon as that was clearly too large rather than
scanning the entire input.  There doesn't seem to be much value in
offering the true number, at least not enough to justify the extra
complication involved in trying to get it.  So just remove that
parenthetical remark.  We already have other places that do it
like that, anyway.

Per suggestions from Alexander Lakhin and Heikki Linnakangas.

Discussion: https://postgr.es/m/2794005.1683042087@sss.pgh.pa.us
2023-11-17 11:29:46 -05:00
Michael Paquier b1e5c9fa9a Change logtape/tuplestore code to use int64 for block numbers
The code previously relied on "long" as type to track block numbers,
which would be 4 bytes in all Windows builds or any 32-bit builds.  This
limited the code to be able to handle up to 16TB of data with the
default block size of 8kB, like during a CLUSTER.  This code now relies
on a more portable int64, which should be more than enough for at least
the next 20 years to come.

This issue has been reported back in 2017, but nothing was done about it
back then, so here we go now.

Reported-by: Peter Geoghegan
Reviewed-by: Heikki Linnakangas
Discussion: https://postgr.es/m/CAH2-WznCscXnWmnj=STC0aSa7QG+BRedDnZsP=Jo_R9GUZvUrg@mail.gmail.com
2023-11-17 11:20:53 +09:00
Michael Paquier c99c7a4871 Remove NOT_USED BufFileTellBlock() from buffile.c
This routine has been marked as NOT_USED since 20ad43b576 from 2000,
and a patch is planned to switch the logtape/tuplestore APIs to rely on
int64 rather than long for the block nunbers, which is more portable.

Keeping it is more confusing than anything at this stage, so let's get
rid of it entirely.

Thanks for Heikki Linnakangas for the poke on this one.

Discussion: https://postgr.es/m/5047be8c-7ee6-4dd5-af76-6c916c3103b4@iki.fi
2023-11-17 10:46:50 +09:00
Tom Lane 743ddafc71 Ensure we preprocess expressions before checking their volatility.
contain_mutable_functions and contain_volatile_functions give
reliable answers only after expression preprocessing (specifically
eval_const_expressions).  Some places understand this, but some did
not get the memo --- which is not entirely their fault, because the
problem is documented only in places far away from those functions.
Introduce wrapper functions that allow doing the right thing easily,
and add commentary in hopes of preventing future mistakes from
copy-and-paste of code that's only conditionally safe.

Two actual bugs of this ilk are fixed here.  We failed to preprocess
column GENERATED expressions before checking mutability, so that the
code could fail to detect the use of a volatile function
default-argument expression, or it could reject a polymorphic function
that is actually immutable on the datatype of interest.  Likewise,
column DEFAULT expressions weren't preprocessed before determining if
it's safe to apply the attmissingval mechanism.  A false negative
would just result in an unnecessary table rewrite, but a false
positive could allow the attmissingval mechanism to be used in a case
where it should not be, resulting in unexpected initial values in a
new column.

In passing, re-order the steps in ComputePartitionAttrs so that its
checks for invalid column references are done before applying
expression_planner, rather than after.  The previous coding would
not complain if a partition expression contains a disallowed column
reference that gets optimized away by constant folding, which seems
to me to be a behavior we do not want.

Per bug #18097 from Jim Keener.  Back-patch to all supported versions.

Discussion: https://postgr.es/m/18097-ebb179674f22932f@postgresql.org
2023-11-16 10:05:14 -05:00
Michael Paquier 2e8a0edc2a Add target "slru" to pg_stat_reset_shared()
Currently, pg_stat_reset_shared() cannot reset the counters in the view
pg_stat_slru even if it is a type of shared stats.  This patch adds
support for a new value in pg_stat_reset_shared(), called "slru", able
to do that.  Note that pg_stat_reset_shared(NULL) also resets SLRU
counters.

There may be a point in removing pg_stat_reset_slru() that was
introduced in 28cac71bd3 (v13~) as the new option overlaps with this
function, but we would lose the ability to reset individual SLRU
counters.  This is left for future reconsideration.

Author: Atsushi Torikoshi
Discussion: https://postgr.es/m/e3c25d72e81378e7b64f3c52e0306fc9@oss.nttdata.com
2023-11-16 15:41:34 +09:00
Nathan Bossart 6a72c42fd5 Retire MemoryContextResetAndDeleteChildren() macro.
As of commit eaa5808e8e, MemoryContextResetAndDeleteChildren() is
just a backwards compatibility macro for MemoryContextReset().  Now
that some time has passed, this macro seems more likely to create
confusion.

This commit removes the macro and replaces all remaining uses with
calls to MemoryContextReset().  Any third-party code that use this
macro will need to be adjusted to call MemoryContextReset()
instead.  Since the two have behaved the same way since v9.5, such
adjustments won't produce any behavior changes for all
currently-supported versions of PostgreSQL.

Reviewed-by: Amul Sul, Tom Lane, Alvaro Herrera, Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/20231113185950.GA1668018%40nathanxps13
2023-11-15 13:42:30 -06:00
Heikki Linnakangas c21e6e2fd4 Clear CurrentResourceOwner earlier in CommitTransaction.
Alexander reported a crash with repeated create + drop database, after
the ResourceOwner rewrite (commit b8bff07daa). That was fixed by the
previous commit, but it nevertheless seems like a good idea clear
CurrentResourceOwner earlier, because you're not supposed to use it
for anything after we start releasing it.

Reviewed-by: Alexander Lakhin
Discussion: https://www.postgresql.org/message-id/11b70743-c5f3-3910-8e5b-dd6c115ff829%40gmail.com
2023-11-15 11:03:49 +01:00
Heikki Linnakangas a8b330ffb6 Fix dsa.c with different resource owners.
The comments in dsa.c suggested that areas were owned by resource
owners, but it was not in fact tracked explicitly. The DSM attachments
held by the dsa were owned by resource owners, but not the area
itself.  That led to confusion if you used one resource owner to
attach or create the area, but then switched to a different resource
owner before allocating or even just accessing the allocations in the
area with dsa_get_address(). The additional DSM segments associated
with the area would get owned by a different resource owner than the
initial segment.  To fix, add an explicit 'resowner' field to
dsa_area.  It replaces the 'mapping_pinned' flag; resowner == NULL now
indicates that the mapping is pinned.

This is arguably a bug fix, but I'm not backpatching because it
doesn't seem to be a live bug in the back branches. In 'master', it is
a bug because commit b8bff07daa made ResourceOwners more strict so
that you are no longer allowed to remember new resources in a
ResourceOwner after you have started to release it. Merely accessing a
dsa pointer might need to attach a new DSM segment, and before this
commit it was temporarily remembered in the current owner for a very
brief period even if the DSA was pinned. And that could happen in
AtEOXact_PgStat(), which is called after the owner is already released.

Reported-by: Alexander Lakhin
Reviewed-by: Alexander Lakhin, Thomas Munro, Andres Freund
Discussion: https://www.postgresql.org/message-id/11b70743-c5f3-3910-8e5b-dd6c115ff829%40gmail.com
2023-11-15 10:34:28 +01:00
Jeff Davis f26c2368dc Add cache for recomputeNamespacePath().
When search_path is changed to something that was previously set, and
no invalidation happened in between, use the cached list of namespace
OIDs rather than recomputing them. This avoids syscache lookups and
ACL checks.

Important when the search_path changes frequently, such as when set in
proconfig.

An earlier version of this patch was reviewd by Nathan Bossart. This
version simplifies a few things and is safer in case of OOM.

Discussion: https://www.postgresql.org/message-id/abf4ce8804e0e05dff8c1725ae6a8ed28b7d66e0.camel%40j-davis.com
Reviewed-by: Nathan Bossart
2023-11-14 17:53:30 -08:00
Robert Haas 025584a168 Change how a base backup decides which files have checksums.
Previously, it thought that any plain file located under global, base,
or a tablespace directory had checksums unless it was in a short list
of excluded files. Now, it thinks that files in those directories have
checksums if parse_filename_for_nontemp_relation says that they are
relation files. (Temporary relation files don't matter because they're
excluded from the backup anyway.)

This changes the behavior if you have stray files not managed by
PostgreSQL in the relevant directories. Previously, you'd get some
kind of checksum-related complaint if such files existed, assuming
that the cluster had checksums enabled and that the base backup
wasn't run with NOVERIFY_CHECKSUMS. Now, you won't get those
complaints any more. That seems like an improvement to me, because
those files were presumably not created by PostgreSQL and so there
is no reason to think that they would be checksummed like a
PostgreSQL relation file. (If we want to complain about such files,
we should complain about them existing at all, not just about their
checksums.)

The point of this change is to make the code more consistent.
sendDir() was already calling parse_filename_for_nontemp_relation()
as part of an effort to determine which files to include in the
backup. So, it already had the information about whether a certain
file was a relation file. sendFile() then used a separate method,
embodied in is_checksummed_file(), to make what is essentially
the same determination. It's better not to make the same decision
using two different methods, especially in closely-related code.

Patch by me. Reviewed by Dilip Kumar and Álvaro Herrera. Thanks
also to Jakub Wartak and Peter Eisentraut for comments, suggestions,
and testing on the larger patch set of which this is a part.

Discussion: http://postgr.es/m/CAFiTN-snhaKkWhi2Gz5i3cZeKefun6sYL==wBoqqnTXxX4_mFA@mail.gmail.com
Discussion: http://postgr.es/m/202311141312.u4qx5gtpvfq3@alvherre.pgsql
2023-11-14 10:51:05 -05:00
Dean Rasheed 519fc1bd9e Support +/- infinity in the interval data type.
This adds support for infinity to the interval data type, using the
same input/output representation as the other date/time data types
that support infinity. This allows various arithmetic operations on
infinite dates, timestamps and intervals.

The new values are represented by setting all fields of the interval
to INT32/64_MIN for -infinity, and INT32/64_MAX for +infinity. This
ensures that they compare as less/greater than all other interval
values, without the need for any special-case comparison code.

Note that, since those 2 values were formerly accepted as legal finite
intervals, pg_upgrade and dump/restore from an old database will turn
them from finite to infinite intervals. That seems OK, since those
exact values should be extremely rare in practice, and they are
outside the documented range supported by the interval type, which
gives us a certain amount of leeway.

Bump catalog version.

Joseph Koshakow, Jian He, and Ashutosh Bapat, reviewed by me.

Discussion: https://postgr.es/m/CAAvxfHea4%2BsPybKK7agDYOMo9N-Z3J6ZXf3BOM79pFsFNcRjwA%40mail.gmail.com
2023-11-14 10:58:49 +00:00
Peter Eisentraut 3849fe7c2b Replace Gen_dummy_probes.sed with Gen_dummy_probes.pl
To generate a dummy probes.h file when dtrace is not available, we had
two different scripts: A sed version, which is the original version,
and a Perl version, which was generated by s2p.  This split was
necessary because Perl was not a mandatory build dependency on Unix,
but sed was not guaranteed to be available on Windows.

(The Meson build system used the sed version even on Windows, which
was probably incorrect and probably would have had to be fixed before
elevating that build system from experimental status.)

As of 721856ff24, Perl is a required build dependency, so this split
is no longer necessary.  We can just use the Perl script in all build
environments and remove a whole bunch of infrastructure to keep the
two variants in sync.

The new Gen_dummy_probes.pl is not the version generated by s2p but a
new implementation written by hand by adapting the sed version to Perl
syntax.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/3fd0f1bc-4483-4ba9-8aa0-64765b052039@eisentraut.org
2023-11-14 10:27:10 +01:00
Michael Paquier e5cca6288a Add support for pg_stat_reset_slru without argument
pg_stat_reset_slru currently requires an input argument, either:
- NULL to reset the SLRU counters of everything.
- A specific value to reset a single SLRU cache.

This commit adds support for a new pattern: pg_stat_reset_slru without
any argument works the same way as pg_stat_reset_slru(NULL), relying on
a DEFAULT in the function definition to handle this case.  This makes
the function more consistent with 23c8c0c8f4.

Bump catalog version.

Author: Bharath Rupireddy
Reviewed-by: Atsushi Torikoshi
Discussion: https://postgr.es/m/CALj2ACW1VizYg01EeH_cA-7qA+4NzWVAoZ5Lw9_XYO1RRHAZbA@mail.gmail.com
2023-11-14 09:50:52 +09:00
Tom Lane 83472de606 Improve readability and error detection of array_in().
Rewrite array_in() and its subroutines so that we make only one
pass over the input text, rather than two.  This requires
potentially re-pallocing the working arrays values[] and nulls[]
larger than our initial guess, but that cost will hopefully be made
up by avoiding duplicate parsing.  In any case this coding seems
much clearer and more straightforward than what we had before.

This also fixes array_in() to reject non-rectangular input (that is,
different brace depths in different parts of the input) more reliably
than before, and to give a better error message when it does so.
This is analogous to the plpython and plperl fixes in 0553528e7 and
f47004add.  Like those PLs, we now accept input such as '{{},{}}'
as a valid representation of an empty array, which we did not before.

Additionally, reject explicit array subscripts that are outside the
integer range (previously you just got whatever atoi() converted
them to), and make some other minor improvements in error reporting.

Although this is arguably a bug fix, it's also a behavioral change
that might trip somebody up, so no back-patch.

Tom Lane, Heikki Linnakangas, and Jian He.  Thanks to Alexander Lakhin
for the initial report and for review/testing.

Discussion: https://postgr.es/m/2794005.1683042087@sss.pgh.pa.us
2023-11-13 13:01:51 -05:00
Bruce Momjian acc95f29ef Add error about the use of FREEZE in COPY TO
Also clarify some other error wording.

Reported-by: Kyotaro Horiguchi

Discussion: https://postgr.es/m/20220802.133046.1941977979333284049.horikyota.ntt@gmail.com

Backpatch-through: master
2023-11-13 12:53:03 -05:00
Tom Lane 5c62ecf6ec Don't release index root page pin in ginFindParents().
It's clearly stated in the comments that ginFindParents() must keep
the pin on the index's root page that's associated with the topmost
GinBtreeStack item.  However, the code path for the case that the
desired downlink has been pushed down to the next index level
ignored this proviso, and would release the pin anyway if we were
still examining the root level.  That led to an assertion failure
or "buffer NNNN is not owned by resource owner" error later, when
we try to release the pin again at the end of the insertion.

This is quite hard to reproduce, since it can only happen if an
index root page split occurs concurrently with our own insertion.
Thanks to Jeff Janes for finding a test case that triggers it
often enough to allow investigation.

This has been there since the beginning of GIN, so back-patch
to all supported branches.

Discussion: https://postgr.es/m/CAMkU=1yCAKtv86dMrD__Ja-7KzjE=uMeKX8y__cx5W-OEWy2ow@mail.gmail.com
2023-11-13 11:44:35 -05:00
Etsuro Fujita 06e8e71e7f Remove incorrect file reference in comment.
Commit b7eda3e0e moved XidInMVCCSnapshot() from tqual.c into snapmgr.c,
but follow-up commit c91560def incorrectly updated this reference.  We
could fix it, but as pointed out by Daniel Gustafsson, 1) the reader can
easily find the file that contains the definition of that function, e.g.
by grepping, and 2) this kind of reference is prone to going stale; so
let's just remove it.

Back-patch to all supported branches.

Reviewed by Daniel Gustafsson.

Discussion: https://postgr.es/m/CAPmGK145VdKkPBLWS2urwhgsfidbSexwY-9zCL6xSUJH%2BBTUUg%40mail.gmail.com
2023-11-13 19:05:00 +09:00
Amit Kapila 861f86beea Use REGBUF_NO_CHANGE at one more place in the hash index.
Commit 00d7fb5e2e started to use REGBUF_NO_CHANGE at a few places in the
code where we register the buffer before marking it dirty but missed
updating one of the code flows in the hash index where we free the overflow
page without any live tuples on it.

Author: Amit Kapila and Hayato Kuroda
Discussion: http://postgr.es/m/f045c8f7-ee24-ead6-3679-c04a43d21351@gmail.com
2023-11-13 14:08:26 +05:30
Michael Paquier 7606175991 Extend sendFileWithContent() to handle custom content length in basebackup.c
sendFileWithContent() previously got the content length by using
strlen(), assuming that the content given is always a string.  Some
patches are under discussion to pass binary contents to a base backup
stream, where an arbitrary length needs to be given by the caller
instead.

The patch extends sendFileWithContent() to be able to handle this case,
where len < 0 can be used to indicate an arbitrary length rather than
rely on strlen() for the content length.

A comment in sendFileWithContent() mentioned the backup_label file.
However, this routine is used by more file types, like the tablespace
map, so adjust it in passing.

Author: David Steele
Discussion: https://postgr.es/m/2daf8adc-8db7-4204-a7f2-a7e94e2bfa4b@pgmasters.net
2023-11-13 08:26:44 +09:00
Michael Paquier 23c8c0c8f4 Add ability to reset all shared stats types in pg_stat_reset_shared()
Currently, pg_stat_reset_shared() can use an argument to specify the
target of statistics to reset, doing nothing for NULL as it is strict.

This patch adds to pg_stat_reset_shared() the possibility to reset all
the stats types already handled in this function rather than do nothing
if the argument value given is NULL or if nothing is specified
(proisstrict is switched to false).  Like previously, SLRUs are not
included in what gets reset.

The idea to use NULL or no argument to control if all the shared stats
already covered by this function should be reset has been proposed by
Andres Freund.

Bump catalog version.

Author: Atsushi Torikoshi
Reviewed-by: Kyotaro Horiguchi, Michael Paquier, Bharath Rupireddy,
Matthias van de Meent
Discussion: https://postgr.es/m/4291a55137ddda77cf7cc5f46e846daf@oss.nttdata.com
2023-11-12 16:43:12 +09:00
Alexander Korotkov b7f315c9d7 Fix how SJE checks against PHVs
It seems that a PHV evaluated/needed at or below the self join should not have
a problem if we remove the self join.  But this requires further investigation.
For now, we just do not remove self joins if the rel to be removed is laterally
referenced by PHVs.

Discussion: https://postgr.es/m/CAMbWs4-ns73VF9gi37q61G3dS6Xuos+HtryMaBh37WQn=BsaJw@mail.gmail.com
Author: Richard Guo
2023-11-10 22:46:46 +02:00
Amit Kapila 8bfb231b43 Prohibit max_slot_wal_keep_size to value other than -1 during upgrade.
We don't want existing slots in the old cluster to get invalidated during
the upgrade. During an upgrade, we set this variable to -1 via the command
line in an attempt to prevent such invalidations, but users have ways to
override it. This patch ensures the value is not overridden by the user.

Author: Kyotaro Horiguchi
Reviewed-by: Peter Smith, Alvaro Herrera
Discussion: http://postgr.es/m/20231027.115759.2206827438943188717.horikyota.ntt@gmail.com
2023-11-10 08:45:01 +05:30
Tom Lane 36f5594c0f Fix computation of varnullingrels when const-folding field selection.
We can simplify FieldSelect on a whole-row Var into a plain Var
for the selected field.  However, we should copy the whole-row Var's
varnullingrels when we do so, because the new Var is clearly nullable
by exactly the same rels as the original.  Failure to do this led to
errors like "wrong varnullingrels (b) (expected (b 3)) for Var 2/2".

Richard Guo, per bug #18184 from Marian Krucina.  Back-patch to
v16 where varnullingrels was introduced.

Discussion: https://postgr.es/m/18184-5868dd258782058e@postgresql.org
2023-11-09 15:46:16 -05:00
Alexander Korotkov b44a1708ab Fix the way SJE removes references from PHVs
Add missing replacement of relids in phv->phexpr.  Also, remove extra
replace_relid() over phv->phrels.

Reported-by:  Zuming Jiang
Bug: #18187
Discussion: https://postgr.es/m/flat/18187-831da249cbd2ff8e%40postgresql.org
Author: Richard Guo
Reviewed-by: Andrei Lepikhov
2023-11-09 14:25:13 +02:00
Dean Rasheed 3850d4dec1 Avoid integer overflow hazard in interval_time().
When casting an interval to a time, the original code suffered from
64-bit integer overflow for inputs with a sufficiently large negative
"time" field, leading to bogus results.

Fix by rewriting the algorithm in a simpler form, that more obviously
cannot overflow. While at it, improve the test coverage to include
negative interval inputs.

Discussion: https://postgr.es/m/CAEZATCXoUKHkcuq4q63hkiPsKZJd0kZWzgKtU%2BNT0aU4wbf_Pw%40mail.gmail.com
2023-11-09 12:10:14 +00:00
Dean Rasheed a4f7d33a90 Fix AFTER ROW trigger execution in MERGE cross-partition update.
When executing a MERGE UPDATE action, if the UPDATE is turned into a
cross-partition DELETE then INSERT, do not attempt to invoke AFTER
UPDATE ROW triggers, or any of the other post-update actions in
ExecUpdateEpilogue().

For consistency with a plain UPDATE command, such triggers should not
be fired (and typically fail anyway), and similarly, other post-update
actions, such as WCO/RLS checks should not be executed, and might also
lead to unexpected failures.

Therefore, as with ExecUpdate(), make ExecMergeMatched() return
immediately if ExecUpdateAct() reports that a cross-partition update
was done, to be sure that no further processing is done for that
tuple.

Back-patch to v15, where MERGE was introduced.

Discussion: https://postgr.es/m/CAEZATCWjBgagyNZs02vgDF0DvASYj-iHTFtXG2-nP3orZhmtcw%40mail.gmail.com
2023-11-09 11:23:42 +00:00
David Rowley 10d34fefc2 Ensure we use the correct spelling of "ensure"
We seem to have accidentally used "insure" in a few places.  Correct
that.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pv0biqrhA3pMhu40aDsj343mTsD75khKnHsLqR8P04f=Q@mail.gmail.com
Backpatch-through: 12, oldest supported version
2023-11-10 00:15:54 +13:00
Heikki Linnakangas 8f4a1ab471 Fix bug in the new ResourceOwner implementation.
When the hash table is in use, ResoureOwnerSort() moves any elements
from the small fixed-size array to the hash table, and sorts it. When
the hash table is not in use, it sorts the elements in the small
fixed-size array directly. However, ResourceOwnerSort() and
ResourceOwnerReleaseAll() had different idea on when the hash table is
in use: ResourceOwnerSort() checked owner->nhash != 0, and
ResourceOwnerReleaseAll() checked owner->hash != NULL. If the hash
table was allocated but was currently empty, you hit an assertion
failure.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://www.postgresql.org/message-id/be58d565-9e95-d417-4e47-f6bd408dea4b@gmail.com
2023-11-09 01:33:14 +02:00
Alvaro Herrera b0f7dd915b
Check stack depth in new recursive functions
Commit b0e96f3119 introduced a bunch of recursive functions, but
failed to make them check for stack depth.  This can cause the backend
to crash when operating on inheritance hierarchies several thousands
deep.  Protect the code by adding the missing stack depth checks.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/b2ac2392-9727-5f76-e890-721ac80c1615@gmail.com
2023-11-08 18:44:54 +01:00
Heikki Linnakangas 954e43564d Use a faster hash function in resource owners.
This buys back some of the performance loss that we otherwise saw from the
previous commit.

Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/d746cead-a1ef-7efe-fb47-933311e876a3%40iki.fi
2023-11-08 13:30:52 +02:00
Heikki Linnakangas b8bff07daa Make ResourceOwners more easily extensible.
Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify
the ResourceOwnerData struct. In particular, this makes it possible
for extensions to register custom resource kinds.

The old approach was to have a small array of resources of each kind,
and if it fills up, switch to a hash table. The new approach also uses
an array and a hash, but now the array and the hash are used at the
same time. The array is used to hold the recently added resources, and
when it fills up, they are moved to the hash. This keeps the access to
recent entries fast, even when there are a lot of long-held resources.

All the resource-specific ResourceOwnerEnlarge*(),
ResourceOwnerRemember*(), and ResourceOwnerForget*() functions have
been replaced with three generic functions that take resource kind as
argument. For convenience, we still define resource-specific wrapper
macros around the generic functions with the old names, but they are
now defined in the source files that use those resource kinds.

The release callback no longer needs to call ResourceOwnerForget on
the resource being released. ResourceOwnerRelease unregisters the
resource from the owner before calling the callback. That needed some
changes in bufmgr.c and some other files, where releasing the
resources previously always called ResourceOwnerForget.

Each resource kind specifies a release priority, and
ResourceOwnerReleaseAll releases the resources in priority order. To
make that possible, we have to restrict what you can do between
phases. After calling ResourceOwnerRelease(), you are no longer
allowed to remember any more resources in it or to forget any
previously remembered resources by calling ResourceOwnerForget.  There
was one case where that was done previously. At subtransaction commit,
AtEOSubXact_Inval() would handle the invalidation messages and call
RelationFlushRelation(), which temporarily increased the reference
count on the relation being flushed. We now switch to the parent
subtransaction's resource owner before calling AtEOSubXact_Inval(), so
that there is a valid ResourceOwner to temporarily hold that relcache
reference.

Other end-of-xact routines make similar calls to AtEOXact_Inval()
between release phases, but I didn't see any regression test failures
from those, so I'm not sure if they could reach a codepath that needs
remembering extra resources.

There were two exceptions to how the resource leak WARNINGs on commit
were printed previously: llvmjit silently released the context without
printing the warning, and a leaked buffer io triggered a PANIC. Now
everything prints a WARNING, including those cases.

Add tests in src/test/modules/test_resowner.

Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 13:30:50 +02:00
Heikki Linnakangas b70c2143bb Move a few ResourceOwnerEnlarge() calls for safety and clarity.
These are functions where a lot of things happen between the
ResourceOwnerEnlarge and ResourceOwnerRemember calls. It's important
that there are no unrelated ResourceOwnerRemember calls in the code in
between, otherwise the reserved entry might be used up by the
intervening ResourceOwnerRemember and not be available at the intended
ResourceOwnerRemember call anymore. I don't see any bugs here, but the
longer the code path between the calls is, the harder it is to verify.

In bufmgr.c, there is a function similar to ResourceOwnerEnlarge,
ReservePrivateRefCountEntry(), to ensure that the private refcount
array has enough space. The ReservePrivateRefCountEntry() calls were
made at different places than the ResourceOwnerEnlargeBuffers()
calls. Move the ResourceOwnerEnlargeBuffers() and
ReservePrivateRefCountEntry() calls together for consistency.

Reviewed-by: Aleksander Alekseev, Michael Paquier, Julien Rouhaud
Reviewed-by: Kyotaro Horiguchi, Hayato Kuroda, Álvaro Herrera, Zhihong Yu
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://www.postgresql.org/message-id/cbfabeb0-cd3c-e951-a572-19b365ed314d%40iki.fi
2023-11-08 13:30:46 +02:00