Commit Graph

25191 Commits

Author SHA1 Message Date
David Rowley 7e0ade0ffe Allow Gather Merge in more cases for parallel DISTINCT
Here we adjust the partial path generation for parallel DISTINCT queries
to add Sort nodes on top of any unsorted partial distinct paths.

This increases the likelihood of the planner pushing a Sort below a Gather
Merge which enables the final phase of the parallel distinct to be
implemented using a Unique node in more cases.

Sorting the partial distinct paths is particularly useful when the
DISTINCT query has an ORDER BY and LIMIT clause as this can allow cheaper
plans by having the workers Hash Aggregate then Sort before feeding the
results into the Gather Merge.  The non-parallel portion of the plan then
becomes very cheap as it leaves only Unique and Limit to do in the leader
process.

Author: Richard Guo
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/CAMbWs48u9VoVOouJsys1qOaC9WVGVmBa+wT1dx8KvxF5GPzezA@mail.gmail.com
2024-02-03 00:20:18 +13:00
Noah Misch 0b6517a3b7 Sync PG_VERSION file in CREATE DATABASE.
An OS crash could leave PG_VERSION empty or missing.  The same symptom
appeared in a backup by block device snapshot, taken after the next
checkpoint and before the OS flushes the PG_VERSION blocks.  Device
snapshots are not a documented backup method, however.  Back-patch to
v15, where commit 9c08aea6a3 introduced
STRATEGY=WAL_LOG and made it the default.

Discussion: https://postgr.es/m/20240130195003.0a.nmisch@google.com
2024-02-01 13:44:19 -08:00
Noah Misch df220714e5 Handle interleavings between CREATE DATABASE steps and base backup.
Restoring a base backup taken in the middle of CreateDirAndVersionFile()
or write_relmap_file() would lose the function's effects.  The symptom
was absence of the database directory, PG_VERSION file, or
pg_filenode.map.  If missing the directory, recovery would fail.  Either
missing file would not fail recovery but would render the new database
unusable.  Fix CreateDirAndVersionFile() with the transam/README "action
first and then write a WAL entry" strategy.  That has a side benefit of
moving filesystem mutations out of a critical section, reducing the ways
to PANIC.  Fix the write_relmap_file() call with a lock acquisition, so
it interacts with checkpoints like non-CREATE DATABASE calls do.
Back-patch to v15, where commit 9c08aea6a3
introduced STRATEGY=WAL_LOG and made it the default.

Discussion: https://postgr.es/m/20240130195003.0a.nmisch@google.com
2024-02-01 13:44:19 -08:00
Michael Paquier 235c09efbb Fix stats_fetch_consistency with stats for fixed-numbered objects
This impacts the statistics retrieved in transactions for the following
views when updating the value of stats_fetch_consistency, leading to
behaviors contrary to what is documented since 605994651b as an update
of this parameter should discard all statistics snapshot data:
- pg_stat_archiver
- pg_stat_bgwriter
- pg_stat_checkpointer
- pg_stat_io
- pg_stat_slru
- pg_stat_wal

For example, updating stats_fetch_consistency from "snapshot" to "cache"
in a transaction did not re-fetch any fresh data, using data cached from
the time when "snapshot" was in use.

Author: Shinya Kato
Discussion: https://postgr.es/m/d77fc5190d4dbe1738d77231488e768b@oss.nttdata.com
Backpatch-through: 15
2024-02-01 17:12:50 +09:00
Alvaro Herrera 402388946f
Fix copy&paste typo in comment 2024-01-31 23:11:53 +01:00
David Rowley 9d1a5354f5 Fix costing bug in MergeAppend
When building a MergeAppendPath which has child paths that are not
sorted correctly for the MergeAppend's sort order, we apply the cost of
sorting those paths to the MergeAppendPath costs.

Here we fix a bug where the number of tuples specified that needed to be
sorted was effectively pg_class.reltuples rather than the number of
expected row in the subpath.  This effectively penalizes MergeAppend
plans any time any filter is present on the MergeAppend subpath as the
sort cost added is to sort all tuples in the table rather than just the
ones expected the path to return.

This did not affect UNION ALL type queries as the RelOptInfo tuples is
set from the subquery's path rows.  It does affect MergeAppends uses for
inheritance and partitioned tables.

This is a long-standing bug introduced when MergeAppend was first added
in 11cad29c9.  No backpatch as this could result in plan changes.

Author: Alexander Kuzmenkov
Reviewed-by: Ashutosh Bapat, Aleksander Alekseev, David Rowley
Discussion: https://postgr.es/m/CALzhyqyhoXQDR-Usd_0HeWk%3DuqNLzoVeT8KhRoo%3DpV_KzgO3QQ%40mail.gmail.com
2024-02-01 09:48:26 +13:00
Robert Haas ea18eb7d62 Revise pg_walsummary's 002_blocks test to avoid spurious failures.
Analysis of buildfarm results showed that the code that was intended
to wait for the inserts performed by this test to complete did not
actually do so. Try to make that logic more robust.

Improve error checking elsewhere in the script, too, so that we
don't miss things like poll_query_until failing.

Along the way, fix a bit of pgindent damage introduced by commit
5ddf997347, which aimed to help us
debug the failures that this commit is trying to fix. It's making
the buildfarm sad.

Discussion: http://postgr.es/m/CA+TgmobWFb8NqyfC31YnKAbZiXf9tLuwmyuvx=iYMXMniPQ4nw@mail.gmail.com
2024-01-31 10:12:53 -05:00
Heikki Linnakangas 21d9c3ee4e Give SMgrRelation pointers a well-defined lifetime.
After calling smgropen(), it was not clear how long you could continue
to use the result, because various code paths including cache
invalidation could call smgrclose(), which freed the memory.

Guarantee that the object won't be destroyed until the end of the
current transaction, or in recovery, the commit/abort record that
destroys the underlying storage.

smgrclose() is now just an alias for smgrrelease(). It closes files
and forgets all state except the rlocator, but keeps the SMgrRelation
object valid.

A new smgrdestroy() function is used by rare places that know there
should be no other references to the SMgrRelation.

The short version:

 * smgrclose() is now just an alias for smgrrelease(). It releases
   resources, but doesn't destroy until EOX
 * smgrdestroy() now frees memory, and should rarely be used.

Existing code should be unaffected, but it is now possible for code that
has an SMgrRelation object to use it repeatedly during a transaction as
long as the storage hasn't been physically dropped.  Such code would
normally hold a lock on the relation.

This also replaces the "ownership" mechanism of SMgrRelations with a
pin counter.  An SMgrRelation can now be "pinned", which prevents it
from being destroyed at end of transaction.  There can be multiple pins
on the same SMgrRelation.  In practice, the pin mechanism is only used
by the relcache, so there cannot be more than one pin on the same
SMgrRelation.  Except with swap_relation_files XXX

Author: Thomas Munro, Heikki Linnakangas
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://www.postgresql.org/message-id/CA%2BhUKGJ8NTvqLHz6dqbQnt2c8XCki4r2QvXjBQcXpVwxTY_pvA@mail.gmail.com
2024-01-31 12:31:02 +02:00
Heikki Linnakangas 6a8ffe812d Remove some obsolete smgrcloseall() calls.
Before the advent of PROCSIGNAL_BARRIER_SMGRRELEASE, we didn't have a
comprehensive way to deal with Windows file handles that get in the way
of unlinking directories.  We had smgrcloseall() calls in a few places
to try to mitigate.

It's still a good idea for bgwriter and checkpointer to do that once per
checkpoint so they don't accumulate unbounded SMgrRelation objects, but
there is no longer any reason to close them at other random places such
as the error path, and the explanation as given in the comments is now
obsolete.

Author: Thomas Munro
Reviewed-by: Heikki Linnakangas, Robert Haas
Discussion: https://www.postgresql.org/message-id/CA%2BhUKGJ8NTvqLHz6dqbQnt2c8XCki4r2QvXjBQcXpVwxTY_pvA@mail.gmail.com
2024-01-31 11:40:29 +02:00
David Rowley b588cad688 Consider the "LIMIT 1" optimization with parallel DISTINCT
Similar to what was done in 5543677ec for non-parallel DISTINCT, apply
the same optimization when the distinct_pathkeys are empty for the
partial paths too.

This can be faster than the non-parallel version when the first row
matching the WHERE clause of the query takes a while to find.  Parallel
workers could speed that process up considerably.

Author: Richard Guo
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/CAMbWs49JC0qvfUbzs-TVzgMpSSBiMJ_6sN=BaA9iohBgYkr=LA@mail.gmail.com
2024-01-31 17:22:02 +13:00
Michael Paquier 3e91dba8b0 Fix various issues with ALTER TEXT SEARCH CONFIGURATION
This commit addresses a set of issues when changing token type mappings
in a text search configuration when using duplicated token names:
- ADD MAPPING would fail on insertion because of a constraint failure
after inserting the same mapping.
- ALTER MAPPING with an "overridden" configuration failed with "tuple
already updated by self" when the token mappings are removed.
- DROP MAPPING failed with "tuple already updated by self", like
previously, but in a different code path.

The code is refactored so the token names (with their numbers) are
handled as a List with unique members rather than an array with numbers,
ensuring that no duplicates mess up with the catalog inserts, updates
and deletes.  The list is generated by getTokenTypes(), with the same
error handling as previously while duplicated tokens are discarded from
the list used to work on the catalogs.

Regression tests are expanded to cover much more ground for the cases
fixed by this commit, as there was no coverage for the code touched in
this commit.  A bit more is done regarding the fact that a token name
not supported by a configuration's parser should result in an error even
if IF EXISTS is used in a DROP MAPPING clause.  This is implied in the
code but there was no coverage for that, and it was very easy to miss.

These issues exist since at least their introduction in core with
140d4ebcb4, so backpatch all the way down.

Reported-by: Alexander Lakhin
Author: Tender Wang, Michael Paquier
Discussion: https://postgr.es/m/18310-1eb233c5908189c8@postgresql.org
Backpatch-through: 12
2024-01-31 13:15:21 +09:00
David Rowley 8ee9c25087 Simplify partial path generation in GROUP BY/ORDER BY
Here we consolidate the generation of partial sort and partial incremental
sort paths in a similar way to what was done in 4a29eabd1.  Since the cost
penalty for incremental sort was removed by that commit, there's no
point in creating a sort path on the cheapest partial path if an
incremental sort could be done instead.

This has the added benefit of reducing the amount of code required to
build these paths.

Author: Richard Guo
Reviewed-by: Etsuro Fujita, Shubham Khanna, David Rowley
Discussion: https://postgr.es/m/CAMbWs49PaKxBZU9cN7k3DKB7id+YfGfOfS9H_Fo5tkqPMt=fDg@mail.gmail.com
2024-01-31 10:10:59 +13:00
Alvaro Herrera 7b745d85b8
Split use of SerialSLRULock, creating SerialControlLock
predicate.c has been using SerialSLRULock (the control lock for its SLRU
structure) to coordinate access to SerialControlData, another of its
numerous shared memory structures; this is unnecessary and confuses
further SLRU scalability work.  Create a separate LWLock to cover
SerialControlData.

Extracted from a larger patch from the same author, and some additional
changes by Álvaro.

Author: Dilip Kumar <dilip.kumar@enterprisedb.com>
Discussion: https://postgr.es/m/CAFiTN-vzDvNz=ExGXz6gdyjtzGixKSqs0mKHMmaQ8sOSEFZ33A@mail.gmail.com
2024-01-30 18:11:17 +01:00
Amit Kapila 776621a5e4 Add a failover option to subscriptions.
This commit introduces a new subscription option named 'failover', which
provides users with the ability to set the failover property of the
replication slot on the publisher when creating or altering a
subscription.

This uses the replication commands introduced by commit 7329240437 to
enable the failover option for a logical replication slot.

If the failover option is set to true, the associated replication slots
(i.e. the main slot and the table sync slots) in the upstream database are
enabled to be synchronized to the standbys. Note that the capability to
sync the replication slots will be added in subsequent commits.

Thanks to Masahiko Sawada for the design inputs.

Author: Shveta Malik, Hou Zhijie, Ajin Cherian
Reviewed-by: Peter Smith, Bertrand Drouvot, Dilip Kumar, Masahiko Sawada, Nisha Moond, Kuroda Hayato, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
2024-01-30 16:49:28 +05:30
Peter Eisentraut 4c48c0fe56 Fix incorrect format placeholders for Oid 2024-01-30 09:11:41 +01:00
David Rowley 57f59396bb Delay build of Memoize hash table until executor run
Previously this hash table was built during executor startup.  This
could cause long delays in EXPLAIN (without ANALYZE) when the planner
opts to use a large Memoize hash table.

No backpatch for now due to lack of complaints.

Author: David Rowley
Discussion: https://postgr.es/m/CAApHDvoJktJ5XL=Kjh2a2TFr64R-7eQZV-+jcJrUwoES2GLiWg@mail.gmail.com
2024-01-30 12:37:03 +13:00
David Rowley c85977d8fe Doc: mention foreign keys can reference unique indexes
We seem to have only documented a foreign key can reference the columns of
a primary key or unique constraint.  Here we adjust the documentation
to mention columns in a non-partial unique index can be mentioned too.

The header comment for transformFkeyCheckAttrs() also didn't mention
unique indexes, so fix that too.  In passing make that header comment
reflect reality in the various other aspects where it deviated from it.

Bug: 18295
Reported-by: Gilles PARC
Author: Laurenz Albe, David Rowley
Discussion: https://www.postgresql.org/message-id/18295-0ed0fac5c9f7b17b%40postgresql.org
Backpatch-through: 12
2024-01-30 10:15:17 +13:00
Tom Lane 400928b83b Fix incompatibilities with libxml2 >= 2.12.0.
libxml2 changed the required signature of error handler callbacks
to make the passed xmlError struct "const".  This is causing build
failures on buildfarm member caiman, and no doubt will start showing
up in the field quite soon.  Add a version check to adjust the
declaration of xml_errorHandler() according to LIBXML_VERSION.

2.12.x also produces deprecation warnings for contrib/xml2/xpath.c's
assignment to xmlLoadExtDtdDefaultValue.  I see no good reason for
that to still be there, seeing that we disabled external DTDs (at a
lower level) years ago for security reasons.  Let's just remove it.

Back-patch to all supported branches, since they might all get built
with newer libxml2 once it gets a bit more popular.  (The back
branches produce another deprecation warning about xpath.c's use of
xmlSubstituteEntitiesDefault().  We ought to consider whether to
back-patch all or part of commit 65c5864d7 to silence that.  It's
less urgent though, since it won't break the buildfarm.)

Discussion: https://postgr.es/m/1389505.1706382262@sss.pgh.pa.us
2024-01-29 12:06:13 -05:00
Alvaro Herrera 5de890e361
Add EXPLAIN (MEMORY) to report planner memory consumption
This adds a new "Memory:" line under the "Planning:" group (which
currently only has "Buffers:") when the MEMORY option is specified.

In order to make the reporting reasonably accurate, we create a separate
memory context for planner activities, to be used only when this option
is given.  The total amount of memory allocated by that context is
reported as "allocated"; we subtract memory in the context's freelists
from that and report that result as "used".  We use
MemoryContextStatsInternal() to obtain the quantities.

The code structure to show buffer usage during planning was not in
amazing shape, so I (Álvaro) modified the patch a bit to clean that up
in passing.

Author: Ashutosh Bapat
Reviewed-by: David Rowley, Andrey Lepikhov, Jian He, Andy Fan
Discussion: https://www.postgresql.org/message-id/CAExHW5sZA=5LJ_ZPpRO-w09ck8z9p7eaYAqq3Ks9GDfhrxeWBw@mail.gmail.com
2024-01-29 17:53:03 +01:00
Heikki Linnakangas 6a1ea02c49 Fix locking when fixing an incomplete split of a GIN internal page
ginFinishSplit() expects the caller to hold an exclusive lock on the
buffer, but when finishing an earlier "leftover" incomplete split of
an internal page, the caller held a shared lock. That caused an
assertion failure in MarkBufferDirty(). Without assertions, it could
lead to corruption if two backends tried to complete the split at the
same time.

On master, add a test case using the new injection point facility.

Report and analysis by Fei Changhong. Backpatch the fix to all
supported versions.

Reviewed-by: Fei Changhong, Michael Paquier
Discussion: https://www.postgresql.org/message-id/tencent_A3CE810F59132D8E230475A5F0F7A08C8307@qq.com
2024-01-29 13:46:22 +02:00
Peter Eisentraut 54fac0e505 Remove make function vpathsearch
This function served to support having prebuilt files in the source
tree for vpath builds.  This is no longer possible (since
721856ff24); all built files are now always in the build tree.  The
invocations of this function are no longer required.
2024-01-29 07:24:59 +01:00
Amit Kapila a9a47fb6d9 Fix comments in ReplicationSlotAcquire().
They were incorrectly referring to a slot parameter in
ReplicationSlotAcquire() which is not passed to the API.

Author: Wang Wei
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/OS3PR01MB6275E3CE4DC15FF8B8B80D3A9E7A2@OS3PR01MB6275.jpnprd01.prod.outlook.com
2024-01-29 10:12:58 +05:30
Amit Kapila 7329240437 Allow setting failover property in the replication command.
This commit implements a new replication command called
ALTER_REPLICATION_SLOT and a corresponding walreceiver API function named
walrcv_alter_slot. Additionally, the CREATE_REPLICATION_SLOT command has
been extended to support the failover option.

These new additions allow the modification of the failover property of a
replication slot on the publisher. A subsequent commit will make use of
these commands in subscription commands and will add the tests as well to
cover the functionality added/changed by this commit.

Author: Hou Zhijie, Shveta Malik
Reviewed-by: Peter Smith, Bertrand Drouvot, Dilip Kumar, Masahiko Sawada, Nisha Moond, Kuroda, Hayato, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
2024-01-29 09:37:23 +05:30
Masahiko Sawada 08e6344fd6 Remove ReorderBufferTupleBuf structure.
Since commit a4ccc1cef, the 'node' and 'alloc_tuple_size' fields of
the ReorderBufferTupleBuf structure are no longer used. This leaves
only the 'tuple' field in the structure. Since keeping a single-field
structure makes little sense, the ReorderBufferTupleBuf is removed
entirely. The code is refactored accordingly.

No back-patching since these are ABI changes in an exposed structure
and functions, and there would be some risk of breaking extensions.

Author: Aleksander Alekseev
Reviewed-by: Amit Kapila, Masahiko Sawada, Reid Thompson
Discussion: https://postgr.es/m/CAD21AoCvnuxiXXfRecp7g9+CeC35POQfhuQeJFr7_9u_Q5jc_Q@mail.gmail.com
2024-01-29 10:37:16 +09:00
Michael Paquier 50b797dc99 Fix DROP ROLE when specifying duplicated roles
This commit fixes failures with "tuple already updated by self" when
listing twice the same role and in a DROP ROLE query.

This is an oversight in 6566133c5f, that has introduced a two-phase
logic in DropRole() where dependencies of all the roles to drop are
removed in a first phase, with the roles themselves removed from
pg_authid in a second phase.

The code is simplified to not rely on a List of ObjectAddress built in
the first phase used to remove the pg_authid entries in the second
phase, switching to a list of OIDs.  Duplicated OIDs can be simply
avoided in the first phase thanks to that.  Using ObjectAddress was not
necessary for the roles as they are not used for anything specific to
dependency.c, building all the ObjectAddress in the List with
AuthIdRelationId as class ID.

In 15 and older versions, where a single phase is used, DROP ROLE with
duplicated role names would fail on "role \"blah\" does not exist" for
the second entry after the CCI() done by the first deletion.  This is
not really incorrect, but it does not seem worth changing based on a
lack of complaints.

Reported-by: Alexander Lakhin
Reviewed-by: Tender Wang
Discussion: https://postgr.es/m/18310-1eb233c5908189c8@postgresql.org
Backpatch-through: 16
2024-01-29 08:05:59 +09:00
Tom Lane 5e444a2526 Compare varnullingrels too in assign_param_for_var().
Oversight in 2489d76c4.  Preliminary analysis suggests that the
problem may be unreachable --- but if we did have instances of
the same column with different varnullingrels, we'd surely need
to treat them as different Params.

Discussion: https://postgr.es/m/412552.1706203379@sss.pgh.pa.us
2024-01-26 15:54:17 -05:00
Tom Lane 25cd2d6402 Detect Julian-date overflow in timestamp[tz]_pl_interval.
We perform addition of the days field of an interval via
arithmetic on the Julian-date representation of the timestamp's date.
This step is subject to int32 overflow, and we also should not let
the Julian date become very negative, for fear of weird results from
j2date.  (In the timestamptz case, allow a Julian date of -1 to pass,
since it might convert back to zero after timezone rotation.)

The additions of the months and microseconds fields could also
overflow, of course.  However, I believe we need no additional
checks there; the existing range checks should catch such cases.
The difficulty here is that j2date's magic modular arithmetic could
produce something that looks like it's in-range.

Per bug #18313 from Christian Maurer.  This has been wrong for
a long time, so back-patch to all supported branches.

Discussion: https://postgr.es/m/18313-64d2c8952d81e84b@postgresql.org
2024-01-26 13:39:45 -05:00
Robert Haas 5ddf997347 Temporary patch to help debug pg_walsummary test failures.
The tests in 002_blocks.pl are failing in the buildfarm from time to
time, but we don't know how to reproduce the failure elsewhere. The
most obvious explanation seems to be the unexpected disappearance of a
WAL summary file, so bump up the logging level in
RemoveWalSummaryIfOlderThan to try to help us spot such problems, and
print the cutoff time in addition to the removed filename. Also
adjust 002_blocks.pl to dump out a directory listing of the relevant
directory at various points.

This patch should be reverted once we sort out what's happening here.

Patch by me, reviewed by Nathan Bossart, who also reported the issue.

Discussion: http://postgr.es/m/20240124170846.GA2643050@nathanxps13
2024-01-26 13:25:19 -05:00
Robert Haas 5eafacd279 Combine FSM updates for prune and no-prune cases.
lazy_scan_prune() and lazy_scan_noprune() update the freespace map
with identical conditions; combine them. This consolidation is easier
now that cb970240f1 moved visibility map
updates into lazy_scan_prune().

While combining the FSM updates, simplify the logic for calling
lazy_scan_new_or_empty() and lazy_scan_noprune().

Also update a few comemnts in this part of the code to make them,
hopefully, clearer.

Melanie Plageman and Robert Haas

Discussion: https://postgr.es/m/CA%2BTgmoaLTvipm%3Dxx4rJLr07m908PCu%3DQH3uCjD1UOn8YaEuO2g%40mail.gmail.com
2024-01-26 11:40:16 -05:00
Peter Eisentraut f7cf9494ba Split some code out from MergeAttributes()
- Separate function to merge a child attribute into matching inherited
  attribute: The logic to merge a child attribute into matching
  inherited attribute in MergeAttribute() is only applicable to
  regular inheritance child.  The code is isolated and coherent enough
  that it can be separated into a function of its own.

- Separate function to merge next parent attribute: Partitions inherit
  from only a single parent.  The logic to merge an attribute from the
  next parent into the corresponding attribute inherited from previous
  parents in MergeAttribute() is only applicable to regular
  inheritance children.  This code is isolated enough that it can be
  separate into a function by itself.

These separations makes MergeAttribute() more readable by making it
easier to follow high level logic without getting entangled into
details.

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e4da@eisentraut.org
2024-01-26 13:52:05 +01:00
Alvaro Herrera 8c9da1441d
Make spelling of cancelled/cancellation consistent
This fixes places where words derived from cancel were not using their
common en-US ugly^H^H^H^Hspelling.

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reported-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA+hUKG+Lrq+ty6yWXF5572qNQ8KwxGwG5n4fsEcCUap685nWvQ@mail.gmail.com
2024-01-26 12:38:15 +01:00
Peter Eisentraut 64444ce071 MergeAttributes code deduplication
The code handling NOT NULL constraints is duplicated in blocks merging
the attribute definition incrementally.  Deduplicate that code.

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e4da@eisentraut.org
2024-01-26 11:05:10 +01:00
Michael Paquier f2bf8fb048 Reindex toast before its main relation in reindex_relation()
This commit changes the order of reindex on a relation so as a toast
relation is processed before its main relation.

The original order, where a rebuild was first done for the indexes on
the main table, could be a problem in the event of a corruption of a
toast index, because, as scans of a toast index may be required to
rebuild the indexes on the main relation, this could lead to failures
with REINDEX TABLE without being able to fix anything.

Rebuilding corrupted toast indexes before this change was possible but
troublesome, as it was necessary to issue a REINDEX on the toast
relation first, followed by a REINDEX on the main relation.  Changing
the order of these operations should make things easier when rebuilding
corrupted indexes, as toast indexes would be rebuilt before they are
used for the indexes on the main relation.

Per request from Richard Vesely.

Author: Gurjeet Singh
Reviewed-by: Nathan Bossart, Michael Paquier
Discussion: https://postgr.es/m/18016-2bd9b549b1fe49b3@postgresql.org
2024-01-26 17:39:58 +09:00
David Rowley bc397e5cdb De-dupicate Memoize cache keys
It was possible when determining the cache keys for a Memoize path that
if the same expr appeared twice in the parameterized path's ppi_clauses
and/or in the Nested Loop's inner relation's lateral_vars.  If this
happened the Memoize node's cache keys would contain duplicates.  This
isn't a problem for correctness, all it means is that the cache lookups
will be suboptimal due to having redundant work to do on every hash table
lookup and insert.

Here we adjust paraminfo_get_equal_hashops() to look for duplicates and
ignore them when we find them.

Author: David Rowley
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/422277.1706207562%40sss.pgh.pa.us
2024-01-26 20:51:36 +13:00
Michael Paquier bd5760df38 Fix comment in index.c
Extracted from a larger patch by the same author.

Author: Gurjeet Singh
Discussion: https://postgr.es/m/CABwTF4WX=m5pQvKXvLFJoEH=hSd6O=iZSqxVqHKjFm+iL-AO=w@mail.gmail.com
2024-01-26 14:08:04 +09:00
David Rowley 2cca95e175 Improve NestLoopParam generation for lateral subqueries
It was possible in cases where we had a LATERAL joined subquery that
when the same Var is mentioned in both the lateral references and in the
outer Vars of the scan clauses that the given Var wouldn't be assigned
to the same NestLoopParam.

This could cause issues in Memoize as the cache key would reference the
Var for the scan clauses but when the parameter for the lateral references
changed some code in Memoize would see that some other parameter had
changed that's not part of the cache key and end up purging the entire
cache as a result, thinking the cache had become stale.  This could
result in a Nested Loop -> Memoize plan being quite inefficient as, in
the worst case, the cache purging could result in never getting a cache
hit.  In no cases could this problem lead to incorrect query results.

Here we switch the order of operations so that we create NestLoopParam
for the lateral references first before doing replace_nestloop_params().
replace_nestloop_params() will find and reuse the existing NestLoopParam
in cases where the Var exists in both locations.

Author: Richard Guo
Reviewed-by: Tom Lane, David Rowley
Discussion: https://postgr.es/m/CAMbWs48XHJEK1Q1CzAQ7L9sTANTs9W1cepXu8%3DKc0quUL%2Btg4Q%40mail.gmail.com
2024-01-26 16:18:58 +13:00
Michael Paquier f2743a7d70 Revert "Add support for parsing of large XML data (>= 10MB)"
This reverts commit 2197d06224, following a discussion over a Coverity
report where issues like the "Billion laugh attack" could cause the
backend to waste CPU and memory even if a client applied checks on the
size of the data given in input, and libxml2 does not offer guarantees
that input limits are respected under XML_PARSE_HUGE.

Discussion: https://postgr.es/m/ZbHlgrPLtBZyr_QW@paquier.xyz
2024-01-26 10:15:32 +09:00
Heikki Linnakangas 376c216138 Update comment, generation mem contexts have a "keeper" block
The keeper block was introduced in commit 1b0d9aa4f7, but it forgot
to update this comment.
2024-01-26 01:04:58 +02:00
Tom Lane 8ba6fdf905 Support TZ and OF format codes in to_timestamp().
Formerly, these were only supported in to_char(), but there seems
little reason for that restriction.  We should at least have enough
support to permit round-tripping the output of to_char().

In that spirit, TZ accepts either zone abbreviations or numeric
(HH or HH:MM) offsets, which are the cases that to_char() can output.
In an ideal world we'd make it take full zone names too, but
that seems like it'd introduce an unreasonable amount of ambiguity,
since the rules for POSIX-spec zone names are so lax.

OF is a subset of this, accepting only HH or HH:MM.

One small benefit of this improvement is that we can simplify
jsonpath's executeDateTimeMethod function, which no longer needs
to consider the HH and HH:MM cases separately.  Moreover, letting
it accept zone abbreviations means it will accept "Z" to mean UTC,
which is emitted by JSON.stringify() for example.

Patch by me, reviewed by Aleksander Alekseev and Daniel Gustafsson

Discussion: https://postgr.es/m/1681086.1686673242@sss.pgh.pa.us
2024-01-25 17:47:08 -05:00
Andrew Dunstan 06a66d87db Clean up a bug in sql/json items commit 66ea94e8e6
Remove a buggy and unnecessary test, along with an unnecessary pstrdup()
and a line of dead code.

Per report, diagnosis and fix from Tom Lane

Discussion: https://postgr.es/m/439811.1706211069@sss.pgh.pa.us
2024-01-25 16:25:11 -05:00
Andrew Dunstan 66ea94e8e6 Implement various jsonpath methods
This commit implements ithe jsonpath .bigint(), .boolean(),
.date(), .decimal([precision [, scale]]), .integer(), .number(),
.string(), .time(), .time_tz(), .timestamp(), and .timestamp_tz()
methods.

.bigint() converts the given JSON string or a numeric value to
the bigint type representation.

.boolean() converts the given JSON string, numeric, or boolean
value to the boolean type representation.  In the numeric case, only
integers are allowed. We use the parse_bool() backend function
to convert a string to a bool.

.decimal([precision [, scale]]) converts the given JSON string
or a numeric value to the numeric type representation.  If precision
and scale are provided for .decimal(), then it is converted to the
equivalent numeric typmod and applied to the numeric number.

.integer() and .number() convert the given JSON string or a
numeric value to the int4 and numeric type representation.

.string() uses the datatype's output function to convert numeric
and various date/time types to the string representation.

The JSON string representing a valid date/time is converted to the
specific date or time type representation using jsonpath .date(),
.time(), .time_tz(), .timestamp(), .timestamp_tz() methods.  The
changes use the infrastructure of the .datetime() method and perform
the datatype conversion as appropriate.  Unlike the .datetime()
method, none of these methods accept a format template and use ISO
DateTime format instead.  However, except for .date(), the
date/time related methods take an optional precision to adjust the
fractional seconds.

Jeevan Chalke, reviewed by Peter Eisentraut and Andrew Dunstan.
2024-01-25 10:15:43 -05:00
Peter Eisentraut 924d046dcf Add a const decoration
Useful for a subsequent patch.

Discussion: https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e4da@eisentraut.org
2024-01-25 13:34:49 +01:00
Alvaro Herrera 55627ba2d3
Remove dummy_spinlock
It's been unused since 1b468a131b (2015).
2024-01-25 11:43:47 +01:00
Peter Eisentraut 4d969b2f85 MergeAttributes: convert pg_attribute back to ColumnDef before comparing
MergeAttributes() has a loop to merge multiple inheritance parents
into a column column definition.  The merged-so-far definition is
stored in a ColumnDef node.  If we have to merge columns from multiple
inheritance parents (if the name matches), then we have to check
whether various column properties (type, collation, etc.) match.  The
current code extracts the pg_attribute value of the
currently-considered inheritance parent and compares it against the
merged-so-far ColumnDef value.  If the currently considered column
doesn't match any previously inherited column, we make a new ColumnDef
node from the pg_attribute information and add it to the column list.

This patch rearranges this so that we create the ColumnDef node first
in either case.  Then the code that checks the column properties for
compatibility compares ColumnDef against ColumnDef (instead of
ColumnDef against pg_attribute).  This makes the code more symmetric
and easier to follow.  Also, later in MergeAttributes(), there is a
similar but separate loop that merges the new local column definition
with the combination of the inheritance parents established in the
first loop.  That comparison is already ColumnDef-vs-ColumnDef.  With
this change, both of these can use similar-looking logic.  (A future
project might be to extract these two sets of code into a common
routine that encodes all the knowledge of whether two column
definitions are compatible.  But this isn't currently straightforward
because we want to give different error messages in the two cases.)
Furthermore, by avoiding the use of Form_pg_attribute here, we make it
easier to make changes in the pg_attribute layout without having to
worry about the local needs of tablecmds.c.

Because MergeAttributes() is hugely long, it's sometimes hard to know
where in the function you are currently looking.  To help with that, I
also renamed some variables to make it clearer where you are currently
looking.  The first look is "prev" vs. "new", the second loop is "inh"
vs. "new".

Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e4da@eisentraut.org
2024-01-25 11:30:01 +01:00
Alvaro Herrera 46778187f5
Fix s_lock_test compile
This is a mostly unused tool, but I discovered while nosing around the
Makefile that it hasn't been kept in line with other changes.  Fix it.

Backpatching doesn't appear to be necessary.

Discussion: https://postgr.es/m/202401241114.ied53jcich72@alvherre.pgsql
2024-01-25 11:17:33 +01:00
Amit Langote fba2112b15 Silence compiler warning introduced in 1edb3b491b
Reported-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs48qEoe9Du5tuUxrkGQ6VC9oy+tQOORQ6jpob14-E1Z+jg@mail.gmail.com
2024-01-25 17:12:18 +09:00
Michael Paquier 1d35f705e1 Add more LOG messages when starting and ending recovery from a backup
Three LOG messages are added in the recovery code paths, providing
information that can be useful to track corruption issues depending on
the state of the cluster, telling that:
- Recovery has started from a backup_label.
- Recovery is restarting from a backup start LSN, without a
backup_label.
- Recovery has completed from a backup.

Author: Andres Freund
Reviewed-by: David Steele, Laurenz Albe, Michael Paquier
Discussion: https://postgr.es/m/20231117041811.vz4vgkthwjnwp2pp@awork3.anarazel.de
2024-01-25 17:07:56 +09:00
Amit Kapila c393308b69 Allow to enable failover property for replication slots via SQL API.
This commit adds the failover property to the replication slot. The
failover property indicates whether the slot will be synced to the standby
servers, enabling the resumption of corresponding logical replication
after failover. But note that this commit does not yet include the
capability to sync the replication slot; the subsequent commits will add
that capability.

A new optional parameter 'failover' is added to the
pg_create_logical_replication_slot() function. We will also enable to set
'failover' option for slots via the subscription commands in the
subsequent commits.

The value of the 'failover' flag is displayed as part of
pg_replication_slots view.

Author: Hou Zhijie, Shveta Malik, Ajin Cherian
Reviewed-by: Peter Smith, Bertrand Drouvot, Dilip Kumar, Masahiko Sawada, Nisha Moond, Kuroda, Hayato, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
2024-01-25 12:15:46 +05:30
Fujii Masao a044e61f1b Remove redundant HandleWalWriterInterrupts().
Because of commit 1bdd54e662, the code of HandleWalWriterInterrupts()
became the same as HandleMainLoopInterrupts(). So this commit removes
HandleWalWriterInterrupts() and makes walwriter use
HandleMainLoopInterrupts() for improved code simplicity.

Author: Fujii Masao
Reviewed-by: Bharath Rupireddy, Nathan Bossart
Discussion: https://postgr.es/m/CAHGQGwHUtwCsB4DnqFLiMiVzjcA=zmeCKf9_pgQM-yJopydatw@mail.gmail.com
2024-01-25 12:50:08 +09:00
Thomas Munro 820b5af73d jit: Require at least LLVM 10.
Remove support for older LLVM versions.  The default on common software
distributions will be at least LLVM 10 when PostgreSQL 17 ships.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/CA%2BhUKGLhNs5geZaVNj2EJ79Dx9W8fyWUU3HxcpZy55sMGcY%3DiA%40mail.gmail.com
2024-01-25 15:42:34 +13:00
Masahiko Sawada 729439607a Add progress reporting of skipped tuples during COPY FROM.
9e2d870119 enabled the COPY command to skip malformed data, however
there was no visibility into how many tuples were actually skipped
during the COPY FROM.

This commit adds a new "tuples_skipped" column to
pg_stat_progress_copy view to report the number of tuples that were
skipped because they contain malformed data.

Bump catalog version.

Author: Atsushi Torikoshi
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/d12fd8c99adcae2744212cb23feff6ed%40oss.nttdata.com
2024-01-25 10:57:41 +09:00
Thomas Munro d282e88e50 Track LLVM 18 changes.
A function was given a newly standard name from C++20 in LLVM 16.  Then
LLVM 18 added a deprecation warning for the old name, and it is about to
ship, so it's time to adjust that.

Back-patch to all supported releases.

Discussion: https://www.postgresql.org/message-id/CA+hUKGLbuVhH6mqS8z+FwAn4=5dHs0bAWmEMZ3B+iYHWKC4-ZA@mail.gmail.com
2024-01-25 13:44:54 +13:00
Peter Eisentraut 46a0cd4cef Add temporal PRIMARY KEY and UNIQUE constraints
Add WITHOUT OVERLAPS clause to PRIMARY KEY and UNIQUE constraints.
These are backed by GiST indexes instead of B-tree indexes, since they
are essentially exclusion constraints with = for the scalar parts of
the key and && for the temporal part.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-01-24 16:34:37 +01:00
Alvaro Herrera 74a7306310
Improve notation of BuiltinTrancheNames
Use C99 designated initializer syntax for array elements, instead of
writing the position in a comment.  This is less verbose and much more
readable.  Akin to cc15059634.

One disadvantage is that the BuiltinTrancheNames array now has a hole of
51 NULLs -- previously, the array elements were shifted 51 elements
 downward to avoid this.  This can be fixed by merging the
IndividualLWLockNames array into BuiltinTrancheNames, which would occupy
those 51 pointers, but because it requires some arguably ugly Meson
hackery, it's left for later.

Discussion: https://postgr.es/m/202401231025.gbv4nnte5fmm@alvherre.pgsql
2024-01-24 15:01:30 +01:00
Amit Langote faa2b953ba Refactor code used by jsonpath executor to fetch variables
Currently, getJsonPathVariable() directly extracts a named
variable/key from the source Jsonb value.  This commit puts that
logic into a callback function called by getJsonPathVariable().
Other implementations of the callback may accept different forms
of the source value(s), for example, a List of values passed from
outside jsonpath_exec.c.

Extracted from a much larger patch to add SQL/JSON query functions.

Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Oleg Bartunov <obartunov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Andrew Dunstan <andrew@dunslane.net>
Author: Amit Langote <amitlangote09@gmail.com>

Reviewers have included (in no particular order) Andres Freund,
Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers,
Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby,
Álvaro Herrera, Jian He, Peter Eisentraut

Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com
Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-01-24 15:04:33 +09:00
Amit Langote 1edb3b491b Adjust populate_record_field() to handle errors softly
This adds a Node *escontext parameter to it and a bunch of functions
downstream to it, replacing any ereport()s in that path by either
errsave() or ereturn() as appropriate.  This also adds code to those
functions where necessary to return early upon encountering a soft
error.

The changes here are mainly intended to suppress errors in the
functions of jsonfuncs.c.  Functions in any external modules, such as
arrayfuncs.c, that those functions may in turn call are not changed
here based on the assumption that the various checks in jsonfuncs.c
functions should ensure that only values that are structurally valid
get passed to the functions in those external modules.  An exception
is made for domain_check() to allow handling domain constraint
violation errors softly.

For testing, this adds a function jsonb_populate_record_valid(),
which returns true if jsonb_populate_record() would finish without
causing an error for the provided JSON object, false otherwise.  Note
that jsonb_populate_record() internally calls populate_record(),
which in turn uses populate_record_field().

Extracted from a much larger patch to add SQL/JSON query functions.

Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Oleg Bartunov <obartunov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Andrew Dunstan <andrew@dunslane.net>
Author: Amit Langote <amitlangote09@gmail.com>

Reviewers have included (in no particular order) Andres Freund,
Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers,
Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby,
Álvaro Herrera, Jian He, Peter Eisentraut

Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com
Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-01-24 15:04:33 +09:00
Amit Langote aaaf9449ec Add soft error handling to some expression nodes
This adjusts the code for CoerceViaIO and CoerceToDomain expression
nodes to handle errors softly.

For CoerceViaIo, this adds a new ExprEvalStep opcode
EEOP_IOCOERCE_SAFE, which is implemented in the new accompanying
function ExecEvalCoerceViaIOSafe().  The only difference from
EEOP_IOCOERCE's inline implementation is that the input function
receives an ErrorSaveContext via the function's
FunctionCallInfo.context, which it can use to handle errors softly.

For CoerceToDomain, this simply entails replacing the ereport() in
ExecEvalConstraintNotNull() and ExecEvalConstraintCheck() by
errsave() passing it the ErrorSaveContext passed in the expression's
ExprEvalStep.

In both cases, the ErrorSaveContext to be used is passed by setting
ExprState.escontext to point to it before calling ExecInitExprRec()
on the expression tree whose errors are to be handled softly.

Note that there's no functional change as of this commit as no call
site of ExecInitExprRec() has been changed.  This is intended for
implementing new SQL/JSON expression nodes in future commits.

Extracted from a much larger patch to add SQL/JSON query functions.

Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Oleg Bartunov <obartunov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Andrew Dunstan <andrew@dunslane.net>
Author: Amit Langote <amitlangote09@gmail.com>

Reviewers have included (in no particular order) Andres Freund,
Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers,
Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby,
Álvaro Herrera, Jian He, Peter Eisentraut

Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com
Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-01-24 15:04:33 +09:00
Michael Paquier bb812ab091 Fix ALTER TABLE .. ADD COLUMN with complex inheritance trees
This command, when used to add a column on a parent table with a complex
inheritance tree, tried to update multiple times the same tuple in
pg_attribute for a child table when incrementing attinhcount, causing
failures with "tuple already updated by self" because of a missing
CommandCounterIncrement() between two updates.

This exists for a rather long time, so backpatch all the way down.

Reported-by: Alexander Lakhin
Author: Tender Wang
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/18297-b04cd83a55b51e35@postgresql.org
Backpatch-through: 12
2024-01-24 14:20:01 +09:00
Heikki Linnakangas 21ef4d4d89 Revert "libpqwalreceiver: Convert to libpq-be-fe-helpers.h"
This reverts commit 728f86fec6.

The signal handling was a few bricks shy of a load in that commit,
which made the walreceiver non-responsive to SIGTERM while it was
waiting for the connection to be established. That prevented a standby
from being promoted.

Since it was non-essential refactoring, let's revert it to make v16
work the same as earlier releases. I reverted it in 'master' too, to
keep the branches in sync. The refactoring was a good idea as such,
but it needs a bit more work. Once we have developed a complete patch
with this issue fixed, let's re-apply that to 'master'.

Reported-by: Kyotaro Horiguchi
Backpatch-through: 16
Discussion: https://www.postgresql.org/message-id/20231231.200741.1078989336605759878.horikyota.ntt@gmail.com
2024-01-23 10:38:07 +02:00
Peter Eisentraut 9b1a6f50b9 Generate syscache info from catalog files
Add a new genbki macros MAKE_SYSCACHE that specifies the syscache ID
macro, the underlying index, and the number of buckets.  From that, we
can generate the existing tables in syscache.h and syscache.c via
genbki.pl.

Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/75ae5875-3abc-dafc-8aec-73247ed41cde@eisentraut.org
2024-01-23 07:31:06 +01:00
David Rowley b262ad440e Add better handling of redundant IS [NOT] NULL quals
Until now PostgreSQL has not been very smart about optimizing away IS
NOT NULL base quals on columns defined as NOT NULL.  The evaluation of
these needless quals adds overhead.  Ordinarily, anyone who came
complaining about that would likely just have been told to not include
the qual in their query if it's not required.  However, a recent bug
report indicates this might not always be possible.

Bug 17540 highlighted that when we optimize Min/Max aggregates the IS NOT
NULL qual that the planner adds to make the rewritten plan ignore NULLs
can cause issues with poor index choice.  That particular case
demonstrated that other quals, especially ones where no statistics are
available to allow the planner a chance at estimating an approximate
selectivity for can result in poor index choice due to cheap startup paths
being prefered with LIMIT 1.

Here we take generic approach to fixing this by having the planner check
for NOT NULL columns and just have the planner remove these quals (when
they're not needed) for all queries, not just when optimizing Min/Max
aggregates.

Additionally, here we also detect IS NULL quals on a NOT NULL column and
transform that into a gating qual so that we don't have to perform the
scan at all.  This also works for join relations when the Var is not
nullable by any outer join.

This also helps with the self-join removal work as it must replace
strict join quals with IS NOT NULL quals to ensure equivalence with the
original query.

Author: David Rowley, Richard Guo, Andy Fan
Reviewed-by: Richard Guo, David Rowley
Discussion: https://postgr.es/m/CAApHDvqg6XZDhYRPz0zgOcevSMo0d3vxA9DvHrZtKfqO30WTnw@mail.gmail.com
Discussion: https://postgr.es/m/17540-7aa1855ad5ec18b4%40postgresql.org
2024-01-23 18:09:18 +13:00
Nathan Bossart 4372adfa24 Fix possible NULL pointer dereference in GetNamedDSMSegment().
GetNamedDSMSegment() doesn't check whether dsm_attach() returns
NULL, which creates the possibility of a NULL pointer dereference
soon after.  To fix, emit an ERROR if dsm_attach() returns NULL.
This shouldn't happen, but it would be nice to avoid a segfault if
it does.  In passing, tidy up the surrounding code.

Reported-by: Tom Lane
Reviewed-by: Michael Paquier, Bharath Rupireddy
Discussion: https://postgr.es/m/3348869.1705854106%40sss.pgh.pa.us
2024-01-22 20:44:38 -06:00
Michael Paquier cdd863480c Fix ERROR message in injection_point.c
This commit fixes an error message that failed to show the correct
function and library names when a function cannot be loaded.

While on it, adjust the call to load_external_function() so as this
ERROR can be reached, by making load_external_function() return NULL
rather than fail if a function cannot be found for a given injection
point.

Thinkos in d86d20f0ba.
2024-01-23 10:45:00 +09:00
Heikki Linnakangas 0eb23285a2 Fix two memcpy() bugs in the new injection point code
1. The memcpy()s in InjectionPointAttach() would copy garbage from
beyond the end of input string to the buffer in shared memory. You
won't usually notice, but if there is not enough valid mapped memory
beyond the end of the string, the read of unmapped memory will
segfault. This was flagged by the Cirrus CI build with address
sanitizer enabled.

2. The memcpy() in injection_point_cache_add() failed to copy the NULL
terminator.

Discussion: https://www.postgresql.org/message-id/0615a424-b726-4157-afa7-4245629f9512%40iki.fi
2024-01-22 20:55:45 +02:00
David Rowley 2bcf0785cd Re-disallow Memoize for parameterized nested loops with join filters
This was previously fixed in 9e215378d but got broken again as a result
of 2489d76c4.  It seems that commit causes ppi_clauses to contain
duplicate clauses and it's no longer safe to check the list_length of
that list to determine if there are join conditions other than what's
mentioned in ppi_clauses.

Here we adjust the check to count the distinct rinfo_serial mentioned in
ppi_clauses.  We expect that extra->restrictlist won't have duplicate
rinfo_serials.

Reported-by: Amadeo Gallardo
Author: Richard Guo
Discussion: https://postgr.es/m/CADFREbW-BLJd7-a5J%2B5wjVumeFG1ByXiSOFzMtkmY_SDWckTxw%40mail.gmail.com
Backpatch-through: 16, where 2489d76c4 was introduced.
2024-01-22 22:45:02 +13:00
Michael Paquier b199eb89c6 Fix some typos
Author: Yongtao Huang
Discussion: https://postgr.es/m/CAOe1Go1F99o5JsphtXdDC5bxm7AzetU8q3AxLh4AAVGKu1AzEQ@mail.gmail.com
2024-01-22 13:55:25 +09:00
Michael Paquier d86d20f0ba Add backend support for injection points
Injection points are a new facility that makes possible for developers
to run custom code in pre-defined code paths.  Its goal is to provide
ways to design and run advanced tests, for cases like:
- Race conditions, where processes need to do actions in a controlled
ordered manner.
- Forcing a state, like an ERROR, FATAL or even PANIC for OOM, to force
recovery, etc.
- Arbitrary sleeps.

This implements some basics, and there are plans to extend it more in
the future depending on what's required.  Hence, this commit adds a set
of routines in the backend that allows developers to attach, detach and
run injection points:
- A code path calling an injection point can be declared with the macro
INJECTION_POINT(name).
- InjectionPointAttach() and InjectionPointDetach() to respectively
attach and detach a callback to/from an injection point.  An injection
point name is registered in a shmem hash table with a library name and a
function name, which will be used to load the callback attached to an
injection point when its code path is run.

Injection point names are just strings, so as an injection point can be
declared and run by out-of-core extensions and modules, with callbacks
defined in external libraries.

This facility is hidden behind a dedicated switch for ./configure and
meson, disabled by default.

Note that backends use a local cache to store callbacks already loaded,
cleaning up their cache if a callback has found to be removed on a
best-effort basis.  This could be refined further but any tests but what
we have here was fine with the tests I've written while implementing
these backend APIs.

Author: Michael Paquier, with doc suggestions from Ashutosh Bapat.
Reviewed-by: Ashutosh Bapat, Nathan Bossart, Álvaro Herrera, Dilip
Kumar, Amul Sul, Nazir Bilal Yavuz
Discussion: https://postgr.es/m/ZTiV8tn_MIb_H2rE@paquier.xyz
2024-01-22 10:15:50 +09:00
Alexander Korotkov 0452b461bc Explore alternative orderings of group-by pathkeys during optimization.
When evaluating a query with a multi-column GROUP BY clause, we can minimize
sort operations or avoid them if we synchronize the order of GROUP BY clauses
with the ORDER BY sort clause or sort order, which comes from the underlying
query tree. Grouping does not imply any ordering, so we can compare
the keys in arbitrary order, and a Hash Agg leverages this. But for Group Agg,
we simply compared keys in the order specified in the query. This commit
explores alternative ordering of the keys, trying to find a cheaper one.

The ordering of group keys may interact with other parts of the query, some of
which may not be known while planning the grouping. For example, there may be
an explicit ORDER BY clause or some other ordering-dependent operation higher up
in the query, and using the same ordering may allow using either incremental
sort or even eliminating the sort entirely.

The patch always keeps the ordering specified in the query, assuming the user
might have additional insights.

This introduces a new GUC enable_group_by_reordering so that the optimization
may be disabled if needed.

Discussion: https://postgr.es/m/7c79e6a5-8597-74e8-0671-1c39d124c9d6%40sigaev.ru
Author: Andrei Lepikhov, Teodor Sigaev
Reviewed-by: Tomas Vondra, Claudio Freire, Gavin Flower, Dmitry Dolgov
Reviewed-by: Robert Haas, Pavel Borisov, David Rowley, Zhihong Yu
Reviewed-by: Tom Lane, Alexander Korotkov, Richard Guo, Alena Rybakina
2024-01-21 22:21:36 +02:00
Alexander Korotkov 7ab80ac1ca Generalize the common code of adding sort before processing of grouping
Extract the repetitive code pattern into a new function make_ordered_path().

Discussion: https://postgr.es/m/CAPpHfdtzaVa7S4onKy3YvttF2rrH5hQNHx9HtcSTLbpjx%2BMJ%2Bw%40mail.gmail.com
Author: Andrei Lepikhov
2024-01-21 22:19:47 +02:00
Tom Lane 58447e3189 Add hint about not qualifying UPDATE...SET target with relation name.
Target columns in UPDATE ... SET must not be qualified with the target
table; we disallow this because it'd create ambiguity about which name
is the column name in case of field-qualified names.  However, newbies
have been seen to expect that they could qualify a target name just
like other names.  The error message when they do is confusing:
"column "foo" of relation "foo" does not exist".  To improve matters,
issue a HINT if the invalid name is qualified and matches the
relation's alias.

James Coleman (editorialized a bit by me)

Discussion: https://postgr.es/m/CAAaqYe8S2Qa060UV-YF5GoSd5PkEhLV94x-fEi3=TOtpaXCV+w@mail.gmail.com
2024-01-20 17:54:14 -05:00
Tom Lane 075df6b208 Add planner support functions for range operators <@ and @>.
These support functions will transform expressions with constant
range values into direct comparisons on the range bound values,
which are frequently better-optimizable.  The transformation is
skipped however if it would require double evaluation of a
volatile or expensive element expression.

Along the way, add the range opfamily OID to range typcache entries,
since load_rangetype_info has to compute that anyway and it seems
silly to duplicate the work later.

Kim Johan Andersson and Jian He, reviewed by Laurenz Albe

Discussion: https://postgr.es/m/94f64d1f-b8c0-b0c5-98bc-0793a34e0851@kimmet.dk
2024-01-20 13:57:54 -05:00
Nathan Bossart 8b2bcf3f28 Introduce the dynamic shared memory registry.
Presently, the most straightforward way for a shared library to use
shared memory is to request it at server startup via a
shmem_request_hook, which requires specifying the library in
shared_preload_libraries.  Alternatively, the library can create a
dynamic shared memory (DSM) segment, but absent a shared location
to store the segment's handle, other backends cannot use it.  This
commit introduces a registry for DSM segments so that these other
backends can look up existing segments with a library-specified
string.  This allows libraries to easily use shared memory without
needing to request it at server startup.

The registry is accessed via the new GetNamedDSMSegment() function.
This function handles allocating the segment and initializing it
via a provided callback.  If another backend already created and
initialized the segment, it simply attaches the segment.
GetNamedDSMSegment() locks the registry appropriately to ensure
that only one backend initializes the segment and that all other
backends just attach it.

The registry itself is comprised of a dshash table that stores the
DSM segment handles keyed by a library-specified string.

Reviewed-by: Michael Paquier, Andrei Lepikhov, Nikita Malakhov, Robert Haas, Bharath Rupireddy, Zhang Mingli, Amul Sul
Discussion: https://postgr.es/m/20231205034647.GA2705267%40nathanxps13
2024-01-19 14:24:36 -06:00
Alexander Korotkov 448a3331d9 Fix name collision in c64086b79d
Reported-by: Erik Rijkers, Tom Lane
Discussion: https://postgr.es/m/E1rQqeS-002A0s-Qm%40gemulon.postgresql.org
2024-01-19 18:17:13 +02:00
Alexander Korotkov c64086b79d Reorder actions in ProcArrayApplyRecoveryInfo()
Since 5a1dfde833, 2PC filenames use FullTransactionId.  Thus, it needs to
convert TransactionId to FullTransactionId in StandbyTransactionIdIsPrepared()
using TransamVariables->nextXid.  However, ProcArrayApplyRecoveryInfo()
first releases locks with usage StandbyTransactionIdIsPrepared(), then advances
TransamVariables->nextXid.  This sequence of actions could cause errors.
This commit makes ProcArrayApplyRecoveryInfo() advance
TransamVariables->nextXid before releasing locks.

Reported-by: Thomas Munro, Michael Paquier
Discussion: https://postgr.es/m/CA%2BhUKGLj_ve1_pNAnxwYU9rDcv7GOhsYXJt7jMKSA%3D5-6ss-Cw%40mail.gmail.com
Discussion: https://postgr.es/m/Zadp9f4E1MYvMJqe%40paquier.xyz
2024-01-19 17:19:17 +02:00
Peter Eisentraut 6db4598fcb Add stratnum GiST support function
This is support function 12 for the GiST AM and translates
"well-known" RT*StrategyNumber values into whatever strategy number is
used by the opclass (since no particular numbers are actually
required).  We will use this to support temporal PRIMARY
KEY/UNIQUE/FOREIGN KEY/FOR PORTION OF functionality.

This commit adds two implementations, one for internal GiST opclasses
(just an identity function) and another for btree_gist opclasses.  It
updates btree_gist from 1.7 to 1.8, adding the support function for
all its opclasses.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-01-19 15:42:13 +01:00
Alexander Korotkov b725b7eec4 Rename COPY option from SAVE_ERROR_TO to ON_ERROR
The option names now are "stop" (default) and "ignore".  The future options
could be "file 'filename.log'" and "table 'tablename'".

Discussion: https://postgr.es/m/20240117.164859.2242646601795501168.horikyota.ntt%40gmail.com
Author: Jian He
Reviewed-by: Atsushi Torikoshi
2024-01-19 15:15:51 +02:00
John Naylor 0aba255440 Add optimized C string hashing
Given an already-initialized hash state and a NUL-terminated string,
accumulate the hash of the string into the hash state and return the
length for the caller to (optionally) save for the finalizer. This
avoids a strlen call.

If the string pointer is aligned, we can use a word-at-a-time
algorithm for NUL lookahead. The aligned case is only used on 64-bit
platforms, since it's not worth the extra complexity for 32-bit.

Handling the tail of the string after finishing the word-wise loop
was inspired by NetBSD's strlen(), but no code was taken since that
is written in assembly language.

As demonstration, use this in the search path cache. This brings the
general case performance closer to the special case optimization done
in commit a86c61c9ee. There are other places that could benefit, but
that is left for future work.

Jeff Davis and John Naylor
Reviewed by Heikki Linnakangas, Jian He, Junwang Zhao

Discussion: https://postgr.es/m/3820f030fd008ff14134b3e9ce5cc6dd623ed479.camel%40j-davis.com
Discussion: https://postgr.es/m/b40292c99e623defe5eadedab1d438cf51a4107c.camel%40j-davis.com
2024-01-19 12:56:15 +07:00
Michael Paquier 3ada0d2cae Fix incorrect placeholder in walreceiver.c
Author: Yongtao Huang
Discussion: https://postgr.es/m/CAOe1Go3H7CgrSceO+HBhnoptk-mJhii-YT8D19CikKintjwumQ@mail.gmail.com
2024-01-19 13:20:49 +09:00
Nathan Bossart d891dcc065 Improve some documentation about the bootstrap superuser.
This commit adds some notes about the inability to remove superuser
privileges from the bootstrap superuser.  This has been blocked
since commit e530be2c5c, but it wasn't intended be a supported
feature before that, either.

In passing, change "bootstrap user" to "bootstrap superuser" in a
couple places.

Author: Yurii Rashkovskii
Reviewed-by: Vignesh C, David G. Johnston
Discussion: https://postgr.es/m/CA%2BRLCQzSx_eTC2Fch0EzeNHD3zFUcPvBYOoB%2BpPScFLch1DEQw%40mail.gmail.com
2024-01-18 21:39:51 -06:00
David Rowley 4b31063643 Fix broken Bitmapset optimization in DiscreteKnapsack()
Some code in DiscreteKnapsack() attempted to zero all words in a
Bitmapset by performing bms_del_members() to delete all the members from
itself before replacing those members with members from another set.
When that code was written, this was a valid way to manipulate the set
in such a way to save from palloc having to be called to allocate a new
Bitmapset.  However, 00b41463c modified Bitmapsets so that an empty set is
*always* represented as NULL and this breaks the optimization as the
Bitmapset code will always pfree the memory when the set becomes empty.

Since DiscreteKnapsack() has been coded to avoid as many unneeded
allocations as possible, it seems risky to not fix this.  Here we add
bms_replace_members() to effectively perform an in-place copy of another
set, reusing the memory of the existing set, when possible.

This got broken in v16, but no backpatch for now as there've been no
complaints.

Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/CAApHDvoTCBkBU2PJghNOFUiO0q=QP4WAWHi5sJP6_4=b2WodrA@mail.gmail.com
2024-01-19 10:44:36 +13:00
Robert Haas e313a61137 Remove LVPagePruneState.
Commit cb970240f1 moved some code from
lazy_scan_heap() to lazy_scan_prune(), and now some things that used to
need to be passed back and forth are completely local to lazy_scan_prune().
Hence, this struct is mostly obsolete.  The only thing that still
needs to be passed back to the caller is has_lpdead_items, and that's
also passed back by lazy_scan_noprune(), so do it the same way in both
cases.

Melanie Plageman, reviewed and slightly revised by me.

Discussion: http://postgr.es/m/CAAKRu_aM=OL85AOr-80wBsCr=vLVzhnaavqkVPRkFBtD0zsuLQ@mail.gmail.com
2024-01-18 15:17:09 -05:00
Robert Haas cb970240f1 Move VM update code from lazy_scan_heap() to lazy_scan_prune().
Most of the output parameters of lazy_scan_prune() were being
used to update the VM in lazy_scan_heap(). Moving that code into
lazy_scan_prune() simplifies lazy_scan_heap() and requires less
communication between the two.

This change permits some further code simplification, but that
is left for a separate commit.

Melanie Plageman, reviewed by me.

Discussion: http://postgr.es/m/CAAKRu_aM=OL85AOr-80wBsCr=vLVzhnaavqkVPRkFBtD0zsuLQ@mail.gmail.com
2024-01-18 14:44:57 -05:00
Robert Haas c120550edb Optimize vacuuming of relations with no indexes.
If there are no indexes on a relation, items can be marked LP_UNUSED
instead of LP_DEAD when pruning. This significantly reduces WAL
volume, since we no longer need to emit one WAL record for pruning
and a second to change the LP_DEAD line pointers thus created to
LP_UNUSED.

Melanie Plageman, reviewed by Andres Freund, Peter Geoghegan, and me

Discussion: https://postgr.es/m/CAAKRu_bgvb_k0gKOXWzNKWHt560R0smrGe3E8zewKPs8fiMKkw%40mail.gmail.com
2024-01-18 10:03:42 -05:00
Peter Eisentraut ed1e0a6512 Error message capitalisation
per style guidelines

Author: Peter Smith <peter.b.smith@fujitsu.com>
Discussion: https://www.postgresql.org/message-id/flat/CAHut%2BPtzstExQ4%3DvFH%2BWzZ4g4xEx2JA%3DqxussxOdxVEwJce6bw%40mail.gmail.com
2024-01-18 09:35:12 +01:00
Michael Paquier 0ae3b46621 Improve handling of dropped partitioned indexes for REINDEX INDEX
A REINDEX INDEX done on a partitioned index builds a list of the indexes
to work on before processing its partitions in individual transactions.
When combined with a DROP of the partitioned index, there was a window
where it was possible to see some unexpected "could not open relation
with OID", synonym of relation lookup error.  The code was robust enough
to handle the case where the parent relation is missing, but not the
case where an index would be gone missing.

This is similar to 1d65416661.

Support for REINDEX on partitioned relations has been introduced in
a6642b3ae0, so backpatch down to 14.

Author: Fei Changhong
Discussion: https://postgr.es/m/tencent_6A52106095ACDE55333E3AD33F304C0C3909@qq.com
Backpatch-through: 14
2024-01-18 16:30:11 +09:00
Michael Paquier 8013850c85 Add try_index_open(), conditional variant of index_open()
try_index_open() is able to open an index if its relkind fits, except
that it would return NULL instead of generated an error if the relation
does not exist.  This new routine will be used by an upcoming patch to
make REINDEX on partitioned relations more robust when an index in a
partition tree is dropped.

Extracted from a larger patch by the same author.

Author: Fei Changhong
Discussion: https://postgr.es/m/tencent_6A52106095ACDE55333E3AD33F304C0C3909@qq.com
Backpatch-through: 14
2024-01-18 15:04:24 +09:00
Alexander Korotkov 58fbbc9d68 Fix spelling in notice
Reported-by: Atsushi Torikoshi
Discussion: https://postgr.es/m/762d7dd4d5aa9e5ecffec2ae6a255a28%40oss.nttdata.com
2024-01-17 22:59:09 +02:00
Heikki Linnakangas 2b53a462cf Fix incorrect comment on how BackendStatusArray is indexed
The comment was copy-pasted from the call to ProcSignalInit() in
AuxiliaryProcessMain(), which uses a similar scheme of having reserved
slots for aux processes after MaxBackends slots for backends. However,
ProcSignalInit() indexing starts from 1, whereas BackendStatusArray
starts from 0. The code is correct, but the comment was wrong.

Discussion: https://www.postgresql.org/message-id/f3ecd4cb-85ee-4e54-8278-5fabfb3a4ed0@iki.fi
Backpatch-through: v14
2024-01-17 15:44:10 +02:00
Daniel Gustafsson 7cfa154d15 Close socket in case of errors in setting non-blocking
If configuring the newly created socket non-blocking fails we
error out and return INVALID_SOCKET, but the socket that had
been created wasn't closed. Fix by issuing closesocket in the
errorpath.

Backpatch to all supported branches.

Author: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://postgr.es/m/CAEudQApmU5CrKefH85VbNYE2y8H=-qqEJbg6RAPU65+vCe+89A@mail.gmail.com
Backpatch-through: v12
2024-01-17 11:24:11 +01:00
Michael Paquier 44ad5129ce Fix description of DecodeInsert() in decode.c
This incorrectly referred to deletes.

Author: Yongtao Huang
Reviewed-by: Richard Guo
Description: https://postgr.es/m/CAOe1Go0Czgvo9eiDqeFpaABwJu=gBK6qjrYzZGZLn=tKDX8AUw@mail.gmail.com
2024-01-17 17:03:02 +09:00
Michael Paquier 2197d06224 Add support for parsing of large XML data (>= 10MB)
This commit adds XML_PARSE_HUGE to the libxml2 functions used in core
for the parsing of XML objects, raising up the original limit of 10MB
supported by libxml2.

In most code paths of upstream, XML_MAX_TEXT_LENGTH (10^7) is the
historical limit that gets upgraded to XML_MAX_HUGE_LENGTH (10^9) once
XML_PARSE_HUGE is given to the parser calls.  These are still limited by
any palloc() calls for text, up to 1GB.

This offers the possibility to handle within the backend XML objects
larger than 10MB in general, with also a higher depth limit.  This
change affects the contrib module xml2, the xml data type and SQL/XML.

Author: Dmitry Koval
Reviewed-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/18274-98d16bc03520665f@postgresql.org
2024-01-17 14:03:55 +09:00
Alexander Korotkov 8ad1f7db87 Fix format specifier for NOTICE in copyfrom.c
It's incorrect to use %lz for 64-bit numbers on 32-bit machine.
2024-01-17 00:59:44 +02:00
Alexander Korotkov 9e2d870119 Add new COPY option SAVE_ERROR_TO
Currently, when source data contains unexpected data regarding data type or
range, the entire COPY fails. However, in some cases, such data can be ignored
and just copying normal data is preferable.

This commit adds a new option SAVE_ERROR_TO, which specifies where to save the
error information. When this option is specified, COPY skips soft errors and
continues copying.

Currently, SAVE_ERROR_TO only supports "none". This indicates error information
is not saved and COPY just skips the unexpected data and continues running.

Later works are expected to add more choices, such as 'log' and 'table'.

Author: Damir Belyalov, Atsushi Torikoshi, Alex Shulgin, Jian He
Discussion: https://postgr.es/m/87k31ftoe0.fsf_-_%40commandprompt.com
Reviewed-by: Pavel Stehule, Andres Freund, Tom Lane, Daniel Gustafsson,
Reviewed-by: Alena Rybakina, Andy Fan, Andrei Lepikhov, Masahiko Sawada
Reviewed-by: Vignesh C, Atsushi Torikoshi
2024-01-16 23:08:53 +02:00
David Rowley c7e5e994b2 Fix REALLOCATE_BITMAPSETS code
7d58f2342 added a compile-time option to have bitmapset.c reallocate the
set before returning when a set is modified.  That commit failed to do
its job in various cases and returned the input set when it shouldn't
have in these cases.  Here we fix those missing cases.

This commit also adds some documentation about what
REALLOCATE_BITMAPSETS is for.  This is important as future functions
that go inside bitmapset.c need to know if they need to do anything
special when this compile-time option is defined.

Also, between 71a3e8c43 and 7d58f2342 some Asserts seem to have become
duplicated.  Tidy these up.  Rather than having the Assert check each
aspect of what makes a set invalid, here we introduce a helper function
which returns false when a set is invalid and have the Asserts use this
instead.

Also, make a pass on improving the comments in bitmapset.c.  Various
comments mentioned the input sets being "recycled".  This could be
interpreted to mean that the output set will always point to the same
memory as the given input parameter.  Here we try to make it clear that
this must not be relied upon and that callers must ensure that all
references to a given set are updated on each modification.

In passing, improve comments for bms_union(), bms_intersect() and
bms_difference() to detail what they do.  I (David) have too often had to
remind myself by reading the code each time to find out if I need, for
example, to use bms_union() or bms_join().  I also removed some
low-value comments that were trying to convey information about "these
operations" without mentioning which operations it was talking about.
It seems better to document these things in the function header comment
instead.

Author: Richard Guo, David Rowley
Discussion: https://postgr.es/m/CAMbWs4-djy9qYux2gZrtmxA0StrYXJjvB-oqLxn-d7J88t=PQQ@mail.gmail.com
2024-01-17 09:30:21 +13:00
Robert Haas 45d395cd75 Be more consistent about whether to update the FSM while vacuuming.
Previously, when lazy_scan_noprune() was called and returned true, we would
update the FSM immediately if the relation had no indexes or if the page
contained no dead items. On the other hand, when lazy_scan_prune() was
called, we would update the FSM if either of those things was true or
if index vacuuming was disabled. Eliminate that behavioral difference by
considering vacrel->do_index_vacuuming in both cases.

Also, make lazy_scan_heap() responsible for deciding whether to update
the FSM, instead of doing it inside lazy_scan_noprune(). This is
more consistent with the lazy_scan_prune() case. lazy_scan_noprune()
still needs an output parameter for whether there are LP_DEAD items
on the page, but the real decision-making now happens in the caller.

Patch by me, reviewed by Peter Geoghegan and Melanie Plageman.

Discussion: http://postgr.es/m/CA+TgmoaOzvN1TcHd9iej=PR3fY40En1USxzOnXSR2CxCLaRM0g@mail.gmail.com
2024-01-16 14:16:57 -05:00
Peter Eisentraut 6995863157 Support identity columns in partitioned tables
Previously, identity columns were disallowed on partitioned tables.
(The reason was mainly that no one had gotten around to working
through all the details to make it work.)  This makes it work now.

Some details on the behavior:

* A newly created partition inherits identity property

  The partitions of a partitioned table are integral part of the
  partitioned table.  A partition inherits identity columns from the
  partitioned table.  An identity column of a partition shares the
  identity space with the corresponding column of the partitioned
  table.  In other words, the same identity column across all
  partitions of a partitioned table share the same identity space.
  This is effected by sharing the same underlying sequence.

  When INSERTing directly into a partition, the sequence associated
  with the topmost partitioned table is used to calculate the value of
  the corresponding identity column.

  In regular inheritance, identity columns and their properties in a
  child table are independent of those in its parent tables.  A child
  table does not inherit identity columns or their properties
  automatically from the parent.  (This is unchanged.)

* Attached partition inherits identity column

  A table being attached as a partition inherits the identity property
  from the partitioned table.  This should be fine since we expect
  that the partition table's column has the same type as the
  partitioned table's corresponding column.  If the table being
  attached is a partitioned table, the identity properties are
  propagated down its partition hierarchy.

  An identity column in the partitioned table is also marked as NOT
  NULL.  The corresponding column in the partition needs to be marked
  as NOT NULL for the attach to succeed.

* Drop identity property when detaching partition

  A partition's identity column shares the identity space
  (i.e. underlying sequence) as the corresponding column of the
  partitioned table.  If a partition is detached it can longer share
  the identity space as before.  Hence the identity columns of the
  partition being detached loose their identity property.

  When identity of a column of a regular table is dropped it retains
  the NOT NULL constraint that came with the identity property.
  Similarly the columns of the partition being detached retain the NOT
  NULL constraints that came with identity property, even though the
  identity property itself is lost.

  The sequence associated with the identity property is linked to the
  partitioned table (and not the partition being detached).  That
  sequence is not dropped as part of detach operation.

* Partitions with their own identity columns are not allowed.

* The usual ALTER operations (add identity column, add identity
  property to existing column, alter properties of an indentity
  column, drop identity property) are supported for partitioned
  tables.  Changing a column only in a partitioned table or a
  partition is not allowed; the change needs to be applied to the
  whole partition hierarchy.

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/CAExHW5uOykuTC+C6R1yDSp=o8Q83jr8xJdZxgPkxfZ1Ue5RRGg@mail.gmail.com
2024-01-16 17:24:52 +01:00
Alvaro Herrera 5850253973
struct XmlTableRoutine: use C99 designated initializers
As in c27f8621ee et al.

Not as critical as other cases we've handled, but I figure if we're
going to add JsonbTableRoutine using TableFuncRoutine, this makes it
easier to jump around the code.
2024-01-16 12:48:30 +01:00
Peter Eisentraut d22d98c341 Assert that partition inherits from only one parent in MergeAttributes()
A partition inherits only from one partitioned table and thus inherits
a column definition only once.  Assert the same in MergeAttributes()
and simplify a condition accordingly.

Similar definition exists about line 3068 in the same function.

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAExHW5uOykuTC+C6R1yDSp=o8Q83jr8xJdZxgPkxfZ1Ue5RRGg@mail.gmail.com
2024-01-16 08:57:35 +01:00
Alexander Korotkov fe093994db Fix 'negative bitmapset member' error
When removing a useless join, we'd remove PHVs that are not used at join
partner rels or above the join.  A PHV that references the join's relid
in ph_eval_at is logically "above" the join and thus should not be
removed.  We have the following check for that:

    !bms_is_member(ojrelid, phinfo->ph_eval_at)

However, in the case of SJE removing a useless inner join, 'ojrelid' is
set to -1, which would trigger the "negative bitmapset member not
allowed" error in bms_is_member().

Fix it by skipping examining ojrelid for inner joins in this check.

Reported-by: Zuming Jiang
Bug: #18260
Discussion: https://postgr.es/m/18260-1b6a0c4ae311b837%40postgresql.org
Author: Richard Guo
Reviewed-by: Andrei Lepikhov
2024-01-15 17:45:16 +02:00
Alvaro Herrera aa817c7496
Avoid useless ReplicationOriginExitCleanup locking
When session_replication_state is NULL, we can know there's nothing to
do with no lock acquisition.  Do that.

Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/CALj2ACX+YaeRU5xJqR4C7kLsTO_F7DBRNF8WgeHvJZcKtNuK_A@mail.gmail.com
2024-01-15 13:02:03 +01:00
Tom Lane 6545ba96cb Prevent access to an unpinned buffer in BEFORE ROW UPDATE triggers.
When ExecBRUpdateTriggers switches to a new target tuple as a result
of the EvalPlanQual logic, it must form a new proposed update tuple.
Since commit 86dc90056, that tuple (the result of
ExecGetUpdateNewTuple) has been a virtual tuple that might contain
pointers to by-ref fields of the new target tuple (in "oldslot").
However, immediately after that we materialize oldslot, causing it to
drop its buffer pin, whereupon the by-ref pointers are unsafe to use.
This is a live bug only when the new target tuple is in a different
page than the original target tuple, since we do still hold a pin on
the original one.  (Before 86dc90056, there was no bug because the
EPQ plantree would hold a pin on the new target tuple; but now that's
not assured.)  To fix, forcibly materialize the new tuple before we
materialize oldslot.  This costs nothing since we would have done that
shortly anyway.

The real-world impact of this is probably minimal.  A visible failure
could occur if the new target tuple's buffer were recycled for some
other page in the short interval before we materialize newslot within
the trigger-calling loop; but that's quite unlikely given that we'd
just touched that page.  There's a larger hazard that some other
process could prune and repack that page within the window.  We have
lock on the new target tuple, but that wouldn't prevent it being moved
on the page.

Alexander Lakhin and Tom Lane, per bug #17798 from Alexander Lakhin.
Back-patch to v14 where 86dc90056 came in.

Discussion: https://postgr.es/m/17798-0907404928dcf0dd@postgresql.org
2024-01-14 12:38:41 -05:00
Peter Eisentraut e3aa802e6f Remove useless Assert
It's already included in the subsequent intVal() call.  Fixup for
4f622503d6.
2024-01-14 07:29:45 +01:00
Tom Lane 96c019ffa3 Re-pgindent catcache.c after previous commit.
Discussion: https://postgr.es/m/1393953.1698353013@sss.pgh.pa.us
Discussion: https://postgr.es/m/CAGjhLkOoBEC9mLsnB42d3CO1vcMx71MLSEuigeABbQ8oRdA6gw@mail.gmail.com
2024-01-13 13:54:11 -05:00
Tom Lane ad98fb1422 Cope with catcache entries becoming stale during detoasting.
We've long had a policy that any toasted fields in a catalog tuple
should be pulled in-line before entering the tuple in a catalog cache.
However, that requires access to the catalog's toast table, and we'll
typically do AcceptInvalidationMessages while opening the toast table.
So it's possible that the catalog tuple is outdated by the time we
finish detoasting it.  Since no cache entry exists yet, we can't
mark the entry stale during AcceptInvalidationMessages, and instead
we'll press forward and build an apparently-valid cache entry.  The
upshot is that we have a race condition whereby an out-of-date entry
could be made in a backend's catalog cache, and persist there
indefinitely causing indeterminate misbehavior.

To fix, use the existing systable_recheck_tuple code to recheck
whether the catalog tuple is still up-to-date after we finish
detoasting it.  If not, loop around and restart the process of
searching the catalog and constructing cache entries from the top.
The case is rare enough that this shouldn't create any meaningful
performance penalty, even in the SearchCatCacheList case where
we need to tear down and reconstruct the whole list.

Indeed, the case is so rare that AFAICT it doesn't occur during
our regression tests, and there doesn't seem to be any easy way
to build a test that would exercise it reliably.  To allow
testing of the retry code paths, add logic (in USE_ASSERT_CHECKING
builds only) that randomly pretends that the recheck failed about
one time out of a thousand.  This is enough to ensure that we'll
pass through the retry paths during most regression test runs.

By adding an extra level of looping, this commit creates a need
to reindent most of SearchCatCacheMiss and SearchCatCacheList.
I'll do that separately, to allow putting those changes in
.git-blame-ignore-revs.

Patch by me; thanks to Alexander Lakhin for having built a test
case to prove the bug is real, and to Xiaoran Wang for review.
Back-patch to all supported branches.

Discussion: https://postgr.es/m/1393953.1698353013@sss.pgh.pa.us
Discussion: https://postgr.es/m/CAGjhLkOoBEC9mLsnB42d3CO1vcMx71MLSEuigeABbQ8oRdA6gw@mail.gmail.com
2024-01-13 13:46:27 -05:00
Peter Eisentraut 4f622503d6 Make attstattarget nullable
This changes the pg_attribute field attstattarget into a nullable
field in the variable-length part of the row.  If no value is set by
the user for attstattarget, it is now null instead of previously -1.
This saves space in pg_attribute and tuple descriptors for most
practical scenarios.  (ATTRIBUTE_FIXED_PART_SIZE is reduced from 108
to 104.)  Also, null is the semantically more correct value.

The ANALYZE code internally continues to represent the default
statistics target by -1, so that that code can avoid having to deal
with null values.  But that is now contained to the ANALYZE code.
Only the DDL code deals with attstattarget possibly null.

For system columns, the field is now always null.  The ANALYZE code
skips system columns anyway.

To set a column's statistics target to the default value, the new
command form ALTER TABLE ... SET STATISTICS DEFAULT can be used.  (SET
STATISTICS -1 still works.)

Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/4da8d211-d54d-44b9-9847-f2a9f1184c76@eisentraut.org
2024-01-13 18:14:53 +01:00
Jeff Davis 45da69371e Fix memory leak in connection string validation.
Introduced in commit c3afe8cf5a.

Discussion: https://postgr.es/m/066a65233d3cb4ea27a9e0778d2f1d0dc764b222.camel@j-davis.com
Reviewed-by: Nathan Bossart, Tom Lane
Backpatch-through: 16
2024-01-12 21:40:23 -08:00
Jeff Davis 5c31669058 Re-validate connection string in libpqrcv_connect().
A superuser may create a subscription with password_required=true, but
which uses a connection string without a password.

Previously, if the owner of such a subscription was changed to a
non-superuser, the non-superuser was able to utilize a password from
another source (like a password file or the PGPASSWORD environment
variable), which should not have been allowed.

This commit adds a step to re-validate the connection string before
connecting.

Reported-by: Jeff Davis
Author: Vignesh C
Reviewed-by: Peter Smith, Robert Haas, Amit Kapila
Discussion: https://www.postgresql.org/message-id/flat/e5892973ae2a80a1a3e0266806640dae3c428100.camel%40j-davis.com
Backpatch-through: 16
2024-01-12 13:41:36 -08:00
Peter Eisentraut a1604237a6 Refactor ATExecAddColumn() to use BuildDescForRelation()
BuildDescForRelation() has all the knowledge for converting a
ColumnDef into pg_attribute/tuple descriptor.  ATExecAddColumn() can
make use of that, instead of duplicating all that logic.  We just pass
a one-element list of ColumnDef and we get back exactly the data
structure we need.  Note that we don't even need to touch
BuildDescForRelation() to make this work.

Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e4da@eisentraut.org
2024-01-12 16:11:29 +01:00
Michael Paquier e72a37528d Refactor code checking for file existence
jit.c and dfgr.c had a copy of the same code to check if a file exists
or not, with a twist: jit.c did not check for EACCES when failing the
stat() call for the path whose existence is tested.  This refactored
routine will be used by an upcoming patch.

Reviewed-by: Ashutosh Bapat
Discussion: https://postgr.es/m/ZTiV8tn_MIb_H2rE@paquier.xyz
2024-01-12 12:04:51 +09:00
Michael Paquier 08c3ad27eb Rework how logirep launchers are stopped during pg_upgrade
This is a rework of 7021d3b176, where we relied on forcing
max_logical_replication_workers to 0 in the postgres command.  This
commit now prevents logical replication launchers to start using -b and
a backend-side check based on IsBinaryUpgrade, effective when upgrading
from 17 and newer versions.

This commit improves the comments explaining why this restriction is
necessary.

This discussion was on hold until we were sure how to add support for
subscribers in pg_upgrade, something now done thanks to 9a17be1e24.

Reviewed-by: Álvaro Herrera, Amit Kapila, Tom Lane
Discussion: https://postgr.es/m/ZU2TeVkUg5qEi7Oy@paquier.xyz
2024-01-12 08:23:07 +09:00
Tom Lane 29f114b6ff Allow subquery pullup to wrap a PlaceHolderVar in another one.
The code for wrapping subquery output expressions in PlaceHolderVars
believed that if the expression already was a PlaceHolderVar, it was
never necessary to wrap that in another one.  That's wrong if the
expression is underneath an outer join and involves a lateral
reference to outside that scope: failing to add an additional PHV
risks evaluating the expression at the wrong place and hence not
forcing it to null when the outer join should do so.  This is an
oversight in commit 9e7e29c75, which added logic to forcibly wrap
lateral-reference Vars in PlaceHolderVars, but didn't see that the
adjacent case for PlaceHolderVars needed the same treatment.

The test case we have for this doesn't fail before 4be058fe9, but now
that I see the problem I wonder if it is possible to demonstrate
related errors before that.  That's moot though, since all such
branches are out of support.

Per bug #18284 from Holger Reise.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/18284-47505a20c23647f8@postgresql.org
2024-01-11 15:28:22 -05:00
Robert Haas e2d5b3b9b6 Remove hastup from LVPagePruneState.
Instead, just have lazy_scan_prune() and lazy_scan_noprune() update
LVRelState->nonempty_pages directly. This makes the two functions
more similar and also removes makes lazy_scan_noprune need one fewer
output parameters.

Melanie Plageman, reviewed by Andres Freund, Michael Paquier, and me

Discussion: http://postgr.es/m/CAAKRu_btji_wQdg=ok-5E4v_bGVxKYnnFFe7RA6Frc1EcOwtSg@mail.gmail.com
2024-01-11 13:30:12 -05:00
Robert Haas 5faffa0434 Reindent after commit d9ef650fca. 2024-01-11 13:24:35 -05:00
Robert Haas d9ef650fca Add new function pg_get_wal_summarizer_state().
This makes it possible to access information about the progress
of WAL summarization from SQL. The previously-added functions
pg_available_wal_summaries() and pg_wal_summary_contents() only
examine on-disk state, but this function exposes information from
the server's shared memory.

Discussion: http://postgr.es/m/CA+Tgmobvqqj-DW9F7uUzT-cQqs6wcVb-Xhs=w=hzJnXSE-kRGw@mail.gmail.com
2024-01-11 12:41:18 -05:00
Tom Lane 3d185cfc09 Restore initdb's old behavior of always setting the lc_xxx GUCs.
In commit 3e51b278d I (tgl) caused initdb to leave lc_messages and
other lc_xxx GUCs commented-out in the installed postgresql.conf file
if they were going to be set to 'C'.  This was a hack for cosmetic
purposes, and it was buggy because lc_messages' wired-in default is
not 'C' but '' (empty string).  That led to --no-locale not having
the expected effect, since the postmaster would then obtain
lc_messages from its startup environment.

Let's just revert to the prior behavior of always de-commenting the
lc_xxx entries; the argument for changing that longstanding behavior
was weak in the first place.

Also, fix postgresql.conf.sample's erroneous claim that the default
value of lc_messages is 'C'.  I suspect that was what misled me into
making this mistake in the first place.

Report and patch by Kyotaro Horiguchi.  Back-patch to v16 where
the problem was introduced.

Discussion: https://postgr.es/m/20231122.162700.1995154567625541112.horikyota.ntt@gmail.com
2024-01-10 18:09:29 -05:00
Tom Lane add673b897 Fix Asserts in calc_non_nestloop_required_outer().
These were not testing the same thing as the comparable Assert
in calc_nestloop_required_outer(), because we neglected to map
the given Paths' relids to top-level relids.  When considering
a partition child join the latter is the correct thing to do.

This oversight is old, but since it's only an overly-weak Assert
check there doesn't seem to be much value in back-patching.

Richard Guo (with cosmetic changes and comment updates by me)

Discussion: https://postgr.es/m/CAMbWs49sqbe9GBZ8sy8dSfKRNURgicR85HX8vgzcgQsPF0XY1w@mail.gmail.com
2024-01-10 13:51:36 -05:00
Tom Lane d641b827af Handle WindowClause.runCondition in tree walker/mutator functions.
Commit 9d9c02ccd, which added the notion of a "run condition" for
window functions, neglected to teach nodeFuncs.c to process the new
field.  Remarkably, that doesn't seem to have had any ill effects
before we invented Var.varnullingrels, but now it can cause visible
failures in join-removal scenarios.

I have no faith that there's not reachable problems in v15 too,
so back-patch the code change to v15 where 9d9c02ccd came in.
The test case seems irrelevant to v15, though.

Per bug #18277 from Zuming Jiang.  Diagnosis and patch by
Richard Guo.

Discussion: https://postgr.es/m/18277-089ead83b329a2fd@postgresql.org
2024-01-10 13:36:33 -05:00
Nathan Bossart 5b1b9bce84 Cross-check lists of predefined LWLocks.
Both lwlocknames.txt and wait_event_names.txt contain a list of all
the predefined LWLocks, i.e., those with predefined positions
within MainLWLockArray.  It is easy to miss one or the other,
especially since the list in wait_event_names.txt omits the "Lock"
suffix from all the LWLock wait events.  This commit adds a cross-
check of these lists to the script that generates lwlocknames.h.
If the lists do not match exactly, building will fail.

Suggested-by: Robert Haas
Reviewed-by: Robert Haas, Michael Paquier, Bertrand Drouvot
Discussion: https://postgr.es/m/20240102173120.GA1061678%40nathanxps13
2024-01-09 11:05:19 -06:00
Alexander Korotkov 028b15405b An addition to 8c441c0827
Given that now SJE doesn't work with result relation, turn a code dealing with
that into an assert that it shouldn't happen.
2024-01-09 10:12:14 +02:00
Alexander Korotkov 8c441c0827 Forbid SJE with result relation
The target relation for INSERT/UPDATE/DELETE/MERGE has a different behavior
than other relations in EvalPlanQual() and RETURNING clause.  This is why we
forbid target relation to be either source or target relation in SJE.
It's not clear if we could ever support this.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/b9e8f460-f9a6-0e9b-e8ba-60d59f0bc22c%40gmail.com
2024-01-09 10:01:22 +02:00
Alexander Korotkov 30b4955a46 Fix misuse of RelOptInfo.unique_for_rels cache by SJE
When SJE uses RelOptInfo.unique_for_rels cache, it passes filtered quals to
innerrel_is_unique_ext().  That might lead to an invalid match to cache entries
made by previous non self-join checking calls.  Add UniqueRelInfo.self_join
flag to prevent such cases.  Also, fix that SJE should require a strict match
of outerrelids to make sure UniqueRelInfo.extra_clauses are valid.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/4788f781-31bd-9796-d7d6-588a751c8787%40gmail.com
2024-01-09 00:09:06 +02:00
Noah Misch d3c5f37dd5 Make dblink interruptible, via new libpqsrv APIs.
This replaces dblink's blocking libpq calls, allowing cancellation and
allowing DROP DATABASE (of a database not involved in the query).  Apart
from explicit dblink_cancel_query() calls, dblink still doesn't cancel
the remote side.  The replacement for the blocking calls consists of
new, general-purpose query execution wrappers in the libpqsrv facility.
Out-of-tree extensions should adopt these.  Use them in postgres_fdw,
replacing a local implementation from which the libpqsrv implementation
derives.  This is a bug fix for dblink.  Code inspection identified the
bug at least thirteen years ago, but user complaints have not appeared.
Hence, no back-patch for now.

Discussion: https://postgr.es/m/20231122012945.74@rfd.leadboat.com
2024-01-08 11:39:56 -08:00
Alexander Korotkov 6d94cc6ca1 Fix indentation in ExecParallelHashIncreaseNumBatches()
Backpatch-through: 12
2024-01-08 20:00:20 +02:00
Tom Lane 89b69db82a Allow examine_simple_variable() to work on INSERT RETURNING Vars.
Since commit 599b33b94, this function assumed that every RTE_RELATION
RangeTblEntry would have an associated RelOptInfo.  But that's not so:
we only build RelOptInfos for relations that are scanned by the query.
In particular the target of an INSERT won't have one, so that Vars
appearing in an INSERT ... RETURNING list will not have an associated
RelOptInfo.  This apparently wasn't a problem before commit f7816aec2
taught examine_simple_variable() to drill down into CTEs containing
INSERT RETURNING, but it is now.

To fix, add a fallback code path that gets the userid to use directly
from the RTEPermissionInfo associated with the RTE.  (Sadly, we must
have two code paths, because not every RTE has a RTEPermissionInfo
either.)

Per report from Alexander Lakhin.  No back-patch, since the case is
apparently unreachable before f7816aec2.

Discussion: https://postgr.es/m/608a4886-6c60-0f9e-97d5-591256bd4150@gmail.com
2024-01-08 11:48:44 -05:00
Alexander Korotkov 2a67b5a60e Fix oversized memory allocation in Parallel Hash Join
During the calculations of the maximum for the number of buckets, take into
account that later we round that to the next power of 2.

Reported-by: Karen Talarico
Bug: #16925
Discussion: https://postgr.es/m/16925-ec96d83529d0d629%40postgresql.org
Author: Thomas Munro, Andrei Lepikhov, Alexander Korotkov
Reviewed-by: Alena Rybakina
Backpatch-through: 12
2024-01-07 09:10:19 +02:00
Alexander Korotkov 5ef34a8fc3 Fix the issue that SJE mistakenly omits qual clauses
When the SJE code handles the transfer of qual clauses from the removed
relation to the remaining one, it replaces the Vars of the removed
relation with the Vars of the remaining relation for each clause, and
then reintegrates these clauses into the appropriate restriction or join
clause lists, while attempting to avoid duplicates.

However, the code compares RestrictInfo->clause to determine if two
clauses are duplicates.  This is just flat wrong.  Two RestrictInfos
with the same clause can have different required_relids,
incompatible_relids, is_pushed_down, and so on.  This can cause qual
clauses to be mistakenly omitted, leading to wrong results.

This patch fixes it by comparing the entire RestrictInfos not just their
clauses ignoring 'rinfo_serial' field (otherwise almost all RestrictInfos will
be unique).  Making 'rinfo_serial' equal_ignore would break other code.  This
is why this commit implements our own comparison function for checking the
equality of RestrictInfos.

Reported-by: Zuming Jiang
Bug: #18261
Discussion: https://postgr.es/m/18261-2a75d748c928609b%40postgresql.org
Author: Richard Guo
2024-01-06 14:10:00 +02:00
Andrew Dunstan dbad1c53e9 Add copyright notices to a few perl scripts that don't have them 2024-01-05 13:15:50 +00:00
Michael Paquier 9cd0d77dfc Fix corruption of local buffer state during extend of temp relation
A typo has been introduced by 31966b151e when updating the state of a
local buffer when a temporary relation is extended, for the case of a
block included in the relation range extended, when it is already found
in the hash table holding the local buffers.  In this case, BM_VALID
should be cleared, but the buffer state was changed so as BM_VALID
remained while clearing the other flags.

As reported on the thread, it was possible to corrupt the state of the
local buffers on ENOSPC, but the states would be corrupted on any kind
of ERROR during the relation extend (like partial writes or some other
errno).

Reported-by: Alexander Lakhin
Author: Tender Wang
Reviewed-by: Richard Guo, Alexander Lakhin, Michael Paquier
Discussion: https://postgr.es/m/18259-6e256429825dd435@postgresql.org
Backpatch-through: 16
2024-01-05 20:08:34 +09:00
Tom Lane 9391f71523 Teach estimate_array_length() to use statistics where available.
If we have DECHIST statistics about the argument expression, use
the average number of distinct elements as the array length estimate.
(It'd be better to use the average total number of elements, but
that is not currently calculated by compute_array_stats(), and
it's unclear that it'd be worth extra effort to get.)

To do this, we have to change the signature of estimate_array_length
to pass the "root" pointer.  While at it, also change its result
type to "double".  That's probably not really necessary, but it
avoids any risk of overflow of the value extracted from DECHIST.
All existing callers are going to use the result in a "double"
calculation anyway.

Paul Jungwirth, reviewed by Jian He and myself

Discussion: https://postgr.es/m/CA+renyUnM2d+SmrxKpDuAdpiq6FOM=FByvi6aS6yi__qyf6j9A@mail.gmail.com
2024-01-04 18:36:19 -05:00
Nathan Bossart 14dd0f27d7 Add macros for looping through a List without a ListCell.
Many foreach loops only use the ListCell pointer to retrieve the
content of the cell, like so:

    ListCell   *lc;

    foreach(lc, mylist)
    {
        int         myint = lfirst_int(lc);

        ...
    }

This commit adds a few convenience macros that automatically
declare the loop variable and retrieve the current cell's contents.
This allows us to rewrite the previous loop like this:

    foreach_int(myint, mylist)
    {
        ...
    }

This commit also adjusts a few existing loops in order to add
coverage for the new/adjusted macros.  There is presently no plan
to bulk update all foreach loops, as that could introduce a
significant amount of back-patching pain.  Instead, these macros
are primarily intended for use in new code.

Author: Jelte Fennema-Nio
Reviewed-by: David Rowley, Alvaro Herrera, Vignesh C, Tom Lane
Discussion: https://postgr.es/m/CAGECzQSwXKnxGwW1_Q5JE%2B8Ja20kyAbhBHO04vVrQsLcDciwXA%40mail.gmail.com
2024-01-04 16:09:34 -06:00
Peter Eisentraut 5d06e99a3c ALTER TABLE command to change generation expression
This adds a new ALTER TABLE subcommand ALTER COLUMN ... SET EXPRESSION
that changes the generation expression of a generated column.

The syntax is not standard but was adapted from other SQL
implementations.

This command causes a table rewrite, using the usual ALTER TABLE
mechanisms.  The implementation is similar to and makes use of some of
the infrastructure of the SET DATA TYPE subcommand (for example,
rebuilding constraints and indexes afterwards).  The new command
requires a new pass in AlterTablePass, and the ADD COLUMN pass had to
be moved earlier so that combinations of ADD COLUMN and SET EXPRESSION
can work.

Author: Amul Sul <sulamul@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b94yyJeGA-5M951_Lr+KfZokOp-2kXicpmEhi5FXhBeTog@mail.gmail.com
2024-01-04 16:28:54 +01:00
David Rowley ae69c4fcf1 Fix use of incorrect TupleTableSlot in DISTINCT aggregates
1349d2790 added code to allow DISTINCT and ORDER BY aggregates to work
more efficiently by using presorted input.  That commit added some code
that made use of the AggState's tmpcontext and adjusted the
ecxt_outertuple and ecxt_innertuple slots before checking if the current
row is distinct from the previously seen row.  That code forgot to set the
TupleTableSlots back to what they were originally, which could result in
errors such as:

ERROR:  attribute 1 of type record has wrong type

This only affects aggregate functions which have multiple arguments when
DISTINCT is used.  For example: string_agg(DISTINCT col, ', ')

Thanks to Tom Lane for identifying the breaking commit.

Bug: #18264
Reported-by: Vojtěch Beneš
Discussion: https://postgr.es/m/18264-e363593d7e9feb7d@postgresql.org
Backpatch-through: 16, where 1349d2790 was added
2024-01-04 20:38:25 +13:00
Amit Kapila 007693f2a3 Track conflict_reason in pg_replication_slots.
This patch changes the existing 'conflicting' field to 'conflict_reason'
in pg_replication_slots. This new field indicates the reason for the
logical slot's conflict with recovery. It is always NULL for physical
slots, as well as for logical slots which are not invalidated. The
non-NULL values indicate that the slot is marked as invalidated. Possible
values are:

wal_removed = required WAL has been removed.
rows_removed = required rows have been removed.
wal_level_insufficient = the primary doesn't have a wal_level sufficient
to perform logical decoding.

The existing users of 'conflicting' column can get the same answer by
using 'conflict_reason' IS NOT NULL.

Author: Shveta Malik
Reviewed-by: Amit Kapila, Bertrand Drouvot, Michael Paquier
Discussion: https://postgr.es/m/ZYOE8IguqTbp-seF@paquier.xyz
2024-01-04 08:26:25 +05:30
Bruce Momjian 29275b1d17 Update copyright for 2024
Reported-by: Michael Paquier

Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz

Backpatch-through: 12
2024-01-03 20:49:05 -05:00
Tom Lane aaf09c5923 Avoid masking EOF (no-password-supplied) conditions in auth.c.
CheckPWChallengeAuth() would return STATUS_ERROR if the user does not
exist or has no password assigned, even if the client disconnected
without responding to the password challenge (as libpq often will,
for example).  We should return STATUS_EOF in that case, and the
lower-level functions do, but this code level got it wrong since the
refactoring done in 7ac955b34.  This breaks the intent of not logging
anything for EOF cases (cf. comments in auth_failed()) and might
also confuse users of ClientAuthentication_hook.

Per report from Liu Lang.  Back-patch to all supported versions.

Discussion: https://postgr.es/m/b725238c-539d-cb09-2bff-b5e6cb2c069c@esgyn.cn
2024-01-03 17:40:41 -05:00
Peter Eisentraut 59fd390d5e Second attempt at organizing jsonpath operators and methods
Second attempt at 283a95da92.  Since we can't reorder the enum values
of JsonPathItemType, instead reorder the switch cases where they are
used to generally follow the order of the enum values, for better
maintainability.
2024-01-03 21:56:41 +01:00
Peter Eisentraut 0958f8f6bf Revert "Reorganise jsonpath operators and methods"
This reverts commit 283a95da92.

The reordering of JsonPathItemType affects the binary on-disk
compatibility of the jsonpath type, so we must not change it.  Revert
for now and consider.
2024-01-03 21:02:49 +01:00
Robert Haas dffde5bf16 Fix defects in PrepareForIncrementalBackup.
Swap the arguments to TimestampDifferenceMilliseconds so that we get
a positive answer instead of zero.

Then use the result of that computation instead of ignoring it.

Per reports from Alexander Lakhin.

Discussion: http://postgr.es/m/8b686764-7ac1-74c3-70f9-b64685a2535f@gmail.com
2024-01-03 09:59:46 -05:00
Peter Eisentraut 283a95da92 Reorganise jsonpath operators and methods
Various jsonpath operators and methods add various keywords, switch
cases, and documentation entries in some order.  However, they are not
consistent; reorder them for better maintainability or readability.

Author: Jeevan Chalke <jeevan.chalke@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAM2+6=XjTyqrrqHAOj80r0wVQxJSxc0iyib9bPC55uFO9VKatg@mail.gmail.com
2024-01-03 11:25:33 +01:00
Peter Eisentraut c1b9e1e56d Add numeric_int8_opt_error() to optionally suppress errors
This matches the existing numeric_int4_opt_error() (see commit
16d489b0fe).  It will be used by a future JSON-related patch, which
wants to report errors in its own way and thus does not want the
internal functions to throw any error.

Author: Jeevan Chalke <jeevan.chalke@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAM2+6=XjTyqrrqHAOj80r0wVQxJSxc0iyib9bPC55uFO9VKatg@mail.gmail.com
2024-01-03 10:05:35 +01:00
Peter Eisentraut d4e66a39eb Refactor: separate function to find all objects depending on a column
Move code from ATExecAlterColumnType() that finds the all the objects
that depend on the column to a separate function.  A future patch will
reuse this code.

Author: Amul Sul <sulamul@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b94yyJeGA-5M951_Lr+KfZokOp-2kXicpmEhi5FXhBeTog@mail.gmail.com
2024-01-03 08:50:37 +01:00
Tom Lane ae6bc39317 Minor fixes for search path cache code.
Avoid leaving a dangling pointer in the unlikely event that
nsphash_create fails.  Improve comments, and fix formatting by
adding typedefs.list entries.

Discussion: https://postgr.es/m/3972900.1704145107@sss.pgh.pa.us
2024-01-02 14:57:21 -05:00
Robert Haas 371b07e894 Remove Lock suffix from WALSummarizerLock in wait_event_names.txt
Nathan Bossart

Discussion: https://postgr.es/m/20240102173120.GA1061678@nathanxps13
2024-01-02 13:17:23 -05:00
Robert Haas 5bc7b33b4e jsonpath_exec: fix typo "absense" -> "absence"
Dagfinn Ilmari Mannsåker, reviewed by Shubham Khanna.

Discussion: http://postgr.es/m/87le9fmi01.fsf@wibble.ilmari.org
2024-01-02 12:27:38 -05:00
Robert Haas e62e73f3a2 gist: fix typo "split(t)ed" -> "split"
Dagfinn Ilmari Mannsåker, reviewed by Shubham Khanna.

Discussion: http://postgr.es/m/87le9fmi01.fsf@wibble.ilmari.org
2024-01-02 12:24:28 -05:00
Robert Haas 591cf626e7 tsquery: fix typo "rewrited" -> "rewritten"
Dagfinn Ilmari Mannsåker, reviewed by Shubham Khanna.

Discussion: http://postgr.es/m/87le9fmi01.fsf@wibble.ilmari.org
2024-01-02 12:23:36 -05:00
Robert Haas 0d9937d118 Fix typos in comments and in one isolation test.
Dagfinn Ilmari Mannsåker, reviewed by Shubham Khanna. Some subtractions
by me.

Discussion: http://postgr.es/m/87le9fmi01.fsf@wibble.ilmari.org
2024-01-02 12:05:41 -05:00
Robert Haas 5c430f9dc5 Add WALSummarizerLock to wait_event_names.txt
Per report from Nathan Bossart.

Discussion: http://postgr.es/m/20231227153647.GA601861@nathanxps13
2024-01-02 10:31:49 -05:00
Alexander Korotkov a7928a57b9 Replace the relid in some missing fields during SJE
Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/a89f480f-8143-0965-f22d-0a892777f501%40gmail.com
Author: Andrei Lepikhov
2024-01-02 09:48:50 +02:00
Amit Kapila 9a17be1e24 Allow upgrades to preserve the full subscription's state.
This feature will allow us to replicate the changes on subscriber nodes
after the upgrade.

Previously, only the subscription metadata information was preserved.
Without the list of relations and their state, it's not possible to
re-enable the subscriptions without missing some records as the list of
relations can only be refreshed after enabling the subscription (and
therefore starting the apply worker).  Even if we added a way to refresh
the subscription while enabling a publication, we still wouldn't know
which relations are new on the publication side, and therefore should be
fully synced, and which shouldn't.

To preserve the subscription relations, this patch teaches pg_dump to
restore the content of pg_subscription_rel from the old cluster by using
binary_upgrade_add_sub_rel_state SQL function. This is supported only
in binary upgrade mode.

The subscription's replication origin is needed to ensure that we don't
replicate anything twice.

To preserve the replication origins, this patch teaches pg_dump to update
the replication origin along with creating a subscription by using
binary_upgrade_replorigin_advance SQL function to restore the
underlying replication origin remote LSN. This is supported only in
binary upgrade mode.

pg_upgrade will check that all the subscription relations are in 'i'
(init) or in 'r' (ready) state and will error out if that's not the case,
logging the reason for the failure. This helps to avoid the risk of any
dangling slot or origin after the upgrade.

Author: Vignesh C, Julien Rouhaud, Shlok Kyal
Reviewed-by: Peter Smith, Masahiko Sawada, Michael Paquier, Amit Kapila, Hayato Kuroda
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
2024-01-02 08:08:46 +05:30
Peter Eisentraut cea89c93a1 Turn AT_PASS_* macros into an enum
This make this code simpler and easier to follow.  Also, patches that
want to change the passes won't have to renumber the whole list.

Reviewed-by: Amul Sul <sulamul@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b94yyJeGA-5M951_Lr+KfZokOp-2kXicpmEhi5FXhBeTog@mail.gmail.com
2024-01-01 19:17:18 +01:00
Tomas Vondra cb44a8345e Fix parallel BRIN builds with synchronized scans
The brinbuildCallbackParallel callback used by parallel BRIN builds did
not consider that the parallel table scans may be synchronized, starting
from an arbitrary block and then wrap around.

If this happened and the scan actually did wrap around, tuples from the
beginning of the table were added to the last range produced by the same
worker. The index would be missing range at the beginning of the table,
while the last range would be too wide. This would not produce incorrect
query results, but it'd be less efficient.

Fixed by checking for both past and future ranges in the callback. The
worker may produce multiple summaries for the same page range, but the
leader will merge them as if the summaries came from different workers.

Discussion: https://postgr.es/m/c2ee7d69-ce17-43f2-d1a0-9811edbda6e6%40enterprisedb.com
2023-12-30 23:17:01 +01:00
Tomas Vondra 6c63bcbf3c Minor cleanup of the BRIN parallel build code
Commit b437571714 added support for parallel builds for BRIN indexes,
using code similar to BTREE parallel builds, and also a new tuplesort
variant. This commit simplifies the new code in two ways:

* The "spool" grouping tuplesort and the heap/index is not necessary.
  The heap/index are available as separate arguments, causing confusion.
  So remove the spool, and use the tuplesort directly.

* The new tuplesort variant does not need the heap/index, as it sorts
  simply by the range block number, without accessing the tuple data.
  So simplify that too.

Initial report and patch by Ranier Vilela, further cleanup by me.

Author: Ranier Vilela
Discussion: https://postgr.es/m/CAEudQAqD7f2i4iyEaAz-5o-bf6zXVX-AkNUBm-YjUXEemaEh6A%40mail.gmail.com
2023-12-30 23:15:04 +01:00
Heikki Linnakangas 5632d6e18a Don't send "cannot connect" errors on invalid startup packet
Commit 16671ba6e7 moved the code that sends "sorry, too many clients
already" and other such messages, but it had the effect that we would
send that error even if the the startup packet processing failed, e.g.
because the client sent an invalid startup packet. That was not
intentional.

Spotted while reading the code again.
2023-12-30 22:18:54 +02:00
Peter Eisentraut a740b213d4 Add GUC backtrace_on_internal_error
When enabled (default off), this logs a backtrace anytime elog() or an
equivalent ereport() for internal errors is called.

This is not well covered by the existing backtrace_functions, because
there are many equally-worded low-level errors in many functions.  And
if you find out where the error is, then you need to manually rewrite
the elog() to ereport() to attach the errbacktrace(), which is
annoying.  Having a backtrace automatically on every elog() call could
be very helpful during development for various kinds of common errors
from palloc, syscache, node support, etc.

Discussion: https://www.postgresql.org/message-id/flat/ba76c6bc-f03f-4285-bf16-47759cfcab9e@eisentraut.org
2023-12-30 11:43:57 +01:00
Peter Eisentraut c538592959 Make all Perl warnings fatal
There are a lot of Perl scripts in the tree, mostly code generation
and TAP tests.  Occasionally, these scripts produce warnings.  These
are probably always mistakes on the developer side (true positives).
Typical examples are warnings from genbki.pl or related when you make
a mess in the catalog files during development, or warnings from tests
when they massage a config file that looks different on different
hosts, or mistakes during merges (e.g., duplicate subroutine
definitions), or just mistakes that weren't noticed because there is a
lot of output in a verbose build.

This changes all warnings into fatal errors, by replacing

    use warnings;

by

    use warnings FATAL => 'all';

in all Perl files.

Discussion: https://www.postgresql.org/message-id/flat/06f899fd-1826-05ab-42d6-adeb1fd5e200%40eisentraut.org
2023-12-29 18:20:00 +01:00
Peter Eisentraut 541e8f14a1 Fix variable name and comment
Should match the name of the related GUC variable.

Discussion: https://www.postgresql.org/message-id/da4a680a-5d8a-4663-a5c8-a3ccbf23394a@eisentraut.org
2023-12-28 17:25:47 +01:00
Tom Lane 58054de2d0 Improve the implementation of information_schema._pg_expandarray().
This function was originally coded with a handmade expansion
of the array subscripts.  We can do it a little faster and far
more legibly today, by using unnest() WITH ORDINALITY.

While at it, let's apply the rowcount estimation support that exists
for the underlying unnest() function: reduce the default ROWS estimate
to 100 and attach array_unnest_support.  I'm not sure that
array_unnest_support can do anything useful today with the call sites
that exist in information_schema, but it can't hurt, and the existing
default rowcount of 1000 is surely much too high for any of these
cases.

The psql.sql regression script is using _pg_expandarray() as a
test case for \sf+.  While we could keep doing so, the new one-line
function body makes a poor test case for \sf+ row-numbering, so
switch it to print another information_schema function.

Discussion: https://postgr.es/m/1424303.1703355485@sss.pgh.pa.us
2023-12-27 15:55:46 -05:00
Peter Eisentraut 390408ec08 Fix incorrect format placeholders 2023-12-27 17:39:10 +01:00
Tom Lane 98c6231d19 Fix incorrect data type choices in some read and write calls.
Recently-introduced code in reconstruct.c was using "unsigned"
to store the result of read(), pg_pread(), or write().  This is
completely bogus: it breaks subsequent tests for the result being
negative, as we're being reminded of by a chorus of buildfarm
warnings.  Switch to "int" as was doubtless intended.  (There are
several other uses of "unsigned" in this file that also look poorly
chosen to me, but for now I'm just trying to clean up the buildfarm.)

A larger problem is that "int" is not necessarily wide enough to hold
the result: per POSIX, all these functions return ssize_t.  In places
where the requested read or write length clearly fits in int, that's
academic.  It may be academic anyway as long as we constrain
individual data files to 1GB, since even a readv or writev-like
operation would then not be responsible for transferring more than
1GB.  Nonetheless it seems like trouble waiting to happen, so I made
a pass over readv and writev calls and fixed the result variables
where that seemed appropriate.  We might want to think about changing
some of the fd.c functions to return ssize_t too, for future-proofing;
but I didn't tackle that here.

Discussion: https://postgr.es/m/1672202.1703441340@sss.pgh.pa.us
2023-12-27 11:02:53 -05:00
Robert Haas da083b20f6 Initialize variable to placate compiler.
I don't think there's a real problem here, because if we reach
the loop over 'tles' then we will either find at least one
TimeLineHistoryEntry such that oldest_segno != 0, in which case
unsummarized_lsn will be initialized, or else unsummarized_tli
will remain 0 and an error will occur before unsummarized_lsn
is used for anything. But some compilers are complainining, as
reported on list by Nathan Bossart and off-list by Andrew Dunstan.

Discussion: http://postgr.es/m/20231223215147.GA69623@nathanxps13
2023-12-27 08:45:23 -05:00
Alexander Korotkov 7e6fb5da41 Improvements and fixes for e0b1ee17dc
e0b1ee17dc introduced optimization for matching B-tree scan keys required for
the directional scan.  However, it incorrectly assumed that all keys required
for opposite direction scan are satisfied by _bt_first().  It has been
illustrated that with multiple scan keys over the same column, a lesser one
(according to the scan direction) could win leaving the other one unsatisfied.

Instead of relying on _bt_first() this commit introduces code that memorizes
whether there was at least one match on the page.  If that's true we know that
keys required for opposite-direction scan are satisfied as soon as
corresponding values are not NULLs.

Also, this commit simplifies the description for the optimization of keys
required for the current direction scan.  Now the flag used for this is named
continuescanPrechecked and means exactly that *continuescan flag is known
to be true for the last item on the page.

Reported-by: Peter Geoghegan
Discussion: https://postgr.es/m/CAH2-Wzn0LeLcb1PdBnK0xisz8NpHkxRrMr3NWJ%2BKOK-WZ%2BQtTQ%40mail.gmail.com
Reviewed-by: Pavel Borisov
2023-12-27 14:35:08 +02:00
Alexander Korotkov 06b10f80ba Remove BTScanOpaqueData.firstPage
It's not necessary to keep the firstPage flag as a field of BTScanOpaqueData.
This commit makes it an argument of the _bt_readpage() function.  We can easily
distinguish first-time and repeated calls (within the scan) of this function.

Reported-by: Peter Geoghegan
Discussion: https://postgr.es/m/CAH2-Wzk4SOsw%2BtHuTFiz8U9Jqj-R77rYPkhWKODCBb1mdHACXA%40mail.gmail.com
Reviewed-by: Pavel Borisov
2023-12-27 14:21:49 +02:00
John Naylor 7d7ef075d2 Fix typo and case in messages
Follow up to dc2123400

Kyotaro Horiguchi

Discussion: https://postgr.es/m/20231222.154939.1509525390095583358.horikyota.ntt@gmail.com
Discussion: https://postgr.es/m/20231225.145124.1745560266993421173.horikyota.ntt@gmail.com
2023-12-27 13:31:54 +07:00
Alexander Korotkov e0477837ce Make replace_relid() leave argument unmodified
There are a lot of situations when we share the same pointer to a Bitmapset
structure across different places.  In order to evade undesirable side effects
replace_relid() function should always return a copy.

Reported-by: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4_wJthNtYBL%2BSsebpgF-5L2r5zFFk6xYbS0A78GKOTFHw%40mail.gmail.com
Reviewed-by: Richard Guo, Andres Freund, Ashutosh Bapat, Andrei Lepikhov
2023-12-27 03:57:57 +02:00
Alexander Korotkov 7d58f2342b REALLOCATE_BITMAPSETS manual compile-time option
This option forces each bitmapset modification to reallocate bitmapset.  This
is useful for debugging hangling pointers to bitmapset's.

Discussion: https://postgr.es/m/CAMbWs4_wJthNtYBL%2BSsebpgF-5L2r5zFFk6xYbS0A78GKOTFHw%40mail.gmail.com
Reviewed-by: Richard Guo, Andres Freund, Ashutosh Bapat, Andrei Lepikhov
2023-12-27 03:57:57 +02:00
Alexander Korotkov 71a3e8c43b Add asserts to bimapset manipulation functions
New asserts validate that arguments are really bitmapsets.  This should help
to early detect accesses to dangling pointers.

Discussion: https://postgr.es/m/CAMbWs4_wJthNtYBL%2BSsebpgF-5L2r5zFFk6xYbS0A78GKOTFHw%40mail.gmail.com
Reviewed-by: Richard Guo, Andres Freund, Ashutosh Bapat, Andrei Lepikhov
2023-12-27 03:57:57 +02:00
Tom Lane 059de3ca47 Fix failure to verify PGC_[SU_]BACKEND GUCs in pg_file_settings view.
set_config_option() bails out early if it detects that the option to
be set is PGC_BACKEND or PGC_SU_BACKEND class and we're reading the
config file in a postmaster child; we don't want to apply any new
value in such a case.  That's fine as far as it goes, but it fails
to consider the requirements of the pg_file_settings view: for that,
we need to check validity of the value even though we have no
intention to apply it.  Because we didn't, even very silly values
for affected GUCs would be reported as valid by the view.  There
are only half a dozen such GUCs, which perhaps explains why this
got overlooked for so long.

Fix by continuing when changeVal is false; this parallels the logic
in some other early-exit paths.

Also, the check added by commit 924bcf4f1 to prevent GUC changes in
parallel workers seems a few bricks shy of a load: it's evidently
assuming that ereport(elevel, ...) won't return.  Make sure we
bail out if it does.  The lack of trouble reports suggests that
this is only a latent bug, i.e. parallel workers don't actually
reach here with elevel < ERROR.  (Per the code coverage report,
we never reach here at all in the regression suite.)  But we clearly
don't want to risk proceeding if that does happen.

Per report from Rıdvan Korkmaz.  These are ancient bugs, so back-patch
to all supported branches.

Discussion: https://postgr.es/m/2089235.1703617353@sss.pgh.pa.us
2023-12-26 17:57:48 -05:00
Peter Eisentraut d327f418d1 Remove unused macro
Usage was removed in 6c5576075b but the definition was not removed.
2023-12-26 20:13:11 +01:00
Alexander Korotkov 0a93f803f4 Fix a comment for remove_self_joins_recurse()
Discussion: https://postgr.es/m/18187-831da249cbd2ff8e%40postgresql.org
Author: Richard Guo
Reviewed-by: Andrei Lepikhov
2023-12-25 01:33:34 +02:00
Alexander Korotkov b5fb6736ed Don't constrain self-join removal due to PHVs
Self-join removal appears to be safe to apply with placeholder variables
as long as we handle PlaceHolderVar in replace_varno_walker() and replace
relid in phinfo->ph_lateral.

Discussion: https://postgr.es/m/18187-831da249cbd2ff8e%40postgresql.org
Author: Richard Guo
Reviewed-by: Andrei Lepikhov
2023-12-25 01:33:26 +02:00
Alexander Korotkov 8a8ed916f7 Handle PlaceHolderVar case in replace_varno_walker
This commit also retires sje_walker.  This increases the generalty of replacing
varno in the parse tree and simplifies the code.

Discussion: https://postgr.es/m/18187-831da249cbd2ff8e%40postgresql.org
Author: Richard Guo
Reviewed-by: Andrei Lepikhov
2023-12-25 01:33:08 +02:00
Alexander Korotkov 12915a58ee Enhance checkpointer restartpoint statistics
Bhis commit introduces enhancements to the pg_stat_checkpointer view by adding
three new columns: restartpoints_timed, restartpoints_req, and
restartpoints_done. These additions aim to improve the visibility and
monitoring of restartpoint processes on replicas.

Previously, it was challenging to differentiate between successful and failed
restartpoint requests. This limitation arises because restartpoints on replicas
are dependent on checkpoint records from the primary, and cannot occur more
frequently than these checkpoints.

The new columns allow for clear distinction and tracking of restartpoint
requests, their triggers, and successful completions.  This enhancement aids
database administrators and developers in better understanding and diagnosing
issues related to restartpoint behavior, particularly in scenarios where
restartpoint requests may fail.

System catalog is changed.  Catversion is bumped.

Discussion: https://postgr.es/m/99b2ccd1-a77a-962a-0837-191cdf56c2b9%40inbox.ru
Author: Anton A. Melnikov
Reviewed-by: Kyotaro Horiguchi, Alexander Korotkov
2023-12-25 01:12:36 +02:00
Peter Eisentraut 3e2e0d5ad7 Set all variable-length fields of pg_attribute to null on column drop
When a column is dropped, the fields attacl, attoptions, and
attfdwoptions were kept unchanged.  This is probably harmless, but it
seems wasteful, and leaves potentially dangling data lying around (for
example, attacl could contain references to users that are later also
dropped).

Change this to set those fields to null when a column is marked as
dropped.

Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/249d819d-1763-4580-8110-0bf91a0f08b7@eisentraut.org
2023-12-22 21:44:55 +01:00
Robert Haas ffc6ab9b56 Replace nonsense comment with a relevant one.
Per report from Alexander Lakhin.

Discussion: http://postgr.es/m/061cccf7-0cac-804f-4c2a-9d6da8e3848b@gmail.com
2023-12-21 16:03:03 -05:00
Robert Haas 49f2194ed5 Fix numerous typos in incremental backup commits.
Apparently, spell check would have been a really good idea.

Alexander Lakhin, with a few additions as per an off-list report
from Andres Freund.

Discussion: http://postgr.es/m/f08f7c60-1ad3-0b57-d580-54b11f07cddf@gmail.com
2023-12-21 15:36:17 -05:00
Tom Lane 903737c5bf Avoid trying to fetch metapage of an SPGist partitioned index.
This is necessary when spgcanreturn() is invoked on a partitioned
index, and the failure might be reachable in other scenarios as
well.  The rest of what spgGetCache() does is perfectly sensible
for a partitioned index, so we should allow it to go through.

I think the main takeaway from this is that we lack sufficient test
coverage for non-btree partitioned indexes.  Therefore, I added
simple test cases for brin and gin as well as spgist (hash and
gist AMs were covered already in indexing.sql).

Per bug #18256 from Alexander Lakhin.  Although the known test case
only fails since v16 (3c569049b), I've got no faith at all that there
aren't other ways to reach this problem; so back-patch to all
supported branches.

Discussion: https://postgr.es/m/18256-0b0e1b6e4a620f1b@postgresql.org
2023-12-21 12:43:36 -05:00
Dean Rasheed a0ff37173d Fix BEFORE ROW trigger handling in cross-partition MERGE update.
Fix a bug during MERGE if a cross-partition update is attempted on a
partitioned table with a BEFORE DELETE ROW trigger that returns NULL,
to prevent the update. This would cause an error to be thrown, or an
assert failure in an assert-enabled build.

This was an oversight in 9321c79c86, which failed to properly
distinguish a DELETE prevented by a trigger from one prevented by a
concurrent update. Fix by having ExecDelete() return the TM_Result
status to ExecCrossPartitionUpdate(), so that it can distinguish the
two cases, and make ExecCrossPartitionUpdate() return the TM_Result
status to ExecUpdateAct(), so that it can return the correct status
from a concurrent update.

In addition, ensure that the command tag is correctly updated by
having ExecMergeMatched() pass canSetTag to ExecUpdateAct(), rather
than passing false, so that it updates the command tag if it does a
cross-partition update, making this code path in ExecMergeMatched()
consistent with ExecUpdate().

Per bug #18238 from Alexander Lakhin. Back-patch to v15, where MERGE
was introduced.

Dean Rasheed, reviewed by Richard Guo and Jian He.

Discussion: https://postgr.es/m/18238-2f2bdc7f720180b9%40postgresql.org
2023-12-21 12:55:22 +00:00
Peter Eisentraut e557db106e Fix prologue of get_partition_ancestors()
The callers of this function assume that the first Oid in the list
returned by this function corresponds to the immediate parent and the
last on corresponds to the topmost parent.  Make that explicit in the
function prologue.

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAExHW5vCbATEmht861=G-BFPHNwLUqyeGa_=8-xibJ6Q1UxAeA@mail.gmail.com
2023-12-21 11:42:12 +01:00
Masahiko Sawada bf6260b39d Show isCatalogRel in several rmgr descriptions.
Commit 6af179395 added isCatalogRel field to some WAL record types,
but this field was not shown in the rmgr descriptions. This commit
changes the several rmgr descriptions to display the isCatalogRel
field.

Author: Bertrand Drouvot
Reviewed-by: Michael Paquier, Masahiko Sawada
Discussion: https://postgr.es/m/957dc8f9-2a02-4640-9c01-9dcbf97c4187%40gmail.com
2023-12-21 10:09:38 +09:00
Robert Haas dc21234005 Add support for incremental backup.
To take an incremental backup, you use the new replication command
UPLOAD_MANIFEST to upload the manifest for the prior backup. This
prior backup could either be a full backup or another incremental
backup.  You then use BASE_BACKUP with the INCREMENTAL option to take
the backup.  pg_basebackup now has an --incremental=PATH_TO_MANIFEST
option to trigger this behavior.

An incremental backup is like a regular full backup except that
some relation files are replaced with files with names like
INCREMENTAL.${ORIGINAL_NAME}, and the backup_label file contains
additional lines identifying it as an incremental backup. The new
pg_combinebackup tool can be used to reconstruct a data directory
from a full backup and a series of incremental backups.

Patch by me.  Reviewed by Matthias van de Meent, Dilip Kumar, Jakub
Wartak, Peter Eisentraut, and Álvaro Herrera. Thanks especially to
Jakub for incredibly helpful and extensive testing.

Discussion: http://postgr.es/m/CA+TgmoYOYZfMCyOXFyC-P+-mdrZqm5pP2N7S-r0z3_402h9rsA@mail.gmail.com
2023-12-20 09:49:12 -05:00
Robert Haas 174c480508 Add a new WAL summarizer process.
When active, this process writes WAL summary files to
$PGDATA/pg_wal/summaries. Each summary file contains information for a
certain range of LSNs on a certain TLI. For each relation, it stores a
"limit block" which is 0 if a relation is created or destroyed within
a certain range of WAL records, or otherwise the shortest length to
which the relation was truncated during that range of WAL records, or
otherwise InvalidBlockNumber. In addition, it stores a list of blocks
which have been modified during that range of WAL records, but
excluding blocks which were removed by truncation after they were
modified and never subsequently modified again.

In other words, it tells us which blocks need to copied in case of an
incremental backup covering that range of WAL records. But this
doesn't yet add the capability to actually perform an incremental
backup; the next patch will do that.

A new parameter summarize_wal enables or disables this new background
process.  The background process also automatically deletes summary
files that are older than wal_summarize_keep_time, if that parameter
has a non-zero value and the summarizer is configured to run.

Patch by me, with some design help from Dilip Kumar and Andres Freund.
Reviewed by Matthias van de Meent, Dilip Kumar, Jakub Wartak, Peter
Eisentraut, and Álvaro Herrera.

Discussion: http://postgr.es/m/CA+TgmoYOYZfMCyOXFyC-P+-mdrZqm5pP2N7S-r0z3_402h9rsA@mail.gmail.com
2023-12-20 08:42:28 -05:00
Jeff Davis 766571be16 Additional write barrier in AdvanceXLInsertBuffer().
First, mark the xlblocks member with InvalidXLogRecPtr, then issue a
write barrier, then initialize it. That ensures that the xlblocks
member doesn't appear valid while the contents are being initialized.

In preparation for reading WAL buffer contents without a lock.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACVfFMfqD5oLzZSQQZWfXiJqd-NdX0_317veP6FuB31QWA@mail.gmail.com
Reviewed-by: Andres Freund
2023-12-19 17:35:54 -08:00
Jeff Davis c3a8e2a7cb Use 64-bit atomics for xlblocks array elements.
In preparation for reading the contents of WAL buffers without a
lock. Also, avoids the previously-needed comment in GetXLogBuffer()
explaining why it's safe from torn reads.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACVfFMfqD5oLzZSQQZWfXiJqd-NdX0_317veP6FuB31QWA@mail.gmail.com
Reviewed-by: Andres Freund
2023-12-19 17:35:42 -08:00
Michael Paquier 1301c80b21 Remove MSVC scripts
This commit removes all the scripts located in src/tools/msvc/ to build
PostgreSQL with Visual Studio on Windows, meson becoming the recommended
way to achieve that.  The scripts held some information that is still
relevant with meson, information kept and moved to better locations.
Comments that referred directly to the scripts are removed.

All the documentation still relevant that was in install-windows.sgml
has been moved to installation.sgml under a new subsection for Visual.
All the content specific to the scripts is removed.  Some adjustments
for the documentation are planned in a follow-up set of changes.

Author: Michael Paquier
Reviewed-by: Peter Eisentraut, Andres Freund
Discussion: https://postgr.es/m/ZQzp_VMJcerM1Cs_@paquier.xyz
2023-12-20 09:44:37 +09:00
Robert Haas 47f01d727e Fix brown paper bag bug in 5c47c6546c.
The previous logic failed to work for anything other than the first
segment of a relation.

Report by Jakub Wartak. Patch by me.

Discussion: http://postgr.es/m/CAKZiRmwd3KTNMQhm9Bv4oR_1uMehXroO6kGyJQkiw9DfM8cMwQ@mail.gmail.com
2023-12-19 15:00:23 -05:00
Tom Lane 7e1ce2b3de Prevent integer overflow when forming tuple width estimates.
It's at least theoretically possible to overflow int32 when adding up
column width estimates to make a row width estimate.  (The bug example
isn't terribly convincing as a real use-case, but perhaps wide joins
would provide a more plausible route to trouble.)  This'd lead to
assertion failures or silly planner behavior.  To forestall it, make
the relevant functions compute their running sums in int64 arithmetic
and then clamp to int32 range at the end.  We can reasonably assume
that MaxAllocSize is a hard limit on actual tuple width, so clamping
to that is simply a correction for dubious input values, and there's
no need to go as far as widening width variables to int64 everywhere.

Per bug #18247 from RekGRpth.  There've been no reports of this issue
arising in practical cases, so I feel no need to back-patch.

Richard Guo and Tom Lane

Discussion: https://postgr.es/m/18247-11ac477f02954422@postgresql.org
2023-12-19 11:12:16 -05:00
Heikki Linnakangas 3c080fb4fa Simplify newNode() by removing special cases
- Remove MemoryContextAllocZeroAligned(). It was supposed to be a
  faster version of MemoryContextAllocZero(), but modern compilers turn
  the MemSetLoop() into a call to memset() anyway, making it more or
  less identical to MemoryContextAllocZero(). That was the only user of
  MemSetTest, MemSetLoop, so remove those too, as well as palloc0fast().

- Convert newNode() to a static inline function. When this was
  originally originally written, it was written as a macro because
  testing showed that gcc didn't inline the size check as we
  intended. Modern compiler versions do, and now that it just calls
  palloc0() there is no size-check to inline anyway.

One nice effect is that the palloc0() takes one less argument than
MemoryContextAllocZeroAligned(), which saves a few instructions in the
callers of newNode().

Reviewed-by: Peter Eisentraut, Tom Lane, John Naylor, Thomas Munro
Discussion: https://www.postgresql.org/message-id/b51f1fa7-7e6a-4ecc-936d-90a8a1659e7c@iki.fi
2023-12-19 12:11:47 +02:00
Amit Kapila c8bc807cf8 pgoutput: Raise an error for missing protocol version parameter.
Currently, we give a misleading error if the user omits to pass the
required parameter 'proto_version' in SQL API
pg_logical_slot_get_changes() or during START_REPLICATION protocol
message. The error raised is as follows which indicates that the wrong
version is passed.
ERROR:  client sent proto_version=0 but server only supports protocol 1 or higher

Author: Emre Hasegeli
Reviewed-by: Peter Smith, Amit Kapila
Discussion: https://postgr.es/m/CAE2gYzwdwtUbs-tPSV-QBwgTubiyGD2ZGsSnAVsDfAGGLDrGOA@mail.gmail.com
2023-12-19 09:53:33 +05:30
Tom Lane 8b965c549d compute_bitmap_pages' loop_count parameter should be double not int.
The value was double in the original implementation of this logic.
Commit da08a6598 pulled it out into a subroutine, but carelessly
declared the parameter as int when it should have been double.
On most platforms, the only ill effect would be to clamp the value
to be not more than INT_MAX, which would seldom be exceeded and
probably wouldn't change the estimates too much anyway.  Nonetheless,
it's wrong and can cause complaints from ubsan.

While here, improve the comments and parameter names.

This is an ABI change in a globally exposed subroutine, so
back-patching would create some risk of breaking extensions.
The value of the fix doesn't seem high enough to warrant taking
that risk, so fix in HEAD only.

Per report from Alexander Lakhin.

Discussion: https://postgr.es/m/f5e15fe1-202d-1936-f47c-f0c69a936b72@gmail.com
2023-12-18 12:46:10 -05:00
Nathan Bossart 0d1adae6f7 Micro-optimize datum_to_json_internal() some more.
Commit dc3f9bc549 mainly targeted the JSONTYPE_NUMERIC code path.
This commit applies similar optimizations (e.g., removing
unnecessary runtime calls to strlen() and palloc()) to nearby code.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/20231208203708.GA4126315%40nathanxps13
2023-12-18 10:34:33 -06:00
Thomas Munro 4908c58720 Provide vectored variants of smgrread() and smgrwrite().
smgrreadv() and smgrwritev() and their md.c implementations call
FileReadV() and FileWriteV().  A range of disk blocks beginning at
'blocknum' and extending for 'nblocks' can be scattered to or gathered
from multiple buffers with a single system call.  The traditional
smgrread() and smgrwrite() functions are implemented in terms of the new
functions.

Later commits will introduce calls with nblocks > 1, but the following
behavioral changes can be seen already:

* After a short transfer we'll now retry until we eventually read 0
  bytes (= EOF) or get ENOSPC, EDQUOT, EFBIG etc, where previously we
  would infer the reason.  Retrying is consistent with xlog.c's
  treatment of large WAL writes, and arguably also xlog.c and fd.c's
  treatment of EINTR.  Arbitrary short returns for larger transfers have
  been observed on several OSes, and might in theory also happen for
  transient reasons with our own pg_p*v() fallback code.

* After unexpected EOF or -1, the error thrown now talks about
  a range even for the single block case, eg "blocks 42..42".

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com
2023-12-18 15:01:50 +13:00
Michael Paquier 3c9d9acae0 Refactor pgstat_prepare_io_time() with an input argument instead of a GUC
Originally, this routine relied on track_io_timing to check if a time
interval for an I/O operation stored in pg_stat_io should be initialized
or not.  However, the addition of WAL statistics to pg_stat_io requires
that the initialization happens when track_wal_io_timing is enabled,
which is dependent on the code path where the I/O operation happens.

Author: Nazir Bilal Yavuz
Discussion: https://postgr.es/m/CAN55FZ3AiQ+ZMxUuXnBpd0Rrh1YhwJ5FudkHg=JU0P+-W8T4Vg@mail.gmail.com
2023-12-16 20:16:20 +01:00
Alvaro Herrera a6be0600ac
Remove useless LIMIT_OPTION_DEFAULT value from LimitOption
During the development that led to commit 357889eb17, for a time we
had the value LIMIT_OPTION_DEFAULT, which was mostly but not completely
removed later on, before commit.  Complete the removal now.

Author: Zhang Mingli <avamingli@gmail.com>
Discussion: https://postgr.es/m/59d61a1a-3858-475a-964f-24468c97cc67@Spark
2023-12-16 18:20:03 +01:00
Thomas Munro b485ad7f07 Provide multi-block smgrprefetch().
Previously smgrprefetch() could issue POSIX_FADV_WILLNEED advice for a
single block at a time.  Add an nblocks argument so that we can do the
same for a range of blocks.  This usually produces a single system call,
but might need to loop if it crosses a segment boundary.  Initially it
is only called with nblocks == 1, but proposed patches will make wider
calls.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> (earlier version)
Discussion: https://postgr.es/m/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com
2023-12-16 17:51:21 +13:00
Tom Lane 59bd34c2fa Fix bugs in manipulation of large objects.
In v16 and up (since commit afbfc0298), large object ownership
checking has been broken because object_ownercheck() didn't take care
of the discrepancy between our object-address representation of large
objects (classId == LargeObjectRelationId) and the catalog where their
ownership info is actually stored (LargeObjectMetadataRelationId).
This resulted in failures such as "unrecognized class ID: 2613"
when trying to update blob properties as a non-superuser.

Poking around for related bugs, I found that AlterObjectOwner_internal
would pass the wrong classId to the PostAlterHook in the no-op code
path where the large object already has the desired owner.  Also,
recordExtObjInitPriv checked for the wrong classId; that bug is only
latent because the stanza is dead code anyway, but as long as we're
carrying it around it should be less wrong.  These bugs are quite old.

In HEAD, we can reduce the scope for future bugs of this ilk by
changing AlterObjectOwner_internal's API to let the translation happen
inside that function, rather than requiring callers to know about it.

A more bulletproof fix, perhaps, would be to start using
LargeObjectMetadataRelationId as the dependency and object-address
classId for blobs.  However that has substantial risk of breaking
third-party code; even within our own code, it'd create hassles
for pg_dump which would have to cope with a version-dependent
representation.  For now, keep the status quo.

Discussion: https://postgr.es/m/2650449.1702497209@sss.pgh.pa.us
2023-12-15 13:55:05 -05:00
Michael Paquier 8a7cbfce13 Prevent tuples to be marked as dead in subtransactions on standbys
Dead tuples are ignored and are not marked as dead during recovery, as
it can lead to MVCC issues on a standby because its xmin may not match
with the primary.  This information is tracked by a field called
"xactStartedInRecovery" in the transaction state data, switched on when
starting a transaction in recovery.

Unfortunately, this information was not correctly tracked when starting
a subtransaction, because the transaction state used for the
subtransaction did not update "xactStartedInRecovery" based on the state
of its parent.  This would cause index scans done in subtransactions to
return inconsistent data, depending on how the xmin of the primary
and/or the standby evolved.

This is broken since the introduction of hot standby in efc16ea520, so
backpatch all the way down.

Author: Fei Changhong
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/tencent_C4D907A5093C071A029712E73B43C6512706@qq.com
Backpatch-through: 12
2023-12-12 17:05:18 +01:00
Daniel Gustafsson c5962ad21a Fix typo in comment
Commit 98e675ed7a accidentally mistyped IDENTIFY_SYSTEM as
IDENTIFY_SERVER. Backpatch to all supported branches.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/68138521-5345-8780-4390-1474afdcba1f@gmail.com
2023-12-12 12:16:38 +01:00
Thomas Munro 871fe4917e Provide vectored variants of FileRead() and FileWrite().
FileReadV() and FileWriteV() adapt pg_preadv() and pg_pwritev() for
fd.c's virtual file descriptors.  The simple FileRead() and FileWrite()
functions are now implemented in terms of the vectored functions, to
avoid code duplication, and they are converted back to the corresponding
simple system calls further down (commit 15c9ac36).  Later work will
make more interesting multi-iovec calls.

The traditional behavior of reporting a "fake" ENOSPC error is
simplified.  It's now always set for non-failing writes, for the benefit
of callers that expect to log a meaningful "%m" if they determine that
the write was short.  (Perhaps we should consider getting rid of that
expectation one day.)

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com
2023-12-12 13:12:43 +13:00
Tom Lane 0a5c46a7a4 Be more wary about OpenSSL not setting errno on error.
OpenSSL will sometimes return SSL_ERROR_SYSCALL without having set
errno; this is apparently a reflection of recv(2)'s habit of not
setting errno when reporting EOF.  Ensure that we treat such cases
the same as read EOF.  Previously, we'd frequently report them like
"could not accept SSL connection: Success" which is confusing, or
worse report them with an unrelated errno left over from some
previous syscall.

To fix, ensure that errno is zeroed immediately before the call,
and report its value only when it's not zero afterwards; otherwise
report EOF.

For consistency, I've applied the same coding pattern in libpq's
pqsecure_raw_read().  Bare recv(2) shouldn't really return -1 without
setting errno, but in case it does we might as well cope.

Per report from Andres Freund.  Back-patch to all supported versions.

Discussion: https://postgr.es/m/20231208181451.deqnflwxqoehhxpe@awork3.anarazel.de
2023-12-11 11:51:56 -05:00
Alvaro Herrera d3fe6e90ba
Simplify productions for FORMAT JSON [ ENCODING name ]
This removes the production json_encoding_clause_opt, instead merging
it into json_format_clause.  Also remove the auxiliary
makeJsonEncoding() function.

Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/202312071841.u2gueb5dsrbk%40alvherre.pgsql
2023-12-11 11:55:34 +01:00
Michael Paquier c7a3e6b46d Remove trace_recovery_messages
This GUC was intended as a debugging help in the 9.0 area when hot
standby and streaming replication were being developped, able to offer
more information at LOG level rather than DEBUGn.  There are more tools
available these days that are able to offer rather equivalent
information, like pg_waldump introduced in 9.3.  It is not obvious how
this facility is useful these days, so let's remove it.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/ZXEXEAUVFrvpquSd@paquier.xyz
2023-12-11 11:49:02 +01:00
Amit Kapila 8d7d2197f3 Fix an undetected deadlock due to apply worker.
The apply worker needs to update the state of the subscription tables to
'READY' during the synchronization phase which requires locking the
corresponding subscription. The apply worker also waits for the
subscription tables to reach the 'SYNCDONE' state after holding the locks
on the subscription and the wait is done using WaitLatch. The 'SYNCDONE'
state is changed by tablesync workers again by locking the corresponding
subscription. Both the state updates use AccessShareLock mode to lock the
subscription, so they can't block each other. However, a backend can
simultaneously try to acquire a lock on the same subscription using
AccessExclusiveLock mode to alter the subscription. Now, the backend's
wait on a lock can sneak in between the apply worker and table sync worker
causing deadlock.

In other words, apply_worker waits for tablesync worker which waits for
backend, and backend waits for apply worker. This is not detected by the
deadlock detector because apply worker uses WaitLatch.

The fix is to release existing locks in apply worker before it starts to
wait for tablesync worker to change the state.

Reported-by: Tomas Vondra
Author: Shlok Kyal
Reviewed-by: Amit Kapila, Peter Smith
Backpatch-through: 12
Discussion: https://postgr.es/m/d291bb50-12c4-e8af-2af2-7bb9bb4d8e3e@enterprisedb.com
2023-12-11 08:50:43 +05:30
Peter Geoghegan aa210e0c12 Fix nbtree backward scan race condition comments.
Remove comments that supposed that holding a pin was a useful interlock
for _bt_walk_left().  There are times when _bt_walk_left() doesn't hold
either a lock or a pin on any page, so clearly this can't be true.
_bt_walk_left() is even prepared to deal with concurrent deletion of
both the original page and any pages to its left.

Oversight in commit 2ed5b87f96.
2023-12-08 15:37:53 -08:00
Nathan Bossart dc3f9bc549 Micro-optimize JSONTYPE_NUMERIC code path in json.c.
This commit does the following:

* In datum_to_json_internal(), the call to IsValidJsonNumber() is
  replaced with simplified validation code.  This avoids an extra
  call to strlen() in this path, and it avoids validating the
  entire string (which is okay since we know we're dealing with a
  numeric data type's output).

* In datum_to_json_internal(), the call to escape_json() in the
  JSONTYPE_NUMERIC path is replaced with code that just surrounds
  the string with quotes.  In passing, some other nearby calls to
  appendStringInfo() have been replaced with similar code to avoid
  unnecessary calls to vsnprintf().

* In composite_to_json(), the length of the separator is now
  determined at compile time to avoid unnecessary calls to
  strlen().

On my machine, this speeds up a benchmark for the proposed COPY TO
(FORMAT json) command with many integers by upwards of 20%.  There
are likely other code paths that could be given a similar
treatment, but that is left as a future exercise.

Reviewed-by: Jeff Davis, Tom Lane, David Rowley, John Naylor
Discussion: https://postgr.es/m/20231207231251.GB3359478%40nathanxps13
2023-12-08 13:39:08 -06:00
Jeff Davis 867dd2dc87 Cache opaque handle for GUC option to avoid repeasted lookups.
When setting GUCs from proconfig, performance is important, and hash
lookups in the GUC table are significant.

Per suggestion from Robert Haas.

Discussion: https://postgr.es/m/CA+TgmoYpKxhR3HOD9syK2XwcAUVPa0+ba0XPnwWBcYxtKLkyxA@mail.gmail.com
Reviewed-by: John Naylor
2023-12-08 11:16:01 -08:00
Peter Geoghegan c9c0589fda Optimize nbtree backward scan boundary cases.
Teach _bt_binsrch (and related helper routines like _bt_search and
_bt_compare) about the initial positioning requirements of backward
scans.  Routines like _bt_binsrch already know all about "nextkey"
searches, so it seems natural to teach them about "goback"/backward
searches, too.  These concepts are closely related, and are much easier
to understand when discussed together.

Now that certain implementation details are hidden from _bt_first, it's
straightforward to add a new optimization: backward scans using the <
strategy now avoid extra leaf page accesses in certain "boundary cases".
Consider the following example, which uses the tenk1 table (and its
tenk1_hundred index) from the standard regression tests:

SELECT * FROM tenk1 WHERE hundred < 12 ORDER BY hundred DESC LIMIT 1;

Before this commit, nbtree would scan two leaf pages, even though it was
only really necessary to scan one leaf page.  We'll now descend straight
to the leaf page containing a (12, -inf) high key instead.  The scan
will locate matching non-pivot tuples with "hundred" values starting
from the value 11.  The scan won't waste a page access on the right
sibling leaf page, which cannot possibly contain any matching tuples.

You can think of the optimization added by this commit as disabling an
optimization (the _bt_compare "!pivotsearch" behavior that was added to
Postgres 12 in commit dd299df8) for a small subset of cases where it was
always counterproductive.

Equivalently, you can think of the new optimization as extending the
"pivotsearch" behavior that page deletion by VACUUM has long required
(since the aforementioned Postgres 12 commit went in) to other, similar
cases.  Obviously, this isn't strictly necessary for these new cases
(unlike VACUUM, _bt_first is prepared to move the scan to the left once
on the leaf level), but the underlying principle is the same.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=XPzM8HzaLPq278Vms420mVSHfgs9wi5tjFKHcapZCEw@mail.gmail.com
2023-12-08 11:05:17 -08:00
Tomas Vondra b437571714 Allow parallel CREATE INDEX for BRIN indexes
Allow using multiple worker processes to build BRIN index, which until
now was supported only for BTREE indexes. For large tables this often
results in significant speedup when the build is CPU-bound.

The work is split in a simple way - each worker builds BRIN summaries on
a subset of the table, determined by the regular parallel scan used to
read the data, and feeds them into a shared tuplesort which sorts them
by blkno (start of the range). The leader then reads this sorted stream
of ranges, merges duplicates (which may happen if the parallel scan does
not align with BRIN pages_per_range), and adds the resulting ranges into
the index.

The number of duplicate results produced by workers (requiring merging
in the leader process) should be fairly small, thanks to how parallel
scans assign chunks to workers. The likelihood of duplicate results may
increase for higher pages_per_range values, but then there are fewer
page ranges in total. In any case, we expect the merging to be much
cheaper than summarization, so this should be a win.

Most of the parallelism infrastructure is a simplified copy of the code
used by BTREE indexes, omitting the parts irrelevant for BRIN indexes
(e.g. uniqueness checks).

This also introduces a new index AM flag amcanbuildparallel, determining
whether to attempt to start parallel workers for the index build.

Original patch by me, with reviews and substantial reworks by Matthias
van de Meent, certainly enough to make him a co-author.

Author: Tomas Vondra, Matthias van de Meent
Reviewed-by: Matthias van de Meent
Discussion: https://postgr.es/m/c2ee7d69-ce17-43f2-d1a0-9811edbda6e6%40enterprisedb.com
2023-12-08 18:15:26 +01:00
Tomas Vondra dae761a87e Add empty BRIN ranges during CREATE INDEX
When building BRIN indexes, the brinbuildCallback only advances to the
next page range when seeing a tuple that doesn't belong to the current
one. This means that the index may end up missing ranges at the end of
the table, if those pages do not contain any indexable tuples.

We tend not to have completely empty pages at the end of a relation, but
this also applies to partial indexes, where the tuples may simply not
match the index predicate. This results in inefficient scans using the
affected BRIN index - without the summaries, the page ranges have to be
read and processed, which consumes I/O and possibly also CPU time.

The existing code already added empty ranges for earlier parts of the
table, this commit makes sure we add them for the ranges at the end of
the table too.

Patch by Matthias van de Meent, with review/improvements by me.

Author: Matthias van de Meent
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/CAEze2WiMsPZg%3DxkvSF_jt4%3D69k6K7gz5B8V2wY3gCGZ%2B1BzCbQ%40mail.gmail.com
2023-12-08 17:14:32 +01:00
Heikki Linnakangas 44913add91 Remove some unnecessary #includes of postmaster/interrupt.h
Commit 44fc6e259b removed a bunch of references to
SignalHandlerForCrashExit, leaving these #includes unneeded.
2023-12-08 13:19:37 +02:00
Heikki Linnakangas b31ba5310b Rename ShmemVariableCache to TransamVariables
The old name was misleading: It's not a cache, the values kept in the
struct are the authoritative source.

Reviewed-by: Tristan Partin, Richard Guo
Discussion: https://www.postgresql.org/message-id/6537d63d-4bb5-46f8-9b5d-73a8ba4720ab@iki.fi
2023-12-08 09:47:15 +02:00
Heikki Linnakangas 15916ffb04 Initialize ShmemVariableCache like other shmem areas
For sake of consistency.

Reviewed-by: Tristan Partin, Richard Guo
Discussion: https://www.postgresql.org/message-id/6537d63d-4bb5-46f8-9b5d-73a8ba4720ab@iki.fi
2023-12-08 09:46:59 +02:00
Heikki Linnakangas 049ef3398d Don't try to open visibilitymap when analyzing a foreign table
It's harmless, visibilitymap_count() returns 0 if the file doesn't
exist. But it's also very pointless. I noticed this when I added an
assertion in smgropen() that the relnumber is valid.

Discussion: https://www.postgresql.org/message-id/621a52fd-3cd8-4f5d-a561-d510b853bbaf@iki.fi
2023-12-08 09:16:21 +02:00
Thomas Munro cd7f19da34 Fix potential pointer overflow in xlogreader.c.
While checking if a record could fit in the circular WAL decoding
buffer, the coding from commit 3f1ce973 used arithmetic that could
overflow.  64 bit systems were unaffected for various technical reasons,
which probably explains the lack of problem reports.  Likewise for 32
bit systems running known 32 bit kernels.  The systems at risk of
problems appear to be 32 bit processes running on 64 bit kernels, with
unlucky placement in memory.

Per complaint from GCC -fsanitize=undefined -m32, while testing
variations of 039_end_of_wal.pl.

Back-patch to 15.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKH0oRPOX7DhiQ_b51sM8HqcPp2J3WA-Oen%3DdXog%2BAGGQ%40mail.gmail.com
2023-12-08 16:09:03 +13:00
David Rowley d16a0c1e2e Verify that attribute counts match in ExecCopySlot
tts_virtual_copyslot() contained an Assert that checked that the srcslot
contained <= attributes than the dstslot.  This seems to be backwards as
if the srcslot contained fewer attributes then the dstslot could be left
with stale Datum values from the previously stored tuple.  It might make
more sense to allow the source to contain additional attributes and only
copy the leading ones that also exist in the destination, however, that's
not what we're doing here.

Here we just remove the Assert from tts_virtual_copyslot() and add an
Assert to ExecCopySlot() to verify the attribute counts match.  There
does not seem to be any places where the destination contains fewer
attributes, so instead of going to the trouble of making the code
properly handle this, let's just ensure the attribute counts match.  If
this Assert fails then that will indicate that we do have cases that
require us to handle the srcslot with more attributes than the dstslot.
It seems better to only write this code if there's a genuine requirement
for it rather than write it now only to leave it untested.

Thanks to Andres Freund for helping with the analysis of this.

Discussion: https://postgr.es/m/CAApHDvpMAvBL0T+TRORquyx1iqFQKMVTXtqNocOw0Pa2uh1heg@mail.gmail.com
2023-12-07 21:28:24 +13:00
Michael Paquier d43bd090a8 Improve some error messages with invalid indexes for REINDEX CONCURRENTLY
An invalid index is skipped when doing REINDEX CONCURRENTLY at table
level, with INDEX_CORRUPTED used as errcode.  This is confusing,
because an invalid index could exist after an interruption.  The errcode
is switched to OBJECT_NOT_IN_PREREQUISITE_STATE instead, as per a
suggestion from Andres Freund.

While on it, the error messages are reworded, and a hint is added,
telling how to rebuild an invalid index in this case.  This has been
suggested by Noah Misch.

Discussion: https://postgr.es/m/20231118230958.4fm3fhk4ypshxopa@awork3.anarazel.de
2023-12-07 14:27:54 +09:00
Amit Kapila 0bf62460bb Fix issues in binary_upgrade_logical_slot_has_caught_up().
The commit 29d0a77fa6 labelled binary_upgrade_logical_slot_has_caught_up()
as a non-strict function to allow providing a better error message to callers
in case the passed slot_name is NULL. On further discussion, it seems that
it is not helpful to have a different error message for NULL input in this
function, so this patch marks the function as strict.

This patch also removes the explicit permission check to use replication
slots as this function is invoked only by superusers and instead adds an
Assert.

Reported-by: Masahiko Sawada
Author: Hayato Kuroda
Reviewed-by: Vignesh C
Discussion: https://postgr.es/m/CAD21AoDSyiBKkMXBxN_gUayZZUCOgyHnG8Ge8rcPXNP3Tf6B4g@mail.gmail.com
2023-12-07 08:42:48 +05:30
Michael Paquier c426f7c2b3 Fix assertion failure with REINDEX and event triggers
A REINDEX CONCURRENTLY run on a table with no indexes would always pop
the topmost snapshot from the active snapshot stack, making the snapshot
handling inconsistent between the multiple-relation and single-relation
cases.  This commit slightly changes the snapshot stack handling so as a
snapshot is popped only ReindexMultipleInternal() in this case after a
relation has been reindexed, fixing a problem where an event trigger
function may need a snapshot but does not have one.  This also keeps the
places where PopActiveSnapshot() is called closer to each other.

While on it, this expands the existing tests to cover all the cases that
could be faced with REINDEX commands and such event triggers, for one or
more relations, with or without indexes.

This behavior is inconsistent since 5dc92b844e, but we've never had a
need for an active snapshot at the end of a REINDEX until now.

Thanks also to Jian He for the input.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/cb538743-484c-eb6a-a8c5-359980cd3a17@gmail.com
2023-12-07 08:31:02 +09:00
Michael Paquier 7636725b92 Fix compilation on Windows with WAL_DEBUG
This has been broken since b060dbe000 that has reworked the callback
mechanism of XLogReader, most likely unnoticed because any form of
development involving WAL happens on platforms where this compiles fine.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACVF14WKQMFwcJ=3okVDhiXpuK5f7YdT+BdYXbbypMHqWA@mail.gmail.com
Backpatch-through: 13
2023-12-06 14:10:39 +09:00
Daniel Gustafsson 4d0cf0b05d Fix indentation
When preparing commit 98e675ed7a I accidentally forgot to run
pgindent, which did produce a diff. Fix by adding the required
whitespace per the koel buildfarm failure.
2023-12-05 15:54:59 +01:00
Daniel Gustafsson 98e675ed7a Fix incorrect error message for IDENTIFY_SYSTEM
Commit 5a991ef869 accidentally reversed the order of the tuples
and fields parameters, making the error message incorrectly refer
to 3 tuples with 1 field when IDENTIFY_SYSTEM returns 1 tuple and
3 or 4 fields. Fix by changing the order of the parameters.  This
also adds a comment describing why we check for < 3 when postgres
since 9.4 has been sending 4 fields.

Backpatch all the way since the bug is almost a decade old.

Author: Tomonari Katsumata <t.katsumata1122@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Bug: #18224
Backpatch-through: v12
2023-12-05 14:30:56 +01:00
Jeff Davis a86c61c9ee Optimize SearchPathCache by saving the last entry.
Repeated lookups are common, so it's worth it to check the last entry
before doing another hash lookup.

Discussion: https://postgr.es/m/04c8592dbd694e4114a3ed87139a7a04e4363030.camel%40j-davis.com
2023-12-04 17:19:16 -08:00
Nathan Bossart b14b1eb4da Teach convert() and friends to avoid copying when possible.
Presently, pg_convert() allocates a new bytea and copies the result
regardless of whether any conversion actually happened.  This
commit adjusts this function to return the source pointer as-is if
no conversion occurred.  This optimization isn't expected to make a
tremendous difference, but it still seems worthwhile to avoid
unnecessary memory allocations.

Author: Yurii Rashkovskii
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/CA%2BRLCQyknBPSWXRBQGOi6aYEcdQ9RpH9Kch4GjoeY8dQ3D%2Bvhw%40mail.gmail.com
2023-12-04 11:55:18 -06:00
Heikki Linnakangas e7c6efe305 Remove now-unnecessary Autovacuum[Launcher|Worker]IAm functions
After commit fd5e8b440d, InitProcess() is called later in the
EXEC_BACKEND startup sequence, so it's enough to set the
am_autovacuum_[launcher|worker] variables at the same place as in the
!EXEC_BACKEND case.
2023-12-04 15:34:37 +02:00
Michael Paquier f21848de20 Add support for REINDEX in event triggers
This commit adds support for REINDEX in event triggers, making this
command react for the events ddl_command_start and ddl_command_end.  The
indexes rebuilt are collected with the ReindexStmt emitted by the
caller, for the concurrent and non-concurrent paths.

Thanks to that, it is possible to know a full list of the indexes that a
single REINDEX command has worked on.

Author: Garrett Thornburg, Jian He
Reviewed-by: Jim Jones, Michael Paquier
Discussion: https://postgr.es/m/CAEEqfk5bm32G7sbhzHbES9WejD8O8DCEOaLkxoBP7HNWxjPpvg@mail.gmail.com
2023-12-04 09:53:49 +09:00
Heikki Linnakangas fd5e8b440d Refactor how InitProcess is called
The order of process initialization steps is now more consistent
between !EXEC_BACKEND and EXEC_BACKEND modes. InitProcess() is called
at the same place in either mode. We can now also move the
AttachSharedMemoryStructs() call into InitProcess() itself. This
reduces the number of "#ifdef EXEC_BACKEND" blocks.

Reviewed-by: Tristan Partin, Andres Freund, Alexander Lakhin
Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi
2023-12-03 16:39:18 +02:00
Heikki Linnakangas 388491f1e5 Pass BackgroundWorker entry in the parameter file in EXEC_BACKEND mode
The BackgroundWorker struct is now passed the same way as the Port
struct. Seems more consistent.

This makes it possible to move InitProcess later in SubPostmasterMain
(in next commit), as we no longer need to access shared memory to read
background worker entry.

Reviewed-by: Tristan Partin, Andres Freund, Alexander Lakhin
Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi
2023-12-03 16:38:54 +02:00
Heikki Linnakangas 69d903367c Refactor CreateSharedMemoryAndSemaphores
For clarity, have separate functions for *creating* the shared memory
and semaphores at postmaster or single-user backend startup, and
for *attaching* to existing shared memory structures in EXEC_BACKEND
case. CreateSharedMemoryAndSemaphores() is now called only at
postmaster startup, and a new AttachSharedMemoryStructs() function is
called at backend startup in EXEC_BACKEND mode.

Reviewed-by: Tristan Partin, Andres Freund
Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi
2023-12-03 16:09:42 +02:00
Heikki Linnakangas b19890d49d Silence Valgrind complaint with EXEC_BACKEND
The padding bytes written to the backend params file were
uninitialized. That's harmless because we don't access the padding
bytes when we read the file back in, but Valgrind doesn't know
that. In any case, clear the padding bytes to make Valgrind happy.

Reported-by: Alexander Lakhin
Discussion: https://www.postgresql.org/message-id/014768ed-8b39-c44f-b07c-098c87b1644c@gmail.com
2023-12-01 22:30:08 +02:00
Peter Eisentraut 0d93ce31a5 pgindent fix
for commit a11c9c42ea
2023-12-01 17:58:18 +01:00
Peter Eisentraut a11c9c42ea Check collation when creating partitioned index
When creating a partitioned index, the partition key must be a subset
of the index's columns.  But this currently doesn't check that the
collations between the partition key and the index definition match.
So you can construct a unique index that fails to enforce uniqueness.
(This would most likely involve a nondeterministic collation, so it
would have to be crafted explicitly and is not something that would
just happen by accident.)

This patch adds the required collation check.  As a result, any
previously allowed unique index that has a collation mismatch would no
longer be allowed to be created.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/3327cb54-f7f1-413b-8fdb-7a9dceebb938%40eisentraut.org
2023-12-01 16:05:41 +01:00
Amit Kapila f66fcc5cd6 Fix an uninitialized access in hash_xlog_squeeze_page().
Commit 861f86beea changed hash_xlog_squeeze_page() to start reading
the write buffer conditionally but forgot to initialize it leading to an
uninitialized access.

Reported-by: Alexander Lakhin
Author: Hayato Kuroda
Reviewed-by: Alexander Lakhin, Amit Kapila
Discussion: http://postgr.es/m/62ed1a9f-746a-8e86-904b-51b9b806a1d9@gmail.com
2023-12-01 10:22:13 +05:30
Thomas Munro 3b51265ee3 Adjust obsolete comment explaining set_stack_base().
Commit 7389aad6 removed the notion of backends started from inside a
signal handler.  A stray comment still referred to them, while
explaining the need for a call to set_stack_base().  That leads to the
question of whether we still need the call in !EXEC_BACKEND builds.
There doesn't seem to be much point in suppressing it now, as it doesn't
hurt and probably helps to measure the stack base from the exact same
place in EXEC_BACKEND and !EXEC_BACKEND builds.

Back-patch to 16.

Reported-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reported-by: Tristan Partin <tristan@neon.tech>
Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKG%2BEJHcevNGNOxVWxTNFbuB%3Dvjf4U68%2B85rAC_Sxvy2zEQ%40mail.gmail.com
2023-12-01 15:18:51 +13:00
Heikki Linnakangas f93133a250 Print lwlock stats also for aux processes, when built with LWLOCK_STATS
InitAuxiliaryProcess() closely resembles InitProcess(), but it didn't
call InitLWLockAccess(). But because InitLWLockAccess() is a no-op
unless compiled with LWLOCK_STATS, and everything works even if it's
not called, the only consequence was that the stats were not printed
for aux processes.

This was an oversight in commit 1c6821be31, in version 9.5, so it is
missing in all supported branches. But since it only affects
developers using LWLOCK_STATS and no one has complained, no
backpatching.

Discussion: https://www.postgresql.org/message-id/20231130202648.7k6agmuizdilufnv@awork3.anarazel.de
2023-12-01 01:00:03 +02:00
Alexander Korotkov ae2ccf66a2 Fix typo in 5a1dfde833
Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/55d8800f-4a80-5256-1e84-246fbe79acd0@gmail.com
2023-11-30 13:46:23 +02:00
Alexander Korotkov b589f211e0 Fix warning due non-standard inline declaration in 4ed8f0913b
Reported-by: Alexander Lakhin, Tom Lane
Author: Pavel Borisov
Discussion: https://postgr.es/m/55d8800f-4a80-5256-1e84-246fbe79acd0@gmail.com
2023-11-30 11:34:45 +02:00
John Naylor 095d109ccd Remove redundant setting of hashkey after insertion
It's not necessary to fill the key field in most cases, since
hash_search has already done that. Some existing call sites have an
assert or comment that this contract has been fulfilled, but those
are quite old and that practice seems unnecessary here.

While at it, remove a nearby redundant assignment that a smart compiler
will elide anyway.

Zhao Junwang, with some adjustments by me

Reviewed by Nathan Bossart, with additional feedback from Tom Lane

Discussion: http://postgr.es/m/CAEG8a3%2BUPF%3DR2QGPgJMF2mKh8xPd1H2TmfH77zPuVUFdBpiGUA%40mail.gmail.com
2023-11-30 15:25:57 +07:00
Michael Paquier 8d9978a717 Apply quotes more consistently to GUC names in logs
Quotes are applied to GUCs in a very inconsistent way across the code
base, with a mix of double quotes or no quotes used.  This commit
removes double quotes around all the GUC names that are obviously
referred to as parameters with non-English words (use of underscore,
mixed case, etc).

This is the result of a discussion with Álvaro Herrera, Nathan Bossart,
Laurenz Albe, Peter Eisentraut, Tom Lane and Daniel Gustafsson.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
2023-11-30 14:11:45 +09:00
Peter Eisentraut 7e5f517799 Improve "user mapping not found" error message
Display the name of the foreign server for which the user mapping was
not found.

Author: Ian Lawrence Barwick <barwick@gmail.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/CAB8KJ=jFzNaeyFtLcTZNOc6fd1+F93pGVLFa-wyt31wn7VNxqQ@mail.gmail.com
2023-11-30 05:34:28 +01:00
Alexander Korotkov 5a1dfde833 Make use FullTransactionId in 2PC filenames
Switch from using TransactionId to FullTransactionId in naming of 2PC files.
Transaction state file in the pg_twophase directory now have extra 8 bytes in
the name to address an epoch of a given xid.

Author: Maxim Orlov, Aleksander Alekseev, Alexander Korotkov, Teodor Sigaev
Author: Nikita Glukhov, Pavel Borisov, Yura Sokolov
Reviewed-by: Jacob Champion, Heikki Linnakangas, Alexander Korotkov
Reviewed-by: Japin Li, Pavel Borisov, Tom Lane, Peter Eisentraut, Andres Freund
Reviewed-by: Andrey Borodin, Dilip Kumar, Aleksander Alekseev
Discussion: https://postgr.es/m/CACG%3DezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe%3DpyyjVWA%40mail.gmail.com
Discussion: https://postgr.es/m/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq%2BvfkmTF5Q%40mail.gmail.com
2023-11-29 01:43:36 +02:00
Alexander Korotkov 2cdf131c46 Use larger segment file names for pg_notify
This avoids the wraparound in async.c and removes the corresponding code
complexity. The maximum amount of allocated SLRU pages for NOTIFY / LISTEN
queue is now determined by the max_notify_queue_pages GUC. The default
value is 1048576. It allows to consume up to 8 GB of disk space which is
exactly the limit we had previously.

Author: Maxim Orlov, Aleksander Alekseev, Alexander Korotkov, Teodor Sigaev
Author: Nikita Glukhov, Pavel Borisov, Yura Sokolov
Reviewed-by: Jacob Champion, Heikki Linnakangas, Alexander Korotkov
Reviewed-by: Japin Li, Pavel Borisov, Tom Lane, Peter Eisentraut, Andres Freund
Reviewed-by: Andrey Borodin, Dilip Kumar, Aleksander Alekseev
Discussion: https://postgr.es/m/CACG%3DezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe%3DpyyjVWA%40mail.gmail.com
Discussion: https://postgr.es/m/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq%2BvfkmTF5Q%40mail.gmail.com
2023-11-29 01:41:48 +02:00
Alexander Korotkov 4ed8f0913b Index SLRUs by 64-bit integers rather than by 32-bit integers
We've had repeated bugs in the area of handling SLRU wraparound in the past,
some of which have caused data loss. Switching to an indexing system for SLRUs
that does not wrap around should allow us to get rid of a whole bunch
of problems and improve the overall reliability of the system.

This particular patch however only changes the indexing and doesn't address
the wraparound per se. This is going to be done in the following patches.

Author: Maxim Orlov, Aleksander Alekseev, Alexander Korotkov, Teodor Sigaev
Author: Nikita Glukhov, Pavel Borisov, Yura Sokolov
Reviewed-by: Jacob Champion, Heikki Linnakangas, Alexander Korotkov
Reviewed-by: Japin Li, Pavel Borisov, Tom Lane, Peter Eisentraut, Andres Freund
Reviewed-by: Andrey Borodin, Dilip Kumar, Aleksander Alekseev
Discussion: https://postgr.es/m/CACG%3DezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe%3DpyyjVWA%40mail.gmail.com
Discussion: https://postgr.es/m/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq%2BvfkmTF5Q%40mail.gmail.com
2023-11-29 01:40:56 +02:00
Tom Lane a916b47e23 Clean up usage of bison precedence for non-operator keywords.
Assigning a precedence to a keyword that isn't a kind of expression
operator is rather dangerous, because it might mask grammar
ambiguities that we'd rather know about.  It's much safer to attach
explicit precedences to individual rules, which will affect the
behavior of only that one rule.  Moreover, when we do have to give
a precedence to a non-operator keyword, we should try to give it the
same precedence as IDENT, thereby reducing the risk of surprising
side-effects.

Apply this hard-won knowledge to SET (which I misassigned ages ago
in commit 2647ad658) and some SQL/JSON-related productions
(from commits 6ee30209a, 71bfd1543).

Patch HEAD only, since there's no evidence of actual bugs here.

Discussion: https://postgr.es/m/CADT4RqBPdbsZW7HS1jJP319TMRHs1hzUiP=iRJYR6UqgHCrgNQ@mail.gmail.com
2023-11-28 13:32:15 -05:00
Tom Lane c82207a548 Use BIO_{get,set}_app_data instead of BIO_{get,set}_data.
We should have done it this way all along, but we accidentally got
away with using the wrong BIO field up until OpenSSL 3.2.  There,
the library's BIO routines that we rely on use the "data" field
for their own purposes, and our conflicting use causes assorted
weird behaviors up to and including core dumps when SSL connections
are attempted.  Switch to using the approved field for the purpose,
i.e. app_data.

While at it, remove our configure probes for BIO_get_data as well
as the fallback implementation.  BIO_{get,set}_app_data have been
there since long before any OpenSSL version that we still support,
even in the back branches.

Also, update src/test/ssl/t/001_ssltests.pl to allow for a minor
change in an error message spelling that evidently came in with 3.2.

Tristan Partin and Bo Andreson.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAN55FZ1eDDYsYaL7mv+oSLUij2h_u6hvD4Qmv-7PK7jkji0uyQ@mail.gmail.com
2023-11-28 12:34:03 -05:00
Heikki Linnakangas 10a59925a3 Fix comment about ressortgrouprefs being unique in setop plans.
Author: Richard Guo, Tom Lane
Discussion: https://www.postgresql.org/message-id/CAMbWs49rAfFS-yd7=QxtDUrZDFfRBGy4rGBJNyGDH7=CLipFPg@mail.gmail.com
2023-11-28 14:15:14 +02:00
Heikki Linnakangas 60f227316c Fix assertions with RI triggers in heap_update and heap_delete.
If the tuple being updated is not visible to the crosscheck snapshot,
we return TM_Updated but the assertions would not hold in that case.
Move them to before the cross-check.

Fixes bug #17893. Backpatch to all supported versions.

Author: Alexander Lakhin
Backpatch-through: 12
Discussion: https://www.postgresql.org/message-id/17893-35847009eec517b5%40postgresql.org
2023-11-28 12:00:14 +02:00
David Rowley 930d2b442f Don't use bms_membership() in cases where we don't need to
00b41463c adjusted Bitmapset so that an empty set is always represented
as NULL.  This makes checking for empty sets far cheaper than it used
to be.

There were various places in the code where we'd call bms_membership()
to handle the 3 possible BMS_Membership values.  For the BMS_SINGLETON
case, we'd also call bms_singleton_member() to find the single set member.
This can now be done in a more optimal way by first checking if the set is
NULL and then not bothering with bms_membership() and simply call
bms_get_singleton_member() instead to find the single member.  This
function will return false if there are multiple members in the set.

Here we also tidy up some logic in examine_variable() for the single
member case.  There's now no need to call bms_is_member() as we've
already established that we're working with a singleton Bitmapset, so we
can just check if varRelid matches the singleton member.

Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/CAApHDvqW+CxNPcY245GaWiuqkkqgTudtG2ncGvvSjGn2wdTZLA@mail.gmail.com
2023-11-28 10:41:12 +13:00
Tomas Vondra a82ee7ef3a Check if ii_AmCache is NULL in aminsertcleanup
Fix a bug introduced by c1ec02be1d. It may happen that the executor
opens indexes on the result relation, but no rows end up being inserted.
Then the index_insert_cleanup still gets executed, but passes down NULL
to the AM callback. The AM callback may not expect this, as is the case
of brininsertcleanup, leading to a crash.

Fixed by only calling the cleanup callback if (ii_AmCache != NULL). This
way the AM can simply assume to only see a valid cache.

Reported-by: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-w9qC-o9hQox9UHvdVZAYTp8OrPQOKtwbvzWaRejTT=Q@mail.gmail.com
2023-11-27 16:53:06 +01:00
Heikki Linnakangas 1f395354d8 Reduce rate of walwriter wakeups due to async commits.
XLogSetAsyncXactLSN(), called at asynchronous commit, would wake up
walwriter every time the LSN advances, but walwriter doesn't actually
do anything unless it has at least 'wal_writer_flush_after' full
blocks of WAL to write. Repeatedly waking up walwriter to do nothing
is a waste of CPU cycles in both walwriter and the backends doing the
wakeups. To fix, apply the same logic in XLogSetAsyncXactLSN() to
decide whether to wake up walwriter, as walwriter uses to determine if
it has any work to do.

In the passing, rename misleadingly named 'flushbytes' local variable
to 'flushblocks'.

Author: Andres Freund, Heikki Linnakangas
Discussion: https://www.postgresql.org/message-id/20231024230929.vsc342baqs7kmbte@awork3.anarazel.de
2023-11-27 17:42:39 +02:00
Amit Kapila 360392fa2a Avoid unconditionally filling in missing values with NULL in pgoutput.
52e4f0cd4 introduced a bug in pgoutput in which missing values in tuples
were incorrectly filled in with NULL. The problem was the use of
CreateTupleDescCopy where CreateTupleDescCopyConstr was required, as the
former drops the constraints in the tuple description (specifically, the
default value constraint) on the floor.

The bug could result in incorrectness when a table replicated via
`REPLICA IDENTITY FULL` underwent a schema change that added a column
with a default value. The problem is that in such cases updates fill NULL
values in old tuples for missing columns for default values. Then on the
subscriber, we failed to find a matching tuple and missed updating the
required row.

Author: Nikhil Benesch
Reviewed-by: Hou Zhijie, Amit Kapila
Backpatch-through: 15
Discussion: http://postgr.es/m/CAPWqQZTEpZQamYsGMn6ZDRvVywwpVPiKH6OY4KSgA+NmeqFNzA@mail.gmail.com
2023-11-27 08:49:55 +05:30