Commit Graph

22822 Commits

Author SHA1 Message Date
Tom Lane bd037dc928 Make XLogRecGetBlockTag() throw error if there's no such block.
All but a few existing callers assume without checking that this
function succeeds.  While it probably will, that's a poor excuse for
not checking.  Let's make it return void and instead throw an error
if it doesn't find the block reference.  Callers that actually need
to handle the no-such-block case must now use the underlying function
XLogRecGetBlockTagExtended.

In addition to being a bit less error-prone, this should also serve
to suppress some Coverity complaints about XLogRecGetBlockRefInfo.

While at it, clean up some inconsistency about use of the
XLogRecHasBlockRef macro: make XLogRecGetBlockTagExtended use
that instead of open-coding the same condition, and avoid calling
XLogRecHasBlockRef twice in relevant code paths.  (That is,
calling XLogRecHasBlockRef followed by XLogRecGetBlockTag is now
deprecated: use XLogRecGetBlockTagExtended instead.)

Patch HEAD only; this doesn't seem to have enough value to consider
a back-branch API break.

Discussion: https://postgr.es/m/425039.1649701221@sss.pgh.pa.us
2022-04-11 17:43:53 -04:00
Peter Geoghegan 9debd12348 Remove comment about historic heap vacuuming issue.
Remove comment block about how heap page vacuuming used to set tuples
with storage to LP_UNUSED in a rare edge case that can no longer happen
following commit 8523492d4e.  The comments seem unnecessary now, since
it's now generally clear that heap vacuuming only applies to LP_DEAD
items from VACUUM's first heap pass following more recent work from
commits 12b5ade902 and 4f8d9d1217.
2022-04-11 14:20:46 -07:00
Tom Lane 9de692c101 Remove dead code in do_pg_backup_start().
As of commit 39969e2a1, no caller of do_pg_backup_start() passes NULL
for labelfile or tblspcmapfile, nor is it plausible that any would
do so in the future.  Remove the code that coped with that case,
as (a) it's dead and (b) it causes Coverity to bleat about possibly
leaked storage.

While here, do some janitorial work on the function's header comment.
2022-04-11 15:56:01 -04:00
Tom Lane 3c702b3ed1 Explicitly ignore guaranteed-true result from pgstat_lock_entry().
With nowait passed as false, pgstat_lock_entry() must return true
so there's no need to check its result.  Coverity seems unconvinced
of this, so whack it upside the head with a (void) cast.
2022-04-11 13:22:37 -04:00
Tom Lane 93fcf2d209 fgetc() returns int, not char.
This has no practical effect, since this code doesn't actually need to
distinguish EOF (-1) from \0377; but it silences a Coverity complaint.
2022-04-11 13:15:46 -04:00
David Rowley b0e5f02ddc Fix various typos and spelling mistakes in code comments
Author: Justin Pryzby
Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
2022-04-11 20:49:41 +12:00
Michael Paquier 7597cc3083 Fix the dates of some copyright notices
0ad8032 and 4e34747 are at the origin of that.  Julien has found the one
in parse_jsontable.c, while I have spotted the rest.

Author: Julien Rouhaud, Michael Paquier
Discussion: https://postgr.es/m/20220411060838.ftnzyvflpwu6f74w@jrouhaud
2022-04-11 16:36:25 +09:00
Peter Eisentraut 38abc39c81 Add missing serial commas 2022-04-09 16:15:01 +02:00
Robert Haas f37015a161 Rename delayChkpt to delayChkptFlags.
Before commit 412ad7a556, delayChkpt
was a Boolean. Now it's an integer. Extensions using it need to be
appropriately updated, so let's rename the field to make sure that
a hard compilation failure occurs.

Replacing delayChkpt with delayChkptFlags made a few comments extend
past 80 characters, so I reflowed them and changed some wording very
slightly.

The back-branches will need a different change to restore compatibility
with existing minor releases; this is just for master.

Per suggestion from Tom Lane.

Discussion: http://postgr.es/m/a7880f4d-1d74-582a-ada7-dad168d046d1@enterprisedb.com
2022-04-08 11:44:17 -04:00
Jeff Davis 12aaae5131 Check XLogRecHasBlockRef() before XLogRecHasBlockImage().
Trial fix of buildfarm failures on kestrel and tamandua.
2022-04-08 02:30:57 -07:00
Jeff Davis 2258e76f90 Add contrib/pg_walinspect.
Provides similar functionality to pg_waldump, but from a SQL interface
rather than a separate utility.

Author: Bharath Rupireddy
Reviewed-by: Greg Stark, Kyotaro Horiguchi, Andres Freund, Ashutosh Sharma, Nitin Jadhav, RKN Sai Krishna
Discussion: https://postgr.es/m/CALj2ACUGUYXsEQdKhEdsBzhGEyF3xggvLdD8C0VT72TNEfOiog%40mail.gmail.com
2022-04-08 00:26:44 -07:00
Peter Eisentraut 708007dced Remove error message hints mentioning configure options
These are usually not useful since users will use packaged
distributions and won't be interested in rebuilding their installation
from source.  Also, we have only used these kinds of hints for some
features and in some places, not consistently throughout.

Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/2552aed7-d0e9-280a-54aa-2dc7073f371d%40enterprisedb.com
2022-04-08 07:41:55 +02:00
Michael Paquier efb0ef909f Track I/O timing for temporary file blocks in EXPLAIN (BUFFERS)
Previously, the output of EXPLAIN (BUFFERS) option showed only the I/O
timing spent reading and writing shared and local buffers.  This commit
adds on top of that the I/O timing for temporary buffers in the output
of EXPLAIN (for spilled external sorts, hashes, materialization. etc).
This can be helpful for users in cases where the I/O related to
temporary buffers is the bottleneck.

Like its cousin, this information is available only when track_io_timing
is enabled.  Playing the patch, this is showing an extra overhead of up
to 1% even when using gettimeofday() as implementation for interval
timings, which is slightly within the usual range noise still that's
measurable.

Author: Masahiko Sawada
Reviewed-by: Georgios Kokolatos, Melanie Plageman, Julien Rouhaud,
Ranier Vilela
Discussion: https://postgr.es/m/CAD21AoAJgotTeP83p6HiAGDhs_9Fw9pZ2J=_tYTsiO5Ob-V5GQ@mail.gmail.com
2022-04-08 11:27:21 +09:00
Peter Geoghegan 10a8d13823 Truncate line pointer array during heap pruning.
Reclaim space from the line pointer array when heap pruning leaves
behind a contiguous group of LP_UNUSED items at the end of the array.
This happens during subsequent page defragmentation.  Certain kinds of
heap line pointer bloat are ameliorated by this new optimization.

Follow-up work to commit 3c3b8a4b26, which taught VACUUM to truncate the
line pointer array in about the same way during VACUUM's second pass
over the heap.  We now apply line pointer array truncation during both
the first and the second pass over the heap made by VACUUM.  We can also
perform line pointer array truncation during opportunistic pruning.

Matthias van de Meent, with small tweaks by me.

Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAEze2WjgaQc55Y5f5CQd3L=eS5CZcff2Obxp=O6pto8-f0hC4w@mail.gmail.com
Discussion: https://postgr.es/m/CAEze2Wg36%2B4at2eWJNcYNiW2FJmht34x3YeX54ctUSs7kKoNcA%40mail.gmail.com
2022-04-07 15:42:12 -07:00
David Rowley 9d9c02ccd1 Teach planner and executor about monotonic window funcs
Window functions such as row_number() always return a value higher than
the previously returned value for tuples in any given window partition.

Traditionally queries such as;

SELECT * FROM (
   SELECT *, row_number() over (order by c) rn
   FROM t
) t WHERE rn <= 10;

were executed fairly inefficiently.  Neither the query planner nor the
executor knew that once rn made it to 11 that nothing further would match
the outer query's WHERE clause.  It would blindly continue until all
tuples were exhausted from the subquery.

Here we implement means to make the above execute more efficiently.

This is done by way of adding a pg_proc.prosupport function to various of
the built-in window functions and adding supporting code to allow the
support function to inform the planner if the window function is
monotonically increasing, monotonically decreasing, both or neither.  The
planner is then able to make use of that information and possibly allow
the executor to short-circuit execution by way of adding a "run condition"
to the WindowAgg to allow it to determine if some of its execution work
can be skipped.

This "run condition" is not like a normal filter.  These run conditions
are only built using quals comparing values to monotonic window functions.
For monotonic increasing functions, quals making use of the btree
operators for <, <= and = can be used (assuming the window function column
is on the left). You can see here that once such a condition becomes false
that a monotonic increasing function could never make it subsequently true
again.  For monotonically decreasing functions the >, >= and = btree
operators for the given type can be used for run conditions.

The best-case situation for this is when there is a single WindowAgg node
without a PARTITION BY clause.  Here when the run condition becomes false
the WindowAgg node can simply return NULL.  No more tuples will ever match
the run condition.  It's a little more complex when there is a PARTITION
BY clause.  In this case, we cannot return NULL as we must still process
other partitions.  To speed this case up we pull tuples from the outer
plan to check if they're from the same partition and simply discard them
if they are.  When we find a tuple belonging to another partition we start
processing as normal again until the run condition becomes false or we run
out of tuples to process.

When there are multiple WindowAgg nodes to evaluate then this complicates
the situation.  For intermediate WindowAggs we must ensure we always
return all tuples to the calling node.  Any filtering done could lead to
incorrect results in WindowAgg nodes above.  For all intermediate nodes,
we can still save some work when the run condition becomes false.  We've
no need to evaluate the WindowFuncs anymore.  Other WindowAgg nodes cannot
reference the value of these and these tuples will not appear in the final
result anyway.  The savings here are small in comparison to what can be
saved in the top-level WingowAgg, but still worthwhile.

Intermediate WindowAgg nodes never filter out tuples, but here we change
WindowAgg so that the top-level WindowAgg filters out tuples that don't
match the intermediate WindowAgg node's run condition.  Such filters
appear in the "Filter" clause in EXPLAIN for the top-level WindowAgg node.

Here we add prosupport functions to allow the above to work for;
row_number(), rank(), dense_rank(), count(*) and count(expr).  It appears
technically possible to do the same for min() and max(), however, it seems
unlikely to be useful enough, so that's not done here.

Bump catversion

Author: David Rowley
Reviewed-by: Andy Fan, Zhihong Yu
Discussion: https://postgr.es/m/CAApHDvqvp3At8++yF8ij06sdcoo1S_b2YoaT9D4Nf+MObzsrLQ@mail.gmail.com
2022-04-08 10:34:36 +12:00
Alvaro Herrera a90641eac2
Revert "Rewrite some RI code to avoid using SPI"
This reverts commit 99392cdd78.
We'd rather rewrite ri_triggers.c as a whole rather than piecemeal.

Discussion: https://postgr.es/m/E1ncXX2-000mFt-Pe@gemulon.postgresql.org
2022-04-07 23:42:13 +02:00
Alvaro Herrera 99392cdd78
Rewrite some RI code to avoid using SPI
Modify the subroutines called by RI trigger functions that want to check
if a given referenced value exists in the referenced relation to simply
scan the foreign key constraint's unique index, instead of using SPI to
execute
  SELECT 1 FROM referenced_relation WHERE ref_key = $1
This saves a lot of work, especially when inserting into or updating a
referencing relation.

This rewrite allows to fix a PK row visibility bug caused by a partition
descriptor hack which requires ActiveSnapshot to be set to come up with
the correct set of partitions for the RI query running under REPEATABLE
READ isolation.  We now set that snapshot indepedently of the snapshot
to be used by the PK index scan, so the two no longer interfere.  The
buggy output in src/test/isolation/expected/fk-snapshot.out of the
relevant test case added by commit 00cb86e75d has been corrected.
(The bug still exists in branch 14, however, but this fix is too
invasive to backpatch.)

Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Li Japin <japinli@hotmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Discussion: https://postgr.es/m/CA+HiwqGkfJfYdeq5vHPh6eqPKjSbfpDDY+j-kXYFePQedtSLeg@mail.gmail.com
2022-04-07 21:10:03 +02:00
Tomas Vondra 2c7ea57e56 Revert "Logical decoding of sequences"
This reverts a sequence of commits, implementing features related to
logical decoding and replication of sequences:

 - 0da92dc530
 - 80901b3291
 - b779d7d8fd
 - d5ed9da41d
 - a180c2b34d
 - 75b1521dae
 - 2d2232933b
 - 002c9dd97a
 - 05843b1aa4

The implementation has issues, mostly due to combining transactional and
non-transactional behavior of sequences. It's not clear how this could
be fixed, but it'll require reworking significant part of the patch.

Discussion: https://postgr.es/m/95345a19-d508-63d1-860a-f5c2f41e8d40@enterprisedb.com
2022-04-07 20:06:36 +02:00
Peter Eisentraut 344d62fb9a Unlogged sequences
Add support for unlogged sequences.  Unlike for unlogged tables, this
is not a performance feature.  It allows sequences associated with
unlogged tables to be excluded from replication.

A new subcommand ALTER SEQUENCE ... SET LOGGED/UNLOGGED is added.

An identity/serial sequence now automatically gets and follows the
persistence level (logged/unlogged) of its owning table.  (The
sequences owned by temporary tables were already temporary through the
separate mechanism in RangeVarAdjustRelationPersistence().)  But you
can still change the persistence of an owned sequence separately.
Also, pg_dump and pg_upgrade preserve the persistence of existing
sequences.

Discussion: https://www.postgresql.org/message-id/flat/04e12818-2f98-257c-b926-2845d74ed04f%402ndquadrant.com
2022-04-07 16:18:00 +02:00
Daniel Gustafsson bab588cd5c Fix typo in xlogrecovery.c code comment
Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/CALj2ACUoPtnReT=yAQMcWLtcCpk7p83xjeA8tiRX8Q0_sjh8kw@mail.gmail.com
2022-04-07 14:02:33 +02:00
Thomas Munro 5dc0418fab Prefetch data referenced by the WAL, take II.
Introduce a new GUC recovery_prefetch.  When enabled, look ahead in the
WAL and try to initiate asynchronous reading of referenced data blocks
that are not yet cached in our buffer pool.  For now, this is done with
posix_fadvise(), which has several caveats.  Since not all OSes have
that system call, "try" is provided so that it can be enabled where
available.  Better mechanisms for asynchronous I/O are possible in later
work.

Set to "try" for now for test coverage.  Default setting to be finalized
before release.

The GUC wal_decode_buffer_size limits the distance we can look ahead in
bytes of decoded data.

The existing GUC maintenance_io_concurrency is used to limit the number
of concurrent I/Os allowed, based on pessimistic heuristics used to
infer that I/Os have begun and completed.  We'll also not look more than
maintenance_io_concurrency * 4 block references ahead.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> (earlier version)
Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version)
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> (earlier version)
Tested-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> (earlier version)
Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com> (earlier version)
Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> (earlier version)
Tested-by: Sait Talha Nisanci <Sait.Nisanci@microsoft.com> (earlier version)
Discussion: https://postgr.es/m/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com
2022-04-07 19:42:14 +12:00
Jeff Davis 9553b4115f Fix warning introduced in 5c279a6d35.
Change two macros to be static inline functions instead to keep the
data type consistent. This avoids a "comparison is always true"
warning that was occurring with -Wtype-limits. In the process, change
the names to look less like macros.

Discussion: https://postgr.es/m/20220407063505.njnnrmbn4sxqfsts@alap3.anarazel.de
2022-04-07 00:39:30 -07:00
Andres Freund ad401664b8 pgstat: add pg_stat_have_stats() test helper.
Will be used by tests committed subsequently.

Bumps catversion (this time for real, the one in 0f96965c65 got lost when
rebasing over 5c279a6d35).

Author: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_aNxL1WegCa45r=VAViCLnpOU7uNC7bTtGw+=QAPyYivw@mail.gmail.com
2022-04-07 00:21:54 -07:00
Andres Freund 0f96965c65 pgstat: add pg_stat_force_next_flush(), use it to simplify tests.
In the stats collector days it was hard to write tests for the stats system,
because fundamentally delivery of stats messages over UDP was not
synchronous (nor guaranteed). Now we easily can force pending stats updates to
be flushed synchronously.

This moves stats.sql into a parallel group, there isn't a reason for it to run
in isolation anymore. And it may shake out some bugs.

Bumps catversion.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-06 23:35:56 -07:00
Andres Freund 5e07d3d6bd pgstat: fix small bug in pgstat_drop_relation().
Just after committing 5891c7a8ed, a test running with debug_discard_caches=1
failed locally...

pgstat_drop_relation() neither checked pgstat_should_count_relation() nor
called pgstat_prep_relation_pending(). With debug_discard_caches=1
rel->pgstat_info wasn't set up, leading pg_stat_get_xact_tuples_inserted()
spuriously still returning > 0 while in the transaction dropping the table.
2022-04-06 23:35:56 -07:00
Andres Freund 81ae9e6588 pgstat: prevent fix pgstat_reinit_entry() from zeroing out lwlock.
Zeroing out an lwlock in a normal build turns out to not trigger any alarms,
if nobody can use the lwlock at that moment (as the case here). But with
--disable-spinlocks --disable-atomics, the sema field needs to be initialized.

We probably should make sure that this fails on more common configurations as
well...

Per buildfarm animal rorqual
2022-04-06 23:35:56 -07:00
Andres Freund 3536b851ad Fix compilation with WAL_DEBUG.
Broke with 5c279a6d35. But looks like it had been half-broken since
70e81861fa, because 'rmid' didn't refer to the current record's rmid anymore,
but to rmid from "Initialize resource managers" - a constant.
2022-04-06 23:26:59 -07:00
Jeff Davis 5c279a6d35 Custom WAL Resource Managers.
Allow extensions to specify a new custom resource manager (rmgr),
which allows specialized WAL. This is meant to be used by a Table
Access Method or Index Access Method.

Prior to this commit, only Generic WAL was available, which offers
support for recovery and physical replication but not logical
replication.

Reviewed-by: Julien Rouhaud, Bharath Rupireddy, Andres Freund
Discussion: https://postgr.es/m/ed1fb2e22d15d3563ae0eb610f7b61bb15999c0a.camel%40j-davis.com
2022-04-06 23:06:46 -07:00
Michael Paquier 06f5295af6 Add single-item cache when looking at topmost XID of a subtrans XID
This change affects SubTransGetTopmostTransaction(), used to find the
topmost transaction ID of a given transaction ID.  The cache is able to
store one value, so as we can save the backend from unnecessary lookups
at pg_subtrans/ on repetitive calls of this routine.  There is a similar
practice in transam.c, for example.

Author: Simon Riggs
Reviewed-by: Andrey Borodin, Julien Rouhaud
Discussion: https://postgr.es/m/CANbhV-G8Co=yq4v4BkW7MJDqVt68K_8A48nAZ_+8UQS7LrwLEQ@mail.gmail.com
2022-04-07 14:34:37 +09:00
Andres Freund fbfe6910ec pgstat: move pgstat.c to utils/activity.
Now that pgstat is not related to postmaster anymore, src/backend/postmaster
is not a well fitting directory.

Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-06 21:29:46 -07:00
Andres Freund 1db4e5a4ee pgstat: rename STATS_COLLECTOR GUC group to STATS_CUMULATIVE.
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-06 21:29:46 -07:00
Andres Freund 6f0cf87872 pgstat: remove stats_temp_directory.
With stats now being stored in shared memory, the GUC isn't needed
anymore. However, the pg_stat_tmp directory and PG_STAT_TMP_DIR define are
kept, as pg_stat_statements (and some out-of-core extensions) store data in
it.

Docs will be updated in a subsequent commit, together with the other pending
docs updates due to shared memory stats.

Author: Andres Freund <andres@anarazel.de>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220330233550.eiwsbearu6xhuqwe@alap3.anarazel.de
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-06 21:29:46 -07:00
Andres Freund 5891c7a8ed pgstat: store statistics in shared memory.
Previously the statistics collector received statistics updates via UDP and
shared statistics data by writing them out to temporary files regularly. These
files can reach tens of megabytes and are written out up to twice a
second. This has repeatedly prevented us from adding additional useful
statistics.

Now statistics are stored in shared memory. Statistics for variable-numbered
objects are stored in a dshash hashtable (backed by dynamic shared
memory). Fixed-numbered stats are stored in plain shared memory.

The header for pgstat.c contains an overview of the architecture.

The stats collector is not needed anymore, remove it.

By utilizing the transactional statistics drop infrastructure introduced in a
prior commit statistics entries cannot "leak" anymore. Previously leaked
statistics were dropped by pgstat_vacuum_stat(), called from [auto-]vacuum. On
systems with many small relations pgstat_vacuum_stat() could be quite
expensive.

Now that replicas drop statistics entries for dropped objects, it is not
necessary anymore to reset stats when starting from a cleanly shut down
replica.

Subsequent commits will perform some further code cleanup, adapt docs and add
tests.

Bumps PGSTAT_FILE_FORMAT_ID.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-By: "David G. Johnston" <david.g.johnston@gmail.com>
Reviewed-By: Tomas Vondra <tomas.vondra@2ndquadrant.com> (in a much earlier version)
Reviewed-By: Arthur Zakirov <a.zakirov@postgrespro.ru> (in a much earlier version)
Reviewed-By: Antonin Houska <ah@cybertec.at> (in a much earlier version)
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de
Discussion: https://postgr.es/m/20210319235115.y3wz7hpnnrshdyv6@alap3.anarazel.de
2022-04-06 21:29:46 -07:00
Andres Freund be902e2651 pgstat: normalize function naming.
Most of pgstat uses pgstat_<verb>_<subject>() or just <verb>_<subject>(). But
not all (some introduced fairly recently by me). Rename ones that aren't
intentionally following a different scheme (e.g. AtEOXact_*).
2022-04-06 21:29:46 -07:00
Amit Kapila 79b716cfb7 Reorder subskiplsn in pg_subscription to avoid alignment issues.
The column 'subskiplsn' uses TYPALIGN_DOUBLE (which has 4 bytes alignment
on AIX) for storage. But the C Struct (Form_pg_subscription) has 8-byte
alignment for this field, so retrieving it from storage causes an
unaligned read.

To fix this, we rearranged the 'subskiplsn' column in the catalog so that
it naturally comes at an 8-byte boundary.

We have fixed a similar problem in commit f3b421da5f. This patch adds a
test to avoid a similar mistake in the future.

Reported-by: Noah Misch
Diagnosed-by: Noah Misch, Masahiko Sawada, Amit Kapila
Author: Masahiko Sawada
Reviewed-by: Noah Misch, Amit Kapila
Discussion: https://postgr.es/m/20220401074423.GC3682158@rfd.leadboat.com
	    https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
2022-04-07 09:39:25 +05:30
Andres Freund e41aed674f pgstat: revise replication slot API in preparation for shared memory stats.
Previously the pgstat <-> replication slots API was done with on the basis of
names. However, the upcoming move to storing stats in shared memory makes it
more convenient to use a integer as key.

Change the replication slot functions to take the slot rather than the slot
name, and expose ReplicationSlotIndex() to compute the index of an replication
slot. Special handling will be required for restarts, as the index is not
stable across restarts. For now pgstat internally still uses names.

Rename pgstat_report_replslot_{create,drop}() to
pgstat_{create,drop}_replslot() to match the functions for other kinds of
stats.

Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220404041516.cctrvpadhuriawlq@alap3.anarazel.de
2022-04-06 18:38:24 -07:00
Andres Freund 8b1dccd37c pgstat: scaffolding for transactional stats creation / drop.
One problematic part of the current statistics collector design is that there
is no reliable way of getting rid of statistics entries. Because of that
pgstat_vacuum_stat() (called by [auto-]vacuum) matches all stats for the
current database with the catalog contents and tries to drop now-superfluous
entries. That's quite expensive. What's worse, it doesn't work on physical
replicas, despite physical replicas collection statistics entries.

This commit introduces infrastructure to create / drop statistics entries
transactionally, together with the underlying catalog objects (functions,
relations, subscriptions). pgstat_xact.c maintains a list of stats entries
created / dropped transactionally in the current transaction. To ensure the
removal of statistics entries is durable dropped statistics entries are
included in commit / abort (and prepare) records, which also ensures that
stats entries are dropped on standbys.

Statistics entries created separately from creating the underlying catalog
object (e.g. when stats were previously lost due to an immediate restart)
are *not* WAL logged. However that can only happen outside of the transaction
creating the catalog object, so it does not lead to "leaked" statistics
entries.

For this to work, functions creating / dropping functions / relations /
subscriptions need to call into pgstat. For subscriptions this was already
done when dropping subscriptions, via pgstat_report_subscription_drop() (now
renamed to pgstat_drop_subscription()).

This commit does not actually drop stats yet, it just provides the
infrastructure. It is however a largely independent piece of infrastructure,
so committing it separately makes sense.

Bumps XLOG_PAGE_MAGIC.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-06 18:27:52 -07:00
Andres Freund 8fb580a35c pgstat: prepare APIs used by pgstatfuncs for shared memory stats.
With the introduction of PgStat_Kind PgStat_Single_Reset_Type,
PgStat_Shared_Reset_Target don't make sense anymore. Replace them with
PgStat_Kind.

Instead of having dedicated reset functions for different kinds of stats, use
two generic helper routines (one to reset all stats of a kind, one to reset
one stats entry).

A number of reset functions were named pgstat_reset_*_counter(), despite
affecting multiple counters. The generic helper routines get rid of
pgstat_reset_single_counter(), pgstat_reset_subscription_counter().

Rename pgstat_reset_slru_counter(), pgstat_reset_replslot_counter() to
pgstat_reset_slru(), pgstat_reset_replslot() respectively, and have them only
deal with a single SLRU/slot. Resetting all SLRUs/slots goes through the
generic pgstat_reset_of_kind().

Previously pg_stat_reset_replication_slot() used SearchNamedReplicationSlot()
to check if a slot exists. API wise it seems better to move that to
pgstat_replslot.c.

This is done separately from the - quite large - shared memory statistics
patch to make review easier.

Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220404041516.cctrvpadhuriawlq@alap3.anarazel.de
2022-04-06 17:56:19 -07:00
Andres Freund 8ea7963fc7 pgstat: add pgstat_copy_relation_stats().
Until now index_concurrently_swap() directly modified pgstat internal
datastructures. That will break with the introduction of shared memory
statistics and seems off architecturally.

This is done separately from the - quite large - shared memory statistics
patch to make review easier.

Author: Andres Freund <andres@anarazel.de>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
2022-04-06 14:09:18 -07:00
Andres Freund cc96373cf3 pgstat: rename some pgstat_send_* functions to pgstat_report_*.
Only the pgstat_send_* functions that are called from outside pgstat*.c are
renamed (the rest will go away). This is done separately from the - quite
large - shared memory statistics patch to make review easier.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220404041516.cctrvpadhuriawlq@alap3.anarazel.de
2022-04-06 14:08:57 -07:00
Tom Lane dbafe127bb Suppress "variable 'pagesaving' set but not used" warning.
With asserts disabled, late-model clang notices that this variable
is incremented but never otherwise read.

Discussion: https://postgr.es/m/3171401.1649275153@sss.pgh.pa.us
2022-04-06 17:03:50 -04:00
Andres Freund bdbd3d9064 pgstat: stats collector references in comments.
Soon the stats collector will be no more, with statistics instead getting
stored in shared memory. There are a lot of references to the stats collector
in comments. This commit replaces most of these references with "cumulative
statistics system", with the remaining ones getting replaced as part of
subsequent commits.

This is done separately from the - quite large - shared memory statistics
patch to make review easier.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de
2022-04-06 13:56:06 -07:00
Andres Freund ab62a642d5 pgstat: move transactional code into pgstat_xact.c.
The transactional integration code is largely independent from the rest of
pgstat.c. Subsequent commits will add more related code.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20220404041516.cctrvpadhuriawlq@alap3.anarazel.de
2022-04-06 13:23:47 -07:00
Andres Freund c3e9b07936 pgstat: move pgstat_report_autovac() to pgstat_database.c.
I got the location wrong in 13619598f1. The name did make it sound like it
belonged in pgstat_relation.c...
2022-04-06 12:41:29 -07:00
Andres Freund 46a2d2499a dsm: allow use in single user mode.
It might seem pointless to allow use of dsm in single user mode, but otherwise
subsystems might need dedicated single user mode code paths.

Besides changing the assert, all that's needed is to make some windows code
assuming the presence of postmaster conditional.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA+hUKGL9hY_VY=+oUK+Gc1iSRx-Ls5qeYJ6q=dQVZnT3R63Taw@mail.gmail.com
2022-04-06 12:40:04 -07:00
Stephen Frost 39969e2a1e Remove exclusive backup mode
Exclusive-mode backups have been deprecated since 9.6 (when
non-exclusive backups were introduced) due to the issues
they can cause should the system crash while one is running and
generally because non-exclusive provides a much better interface.
Further, exclusive backup mode wasn't really being tested (nor was most
of the related code- like being able to log in just to stop an exclusive
backup and the bits of the state machine related to that) and having to
possibly deal with an exclusive backup and the backup_label file
existing during pg_basebackup, pg_rewind, etc, added other complexities
that we are better off without.

This patch removes the exclusive backup mode, the various special cases
for dealing with it, and greatly simplifies the online backup code and
documentation.

Authors: David Steele, Nathan Bossart
Reviewed-by: Chapman Flack
Discussion: https://postgr.es/m/ac7339ca-3718-3c93-929f-99e725d1172c@pgmasters.net
https://postgr.es/m/CAHg+QDfiM+WU61tF6=nPZocMZvHDzCK47Kneyb0ZRULYzV5sKQ@mail.gmail.com
2022-04-06 14:41:03 -04:00
Tom Lane a0ffa885e4 Allow granting SET and ALTER SYSTEM privileges on GUC parameters.
This patch allows "PGC_SUSET" parameters to be set by non-superusers
if they have been explicitly granted the privilege to do so.
The privilege to perform ALTER SYSTEM SET/RESET on a specific parameter
can also be granted.
Such privileges are cluster-wide, not per database.  They are tracked
in a new shared catalog, pg_parameter_acl.

Granting and revoking these new privileges works as one would expect.
One caveat is that PGC_USERSET GUCs are unaffected by the SET privilege
--- one could wish that those were handled by a revocable grant to
PUBLIC, but they are not, because we couldn't make it robust enough
for GUCs defined by extensions.

Mark Dilger, reviewed at various times by Andrew Dunstan, Robert Haas,
Joshua Brindle, and myself

Discussion: https://postgr.es/m/3D691E20-C1D5-4B80-8BA5-6BEB63AF3029@enterprisedb.com
2022-04-06 13:24:33 -04:00
Peter Eisentraut 01effb1304 Fix unsigned output format in SLRU error reporting
Avoid printing signed values as unsigned.  (No impact in practice
expected.)

Author: Pavel Borisov <pashkin.elfe@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CALT9ZEHN7hWJo6MgJKqoDMGj%3DGOzQU50wTvOYZXDj7x%3DsUK-kw%40mail.gmail.com
2022-04-06 09:15:05 +02:00
Etsuro Fujita c2bb02bc2e Allow asynchronous execution in more cases.
In commit 27e1f1456, create_append_plan() only allowed the subplan
created from a given subpath to be executed asynchronously when it was
an async-capable ForeignPath.  To extend coverage, this patch handles
cases when the given subpath includes some other Path types as well that
can be omitted in the plan processing, such as a ProjectionPath directly
atop an async-capable ForeignPath, allowing asynchronous execution in
partitioned-scan/partitioned-join queries with non-Var tlist expressions
and more UNION queries.

Andrey Lepikhov and Etsuro Fujita, reviewed by Alexander Pyhalov and
Zhihong Yu.

Discussion: https://postgr.es/m/659c37a8-3e71-0ff2-394c-f04428c76f08%40postgrespro.ru
2022-04-06 15:45:00 +09:00
Amit Kapila 2d09e44d30 Improve comments for row filtering and toast interaction in logical replication.
Reported-by: Antonin Houska
Author: Amit Kapila
Reviewed-by: Antonin Houska, Ajin Cherian
Discussion: https://postgr.es/m/84638.1649152255@antos
2022-04-06 08:20:40 +05:30