Commit Graph

327 Commits

Author SHA1 Message Date
Alexander Korotkov 6377e12a5a revert: Generalize relation analyze in table AM interface
This commit reverts 27bc1772fc and dd1f6b0c17.  Per review by Andres Freund.

Discussion: https://postgr.es/m/20240415201057.khoyxbwwxfgzomeo%40awork3.anarazel.de
2024-04-16 13:14:20 +03:00
Alexander Korotkov dd1f6b0c17 Provide a way block-level table AMs could re-use acquire_sample_rows()
While keeping API the same, this commit provides a way for block-level table
AMs to re-use existing acquire_sample_rows() by providing custom callbacks
for getting the next block and the next tuple.

Reported-by: Andres Freund
Discussion: https://postgr.es/m/20240407214001.jgpg5q3yv33ve6y3%40awork3.anarazel.de
Reviewed-by: Pavel Borisov
2024-04-08 14:39:48 +03:00
Thomas Munro 041b96802e Use streaming I/O in ANALYZE.
The ANALYZE command prefetches and reads sample blocks chosen by a
BlockSampler algorithm. Instead of calling [Prefetch|Read]Buffer() for
each block, ANALYZE now uses the streaming API introduced in b5a9b18cd0.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/flat/CAN55FZ0UhXqk9v3y-zW_fp4-WCp43V8y0A72xPmLkOM%2B6M%2BmJg%40mail.gmail.com
2024-04-08 13:16:28 +12:00
Alexander Korotkov 27bc1772fc Generalize relation analyze in table AM interface
Currently, there is just one algorithm for sampling tuples from a table written
in acquire_sample_rows().  Custom table AM can just redefine the way to get the
next block/tuple by implementing scan_analyze_next_block() and
scan_analyze_next_tuple() API functions.

This approach doesn't seem general enough.  For instance, it's unclear how to
sample this way index-organized tables.  This commit allows table AM to
encapsulate the whole sampling algorithm (currently implemented in
acquire_sample_rows()) into the relation_analyze() API function.

Discussion: https://postgr.es/m/CAPpHfdurb9ycV8udYqM%3Do0sPS66PJ4RCBM1g-bBpvzUfogY0EA%40mail.gmail.com
Reviewed-by: Pavel Borisov, Matthias van de Meent
2024-03-30 22:34:04 +02:00
Peter Eisentraut 20e58105ba Separate equalRowTypes() from equalTupleDescs()
This introduces a new function equalRowTypes() that is effectively a
subset of equalTupleDescs() but only compares the number of attributes
and attribute name, type, typmod, and collation.  This is enough for
most existing uses of equalTupleDescs(), which are changed to use the
new function.  The only remaining callers of equalTupleDescs() are
those that really want to check the full tuple descriptor as such,
without concern about record or row or record type semantics.

The existing function hashTupleDesc() is renamed to hashRowType(),
because it now corresponds more to equalRowTypes().

The purpose of this change is to be clearer about the semantics of the
equality asked for by each caller.  (At least one caller had a comment
that questioned whether equalTupleDescs() was too restrictive.)  For
example, 4f622503d6 removed attstattarget from the tuple descriptor
structure.  It was not fully clear at the time how this should affect
equalTupleDescs().  Now the answer is clear: By their own definitions,
equalRowTypes() does not care, and equalTupleDescs() just compares
whatever is in the tuple descriptor but does not care why it is in
there.

Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/f656d6d9-6660-4518-a006-2f65cafbebd1%40eisentraut.org
2024-03-17 05:58:04 +01:00
Nathan Bossart ecb0fd3372 Reintroduce MAINTAIN privilege and pg_maintain predefined role.
Roles with MAINTAIN on a relation may run VACUUM, ANALYZE, REINDEX,
REFRESH MATERIALIZE VIEW, CLUSTER, and LOCK TABLE on the relation.
Roles with privileges of pg_maintain may run those same commands on
all relations.

This was previously committed for v16, but it was reverted in
commit 151c22deee due to concerns about search_path tricks that
could be used to escalate privileges to the table owner.  Commits
2af07e2f74, 59825d1639, and c7ea3f4229 resolved these concerns by
restricting search_path when running maintenance commands.

Bumps catversion.

Reviewed-by: Jeff Davis
Discussion: https://postgr.es/m/20240305161235.GA3478007%40nathanxps13
2024-03-13 14:49:26 -05:00
Jeff Davis 59825d1639 Fix buildfarm failures from 2af07e2f74.
Use GUC_ACTION_SAVE rather than GUC_ACTION_SET, necessary for working
with parallel query.

Now that the call requires more arguments, wrap the call in a new
function to avoid code duplication and offer a place for a comment.

Discussion: https://postgr.es/m/E1rhJpO-0027Wf-9L@gemulon.postgresql.org
2024-03-04 19:42:16 -08:00
Jeff Davis 2af07e2f74 Fix search_path to a safe value during maintenance operations.
While executing maintenance operations (ANALYZE, CLUSTER, REFRESH
MATERIALIZED VIEW, REINDEX, or VACUUM), set search_path to
'pg_catalog, pg_temp' to prevent inconsistent behavior.

Functions that are used for functional indexes, in index expressions,
or in materialized views and depend on a different search path must be
declared with CREATE FUNCTION ... SET search_path='...'.

This change was previously committed as 05e1737351, then reverted in
commit 2fcc7ee7af because it was too late in the cycle.

Preparation for the MAINTAIN privilege, which was previously reverted
due to search_path manipulation hazards.

Discussion: https://postgr.es/m/d4ccaf3658cb3c281ec88c851a09733cd9482f22.camel@j-davis.com
Discussion: https://postgr.es/m/E1q7j7Y-000z1H-Hr%40gemulon.postgresql.org
Discussion: https://postgr.es/m/e44327179e5c9015c8dda67351c04da552066017.camel%40j-davis.com
Reviewed-by: Greg Stark, Nathan Bossart, Noah Misch
2024-03-04 17:31:38 -08:00
Peter Eisentraut dbbca2cf29 Remove unused #include's from backend .c files
as determined by include-what-you-use (IWYU)

While IWYU also suggests to *add* a bunch of #include's (which is its
main purpose), this patch does not do that.  In some cases, a more
specific #include replaces another less specific one.

Some manual adjustments of the automatic result:

- IWYU currently doesn't know about includes that provide global
  variable declarations (like -Wmissing-variable-declarations), so
  those includes are being kept manually.

- All includes for port(ability) headers are being kept for now, to
  play it safe.

- No changes of catalog/pg_foo.h to catalog/pg_foo_d.h, to keep the
  patch from exploding in size.

Note that this patch touches just *.c files, so nothing declared in
header files changes in hidden ways.

As a small example, in src/backend/access/transam/rmgr.c, some IWYU
pragma annotations are added to handle a special case there.

Discussion: https://www.postgresql.org/message-id/flat/af837490-6b2f-46df-ba05-37ea6a6653fc%40eisentraut.org
2024-03-04 12:02:20 +01:00
Heikki Linnakangas 393b5599e5 Use MyBackendType in more places to check what process this is
Remove IsBackgroundWorker, IsAutoVacuumLauncherProcess(),
IsAutoVacuumWorkerProcess(), and IsLogicalSlotSyncWorker() in favor of
new Am*Process() macros that use MyBackendType. For consistency with
the existing Am*Process() macros.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/f3ecd4cb-85ee-4e54-8278-5fabfb3a4ed0@iki.fi
2024-03-04 10:25:12 +02:00
Peter Eisentraut 4f622503d6 Make attstattarget nullable
This changes the pg_attribute field attstattarget into a nullable
field in the variable-length part of the row.  If no value is set by
the user for attstattarget, it is now null instead of previously -1.
This saves space in pg_attribute and tuple descriptors for most
practical scenarios.  (ATTRIBUTE_FIXED_PART_SIZE is reduced from 108
to 104.)  Also, null is the semantically more correct value.

The ANALYZE code internally continues to represent the default
statistics target by -1, so that that code can avoid having to deal
with null values.  But that is now contained to the ANALYZE code.
Only the DDL code deals with attstattarget possibly null.

For system columns, the field is now always null.  The ANALYZE code
skips system columns anyway.

To set a column's statistics target to the default value, the new
command form ALTER TABLE ... SET STATISTICS DEFAULT can be used.  (SET
STATISTICS -1 still works.)

Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/4da8d211-d54d-44b9-9847-f2a9f1184c76@eisentraut.org
2024-01-13 18:14:53 +01:00
Bruce Momjian 29275b1d17 Update copyright for 2024
Reported-by: Michael Paquier

Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz

Backpatch-through: 12
2024-01-03 20:49:05 -05:00
Heikki Linnakangas 049ef3398d Don't try to open visibilitymap when analyzing a foreign table
It's harmless, visibilitymap_count() returns 0 if the file doesn't
exist. But it's also very pointless. I noticed this when I added an
assertion in smgropen() that the relnumber is valid.

Discussion: https://www.postgresql.org/message-id/621a52fd-3cd8-4f5d-a561-d510b853bbaf@iki.fi
2023-12-08 09:16:21 +02:00
Nathan Bossart 6a72c42fd5 Retire MemoryContextResetAndDeleteChildren() macro.
As of commit eaa5808e8e, MemoryContextResetAndDeleteChildren() is
just a backwards compatibility macro for MemoryContextReset().  Now
that some time has passed, this macro seems more likely to create
confusion.

This commit removes the macro and replaces all remaining uses with
calls to MemoryContextReset().  Any third-party code that use this
macro will need to be adjusted to call MemoryContextReset()
instead.  Since the two have behaved the same way since v9.5, such
adjustments won't produce any behavior changes for all
currently-supported versions of PostgreSQL.

Reviewed-by: Amul Sul, Tom Lane, Alvaro Herrera, Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/20231113185950.GA1668018%40nathanxps13
2023-11-15 13:42:30 -06:00
Heikki Linnakangas c181f2e2bc Fix briefly showing old progress stats for ANALYZE on inherited tables.
ANALYZE on a table with inheritance children analyzes all the child
tables in a loop. When stepping to next child table, it updated the
child rel ID value in the command progress stats, but did not reset
the 'sample_blks_total' and 'sample_blks_scanned' counters.
acquire_sample_rows() updates 'sample_blks_total' as soon as the scan
starts and 'sample_blks_scanned' after processing the first block, but
until then, pg_stat_progress_analyze would display a bogus combination
of the new child table relid with old counter values from the
previously processed child table. Fix by resetting 'sample_blks_total'
and 'sample_blks_scanned' to zero at the same time that
'current_child_table_relid' is updated.

Backpatch to v13, where pg_stat_progress_analyze view was introduced.

Reported-by: Justin Pryzby
Discussion: https://www.postgresql.org/message-id/20230122162345.GP13860%40telsasoft.com
2023-09-30 17:03:50 +03:00
Nathan Bossart 151c22deee Revert MAINTAIN privilege and pg_maintain predefined role.
This reverts the following commits: 4dbdb82513, c2122aae63,
5b1a879943, 9e1e9d6560, ff9618e82a, 60684dd834, 4441fc704d,
and b5d6382496.  A role with the MAINTAIN privilege may be able to
use search_path tricks to escalate privileges to the table owner.
Unfortunately, it is too late in the v16 development cycle to apply
the proposed fix, i.e., restricting search_path when running
maintenance commands.

Bumps catversion.

Reviewed-by: Jeff Davis
Discussion: https://postgr.es/m/E1q7j7Y-000z1H-Hr%40gemulon.postgresql.org
Backpatch-through: 16
2023-07-07 11:25:13 -07:00
Peter Eisentraut c69bdf837f Take pg_attribute out of VacAttrStats
The VacAttrStats structure contained the whole Form_pg_attribute for a
column, but it actually only needs attstattarget from there.  So
remove the Form_pg_attribute field and make a separate field for
attstattarget.  This simplifies some code for extended statistics that
doesn't deal with a column but an expression, which had to fake up
pg_attribute rows to satisfy internal APIs.  Also, we can remove some
comments that essentially said "don't look at pg_attribute directly".

Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/d6069765-5971-04d3-c10d-e4f7b2e9c459%40eisentraut.org
2023-07-03 07:18:57 +02:00
Nathan Bossart 5b1a879943 Move bool parameter for vacuum_rel() to option bits.
ff9618e82a introduced the skip_privs parameter, which is used to
skip privilege checks when recursing to a relation's TOAST table.
This parameter should have been added as a flag bit in
VacuumParams->options instead.

Suggested-by: Michael Paquier
Reviewed-by: Michael Paquier, Jeff Davis
Discussion: https://postgr.es/m/ZIj4v1CwqlDVJZfB%40paquier.xyz
2023-06-20 15:14:58 -07:00
Jeff Davis 2fcc7ee7af Revert "Fix search_path to a safe value during maintenance operations."
This reverts commit 05e1737351.
2023-06-10 08:11:41 -07:00
Jeff Davis 05e1737351 Fix search_path to a safe value during maintenance operations.
While executing maintenance operations (ANALYZE, CLUSTER, REFRESH
MATERIALIZED VIEW, REINDEX, or VACUUM), set search_path to
'pg_catalog, pg_temp' to prevent inconsistent behavior.

Functions that are used for functional indexes, in index expressions,
or in materialized views and depend on a different search path must be
declared with CREATE FUNCTION ... SET search_path='...'.

This change addresses a security risk introduced in commit 60684dd834,
where a role with MAINTAIN privileges on a table may be able to
escalate privileges to the table owner. That commit is not yet part of
any release, so no need to backpatch.

Discussion: https://postgr.es/m/e44327179e5c9015c8dda67351c04da552066017.camel%40j-davis.com
Reviewed-by: Greg Stark
Reviewed-by: Nathan Bossart
2023-06-09 11:20:47 -07:00
Peter Geoghegan a349b86603 Move heaprel struct field next to index rel field.
Commit 61b313e4 added a heaprel struct member to IndexVacuumInfo, but
placed it last.  Move the heaprel struct member next to the index struct
member to improve the code's readability.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WznG=TV6S9d3VA=y0vBHbXwnLs9_LLdiML=aNJuHeriwxg@mail.gmail.com
2023-04-03 11:01:11 -07:00
Andres Freund 61b313e47e Pass down table relation into more index relation functions
This is done in preparation for logical decoding on standby, which needs to
include whether visibility affecting WAL records are about a (user) catalog
table. Which is only known for the table, not the indexes.

It's also nice to be able to pass the heap relation to GlobalVisTestFor() in
vacuumRedirectAndPlaceholder().

Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/21b700c3-eecf-2e05-a699-f8c78dd31ec7@gmail.com
2023-04-01 20:18:29 -07:00
Tom Lane c2d7d679c1 Ensure acquire_inherited_sample_rows sets its output parameters.
The totalrows/totaldeadrows outputs were left uninitialized in cases
where we find no analyzable child tables of a partitioned table.  This
could lead to setting the partitioned table's pg_class.reltuples value
to garbage.  It's not clear that that would have any very bad effects
in practice, but fix it anyway because it's making valgrind unhappy.

Reported and diagnosed by Alexander Lakhin (bug #17880).

Discussion: https://postgr.es/m/17880-9282037c923d856e@postgresql.org
2023-03-31 10:08:40 -04:00
Peter Eisentraut aa69541046 Remove useless casts to (void *) in arguments of some system functions
The affected functions are: bsearch, memcmp, memcpy, memset, memmove,
qsort, repalloc

Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/fd9adf5d-b1aa-e82f-e4c7-263c30145807%40enterprisedb.com
2023-02-07 06:57:59 +01:00
Bruce Momjian c8e1ba736b Update copyright for 2023
Backpatch-through: 11
2023-01-02 15:00:37 -05:00
Jeff Davis 60684dd834 Add grantable MAINTAIN privilege and pg_maintain role.
Allows VACUUM, ANALYZE, REINDEX, REFRESH MATERIALIZED VIEW, CLUSTER,
and LOCK TABLE.

Effectively reverts 4441fc704d. Instead of creating separate
privileges for VACUUM, ANALYZE, and other maintenance commands, group
them together under a single MAINTAIN privilege.

Author: Nathan Bossart
Discussion: https://postgr.es/m/20221212210136.GA449764@nathanxps13
Discussion: https://postgr.es/m/45224.1670476523@sss.pgh.pa.us
2022-12-13 17:33:28 -08:00
Andrew Dunstan b5d6382496 Provide per-table permissions for vacuum and analyze.
Currently a table can only be vacuumed or analyzed by its owner or
a superuser. This can now be extended to any user by means of an
appropriate GRANT.

Nathan Bossart

Reviewed by: Bharath Rupireddy, Kyotaro Horiguchi, Stephen Frost, Robert
Haas, Mark Dilger, Tom Lane, Corey Huinker, David G. Johnston, Michael
Paquier.

Discussion: https://postgr.es/m/20220722203735.GB3996698@nathanxps13
2022-11-28 12:08:14 -05:00
Michael Paquier 09a72188cd Avoid some overhead with open and close of catalog indexes
This commit improves two code paths to open and close indexes a
minimum amount of times when doing a series of catalog updates or
inserts.  CatalogTupleInsert() is costly when using it for multiple
inserts or updates compared to CatalogTupleInsertWithInfo(), as it would
need to open and close the indexes of the catalog worked each time an
operation is done.

This commit updates the following places:
- REINDEX CONCURRENTLY when copying statistics from one index relation
to the other.  Multi-INSERTs are avoided here, as this would begin to
show benefits only for indexes with multiple expressions, for example,
which may not be the most common pattern.  This change is noticeable in
profiles with indexes having many expressions, for example, and it would
improve any callers of CopyStatistics().
- Update of statistics on ANALYZE, that mixes inserts and updates.
In each case, the catalog indexes are opened only if at least one
insertion and/or update is required, to minimize the cost of the
operation.  Like the previous coding, no indexes are opened as long as
at least one insert or update of pg_statistic has happened.

Author: Ranier Vilela
Reviewed-by: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/CAEudQAqh0F9y6Di_Wc8xW4zkWm_5SDd-nRfVsCn=h0Nm1C_mrg@mail.gmail.com
2022-11-16 10:49:05 +09:00
Michael Paquier c42cd05c58 Cleanup useless assignments and checks
This cleans up a couple of areas:
- Remove XLogSegNo calculation for the last WAL segment in backup in
xlog.c (7d70809 has moved this logic entirely to xlogbackup.c when
building the contents of the backup history file).
- Remove check on log_min_duration in analyze.c, as it is already true
where this code path is reached.
- Simplify call to find_option() in guc.c.

Author: Ranier Vilela
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAEudQArCDQQiPiFR16=yu9k5s2tp4tgEe1U1ZbkW4ofx81AWWQ@mail.gmail.com
2022-10-04 13:16:23 +09:00
Tom Lane e64cdab003 Invent qsort_interruptible().
Justin Pryzby reported that some scenarios could cause gathering
of extended statistics to spend many seconds in an un-cancelable
qsort() operation.  To fix, invent qsort_interruptible(), which is
just like qsort_arg() except that it will also do CHECK_FOR_INTERRUPTS
every so often.  This bloats the backend by a couple of kB, which
seems like a good investment.  (We considered just enabling
CHECK_FOR_INTERRUPTS in the existing qsort and qsort_arg functions,
but there are some callers for which that'd demonstrably be unsafe.
Opt-in seems like a better way.)

For now, just apply qsort_interruptible() in statistics collection.
There's probably more places where it could be useful, but we can
always change other call sites as we find problems.

Back-patch to v14.  Before that we didn't have extended stats on
expressions, so that the problem was less severe.  Also, this patch
depends on the sort_template infrastructure introduced in v14.

Tom Lane and Justin Pryzby

Discussion: https://postgr.es/m/20220509000108.GQ28830@telsasoft.com
2022-07-12 16:30:36 -04:00
Peter Eisentraut d746021de1 Add construct_array_builtin, deconstruct_array_builtin
There were many calls to construct_array() and deconstruct_array() for
built-in types, for example, when dealing with system catalog columns.
These all hardcoded the type attributes necessary to pass to these
functions.

To simplify this a bit, add construct_array_builtin(),
deconstruct_array_builtin() as wrappers that centralize this hardcoded
knowledge.  This simplifies many call sites and reduces the amount of
hardcoded stuff that is spread around.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/2914356f-9e5f-8c59-2995-5997fc48bcba%40enterprisedb.com
2022-07-01 11:23:15 +02:00
Tom Lane 23e7b38bfe Pre-beta mechanical code beautification.
Run pgindent, pgperltidy, and reformat-dat-files.
I manually fixed a couple of comments that pgindent uglified.
2022-05-12 15:17:30 -04:00
Andres Freund bdbd3d9064 pgstat: stats collector references in comments.
Soon the stats collector will be no more, with statistics instead getting
stored in shared memory. There are a lot of references to the stats collector
in comments. This commit replaces most of these references with "cumulative
statistics system", with the remaining ones getting replaced as part of
subsequent commits.

This is done separately from the - quite large - shared memory statistics
patch to make review easier.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de
2022-04-06 13:56:06 -07:00
Peter Geoghegan 872770fd6c Add VACUUM instrumentation for scanned pages, relfrozenxid.
Report on scanned pages within VACUUM VERBOSE and autovacuum logging.
These are pages that were physically examined during the VACUUM
operation.  Note that this can include a small number of pages that were
marked all-visible in the visibility map by some earlier VACUUM
operation.  VACUUM won't skip all-visible pages that aren't part of a
range of all-visible pages that's at least 32 blocks in length (partly
to avoid missing out on opportunities to advance relfrozenxid during
non-aggressive VACUUMs).

Commit 44fa8488 simplified the definition of scanned pages.  It became
the complement of the pages (of those pages from rel_pages) that were
skipped using the visibility map.  And so scanned pages precisely
indicates how effective the visibility map was at saving work.  (Before
now we displayed the number of pages skipped via the visibility map when
happened to be frozen pages, but not when they were merely all-visible,
which was less useful to users.)

Rename the user-visible OldestXmin output field to "removal cutoff", and
show some supplementary information: how far behind the cutoff is
(number of XIDs behind) by the time the VACUUM operation finished.  This
will help users to figure out what's _not_ working in extreme cases
where VACUUM is fundamentally unable to remove dead tuples or freeze
older tuples (e.g., due to a leaked replication slot).  Also report when
relfrozenxid is advanced by VACUUM in output that immediately follows
"removal cutoff".  This structure is intended to highlight the
relationship between the new relfrozenxid value for the table, and the
VACUUM operation's removal cutoff.

Finally, add instrumentation of "missed dead tuples", and the number of
pages that had at least one such tuple.  These are fully DEAD (not just
RECENTLY_DEAD) tuples with storage that could not be pruned due to
failure to acquire a cleanup lock on a heap page.  This is a replacement
for the "skipped due to pin" instrumentation removed by commit 44fa8488.
It shows more details than before for pages where failing to get a
cleanup lock actually resulted in VACUUM missing out on useful work, but
usually shows nothing at all instead (the mere fact that we couldn't get
a cleanup lock is usually of no consequence whatsoever now).

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAH2-Wznp=c=Opj8Z7RMR3G=ec3_JfGYMN_YvmCEjoPCHzWbx0g@mail.gmail.com
2022-02-11 16:48:40 -08:00
Tomas Vondra 269b532aef Add stxdinherit flag to pg_statistic_ext_data
Add pg_statistic_ext_data.stxdinherit flag, so that for each extended
statistics definition we can store two versions of data - one for the
relation alone, one for the whole inheritance tree. This is analogous to
pg_statistic.stainherit, but we failed to include such flag in catalogs
for extended statistics, and we had to work around it (see commits
859b3003de, 36c4bc6e72 and 20b9fa308e).

This changes the relationship between the two catalogs storing extended
statistics objects (pg_statistic_ext and pg_statistic_ext_data). Until
now, there was a simple 1:1 mapping - for each definition there was one
pg_statistic_ext_data row, and this row was inserted while creating the
statistics (and then updated during ANALYZE). With the stxdinherit flag,
we don't know how many rows there will be (child relations may be added
after the statistics object is defined), so there may be up to two rows.

We could make CREATE STATISTICS to always create both rows, but that
seems wasteful - without partitioning we only need stxdinherit=false
rows, and declaratively partitioned tables need only stxdinherit=true.
So we no longer initialize pg_statistic_ext_data in CREATE STATISTICS,
and instead make that a responsibility of ANALYZE. Which is what we do
for regular statistics too.

Patch by me, with extensive improvements and fixes by Justin Pryzby.

Author: Tomas Vondra, Justin Pryzby
Reviewed-by: Tomas Vondra, Justin Pryzby
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
2022-01-16 13:38:01 +01:00
Tomas Vondra 20b9fa308e Build inherited extended stats on partitioned tables
Commit 859b3003de disabled building of extended stats for inheritance
trees, to prevent updating the same catalog row twice. While that
resolved the issue, it also means there are no extended stats for
declaratively partitioned tables, because there are no data in the
non-leaf relations.

That also means declaratively partitioned tables were not affected by
the issue 859b3003de addressed, which means this is a regression
affecting queries that calculate estimates for the whole inheritance
tree as a whole (which includes e.g. GROUP BY queries).

But because partitioned tables are empty, we can invert the condition
and build statistics only for the case with inheritance, without losing
anything. And we can consider them when calculating estimates.

It may be necessary to run ANALYZE on partitioned tables, to collect
proper statistics. For declarative partitioning there should no prior
statistics, and it might take time before autoanalyze is triggered. For
tables partitioned by inheritance the statistics may include data from
child relations (if built 859b3003de), contradicting the current code.

Report and patch by Justin Pryzby, minor fixes and cleanup by me.
Backpatch all the way back to PostgreSQL 10, where extended statistics
were introduced (same as 859b3003de).

Author: Justin Pryzby
Reported-by: Justin Pryzby
Backpatch-through: 10
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
2022-01-15 19:06:48 +01:00
Bruce Momjian 27b77ecf9f Update copyright for 2022
Backpatch-through: 10
2022-01-07 19:04:57 -05:00
Tom Lane 3804539e48 Replace random(), pg_erand48(), etc with a better PRNG API and algorithm.
Standardize on xoroshiro128** as our basic PRNG algorithm, eliminating
a bunch of platform dependencies as well as fundamentally-obsolete PRNG
code.  In addition, this API replacement will ease replacing the
algorithm again in future, should that become necessary.

xoroshiro128** is a few percent slower than the drand48 family,
but it can produce full-width 64-bit random values not only 48-bit,
and it should be much more trustworthy.  It's likely to be noticeably
faster than the platform's random(), depending on which platform you
are thinking about; and we can have non-global state vectors easily,
unlike with random().  It is not cryptographically strong, but neither
are the functions it replaces.

Fabien Coelho, reviewed by Dean Rasheed, Aleksander Alekseev, and myself

Discussion: https://postgr.es/m/alpine.DEB.2.22.394.2105241211230.165418@pseudo
2021-11-28 21:33:07 -05:00
Michael Paquier fdd8857145 Block ALTER INDEX/TABLE index_name ALTER COLUMN colname SET (options)
The grammar of this command run on indexes with column names has always
been authorized by the parser, and it has never been documented.

Since 911e702, it is possible to define opclass parameters as of CREATE
INDEX, which actually broke the old case of ALTER INDEX/TABLE where
relation-level parameters n_distinct and n_distinct_inherited could be
defined for an index (see 76a47c0 and its thread where this point has
been touched, still remained unused).  Attempting to do that in v13~
would cause the index to become unusable, as there is a new dedicated
code path to load opclass parameters instead of the relation-level ones
previously available.  Note that it is possible to fix things with a
manual catalog update to bring the relation back online.

This commit disables this command for now as the use of column names for
indexes does not make sense anyway, particularly when it comes to index
expressions where names are automatically computed.  One way to properly
support this case properly in the future would be to use column numbers
when it comes to indexes, in the same way as ALTER INDEX .. ALTER COLUMN
.. SET STATISTICS.

Partitioned indexes were already blocked, but not indexes.  Some tests
are added for both cases.

There was some code in ANALYZE to enforce n_distinct to be used for an
index expression if the parameter was defined, but just remove it for
now until/if there is support for this (note that index-level parameters
never had support in pg_dump either, previously), so this was just dead
code.

Reported-by: Matthijs van der Vleuten
Author: Nathan Bossart, Michael Paquier
Reviewed-by: Vik Fearing, Dilip Kumar
Discussion: https://postgr.es/m/17220-15d684c6c2171a83@postgresql.org
Backpatch-through: 13
2021-10-19 11:03:52 +09:00
Alvaro Herrera 375aed36ad
Keep stats up to date for partitioned tables
In the long-going saga for analyze on partitioned tables, one thing I
missed while reverting 0827e8af70 is the maintenance of analyze count
and last analyze time for partitioned tables.  This is a mostly trivial
change that enables users assess the need for invoking manual ANALYZE on
partitioned tables.

This patch, posted by Justin and modified a bit by me (Álvaro), can be
mostly traced back to Hosoya-san, though any problems introduced with
the scissors are mine.

Backpatch to 14, in line with 6f8127b739.

Co-authored-by: Yuzuko Hosoya <yuzukohosoya@gmail.com>
Co-authored-by: Justin Pryzby <pryzby@telsasoft.com>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20210816222810.GE10479@telsasoft.com
2021-08-28 15:58:23 -04:00
Stephen Frost ce42efaa26 Use maintenance_io_concurrency for ANALYZE prefetch
When prefetching pages for ANALYZE, we should be using
maintenance_io_concurrenty (by calling
get_tablespace_maintenance_io_concurrency(), not
get_tablespace_io_concurrency()).

ANALYZE prefetching was introduced in c6fc50c, so back-patch to 14.

Backpatch-through: 14
Reported-By: Egor Rogov
Discussion: https://postgr.es/m/9beada99-34ce-8c95-fadb-451768d08c64%40postgrespro.ru
2021-08-27 19:23:14 -04:00
Peter Geoghegan bda822554b track_io_timing logging: Don't special case 0 ms.
Adjust track_io_timing related logging code added by commit 94d13d474d.
Make it consistent with other nearby autovacuum and autoanalyze logging
code by removing logic that suppressed zero millisecond outputs.

log_autovacuum_min_duration log output now reliably shows "read:" and
"write:" millisecond-based values in its report (when track_io_timing is
enabled).

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Stephen Frost <sfrost@snowman.net>
Discussion: https://postgr.es/m/CAH2-WznW0FNxSVQMSRazAMYNfZ6DR_gr5WE78hc6E1CBkkJpzw@mail.gmail.com
Backpatch: 14-, where the track_io_timing logging was introduced.
2021-08-27 13:34:00 -07:00
Peter Geoghegan fdfbfa24fa Reorder log_autovacuum_min_duration log output.
This order seems more natural.  It starts with details that are
particular to heap and index data structures, and ends with system-level
costs incurred during the autovacuum worker's VACUUM/ANALYZE operation.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkzxK6ahA9xxsOftRtBX_R0swuHZsvo4QUbak1Bz7hb7Q@mail.gmail.com
Backpatch: 14-, which enhanced the log output in various ways.
2021-08-27 13:08:41 -07:00
Alvaro Herrera 6f8127b739
Revert analyze support for partitioned tables
This reverts the following commits:
1b5617eb84 Describe (auto-)analyze behavior for partitioned tables
0e69f705cc Set pg_class.reltuples for partitioned tables
41badeaba8 Document ANALYZE storage parameters for partitioned tables
0827e8af70 autovacuum: handle analyze for partitioned tables

There are efficiency issues in this code when handling databases with
large numbers of partitions, and it doesn't look like there isn't any
trivial way to handle those.  There are some other issues as well.  It's
now too late in the cycle for nontrivial fixes, so we'll have to let
Postgres 14 users continue to manually deal with ANALYZE their
partitioned tables, and hopefully we can fix the issues for Postgres 15.

I kept [most of] be280cdad2 ("Don't reset relhasindex for partitioned
tables on ANALYZE") because while we added it due to 0827e8af70, it is
a good bugfix in its own right, since it affects manual analyze as well
as autovacuum-induced analyze, and there's no reason to revert it.

I retained the addition of relkind 'p' to tables included by
pg_stat_user_tables, because reverting that would require a catversion
bump.
Also, in pg14 only, I keep a struct member that was added to
PgStat_TabStatEntry to avoid breaking compatibility with existing stat
files.

Backpatch to 14.

Discussion: https://postgr.es/m/20210722205458.f2bug3z6qzxzpx2s@alap3.anarazel.de
2021-08-16 17:27:52 -04:00
Peter Eisentraut f4f4a649d8 Message style improvements 2021-08-07 12:09:37 +02:00
Alvaro Herrera d700518d74
Don't reset relhasindex for partitioned tables on ANALYZE
Commit 0e69f705cc introduced code to analyze partitioned table;
however, that code fails to preserve pg_class.relhasindex correctly.
Fix by observing whether any indexes exist rather than accidentally
falling through to assuming none do.

Backpatch to 14.

Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Discussion: https://postgr.es/m/CALNJ-vS1R3Qoe5t4tbzxrkpBtzRbPq1dDcW4RmA_a+oqweF30w@mail.gmail.com
2021-07-01 12:56:30 -04:00
Tom Lane def5b065ff Initial pgindent and pgperltidy run for v14.
Also "make reformat-dat-files".

The only change worthy of note is that pgindent messed up the formatting
of launcher.c's struct LogicalRepWorkerId, which led me to notice that
that struct wasn't used at all anymore, so I just took it out.
2021-05-12 13:14:10 -04:00
Michael Paquier 7ef8b52cf0 Fix typos and grammar in comments and docs
Author: Justin Pryzby
Discussion: https://postgr.es/m/20210416070310.GG3315@telsasoft.com
2021-04-19 11:32:30 +09:00
Alvaro Herrera 0e69f705cc
Set pg_class.reltuples for partitioned tables
When commit 0827e8af70 added auto-analyze support for partitioned
tables, it included code to obtain reltuples for the partitioned table
as a number of catalog accesses to read pg_class.reltuples for each
partition.  That's not only very inefficient, but also problematic
because autovacuum doesn't hold any locks on any of those tables -- and
doesn't want to.  Replace that code with a read of pg_class.reltuples
for the partitioned table, and make sure ANALYZE and TRUNCATE properly
maintain that value.

I found no code that would be affected by the change of relpages from
zero to non-zero for partitioned tables, and no other code that should
be maintaining it, but if there is, hopefully it'll be an easy fix.

Per buildfarm.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Discussion: https://postgr.es/m/1823909.1617862590@sss.pgh.pa.us
2021-04-09 11:50:33 -04:00
Michael Paquier 609b0652af Fix typos and grammar in documentation and code comments
Comment fixes are applied on HEAD, and documentation improvements are
applied on back-branches where needed.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20210408164008.GJ6592@telsasoft.com
Backpatch-through: 9.6
2021-04-09 13:53:07 +09:00