postgresql/src/include/commands/vacuum.h

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

458 lines
18 KiB
C
Raw Normal View History

/*-------------------------------------------------------------------------
*
* vacuum.h
* header file for postgres vacuum cleaner and statistics analyzer
*
*
* Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
2010-09-20 22:08:53 +02:00
* src/include/commands/vacuum.h
*
*-------------------------------------------------------------------------
*/
#ifndef VACUUM_H
#define VACUUM_H
#include "access/htup.h"
#include "access/genam.h"
#include "access/parallel.h"
Use TidStore for dead tuple TIDs storage during lazy vacuum. Previously, we used a simple array for storing dead tuple IDs during lazy vacuum, which had a number of problems: * The array used a single allocation and so was limited to 1GB. * The allocation was pessimistically sized according to table size. * Lookup with binary search was slow because of poor CPU cache and branch prediction behavior. This commit replaces that array with the TID store from commit 30e144287a. Since the backing radix tree makes small allocations as needed, the 1GB limit is now gone. Further, the total memory used is now often smaller by an order of magnitude or more, depending on the distribution of blocks and offsets. These two features should make multiple rounds of heap scanning and index cleanup an extremely rare event. TID lookup during index cleanup is also several times faster, even more so when index order is correlated with heap tuple order. Since there is no longer a predictable relationship between the number of dead tuples vacuumed and the space taken up by their TIDs, the number of tuples no longer provides any meaningful insights for users, nor is the maximum number predictable. For that reason this commit also changes to byte-based progress reporting, with the relevant columns of pg_stat_progress_vacuum renamed accordingly to max_dead_tuple_bytes and dead_tuple_bytes. For parallel vacuum, both the TID store and supplemental information specific to vacuum are shared among the parallel vacuum workers. As with the previous array, we don't take any locks on TidStore during parallel vacuum since writes are still only done by the leader process. Bump catalog version. Reviewed-by: John Naylor, (in an earlier version) Dilip Kumar Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com
2024-04-02 03:15:37 +02:00
#include "access/tidstore.h"
Improve VACUUM and ANALYZE by avoiding early lock queue A caller of VACUUM can perform early lookup obtention which can cause other sessions to block on the request done, causing potentially DOS attacks as even a non-privileged user can attempt a vacuum fill of a critical catalog table to block even all incoming connection attempts. Contrary to TRUNCATE, a client could attempt a system-wide VACUUM after building the list of relations to VACUUM, which can cause vacuum_rel() or analyze_rel() to try to lock the relation but the operation would just block. When the client specifies a list of relations and the relation needs to be skipped, ownership checks are done when building the list of relations to work on, preventing a later lock attempt. vacuum_rel() already had the sanity checks needed, except that those were applied too late. This commit refactors the code so as relation skips are checked beforehand, making it safer to avoid too early locks, for both manual VACUUM with and without a list of relations specified. An isolation test is added emulating the fact that early locks do not happen anymore, issuing a WARNING message earlier if the user calling VACUUM is not a relation owner. When a partitioned table is listed in a manual VACUUM or ANALYZE command, its full list of partitions is fetched, all partitions get added to the list to work on, and then each one of them is processed one by one, with ownership checks happening at the later phase of vacuum_rel() or analyze_rel(). Trying to do early ownership checks for each partition is proving to be tedious as this would result in deadlock risks with lock upgrades, and skipping all partitions if the listed partitioned table is not owned would result in a behavior change compared to how Postgres 10 has implemented vacuum for partitioned tables. The original problem reported related to early lock queue for critical relations is fixed anyway, so priority is given to avoiding a backward-incompatible behavior. Reported-by: Lloyd Albin, Jeremy Schneider Author: Michael Paquier Reviewed by: Nathan Bossart, Kyotaro Horiguchi Discussion: https://postgr.es/m/152512087100.19803.12733865831237526317@wrigleys.postgresql.org Discussion: https://postgr.es/m/20180812222142.GA6097@paquier.xyz
2018-08-27 02:11:12 +02:00
#include "catalog/pg_class.h"
#include "catalog/pg_statistic.h"
#include "catalog/pg_type.h"
#include "executor/tuptable.h"
#include "parser/parse_node.h"
#include "storage/buf.h"
#include "storage/lock.h"
#include "storage/read_stream.h"
#include "utils/relcache.h"
1997-02-07 17:24:12 +01:00
/*
* Flags for amparallelvacuumoptions to control the participation of bulkdelete
* and vacuumcleanup in parallel vacuum.
*/
/*
* Both bulkdelete and vacuumcleanup are disabled by default. This will be
* used by IndexAM's that don't want to or cannot participate in parallel
* vacuum. For example, if an index AM doesn't have a way to communicate the
* index statistics allocated by the first ambulkdelete call to the subsequent
* ones until amvacuumcleanup, the index AM cannot participate in parallel
* vacuum.
*/
#define VACUUM_OPTION_NO_PARALLEL 0
/*
* bulkdelete can be performed in parallel. This option can be used by
Don't overlook indexes during parallel VACUUM. Commit b4af70cb, which simplified state managed by VACUUM, performed refactoring of parallel VACUUM in passing. Confusion about the exact details of the tasks that the leader process is responsible for led to code that made it possible for parallel VACUUM to miss a subset of the table's indexes entirely. Specifically, indexes that fell under the min_parallel_index_scan_size size cutoff were missed. These indexes are supposed to be vacuumed by the leader (alongside any parallel unsafe indexes), but weren't vacuumed at all. Affected indexes could easily end up with duplicate heap TIDs, once heap TIDs were recycled for new heap tuples. This had generic symptoms that might be seen with almost any index corruption involving structural inconsistencies between an index and its table. To fix, make sure that the parallel VACUUM leader process performs any required index vacuuming for indexes that happen to be below the size cutoff. Also document the design of parallel VACUUM with these below-size-cutoff indexes. It's unclear how many users might be affected by this bug. There had to be at least three indexes on the table to hit the bug: a smaller index, plus at least two additional indexes that themselves exceed the size cutoff. Cases with just one additional index would not run into trouble, since the parallel VACUUM cost model requires two larger-than-cutoff indexes on the table to apply any parallel processing. Note also that autovacuum was not affected, since it never uses parallel processing. Test case based on tests from a larger patch to test parallel VACUUM by Masahiko Sawada. Many thanks to Kamigishi Rei for her invaluable help with tracking this problem down. Author: Peter Geoghegan <pg@bowt.ie> Author: Masahiko Sawada <sawada.mshk@gmail.com> Reported-By: Kamigishi Rei <iijima.yun@koumakan.jp> Reported-By: Andrew Gierth <andrew@tao11.riddles.org.uk> Diagnosed-By: Andres Freund <andres@anarazel.de> Bug: #17245 Discussion: https://postgr.es/m/17245-ddf06aaf85735f36@postgresql.org Discussion: https://postgr.es/m/20211030023740.qbnsl2xaoh2grq3d@alap3.anarazel.de Backpatch: 14-, where the refactoring commit appears.
2021-11-02 20:06:17 +01:00
* index AMs that need to scan indexes to delete tuples.
*/
#define VACUUM_OPTION_PARALLEL_BULKDEL (1 << 0)
/*
* vacuumcleanup can be performed in parallel if bulkdelete is not performed
* yet. This will be used by IndexAM's that can scan the index if the
* bulkdelete is not performed.
*/
#define VACUUM_OPTION_PARALLEL_COND_CLEANUP (1 << 1)
/*
* vacuumcleanup can be performed in parallel even if bulkdelete has already
* processed the index. This will be used by IndexAM's that scan the index
* during the cleanup phase of index irrespective of whether the index is
* already scanned or not during bulkdelete phase.
*/
#define VACUUM_OPTION_PARALLEL_CLEANUP (1 << 2)
/* value for checking vacuum flags */
#define VACUUM_OPTION_MAX_VALID_VALUE ((1 << 3) - 1)
/* Abstract type for parallel vacuum state */
typedef struct ParallelVacuumState ParallelVacuumState;
/*----------
* ANALYZE builds one of these structs for each attribute (column) that is
* to be analyzed. The struct and subsidiary data are in anl_context,
* so they live until the end of the ANALYZE operation.
*
* The type-specific typanalyze function is passed a pointer to this struct
* and must return true to continue analysis, false to skip analysis of this
* column. In the true case it must set the compute_stats and minrows fields,
* and can optionally set extra_data to pass additional info to compute_stats.
* minrows is its request for the minimum number of sample rows to be gathered
* (but note this request might not be honored, eg if there are fewer rows
* than that in the table).
*
* The compute_stats routine will be called after sample rows have been
* gathered. Aside from this struct, it is passed:
* fetchfunc: a function for accessing the column values from the
* sample rows
* samplerows: the number of sample tuples
* totalrows: estimated total number of rows in relation
* The fetchfunc may be called with rownum running from 0 to samplerows-1.
* It returns a Datum and an isNull flag.
*
* compute_stats should set stats_valid true if it is able to compute
* any useful statistics. If it does, the remainder of the struct holds
* the information to be stored in a pg_statistic row for the column. Be
* careful to allocate any pointed-to data in anl_context, which will NOT
* be CurrentMemoryContext when compute_stats is called.
*
Make pg_statistic and related code account more honestly for collations. When we first put in collations support, we basically punted on teaching pg_statistic, ANALYZE, and the planner selectivity functions about that. They've just used DEFAULT_COLLATION_OID independently of the actual collation of the data. It's time to improve that, so: * Add columns to pg_statistic that record the specific collation associated with each statistics slot. * Teach ANALYZE to use the column's actual collation when comparing values for statistical purposes, and record this in the appropriate slot. (Note that type-specific typanalyze functions are now expected to fill stats->stacoll with the appropriate collation, too.) * Teach assorted selectivity functions to use the actual collation of the stats they are looking at, instead of just assuming it's DEFAULT_COLLATION_OID. This should give noticeably better results in selectivity estimates for columns with nondefault collations, at least for query clauses that use that same collation (which would be the default behavior in most cases). It's still true that comparisons with explicit COLLATE clauses different from the stored data's collation won't be well-estimated, but that's no worse than before. Also, this patch does make the first step towards doing better with that, which is that it's now theoretically possible to collect stats for a collation other than the column's own collation. Patch by me; thanks to Peter Eisentraut for review. Discussion: https://postgr.es/m/14706.1544630227@sss.pgh.pa.us
2018-12-14 18:52:49 +01:00
* Note: all comparisons done for statistical purposes should use the
* underlying column's collation (attcollation), except in situations
* where a noncollatable container type contains a collatable type;
* in that case use the type's default collation. Be sure to record
* the appropriate collation in stacoll.
*----------
*/
typedef struct VacAttrStats *VacAttrStatsP;
typedef Datum (*AnalyzeAttrFetchFunc) (VacAttrStatsP stats, int rownum,
bool *isNull);
typedef void (*AnalyzeAttrComputeStatsFunc) (VacAttrStatsP stats,
AnalyzeAttrFetchFunc fetchfunc,
int samplerows,
double totalrows);
typedef struct VacAttrStats
{
/*
* These fields are set up by the main ANALYZE code before invoking the
* type-specific typanalyze function. They don't necessarily match what
* is in pg_attribute, because some index opclasses store a different type
* than the underlying column/expression. Therefore, use these fields for
* information about the datatype being fed to the typanalyze function.
*/
int attstattarget; /* -1 to use default */
Oid attrtypid; /* type of data being analyzed */
int32 attrtypmod; /* typmod of data being analyzed */
Form_pg_type attrtype; /* copy of pg_type row for attrtypid */
Make pg_statistic and related code account more honestly for collations. When we first put in collations support, we basically punted on teaching pg_statistic, ANALYZE, and the planner selectivity functions about that. They've just used DEFAULT_COLLATION_OID independently of the actual collation of the data. It's time to improve that, so: * Add columns to pg_statistic that record the specific collation associated with each statistics slot. * Teach ANALYZE to use the column's actual collation when comparing values for statistical purposes, and record this in the appropriate slot. (Note that type-specific typanalyze functions are now expected to fill stats->stacoll with the appropriate collation, too.) * Teach assorted selectivity functions to use the actual collation of the stats they are looking at, instead of just assuming it's DEFAULT_COLLATION_OID. This should give noticeably better results in selectivity estimates for columns with nondefault collations, at least for query clauses that use that same collation (which would be the default behavior in most cases). It's still true that comparisons with explicit COLLATE clauses different from the stored data's collation won't be well-estimated, but that's no worse than before. Also, this patch does make the first step towards doing better with that, which is that it's now theoretically possible to collect stats for a collation other than the column's own collation. Patch by me; thanks to Peter Eisentraut for review. Discussion: https://postgr.es/m/14706.1544630227@sss.pgh.pa.us
2018-12-14 18:52:49 +01:00
Oid attrcollid; /* collation of data being analyzed */
MemoryContext anl_context; /* where to save long-lived data */
/*
* These fields must be filled in by the typanalyze routine, unless it
* returns false.
*/
AnalyzeAttrComputeStatsFunc compute_stats; /* function pointer */
int minrows; /* Minimum # of rows wanted for stats */
void *extra_data; /* for extra type-specific data */
/*
* These fields are to be filled in by the compute_stats routine. (They
* are initialized to zero when the struct is created.)
*/
bool stats_valid;
float4 stanullfrac; /* fraction of entries that are NULL */
int32 stawidth; /* average width of column values */
float4 stadistinct; /* # distinct values */
int16 stakind[STATISTIC_NUM_SLOTS];
Oid staop[STATISTIC_NUM_SLOTS];
Make pg_statistic and related code account more honestly for collations. When we first put in collations support, we basically punted on teaching pg_statistic, ANALYZE, and the planner selectivity functions about that. They've just used DEFAULT_COLLATION_OID independently of the actual collation of the data. It's time to improve that, so: * Add columns to pg_statistic that record the specific collation associated with each statistics slot. * Teach ANALYZE to use the column's actual collation when comparing values for statistical purposes, and record this in the appropriate slot. (Note that type-specific typanalyze functions are now expected to fill stats->stacoll with the appropriate collation, too.) * Teach assorted selectivity functions to use the actual collation of the stats they are looking at, instead of just assuming it's DEFAULT_COLLATION_OID. This should give noticeably better results in selectivity estimates for columns with nondefault collations, at least for query clauses that use that same collation (which would be the default behavior in most cases). It's still true that comparisons with explicit COLLATE clauses different from the stored data's collation won't be well-estimated, but that's no worse than before. Also, this patch does make the first step towards doing better with that, which is that it's now theoretically possible to collect stats for a collation other than the column's own collation. Patch by me; thanks to Peter Eisentraut for review. Discussion: https://postgr.es/m/14706.1544630227@sss.pgh.pa.us
2018-12-14 18:52:49 +01:00
Oid stacoll[STATISTIC_NUM_SLOTS];
int numnumbers[STATISTIC_NUM_SLOTS];
float4 *stanumbers[STATISTIC_NUM_SLOTS];
int numvalues[STATISTIC_NUM_SLOTS];
Datum *stavalues[STATISTIC_NUM_SLOTS];
/*
* These fields describe the stavalues[n] element types. They will be
* initialized to match attrtypid, but a custom typanalyze function might
* want to store an array of something other than the analyzed column's
* elements. It should then overwrite these fields.
*/
Oid statypid[STATISTIC_NUM_SLOTS];
int16 statyplen[STATISTIC_NUM_SLOTS];
bool statypbyval[STATISTIC_NUM_SLOTS];
char statypalign[STATISTIC_NUM_SLOTS];
/*
* These fields are private to the main ANALYZE code and should not be
* looked at by type-specific functions.
*/
int tupattnum; /* attribute number within tuples */
HeapTuple *rows; /* access info for std fetch function */
TupleDesc tupDesc;
Datum *exprvals; /* access info for index fetch function */
bool *exprnulls;
int rowstride;
} VacAttrStats;
/*
* AcquireSampleRowsFunc - a function for the sampling statistics collection.
*
* A random sample of up to `targrows` rows should be collected from the
* table and stored into the caller-provided `rows` array. The actual number
* of rows collected must be returned. In addition, a function should store
* estimates of the total numbers of live and dead rows in the table into the
* output parameters `*totalrows` and `*totaldeadrows1. (Set `*totaldeadrows`
* to zero if the storage does not have any concept of dead rows.)
*/
typedef int (*AcquireSampleRowsFunc) (Relation relation, int elevel,
HeapTuple *rows, int targrows,
double *totalrows,
double *totaldeadrows);
/* flag bits for VacuumParams->options */
#define VACOPT_VACUUM 0x01 /* do VACUUM */
#define VACOPT_ANALYZE 0x02 /* do ANALYZE */
Unify VACUUM VERBOSE and autovacuum logging. The log_autovacuum_min_duration instrumentation used its own dedicated code for logging, which was not reused by VACUUM VERBOSE. This was highly duplicative, and sometimes led to each code path using slightly different accounting for essentially the same information. Clean things up by making VACUUM VERBOSE reuse the same instrumentation code. This code restructuring changes the structure of the VACUUM VERBOSE output itself, but that seems like an overall improvement. The most noticeable change in VACUUM VERBOSE output is that it no longer outputs a distinct message per index per round of index vacuuming. Most of the same information (about each index) is now shown in its new per-operation summary message. This is far more legible. A few details are no longer displayed by VACUUM VERBOSE, but that's no real loss in practice, especially in the common case where we don't need multiple index scans/rounds of vacuuming. This super fine-grained information is still available via DEBUG2 messages, which might still be useful in debugging scenarios. VACUUM VERBOSE now shows new instrumentation, which is typically very useful: all of the log_autovacuum_min_duration instrumentation that it missed out on before now. This includes information about WAL overhead, buffers hit/missed/dirtied information, and I/O timing information. VACUUM VERBOSE still retains a few INFO messages of its own. This is limited to output concerning the progress of heap rel truncation, as well as some basic information about parallel workers. These details are still potentially quite useful. They aren't a good fit for the log output, which must summarize the whole operation. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-WzmW4Me7_qR4X4ka7pxP-jGmn7=Npma_-Z-9Y1eD0MQRLw@mail.gmail.com
2022-01-15 01:50:34 +01:00
#define VACOPT_VERBOSE 0x04 /* output INFO instrumentation messages */
#define VACOPT_FREEZE 0x08 /* FREEZE option */
#define VACOPT_FULL 0x10 /* FULL (non-concurrent) vacuum */
#define VACOPT_SKIP_LOCKED 0x20 /* skip if cannot get lock */
#define VACOPT_PROCESS_MAIN 0x40 /* process main relation */
#define VACOPT_PROCESS_TOAST 0x80 /* process the TOAST table, if any */
#define VACOPT_DISABLE_PAGE_SKIPPING 0x100 /* don't skip any pages */
#define VACOPT_SKIP_DATABASE_STATS 0x200 /* skip vac_update_datfrozenxid() */
#define VACOPT_ONLY_DATABASE_STATS 0x400 /* only vac_update_datfrozenxid() */
/*
* Values used by index_cleanup and truncate params.
*
* VACOPTVALUE_UNSPECIFIED is used as an initial placeholder when VACUUM
* command has no explicit value. When that happens the final usable value
* comes from the corresponding reloption (though the reloption default is
* usually used).
*/
typedef enum VacOptValue
{
VACOPTVALUE_UNSPECIFIED = 0,
VACOPTVALUE_AUTO,
VACOPTVALUE_DISABLED,
VACOPTVALUE_ENABLED,
} VacOptValue;
/*
* Parameters customizing behavior of VACUUM and ANALYZE.
*
* Note that at least one of VACOPT_VACUUM and VACOPT_ANALYZE must be set
* in options.
*
* When adding a new VacuumParam member, consider adding it to vacuumdb as
* well.
*/
typedef struct VacuumParams
{
bits32 options; /* bitmask of VACOPT_* */
int freeze_min_age; /* min freeze age, -1 to use default */
int freeze_table_age; /* age at which to scan whole table */
int multixact_freeze_min_age; /* min multixact freeze age, -1 to
* use default */
int multixact_freeze_table_age; /* multixact age at which to scan
* whole table */
bool is_wraparound; /* force a for-wraparound vacuum */
int log_min_duration; /* minimum execution threshold in ms at
Unify VACUUM VERBOSE and autovacuum logging. The log_autovacuum_min_duration instrumentation used its own dedicated code for logging, which was not reused by VACUUM VERBOSE. This was highly duplicative, and sometimes led to each code path using slightly different accounting for essentially the same information. Clean things up by making VACUUM VERBOSE reuse the same instrumentation code. This code restructuring changes the structure of the VACUUM VERBOSE output itself, but that seems like an overall improvement. The most noticeable change in VACUUM VERBOSE output is that it no longer outputs a distinct message per index per round of index vacuuming. Most of the same information (about each index) is now shown in its new per-operation summary message. This is far more legible. A few details are no longer displayed by VACUUM VERBOSE, but that's no real loss in practice, especially in the common case where we don't need multiple index scans/rounds of vacuuming. This super fine-grained information is still available via DEBUG2 messages, which might still be useful in debugging scenarios. VACUUM VERBOSE now shows new instrumentation, which is typically very useful: all of the log_autovacuum_min_duration instrumentation that it missed out on before now. This includes information about WAL overhead, buffers hit/missed/dirtied information, and I/O timing information. VACUUM VERBOSE still retains a few INFO messages of its own. This is limited to output concerning the progress of heap rel truncation, as well as some basic information about parallel workers. These details are still potentially quite useful. They aren't a good fit for the log output, which must summarize the whole operation. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-WzmW4Me7_qR4X4ka7pxP-jGmn7=Npma_-Z-9Y1eD0MQRLw@mail.gmail.com
2022-01-15 01:50:34 +01:00
* which autovacuum is logged, -1 to use
* default */
VacOptValue index_cleanup; /* Do index vacuum and cleanup */
VacOptValue truncate; /* Truncate empty pages at the end */
Oid toast_parent; /* for privilege checks when recursing */
/*
* The number of parallel vacuum workers. 0 by default which means choose
* based on the number of indexes. -1 indicates parallel vacuum is
* disabled.
*/
int nworkers;
} VacuumParams;
/*
* VacuumCutoffs is immutable state that describes the cutoffs used by VACUUM.
* Established at the beginning of each VACUUM operation.
*/
struct VacuumCutoffs
{
/*
* Existing pg_class fields at start of VACUUM
*/
TransactionId relfrozenxid;
MultiXactId relminmxid;
/*
* OldestXmin is the Xid below which tuples deleted by any xact (that
* committed) should be considered DEAD, not just RECENTLY_DEAD.
*
* OldestMxact is the Mxid below which MultiXacts are definitely not seen
* as visible by any running transaction.
*
* OldestXmin and OldestMxact are also the most recent values that can
* ever be passed to vac_update_relstats() as frozenxid and minmulti
* arguments at the end of VACUUM. These same values should be passed
* when it turns out that VACUUM will leave no unfrozen XIDs/MXIDs behind
* in the table.
*/
TransactionId OldestXmin;
MultiXactId OldestMxact;
/*
* FreezeLimit is the Xid below which all Xids are definitely frozen or
* removed in pages VACUUM scans and cleanup locks.
*
* MultiXactCutoff is the value below which all MultiXactIds are
* definitely removed from Xmax in pages VACUUM scans and cleanup locks.
*/
TransactionId FreezeLimit;
MultiXactId MultiXactCutoff;
};
/*
Use TidStore for dead tuple TIDs storage during lazy vacuum. Previously, we used a simple array for storing dead tuple IDs during lazy vacuum, which had a number of problems: * The array used a single allocation and so was limited to 1GB. * The allocation was pessimistically sized according to table size. * Lookup with binary search was slow because of poor CPU cache and branch prediction behavior. This commit replaces that array with the TID store from commit 30e144287a. Since the backing radix tree makes small allocations as needed, the 1GB limit is now gone. Further, the total memory used is now often smaller by an order of magnitude or more, depending on the distribution of blocks and offsets. These two features should make multiple rounds of heap scanning and index cleanup an extremely rare event. TID lookup during index cleanup is also several times faster, even more so when index order is correlated with heap tuple order. Since there is no longer a predictable relationship between the number of dead tuples vacuumed and the space taken up by their TIDs, the number of tuples no longer provides any meaningful insights for users, nor is the maximum number predictable. For that reason this commit also changes to byte-based progress reporting, with the relevant columns of pg_stat_progress_vacuum renamed accordingly to max_dead_tuple_bytes and dead_tuple_bytes. For parallel vacuum, both the TID store and supplemental information specific to vacuum are shared among the parallel vacuum workers. As with the previous array, we don't take any locks on TidStore during parallel vacuum since writes are still only done by the leader process. Bump catalog version. Reviewed-by: John Naylor, (in an earlier version) Dilip Kumar Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com
2024-04-02 03:15:37 +02:00
* VacDeadItemsInfo stores supplemental information for dead tuple TID
* storage (i.e. TidStore).
*/
Use TidStore for dead tuple TIDs storage during lazy vacuum. Previously, we used a simple array for storing dead tuple IDs during lazy vacuum, which had a number of problems: * The array used a single allocation and so was limited to 1GB. * The allocation was pessimistically sized according to table size. * Lookup with binary search was slow because of poor CPU cache and branch prediction behavior. This commit replaces that array with the TID store from commit 30e144287a. Since the backing radix tree makes small allocations as needed, the 1GB limit is now gone. Further, the total memory used is now often smaller by an order of magnitude or more, depending on the distribution of blocks and offsets. These two features should make multiple rounds of heap scanning and index cleanup an extremely rare event. TID lookup during index cleanup is also several times faster, even more so when index order is correlated with heap tuple order. Since there is no longer a predictable relationship between the number of dead tuples vacuumed and the space taken up by their TIDs, the number of tuples no longer provides any meaningful insights for users, nor is the maximum number predictable. For that reason this commit also changes to byte-based progress reporting, with the relevant columns of pg_stat_progress_vacuum renamed accordingly to max_dead_tuple_bytes and dead_tuple_bytes. For parallel vacuum, both the TID store and supplemental information specific to vacuum are shared among the parallel vacuum workers. As with the previous array, we don't take any locks on TidStore during parallel vacuum since writes are still only done by the leader process. Bump catalog version. Reviewed-by: John Naylor, (in an earlier version) Dilip Kumar Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com
2024-04-02 03:15:37 +02:00
typedef struct VacDeadItemsInfo
{
Use TidStore for dead tuple TIDs storage during lazy vacuum. Previously, we used a simple array for storing dead tuple IDs during lazy vacuum, which had a number of problems: * The array used a single allocation and so was limited to 1GB. * The allocation was pessimistically sized according to table size. * Lookup with binary search was slow because of poor CPU cache and branch prediction behavior. This commit replaces that array with the TID store from commit 30e144287a. Since the backing radix tree makes small allocations as needed, the 1GB limit is now gone. Further, the total memory used is now often smaller by an order of magnitude or more, depending on the distribution of blocks and offsets. These two features should make multiple rounds of heap scanning and index cleanup an extremely rare event. TID lookup during index cleanup is also several times faster, even more so when index order is correlated with heap tuple order. Since there is no longer a predictable relationship between the number of dead tuples vacuumed and the space taken up by their TIDs, the number of tuples no longer provides any meaningful insights for users, nor is the maximum number predictable. For that reason this commit also changes to byte-based progress reporting, with the relevant columns of pg_stat_progress_vacuum renamed accordingly to max_dead_tuple_bytes and dead_tuple_bytes. For parallel vacuum, both the TID store and supplemental information specific to vacuum are shared among the parallel vacuum workers. As with the previous array, we don't take any locks on TidStore during parallel vacuum since writes are still only done by the leader process. Bump catalog version. Reviewed-by: John Naylor, (in an earlier version) Dilip Kumar Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com
2024-04-02 03:15:37 +02:00
size_t max_bytes; /* the maximum bytes TidStore can use */
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
extern PGDLLIMPORT int vacuum_freeze_table_age;
extern PGDLLIMPORT int vacuum_multixact_freeze_min_age;
extern PGDLLIMPORT int vacuum_multixact_freeze_table_age;
extern PGDLLIMPORT int vacuum_failsafe_age;
extern PGDLLIMPORT int vacuum_multixact_failsafe_age;
/*
* Maximum value for default_statistics_target and per-column statistics
* targets. This is fairly arbitrary, mainly to prevent users from creating
* unreasonably large statistics that the system cannot handle well.
*/
#define MAX_STATISTICS_TARGET 10000
/* Variables for cost-based parallel vacuum */
extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
extern PGDLLIMPORT int VacuumCostBalanceLocal;
extern PGDLLIMPORT bool VacuumFailsafeActive;
extern PGDLLIMPORT double vacuum_cost_delay;
extern PGDLLIMPORT int vacuum_cost_limit;
/* in commands/vacuum.c */
extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
extern void vacuum(List *relations, VacuumParams *params,
BufferAccessStrategy bstrategy, MemoryContext vac_context,
bool isTopLevel);
extern void vac_open_indexes(Relation relation, LOCKMODE lockmode,
int *nindexes, Relation **Irel);
extern void vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode);
When updating reltuples after ANALYZE, just extrapolate from our sample. The existing logic for updating pg_class.reltuples trusted the sampling results only for the pages ANALYZE actually visited, preferring to believe the previous tuple density estimate for all the unvisited pages. While there's some rationale for doing that for VACUUM (first that VACUUM is likely to visit a very nonrandom subset of pages, and second that we know for sure that the unvisited pages did not change), there's no such rationale for ANALYZE: by assumption, it's looked at an unbiased random sample of the table's pages. Furthermore, in a very large table ANALYZE will have examined only a tiny fraction of the table's pages, meaning it cannot slew the overall density estimate very far at all. In a table that is physically growing, this causes reltuples to increase nearly proportionally to the change in relpages, regardless of what is actually happening in the table. This has been observed to cause reltuples to become so much larger than reality that it effectively shuts off autovacuum, whose threshold for doing anything is a fraction of reltuples. (Getting to the point where that would happen seems to require some additional, not well understood, conditions. But it's undeniable that if reltuples is seriously off in a large table, ANALYZE alone will not fix it in any reasonable number of iterations, especially not if the table is continuing to grow.) Hence, restrict the use of vac_estimate_reltuples() to VACUUM alone, and in ANALYZE, just extrapolate from the sample pages on the assumption that they provide an accurate model of the whole table. If, by very bad luck, they don't, at least another ANALYZE will fix it; in the old logic a single bad estimate could cause problems indefinitely. In HEAD, let's remove vac_estimate_reltuples' is_analyze argument altogether; it was never used for anything and now it's totally pointless. But keep it in the back branches, in case any third-party code is calling this function. Per bug #15005. Back-patch to all supported branches. David Gould, reviewed by Alexander Kuzmenkov, cosmetic changes by me Discussion: https://postgr.es/m/20180117164916.3fdcf2e9@engels
2018-03-13 18:24:27 +01:00
extern double vac_estimate_reltuples(Relation relation,
Fix VACUUM so that it always updates pg_class.reltuples/relpages. When we added the ability for vacuum to skip heap pages by consulting the visibility map, we made it just not update the reltuples/relpages statistics if it skipped any pages. But this could leave us with extremely out-of-date stats for a table that contains any unchanging areas, especially for TOAST tables which never get processed by ANALYZE. In particular this could result in autovacuum making poor decisions about when to process the table, as in recent report from Florian Helmberger. And in general it's a bad idea to not update the stats at all. Instead, use the previous values of reltuples/relpages as an estimate of the tuple density in unvisited pages. This approach results in a "moving average" estimate of reltuples, which should converge to the correct value over multiple VACUUM and ANALYZE cycles even when individual measurements aren't very good. This new method for updating reltuples is used by both VACUUM and ANALYZE, with the result that we no longer need the grotty interconnections that caused ANALYZE to not update the stats depending on what had happened in the parent VACUUM command. Also, fix the logic for skipping all-visible pages during VACUUM so that it looks ahead rather than behind to decide what to do, as per a suggestion from Greg Stark. This eliminates useless scanning of all-visible pages at the start of the relation or just after a not-all-visible page. In particular, the first few pages of the relation will not be invariably included in the scanned pages, which seems to help in not overweighting them in the reltuples estimate. Back-patch to 8.4, where the visibility map was introduced.
2011-05-30 23:05:26 +02:00
BlockNumber total_pages,
BlockNumber scanned_pages,
double scanned_tuples);
extern void vac_update_relstats(Relation relation,
BlockNumber num_pages,
double num_tuples,
BlockNumber num_all_visible_pages,
bool hasindex,
Improve concurrency of foreign key locking This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com 1290721684-sup-3951@alvh.no-ip.org 1294953201-sup-2099@alvh.no-ip.org 1320343602-sup-2290@alvh.no-ip.org 1339690386-sup-8927@alvh.no-ip.org 4FE5FF020200002500048A3D@gw.wicourts.gov 4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 16:04:59 +01:00
TransactionId frozenxid,
MultiXactId minmulti,
Add VACUUM instrumentation for scanned pages, relfrozenxid. Report on scanned pages within VACUUM VERBOSE and autovacuum logging. These are pages that were physically examined during the VACUUM operation. Note that this can include a small number of pages that were marked all-visible in the visibility map by some earlier VACUUM operation. VACUUM won't skip all-visible pages that aren't part of a range of all-visible pages that's at least 32 blocks in length (partly to avoid missing out on opportunities to advance relfrozenxid during non-aggressive VACUUMs). Commit 44fa8488 simplified the definition of scanned pages. It became the complement of the pages (of those pages from rel_pages) that were skipped using the visibility map. And so scanned pages precisely indicates how effective the visibility map was at saving work. (Before now we displayed the number of pages skipped via the visibility map when happened to be frozen pages, but not when they were merely all-visible, which was less useful to users.) Rename the user-visible OldestXmin output field to "removal cutoff", and show some supplementary information: how far behind the cutoff is (number of XIDs behind) by the time the VACUUM operation finished. This will help users to figure out what's _not_ working in extreme cases where VACUUM is fundamentally unable to remove dead tuples or freeze older tuples (e.g., due to a leaked replication slot). Also report when relfrozenxid is advanced by VACUUM in output that immediately follows "removal cutoff". This structure is intended to highlight the relationship between the new relfrozenxid value for the table, and the VACUUM operation's removal cutoff. Finally, add instrumentation of "missed dead tuples", and the number of pages that had at least one such tuple. These are fully DEAD (not just RECENTLY_DEAD) tuples with storage that could not be pruned due to failure to acquire a cleanup lock on a heap page. This is a replacement for the "skipped due to pin" instrumentation removed by commit 44fa8488. It shows more details than before for pages where failing to get a cleanup lock actually resulted in VACUUM missing out on useful work, but usually shows nothing at all instead (the mere fact that we couldn't get a cleanup lock is usually of no consequence whatsoever now). Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-Wznp=c=Opj8Z7RMR3G=ec3_JfGYMN_YvmCEjoPCHzWbx0g@mail.gmail.com
2022-02-12 01:48:40 +01:00
bool *frozenxid_updated,
bool *minmulti_updated,
bool in_outer_xact);
extern bool vacuum_get_cutoffs(Relation rel, const VacuumParams *params,
struct VacuumCutoffs *cutoffs);
extern bool vacuum_xid_failsafe_check(const struct VacuumCutoffs *cutoffs);
extern void vac_update_datfrozenxid(void);
extern void vacuum_delay_point(void);
extern bool vacuum_is_permitted_for_relation(Oid relid, Form_pg_class reltuple,
bits32 options);
extern Relation vacuum_open_relation(Oid relid, RangeVar *relation,
bits32 options, bool verbose,
LOCKMODE lmode);
extern IndexBulkDeleteResult *vac_bulkdel_one_index(IndexVacuumInfo *ivinfo,
IndexBulkDeleteResult *istat,
Use TidStore for dead tuple TIDs storage during lazy vacuum. Previously, we used a simple array for storing dead tuple IDs during lazy vacuum, which had a number of problems: * The array used a single allocation and so was limited to 1GB. * The allocation was pessimistically sized according to table size. * Lookup with binary search was slow because of poor CPU cache and branch prediction behavior. This commit replaces that array with the TID store from commit 30e144287a. Since the backing radix tree makes small allocations as needed, the 1GB limit is now gone. Further, the total memory used is now often smaller by an order of magnitude or more, depending on the distribution of blocks and offsets. These two features should make multiple rounds of heap scanning and index cleanup an extremely rare event. TID lookup during index cleanup is also several times faster, even more so when index order is correlated with heap tuple order. Since there is no longer a predictable relationship between the number of dead tuples vacuumed and the space taken up by their TIDs, the number of tuples no longer provides any meaningful insights for users, nor is the maximum number predictable. For that reason this commit also changes to byte-based progress reporting, with the relevant columns of pg_stat_progress_vacuum renamed accordingly to max_dead_tuple_bytes and dead_tuple_bytes. For parallel vacuum, both the TID store and supplemental information specific to vacuum are shared among the parallel vacuum workers. As with the previous array, we don't take any locks on TidStore during parallel vacuum since writes are still only done by the leader process. Bump catalog version. Reviewed-by: John Naylor, (in an earlier version) Dilip Kumar Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com
2024-04-02 03:15:37 +02:00
TidStore *dead_items,
VacDeadItemsInfo *dead_items_info);
extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
IndexBulkDeleteResult *istat);
/* In postmaster/autovacuum.c */
Refresh cost-based delay params more frequently in autovacuum Allow autovacuum to reload the config file more often so that cost-based delay parameters can take effect while VACUUMing a relation. Previously, autovacuum workers only reloaded the config file once per relation vacuumed, so config changes could not take effect until beginning to vacuum the next table. Now, check if a reload is pending roughly once per block, when checking if we need to delay. In order for autovacuum workers to safely update their own cost delay and cost limit parameters without impacting performance, we had to rethink when and how these values were accessed. Previously, an autovacuum worker's wi_cost_limit was set only at the beginning of vacuuming a table, after reloading the config file. Therefore, at the time that autovac_balance_cost() was called, workers vacuuming tables with no cost-related storage parameters could still have different values for their wi_cost_limit_base and wi_cost_delay. Now that the cost parameters can be updated while vacuuming a table, workers will (within some margin of error) have no reason to have different values for cost limit and cost delay (in the absence of cost-related storage parameters). This removes the rationale for keeping cost limit and cost delay in shared memory. Balancing the cost limit requires only the number of active autovacuum workers vacuuming a table with no cost-based storage parameters. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
2023-04-07 01:00:21 +02:00
extern void AutoVacuumUpdateCostLimit(void);
extern void VacuumUpdateCosts(void);
/* in commands/vacuumparallel.c */
extern ParallelVacuumState *parallel_vacuum_init(Relation rel, Relation *indrels,
int nindexes, int nrequested_workers,
Use TidStore for dead tuple TIDs storage during lazy vacuum. Previously, we used a simple array for storing dead tuple IDs during lazy vacuum, which had a number of problems: * The array used a single allocation and so was limited to 1GB. * The allocation was pessimistically sized according to table size. * Lookup with binary search was slow because of poor CPU cache and branch prediction behavior. This commit replaces that array with the TID store from commit 30e144287a. Since the backing radix tree makes small allocations as needed, the 1GB limit is now gone. Further, the total memory used is now often smaller by an order of magnitude or more, depending on the distribution of blocks and offsets. These two features should make multiple rounds of heap scanning and index cleanup an extremely rare event. TID lookup during index cleanup is also several times faster, even more so when index order is correlated with heap tuple order. Since there is no longer a predictable relationship between the number of dead tuples vacuumed and the space taken up by their TIDs, the number of tuples no longer provides any meaningful insights for users, nor is the maximum number predictable. For that reason this commit also changes to byte-based progress reporting, with the relevant columns of pg_stat_progress_vacuum renamed accordingly to max_dead_tuple_bytes and dead_tuple_bytes. For parallel vacuum, both the TID store and supplemental information specific to vacuum are shared among the parallel vacuum workers. As with the previous array, we don't take any locks on TidStore during parallel vacuum since writes are still only done by the leader process. Bump catalog version. Reviewed-by: John Naylor, (in an earlier version) Dilip Kumar Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com
2024-04-02 03:15:37 +02:00
int vac_work_mem, int elevel,
BufferAccessStrategy bstrategy);
extern void parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats);
Use TidStore for dead tuple TIDs storage during lazy vacuum. Previously, we used a simple array for storing dead tuple IDs during lazy vacuum, which had a number of problems: * The array used a single allocation and so was limited to 1GB. * The allocation was pessimistically sized according to table size. * Lookup with binary search was slow because of poor CPU cache and branch prediction behavior. This commit replaces that array with the TID store from commit 30e144287a. Since the backing radix tree makes small allocations as needed, the 1GB limit is now gone. Further, the total memory used is now often smaller by an order of magnitude or more, depending on the distribution of blocks and offsets. These two features should make multiple rounds of heap scanning and index cleanup an extremely rare event. TID lookup during index cleanup is also several times faster, even more so when index order is correlated with heap tuple order. Since there is no longer a predictable relationship between the number of dead tuples vacuumed and the space taken up by their TIDs, the number of tuples no longer provides any meaningful insights for users, nor is the maximum number predictable. For that reason this commit also changes to byte-based progress reporting, with the relevant columns of pg_stat_progress_vacuum renamed accordingly to max_dead_tuple_bytes and dead_tuple_bytes. For parallel vacuum, both the TID store and supplemental information specific to vacuum are shared among the parallel vacuum workers. As with the previous array, we don't take any locks on TidStore during parallel vacuum since writes are still only done by the leader process. Bump catalog version. Reviewed-by: John Naylor, (in an earlier version) Dilip Kumar Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com
2024-04-02 03:15:37 +02:00
extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
VacDeadItemsInfo **dead_items_info_p);
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
bool estimated_count);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
struct TableScanDescData;
/*
* A callback to prepare to analyze block from `stream` of `scan`. The scan
* has been started with table_beginscan_analyze().
*
* The callback may acquire resources like locks that are held until
* ScanAnalyzeNextTupleFunc returns false. In some cases it could be
* useful to hold a lock until all tuples in a block have been analyzed by
* ScanAnalyzeNextTupleFunc.
*
* The callback can return false if the block is not suitable for
* sampling, e.g. because it's a metapage that could never contain tuples.
*
* This is primarily suited for block-based AMs. It's not clear what a
* good interface for non block-based AMs would be, so there isn't one
* yet and sampling using a custom implementation of acquire_sample_rows
* may be preferred.
*/
typedef bool (*ScanAnalyzeNextBlockFunc) (struct TableScanDescData *scan,
ReadStream *stream);
/*
* A callback to iterate over tuples in the block selected with
* ScanAnalyzeNextBlockFunc (which needs to have returned true, and
* this routine may not have returned false for the same block before). If
* a tuple that's suitable for sampling is found, true is returned and a
* tuple is stored in `slot`.
*
* *liverows and *deadrows are incremented according to the encountered
* tuples.
*
* Not every AM might have a meaningful concept of dead rows, in which
* case it's OK to not increment *deadrows - but note that that may
* influence autovacuum scheduling (see comment for relation_vacuum
* callback).
*/
typedef bool (*ScanAnalyzeNextTupleFunc) (struct TableScanDescData *scan,
TransactionId OldestXmin,
double *liverows,
double *deadrows,
TupleTableSlot *slot);
extern void analyze_rel(Oid relid, RangeVar *relation,
VacuumParams *params, List *va_cols, bool in_outer_xact,
BufferAccessStrategy bstrategy);
extern void block_level_table_analyze(Relation relation,
AcquireSampleRowsFunc *func,
BlockNumber *totalpages,
BufferAccessStrategy bstrategy,
ScanAnalyzeNextBlockFunc scan_analyze_next_block_cb,
ScanAnalyzeNextTupleFunc scan_analyze_next_tuple_cb);
extern bool std_typanalyze(VacAttrStats *stats);
/* in utils/misc/sampling.c --- duplicate of declarations in utils/sampling.h */
extern double anl_random_fract(void);
extern double anl_init_selection_state(int n);
extern double anl_get_next_S(double t, int n, double *stateptr);
#endif /* VACUUM_H */