Add eager and lazy freezing strategies to VACUUM.

Eager freezing strategy avoids large build-ups of all-visible pages.  It
makes VACUUM trigger page-level freezing whenever doing so will enable
the page to become all-frozen in the visibility map.  This is useful for
tables that experience continual growth, particularly strict append-only
tables such as pgbench's history table.  Eager freezing significantly
improves performance stability by spreading out the cost of freezing
over time, rather than doing most freezing during aggressive VACUUMs.
It complements the insert autovacuum mechanism added by commit b07642db.

VACUUM determines its freezing strategy based on the value of the new
vacuum_freeze_strategy_threshold GUC (or reloption) with logged tables.
Tables that exceed the size threshold use the eager freezing strategy.
Unlogged tables and temp tables always use eager freezing strategy,
since the added cost is negligible there.  Non-permanent relations won't
incur any extra overhead in WAL written (for the obvious reason), nor in
pages dirtied (since any extra freezing will only take place on pages
whose PD_ALL_VISIBLE bit needed to be set either way).

VACUUM uses lazy freezing strategy for logged tables that fall under the
GUC size threshold.  Page-level freezing triggers based on the criteria
established in commit 1de58df4, which added basic page-level freezing.

Eager freezing is strictly more aggressive than lazy freezing.  Settings
like vacuum_freeze_min_age still get applied in just the same way in
every VACUUM, independent of the strategy in use.  The only mechanical
difference between eager and lazy freezing strategies is that only the
former applies its own additional criteria to trigger freezing pages.
Note that even lazy freezing strategy will trigger freezing whenever a
page happens to have required that an FPI be written during pruning,
provided that the page will thereby become all-frozen in the visibility
map afterwards (due to the FPI optimization from commit 1de58df4).

The vacuum_freeze_strategy_threshold default setting is 4GB.  This is a
relatively low setting that prioritizes performance stability.  It will
be reviewed at the end of the Postgres 16 beta period.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Jeff Davis <pgsql@j-davis.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzkFok_6EAHuK39GaW4FjEFQsY=3J0AAd6FXk93u-Xq3Fg@mail.gmail.com
This commit is contained in:
Peter Geoghegan 2023-01-25 14:15:38 -08:00
parent 642e8821d7
commit 4d41799261
12 changed files with 197 additions and 14 deletions

View File

@ -9272,6 +9272,36 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
<varlistentry id="guc-vacuum-freeze-strategy-threshold" xreflabel="vacuum_freeze_strategy_threshold">
<term><varname>vacuum_freeze_strategy_threshold</varname> (<type>integer</type>)
<indexterm>
<primary><varname>vacuum_freeze_strategy_threshold</varname> configuration parameter</primary>
</indexterm>
</term>
<listitem>
<para>
Specifies the cutoff storage size that
<command>VACUUM</command> should use to determine its freezing
strategy. This is applied by comparing it to the size of the
target table's <glossterm linkend="glossary-fork">main
fork</glossterm> at the beginning of each <command>VACUUM</command>.
Eager freezing strategy is used by <command>VACUUM</command>
when the table's main fork size exceeds this value.
<command>VACUUM</command> <emphasis>always</emphasis> uses
eager freezing strategy when processing <glossterm
linkend="glossary-unlogged">unlogged</glossterm> tables,
regardless of this setting. Otherwise <command>VACUUM</command>
uses lazy freezing strategy. For more information see <xref
linkend="vacuum-for-wraparound"/>.
</para>
<para>
If this value is specified without units, it is taken as
megabytes. The default is four gigabytes
(<literal>4GB</literal>).
</para>
</listitem>
</varlistentry>
<varlistentry id="guc-vacuum-failsafe-age" xreflabel="vacuum_failsafe_age">
<term><varname>vacuum_failsafe_age</varname> (<type>integer</type>)
<indexterm>

View File

@ -478,13 +478,30 @@
</note>
<para>
<xref linkend="guc-vacuum-freeze-min-age"/>
controls how old an XID value has to be before rows bearing that XID will be
frozen. Increasing this setting may avoid unnecessary work if the
rows that would otherwise be frozen will soon be modified again,
but decreasing this setting increases
the number of transactions that can elapse before the table must be
vacuumed again.
<xref linkend="guc-vacuum-freeze-strategy-threshold"/> controls
<command>VACUUM</command>'s freezing strategy. The
<firstterm>eager freezing strategy</firstterm> makes
<command>VACUUM</command> freeze all rows on a page whenever each
and every row on the page is considered visible to all current
transactions (immediately after dead row versions are removed).
Freezing pages early and in batch often spreads out the overhead
of freezing over time. <command>VACUUM</command> consistently
avoids allowing unfrozen all-visible pages to build up, improving
system level performance stability. The <firstterm>lazy freezing
strategy</firstterm> makes <command>VACUUM</command> determine
whether pages should be frozen on the basis of the age of the
oldest XID on the page. Freezing pages lazily sometimes avoids
the overhead of freezing that turns out to have been unnecessary
because the rows were modified soon after freezing took place.
</para>
<para>
<xref linkend="guc-vacuum-freeze-min-age"/> controls how old an
XID value has to be before pages with rows bearing that XID are
frozen. This setting is an additional trigger criteria for
freezing a page's tuples. It is used by both freezing strategies,
though it typically has little impact when <command>VACUUM</command>
uses the eager freezing strategy.
</para>
<para>
@ -506,12 +523,21 @@
always use its aggressive strategy.
</para>
<para>
Controlling the overhead of freezing existing all-visible pages
during aggressive vacuuming is the goal of the eager freezing
strategy. Increasing <varname>vacuum_freeze_strategy_threshold</varname>
may avoid unnecessary work, but it increases the risk of an
eventual aggressive vacuum that performs an excessive amount of
<quote>catch up</quote> freezing all at once.
</para>
<para>
The maximum time that a table can go unvacuumed is two billion
transactions minus the <varname>vacuum_freeze_min_age</varname> value at
the time of the last aggressive vacuum. If it were to go
unvacuumed for longer than
that, data loss could result. To ensure that this does not happen,
unvacuumed for longer than that, the system could temporarily refuse to
allocate new transaction IDs. To ensure that this never happens,
autovacuum is invoked on any table that might contain unfrozen rows with
XIDs older than the age specified by the configuration parameter <xref
linkend="guc-autovacuum-freeze-max-age"/>. (This will happen even if
@ -551,7 +577,7 @@
</para>
<para>
The sole disadvantage of increasing <varname>autovacuum_freeze_max_age</varname>
One disadvantage of increasing <varname>autovacuum_freeze_max_age</varname>
(and <varname>vacuum_freeze_table_age</varname> along with it) is that
the <filename>pg_xact</filename> and <filename>pg_commit_ts</filename>
subdirectories of the database cluster will take more space, because it
@ -837,8 +863,8 @@ vacuum insert threshold = vacuum base insert threshold + vacuum insert scale fac
For tables which receive <command>INSERT</command> operations but no or
almost no <command>UPDATE</command>/<command>DELETE</command> operations,
it may be beneficial to lower the table's
<xref linkend="reloption-autovacuum-freeze-min-age"/> as this may allow
tuples to be frozen by earlier vacuums. The number of obsolete tuples and
<xref linkend="reloption-autovacuum-freeze-strategy-threshold"/>
to allow freezing to take place proactively. The number of obsolete tuples and
the number of inserted tuples are obtained from the cumulative statistics system;
it is a semi-accurate count updated by each <command>UPDATE</command>,
<command>DELETE</command> and <command>INSERT</command> operation. (It is

View File

@ -1781,6 +1781,20 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
<varlistentry id="reloption-autovacuum-freeze-strategy-threshold" xreflabel="autovacuum_freeze_strategy_threshold">
<term><literal>autovacuum_freeze_strategy_threshold</literal>, <literal>toast.autovacuum_freeze_strategy_threshold</literal> (<type>integer</type>)
<indexterm>
<primary><varname>autovacuum_freeze_strategy_threshold</varname> storage parameter</primary>
</indexterm>
</term>
<listitem>
<para>
Per-table value for <xref linkend="guc-vacuum-freeze-strategy-threshold"/>
parameter.
</para>
</listitem>
</varlistentry>
<varlistentry id="reloption-log-autovacuum-min-duration" xreflabel="log_autovacuum_min_duration">
<term><literal>log_autovacuum_min_duration</literal>, <literal>toast.log_autovacuum_min_duration</literal> (<type>integer</type>)
<indexterm>

View File

@ -312,6 +312,14 @@ static relopt_int intRelOpts[] =
ShareUpdateExclusiveLock
}, -1, 0, 2000000000
},
{
{
"autovacuum_freeze_strategy_threshold",
"Table size at which VACUUM freezes using eager strategy, in megabytes.",
RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
ShareUpdateExclusiveLock
}, -1, 0, MAX_KILOBYTES
},
{
{
"log_autovacuum_min_duration",
@ -1863,6 +1871,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, multixact_freeze_max_age)},
{"autovacuum_multixact_freeze_table_age", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, multixact_freeze_table_age)},
{"autovacuum_freeze_strategy_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, freeze_strategy_threshold)},
{"log_autovacuum_min_duration", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, log_min_duration)},
{"toast_tuple_target", RELOPT_TYPE_INT,

View File

@ -7057,6 +7057,7 @@ heap_freeze_tuple(HeapTupleHeader tuple,
cutoffs.OldestMxact = MultiXactCutoff;
cutoffs.FreezeLimit = FreezeLimit;
cutoffs.MultiXactCutoff = MultiXactCutoff;
cutoffs.freeze_strategy_threshold_pages = 0;
pagefrz.freeze_required = true;
pagefrz.FreezePageRelfrozenXid = FreezeLimit;

View File

@ -153,6 +153,8 @@ typedef struct LVRelState
bool aggressive;
/* Use visibility map to skip? (disabled by DISABLE_PAGE_SKIPPING) */
bool skipwithvm;
/* Eagerly freeze pages that are eligible to become all-frozen? */
bool eager_freeze_strategy;
/* Wraparound failsafe has been triggered? */
bool failsafe_active;
/* Consider index vacuuming bypass optimization? */
@ -243,6 +245,7 @@ typedef struct LVSavedErrInfo
/* non-export function prototypes */
static void lazy_scan_heap(LVRelState *vacrel);
static void lazy_scan_strategy(LVRelState *vacrel);
static BlockNumber lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer,
BlockNumber next_block,
bool *next_unskippable_allvis,
@ -472,6 +475,10 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
vacrel->skipwithvm = skipwithvm;
/*
* Now determine VACUUM's freezing strategy
*/
lazy_scan_strategy(vacrel);
if (verbose)
{
if (vacrel->aggressive)
@ -1267,6 +1274,38 @@ lazy_scan_heap(LVRelState *vacrel)
lazy_cleanup_all_indexes(vacrel);
}
/*
* lazy_scan_strategy() -- Determine freezing strategy.
*
* Our lazy freezing strategy is useful when putting off the work of freezing
* totally avoids freezing that turns out to have been wasted effort later on.
* Our eager freezing strategy is useful with larger tables that experience
* continual growth, where freezing pages proactively is needed just to avoid
* falling behind on freezing (eagerness is also likely to be cheaper in the
* short/medium term for such tables, but the long term picture matters most).
*/
static void
lazy_scan_strategy(LVRelState *vacrel)
{
BlockNumber rel_pages = vacrel->rel_pages;
/*
* Decide freezing strategy.
*
* The eager freezing strategy is used whenever rel_pages exceeds a
* threshold controlled by the freeze_strategy_threshold GUC/reloption.
*
* Also freeze eagerly with an unlogged or temp table, where the total
* cost of freezing pages is mostly just the cycles needed to prepare a
* set of freeze plans. Executing the freeze plans adds very little cost.
* Dirtying extra pages isn't a concern, either; VACUUM will definitely
* set PD_ALL_VISIBLE on affected pages, regardless of freezing strategy.
*/
vacrel->eager_freeze_strategy =
(rel_pages > vacrel->cutoffs.freeze_strategy_threshold_pages ||
!RelationIsPermanent(vacrel->rel));
}
/*
* lazy_scan_skip() -- set up range of skippable blocks using visibility map.
*
@ -1795,10 +1834,12 @@ retry:
* one XID/MXID from before FreezeLimit/MultiXactCutoff is present. Also
* freeze when pruning generated an FPI, if doing so means that we set the
* page all-frozen afterwards (might not happen until final heap pass).
* When ongoing VACUUM opted to use the eager freezing strategy we freeze
* any page that will thereby become all-frozen in the visibility map.
*/
if (pagefrz.freeze_required || tuples_frozen == 0 ||
(prunestate->all_visible && prunestate->all_frozen &&
fpi_before != pgWalUsage.wal_fpi))
(fpi_before != pgWalUsage.wal_fpi || vacrel->eager_freeze_strategy)))
{
/*
* We're freezing the page. Our final NewRelfrozenXid doesn't need to

View File

@ -68,6 +68,7 @@ int vacuum_freeze_min_age;
int vacuum_freeze_table_age;
int vacuum_multixact_freeze_min_age;
int vacuum_multixact_freeze_table_age;
int vacuum_freeze_strategy_threshold;
int vacuum_failsafe_age;
int vacuum_multixact_failsafe_age;
@ -264,6 +265,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
params.freeze_table_age = 0;
params.multixact_freeze_min_age = 0;
params.multixact_freeze_table_age = 0;
params.freeze_strategy_threshold = 0;
}
else
{
@ -271,6 +273,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
params.freeze_table_age = -1;
params.multixact_freeze_min_age = -1;
params.multixact_freeze_table_age = -1;
params.freeze_strategy_threshold = -1;
}
/* user-invoked vacuum is never "for wraparound" */
@ -962,7 +965,9 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params,
multixact_freeze_min_age,
freeze_table_age,
multixact_freeze_table_age,
effective_multixact_freeze_max_age;
effective_multixact_freeze_max_age,
freeze_strategy_threshold;
uint64 threshold_strategy_pages;
TransactionId nextXID,
safeOldestXmin,
aggressiveXIDCutoff;
@ -975,6 +980,7 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params,
multixact_freeze_min_age = params->multixact_freeze_min_age;
freeze_table_age = params->freeze_table_age;
multixact_freeze_table_age = params->multixact_freeze_table_age;
freeze_strategy_threshold = params->freeze_strategy_threshold;
/* Set pg_class fields in cutoffs */
cutoffs->relfrozenxid = rel->rd_rel->relfrozenxid;
@ -1089,6 +1095,23 @@ vacuum_get_cutoffs(Relation rel, const VacuumParams *params,
if (MultiXactIdPrecedes(cutoffs->OldestMxact, cutoffs->MultiXactCutoff))
cutoffs->MultiXactCutoff = cutoffs->OldestMxact;
/*
* Determine the freeze_strategy_threshold to use: as specified by the
* caller, or vacuum_freeze_strategy_threshold
*/
if (freeze_strategy_threshold < 0)
freeze_strategy_threshold = vacuum_freeze_strategy_threshold;
Assert(freeze_strategy_threshold >= 0);
/*
* Convert MB-based freeze_strategy_threshold to page-based value used by
* our vacuumlazy.c caller, while being careful to avoid overflow
*/
threshold_strategy_pages =
((uint64) freeze_strategy_threshold * 1024 * 1024) / BLCKSZ;
threshold_strategy_pages = Min(threshold_strategy_pages, MaxBlockNumber);
cutoffs->freeze_strategy_threshold_pages = threshold_strategy_pages;
/*
* Finally, figure out if caller needs to do an aggressive VACUUM or not.
*

View File

@ -151,6 +151,7 @@ static int default_freeze_min_age;
static int default_freeze_table_age;
static int default_multixact_freeze_min_age;
static int default_multixact_freeze_table_age;
static int default_freeze_strategy_threshold;
/* Memory context for long-lived data */
static MemoryContext AutovacMemCxt;
@ -2010,6 +2011,7 @@ do_autovacuum(void)
default_freeze_table_age = 0;
default_multixact_freeze_min_age = 0;
default_multixact_freeze_table_age = 0;
default_freeze_strategy_threshold = 0;
}
else
{
@ -2017,6 +2019,7 @@ do_autovacuum(void)
default_freeze_table_age = vacuum_freeze_table_age;
default_multixact_freeze_min_age = vacuum_multixact_freeze_min_age;
default_multixact_freeze_table_age = vacuum_multixact_freeze_table_age;
default_freeze_strategy_threshold = vacuum_freeze_strategy_threshold;
}
ReleaseSysCache(tuple);
@ -2801,6 +2804,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
int freeze_table_age;
int multixact_freeze_min_age;
int multixact_freeze_table_age;
int freeze_strategy_threshold;
int vac_cost_limit;
double vac_cost_delay;
int log_min_duration;
@ -2850,6 +2854,11 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
? avopts->multixact_freeze_table_age
: default_multixact_freeze_table_age;
freeze_strategy_threshold = (avopts &&
avopts->freeze_strategy_threshold >= 0)
? avopts->freeze_strategy_threshold
: default_freeze_strategy_threshold;
tab = palloc(sizeof(autovac_table));
tab->at_relid = relid;
tab->at_sharedrel = classForm->relisshared;
@ -2877,6 +2886,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
tab->at_params.freeze_strategy_threshold = freeze_strategy_threshold;
tab->at_params.is_wraparound = wraparound;
tab->at_params.log_min_duration = log_min_duration;
tab->at_vacuum_cost_limit = vac_cost_limit;

View File

@ -2535,6 +2535,20 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
{
{"vacuum_freeze_strategy_threshold", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Table size at which VACUUM freezes using eager strategy, in megabytes."),
gettext_noop("This is applied by comparing it to the size of a table's main fork at "
"the beginning of each VACUUM. Eager freezing strategy is used when size "
"exceeds the threshold, or when table is a temporary or unlogged table. "
"Otherwise lazy freezing strategy is used."),
GUC_UNIT_MB
},
&vacuum_freeze_strategy_threshold,
4096, 0, MAX_KILOBYTES,
NULL, NULL, NULL
},
{
{"vacuum_defer_cleanup_age", PGC_SIGHUP, REPLICATION_PRIMARY,
gettext_noop("Number of transactions by which VACUUM and HOT cleanup should be deferred, if any."),

View File

@ -700,6 +700,7 @@
#vacuum_multixact_freeze_table_age = 150000000
#vacuum_multixact_freeze_min_age = 5000000
#vacuum_multixact_failsafe_age = 1600000000
#vacuum_freeze_strategy_threshold = 4GB
#bytea_output = 'hex' # hex, escape
#xmlbinary = 'base64'
#xmloption = 'content'

View File

@ -222,6 +222,9 @@ typedef struct VacuumParams
* use default */
int multixact_freeze_table_age; /* multixact age at which to scan
* whole table */
int freeze_strategy_threshold; /* threshold to use eager
* freezing, in megabytes, -1 to
* use default */
bool is_wraparound; /* force a for-wraparound vacuum */
int log_min_duration; /* minimum execution threshold in ms at
* which autovacuum is logged, -1 to use
@ -274,6 +277,14 @@ struct VacuumCutoffs
*/
TransactionId FreezeLimit;
MultiXactId MultiXactCutoff;
/*
* Eager freezing strategy is used whenever target rel's main fork size
* exceeds freeze_strategy_threshold_pages. Otherwise lazy freezing
* strategy is used. (Actually, there are exceptions. Non-permanent
* tables always use eager freezing strategy.)
*/
BlockNumber freeze_strategy_threshold_pages;
};
/*
@ -297,6 +308,7 @@ extern PGDLLIMPORT int vacuum_freeze_min_age;
extern PGDLLIMPORT int vacuum_freeze_table_age;
extern PGDLLIMPORT int vacuum_multixact_freeze_min_age;
extern PGDLLIMPORT int vacuum_multixact_freeze_table_age;
extern PGDLLIMPORT int vacuum_freeze_strategy_threshold;
extern PGDLLIMPORT int vacuum_failsafe_age;
extern PGDLLIMPORT int vacuum_multixact_failsafe_age;

View File

@ -314,6 +314,7 @@ typedef struct AutoVacOpts
int multixact_freeze_min_age;
int multixact_freeze_max_age;
int multixact_freeze_table_age;
int freeze_strategy_threshold;
int log_min_duration;
float8 vacuum_cost_delay;
float8 vacuum_scale_factor;