Remove vacuum_defer_cleanup_age

vacuum_defer_cleanup_age was introduced before hot_standby_feedback and
replication slots existed. It is hard to use reasonably - commonly it will
either be set too low (not preventing recovery conflicts, while still causing
some bloat), or too high (causing a lot of bloat). The alternatives do not
have that issue.

That on its own might not be sufficient reason to remove
vacuum_defer_cleanup_age, but it also complicates computation of xid
horizons. See e.g. the bug fixed in be504a3e97. It also is untested.

This commit removes TransactionIdRetreatSafely(), as there are no users
anymore. There might be potential future users, hence noting that here.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20230317230930.nhsgk3qfk7f4axls@awork3.anarazel.de
This commit is contained in:
Andres Freund 2023-04-24 12:20:52 -07:00
parent 441ee1677e
commit 1118cd37eb
8 changed files with 15 additions and 170 deletions

View File

@ -4614,41 +4614,6 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
</listitem>
</varlistentry>
<varlistentry id="guc-vacuum-defer-cleanup-age" xreflabel="vacuum_defer_cleanup_age">
<term><varname>vacuum_defer_cleanup_age</varname> (<type>integer</type>)
<indexterm>
<primary><varname>vacuum_defer_cleanup_age</varname> configuration parameter</primary>
</indexterm>
</term>
<listitem>
<para>
Specifies the number of transactions by which <command>VACUUM</command> and
<link linkend="storage-hot"><acronym>HOT</acronym> updates</link>
will defer cleanup of dead row versions. The
default is zero transactions, meaning that dead row versions can be
removed as soon as possible, that is, as soon as they are no longer
visible to any open transaction. You may wish to set this to a
non-zero value on a primary server that is supporting hot standby
servers, as described in <xref linkend="hot-standby"/>. This allows
more time for queries on the standby to complete without incurring
conflicts due to early cleanup of rows. However, since the value
is measured in terms of number of write transactions occurring on the
primary server, it is difficult to predict just how much additional
grace time will be made available to standby queries.
This parameter can only be set in the <filename>postgresql.conf</filename>
file or on the server command line.
</para>
<para>
You should also consider setting <varname>hot_standby_feedback</varname>
on standby server(s) as an alternative to using this parameter.
</para>
<para>
This does not prevent cleanup of dead rows which have reached the age
specified by <varname>old_snapshot_threshold</varname>.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>

View File

@ -944,12 +944,11 @@ primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'
retained by replication slots.
</para>
<para>
Similarly, <xref linkend="guc-hot-standby-feedback"/>
and <xref linkend="guc-vacuum-defer-cleanup-age"/> provide protection against
relevant rows being removed by vacuum, but the former provides no
protection during any time period when the standby is not connected,
and the latter often needs to be set to a high value to provide adequate
protection. Replication slots overcome these disadvantages.
Similarly, <xref linkend="guc-hot-standby-feedback"/> on its own, without
also using a replication slot, provides protection against relevant rows
being removed by vacuum, but provides no protection during any time period
when the standby is not connected. Replication slots overcome these
disadvantages.
</para>
<sect3 id="streaming-replication-slots-manipulation">
<title>Querying and Manipulating Replication Slots</title>
@ -1910,17 +1909,6 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
by newly-arrived streaming WAL entries after reconnection.
</para>
<para>
Another option is to increase <xref linkend="guc-vacuum-defer-cleanup-age"/>
on the primary server, so that dead rows will not be cleaned up as quickly
as they normally would be. This will allow more time for queries to
execute before they are canceled on the standby, without having to set
a high <varname>max_standby_streaming_delay</varname>. However it is
difficult to guarantee any specific execution-time window with this
approach, since <varname>vacuum_defer_cleanup_age</varname> is measured in
transactions executed on the primary server.
</para>
<para>
The number of query cancels and the reason for them can be viewed using
the <structname>pg_stat_database_conflicts</structname> system view on the standby
@ -2257,8 +2245,7 @@ HINT: You can then restart the server after making the necessary configuration
</para>
<para>
On the primary, parameters <xref linkend="guc-wal-level"/> and
<xref linkend="guc-vacuum-defer-cleanup-age"/> can be used.
On the primary, the <xref linkend="guc-wal-level"/> parameter can be used.
<xref linkend="guc-max-standby-archive-delay"/> and
<xref linkend="guc-max-standby-streaming-delay"/> have no effect if set on
the primary.
@ -2268,9 +2255,6 @@ HINT: You can then restart the server after making the necessary configuration
On the standby, parameters <xref linkend="guc-hot-standby"/>,
<xref linkend="guc-max-standby-archive-delay"/> and
<xref linkend="guc-max-standby-streaming-delay"/> can be used.
<xref linkend="guc-vacuum-defer-cleanup-age"/> has no effect
as long as the server remains in standby mode, though it will
become relevant if the standby becomes primary.
</para>
</sect2>

View File

@ -367,9 +367,6 @@ static inline void ProcArrayEndTransactionInternal(PGPROC *proc, TransactionId l
static void ProcArrayGroupClearXid(PGPROC *proc, TransactionId latestXid);
static void MaintainLatestCompletedXid(TransactionId latestXid);
static void MaintainLatestCompletedXidRecovery(TransactionId latestXid);
static void TransactionIdRetreatSafely(TransactionId *xid,
int retreat_by,
FullTransactionId rel);
static inline FullTransactionId FullXidRelativeTo(FullTransactionId rel,
TransactionId xid);
@ -1709,10 +1706,7 @@ TransactionIdIsActive(TransactionId xid)
* do about that --- data is only protected if the walsender runs continuously
* while queries are executed on the standby. (The Hot Standby code deals
* with such cases by failing standby queries that needed to access
* already-removed data, so there's no integrity bug.) The computed values
* are also adjusted with vacuum_defer_cleanup_age, so increasing that setting
* on the fly is another easy way to make horizons move backwards, with no
* consequences for data integrity.
* already-removed data, so there's no integrity bug.)
*
* Note: the approximate horizons (see definition of GlobalVisState) are
* updated by the computations done here. That's currently required for
@ -1877,50 +1871,11 @@ ComputeXidHorizons(ComputeXidHorizonsResult *h)
TransactionIdOlder(h->data_oldest_nonremovable, kaxmin);
/* temp relations cannot be accessed in recovery */
}
else
{
/*
* Compute the cutoff XID by subtracting vacuum_defer_cleanup_age.
*
* vacuum_defer_cleanup_age provides some additional "slop" for the
* benefit of hot standby queries on standby servers. This is quick
* and dirty, and perhaps not all that useful unless the primary has a
* predictable transaction rate, but it offers some protection when
* there's no walsender connection. Note that we are assuming
* vacuum_defer_cleanup_age isn't large enough to cause wraparound ---
* so guc.c should limit it to no more than the xidStopLimit threshold
* in varsup.c. Also note that we intentionally don't apply
* vacuum_defer_cleanup_age on standby servers.
*
* Need to use TransactionIdRetreatSafely() instead of open-coding the
* subtraction, to prevent creating an xid before
* FirstNormalTransactionId.
*/
Assert(TransactionIdPrecedesOrEquals(h->oldest_considered_running,
h->shared_oldest_nonremovable));
Assert(TransactionIdPrecedesOrEquals(h->shared_oldest_nonremovable,
h->data_oldest_nonremovable));
if (vacuum_defer_cleanup_age > 0)
{
TransactionIdRetreatSafely(&h->oldest_considered_running,
vacuum_defer_cleanup_age,
h->latest_completed);
TransactionIdRetreatSafely(&h->shared_oldest_nonremovable,
vacuum_defer_cleanup_age,
h->latest_completed);
TransactionIdRetreatSafely(&h->data_oldest_nonremovable,
vacuum_defer_cleanup_age,
h->latest_completed);
/* defer doesn't apply to temp relations */
Assert(TransactionIdPrecedesOrEquals(h->oldest_considered_running,
h->shared_oldest_nonremovable));
Assert(TransactionIdPrecedesOrEquals(h->shared_oldest_nonremovable,
h->data_oldest_nonremovable));
}
}
Assert(TransactionIdPrecedesOrEquals(h->oldest_considered_running,
h->shared_oldest_nonremovable));
Assert(TransactionIdPrecedesOrEquals(h->shared_oldest_nonremovable,
h->data_oldest_nonremovable));
/*
* Check whether there are replication slots requiring an older xmin.
@ -1947,8 +1902,8 @@ ComputeXidHorizons(ComputeXidHorizonsResult *h)
h->slot_catalog_xmin);
/*
* It's possible that slots / vacuum_defer_cleanup_age backed up the
* horizons further than oldest_considered_running. Fix.
* It's possible that slots backed up the horizons further than
* oldest_considered_running. Fix.
*/
h->oldest_considered_running =
TransactionIdOlder(h->oldest_considered_running,
@ -2490,15 +2445,9 @@ GetSnapshotData(Snapshot snapshot)
*/
oldestfxid = FullXidRelativeTo(latest_completed, oldestxid);
/* apply vacuum_defer_cleanup_age */
def_vis_xid_data = xmin;
TransactionIdRetreatSafely(&def_vis_xid_data,
vacuum_defer_cleanup_age,
oldestfxid);
/* Check whether there's a replication slot requiring an older xmin. */
def_vis_xid_data =
TransactionIdOlder(def_vis_xid_data, replication_slot_xmin);
TransactionIdOlder(xmin, replication_slot_xmin);
/*
* Rows in non-shared, non-catalog tables possibly could be vacuumed
@ -4320,44 +4269,6 @@ GlobalVisCheckRemovableXid(Relation rel, TransactionId xid)
return GlobalVisTestIsRemovableXid(state, xid);
}
/*
* Safely retract *xid by retreat_by, store the result in *xid.
*
* Need to be careful to prevent *xid from retreating below
* FirstNormalTransactionId during epoch 0. This is important to prevent
* generating xids that cannot be converted to a FullTransactionId without
* wrapping around.
*
* If retreat_by would lead to a too old xid, FirstNormalTransactionId is
* returned instead.
*/
static void
TransactionIdRetreatSafely(TransactionId *xid, int retreat_by, FullTransactionId rel)
{
TransactionId original_xid = *xid;
FullTransactionId fxid;
uint64 fxid_i;
Assert(TransactionIdIsNormal(original_xid));
Assert(retreat_by >= 0); /* relevant GUCs are stored as ints */
AssertTransactionIdInAllowableRange(original_xid);
if (retreat_by == 0)
return;
fxid = FullXidRelativeTo(rel, original_xid);
fxid_i = U64FromFullTransactionId(fxid);
if ((fxid_i - FirstNormalTransactionId) <= retreat_by)
*xid = FirstNormalTransactionId;
else
{
*xid = TransactionIdRetreatedBy(original_xid, retreat_by);
Assert(TransactionIdIsNormal(*xid));
Assert(NormalTransactionIdPrecedes(*xid, original_xid));
}
}
/*
* Convert a 32 bit transaction id into 64 bit transaction id, by assuming it
* is within MaxTransactionId / 2 of XidFromFullTransactionId(rel).

View File

@ -38,7 +38,6 @@
#include "utils/timestamp.h"
/* User-settable GUC parameters */
int vacuum_defer_cleanup_age;
int max_standby_archive_delay = 30 * 1000;
int max_standby_streaming_delay = 30 * 1000;
bool log_recovery_conflict_waits = false;

View File

@ -2576,15 +2576,6 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
{
{"vacuum_defer_cleanup_age", PGC_SIGHUP, REPLICATION_PRIMARY,
gettext_noop("Number of transactions by which VACUUM and HOT cleanup should be deferred, if any."),
NULL
},
&vacuum_defer_cleanup_age,
0, 0, 1000000, /* see ComputeXidHorizons */
NULL, NULL, NULL
},
{
{"vacuum_failsafe_age", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Age at which VACUUM should trigger failsafe to avoid a wraparound outage."),

View File

@ -329,7 +329,6 @@
# method to choose sync standbys, number of sync standbys,
# and comma-separated list of application_name
# from standby(s); '*' = all
#vacuum_defer_cleanup_age = 0 # number of xacts by which cleanup is delayed
# - Standby Servers -

View File

@ -234,9 +234,6 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
* we only modify the new cluster, so only use it there. If there is a
* crash, the new cluster has to be recreated anyway. fsync=off is a big
* win on ext4.
*
* Force vacuum_defer_cleanup_age to 0 on the new cluster, so that
* vacuumdb --freeze actually freezes the tuples.
*/
snprintf(cmd, sizeof(cmd),
"\"%s/pg_ctl\" -w -l \"%s/%s\" -D \"%s\" -o \"-p %d -b%s %s%s\" start",
@ -244,7 +241,7 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
log_opts.logdir,
SERVER_LOG_FILE, cluster->pgconfig, cluster->port,
(cluster == &new_cluster) ?
" -c synchronous_commit=off -c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0" : "",
" -c synchronous_commit=off -c fsync=off -c full_page_writes=off" : "",
cluster->pgopts ? cluster->pgopts : "", socket_string);
/*

View File

@ -21,7 +21,6 @@
#include "storage/standbydefs.h"
/* User-settable GUC parameters */
extern PGDLLIMPORT int vacuum_defer_cleanup_age;
extern PGDLLIMPORT int max_standby_archive_delay;
extern PGDLLIMPORT int max_standby_streaming_delay;
extern PGDLLIMPORT bool log_recovery_conflict_waits;