Add a multi-worker capability to autovacuum. This allows multiple worker

processes to be running simultaneously.  Also, now autovacuum processes do not
count towards the max_connections limit; they are counted separately from
regular processes, and are limited by the new GUC variable
autovacuum_max_workers.

The launcher now has intelligence to launch workers on each database every
autovacuum_naptime seconds, limited only on the max amount of worker slots
available.

Also, the global worker I/O utilization is limited by the vacuum cost-based
delay feature.  Workers are "balanced" so that the total I/O consumption does
not exceed the established limit.  This part of the patch was contributed by
ITAGAKI Takahiro.

Per discussion.
This commit is contained in:
Alvaro Herrera 2007-04-16 18:30:04 +00:00
parent 42dc4b66e6
commit e2a186b03c
12 changed files with 1174 additions and 162 deletions

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.119 2007/04/02 15:27:02 petere Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.120 2007/04/16 18:29:50 alvherre Exp $ -->
<chapter Id="runtime-config">
<title>Server Configuration</title>
@ -3166,7 +3166,7 @@ SELECT * FROM parent WHERE key = 2400;
<listitem>
<para>
Controls whether the server should run the
autovacuum daemon. This is off by default.
autovacuum launcher daemon. This is on by default.
<varname>stats_start_collector</> and <varname>stats_row_level</>
must also be turned on for autovacuum to work.
This parameter can only be set in the <filename>postgresql.conf</>
@ -3175,6 +3175,21 @@ SELECT * FROM parent WHERE key = 2400;
</listitem>
</varlistentry>
<varlistentry id="guc-autovacuum-max-workers" xreflabel="autovacuum_max_workers">
<term><varname>autovacuum_max_workers</varname> (<type>integer</type>)</term>
<indexterm>
<primary><varname>autovacuum_max_workers</> configuration parameter</primary>
</indexterm>
<listitem>
<para>
Specifies the maximum number of autovacuum processes (other than the
autovacuum launcher) which may be running at any one time. The default
is three (<literal>3</literal>). This parameter can only be set in
the <filename>postgresql.conf</> file or on the server command line.
</para>
</listitem>
</varlistentry>
<varlistentry id="guc-autovacuum-naptime" xreflabel="autovacuum_naptime">
<term><varname>autovacuum_naptime</varname> (<type>integer</type>)</term>
<indexterm>
@ -3182,9 +3197,9 @@ SELECT * FROM parent WHERE key = 2400;
</indexterm>
<listitem>
<para>
Specifies the delay between activity rounds for the autovacuum
daemon. In each round the daemon examines one database
and issues <command>VACUUM</> and <command>ANALYZE</> commands
Specifies the minimum delay between autovacuum runs on any given
database. In each round the daemon examines the
database and issues <command>VACUUM</> and <command>ANALYZE</> commands
as needed for tables in that database. The delay is measured
in seconds, and the default is one minute (<literal>1m</>).
This parameter can only be set in the <filename>postgresql.conf</>
@ -3318,7 +3333,10 @@ SELECT * FROM parent WHERE key = 2400;
Specifies the cost limit value that will be used in automatic
<command>VACUUM</> operations. If <literal>-1</> is specified (which is the
default), the regular
<xref linkend="guc-vacuum-cost-limit"> value will be used.
<xref linkend="guc-vacuum-cost-limit"> value will be used. Note that
the value is distributed proportionally among the running autovacuum
workers, if there is more than one, so that the sum of the limits of
each worker never exceeds the limit on this variable.
This parameter can only be set in the <filename>postgresql.conf</>
file or on the server command line.
This setting can be overridden for individual tables by entries in

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/maintenance.sgml,v 1.70 2007/02/01 19:10:24 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/maintenance.sgml,v 1.71 2007/04/16 18:29:50 alvherre Exp $ -->
<chapter id="maintenance">
<title>Routine Database Maintenance Tasks</title>
@ -466,26 +466,43 @@ HINT: Stop the postmaster and use a standalone backend to VACUUM in "mydb".
<secondary>general information</secondary>
</indexterm>
<para>
Beginning in <productname>PostgreSQL </productname> 8.1, there is a
separate optional server process called the <firstterm>autovacuum
daemon</firstterm>, whose purpose is to automate the execution of
Beginning in <productname>PostgreSQL</productname> 8.1, there is an
optional feature called <firstterm>autovacuum</firstterm>,
whose purpose is to automate the execution of
<command>VACUUM</command> and <command>ANALYZE </command> commands.
When enabled, the autovacuum daemon runs periodically and checks for
When enabled, autovacuum checks for
tables that have had a large number of inserted, updated or deleted
tuples. These checks use the row-level statistics collection facility;
therefore, the autovacuum daemon cannot be used unless <xref
therefore, autovacuum cannot be used unless <xref
linkend="guc-stats-start-collector"> and <xref
linkend="guc-stats-row-level"> are set to <literal>true</literal>. Also,
it's important to allow a slot for the autovacuum process when choosing
the value of <xref linkend="guc-superuser-reserved-connections">. In
the default configuration, autovacuuming is enabled and the related
linkend="guc-stats-row-level"> are set to <literal>true</literal>.
In the default configuration, autovacuuming is enabled and the related
configuration parameters are appropriately set.
</para>
<para>
The autovacuum daemon, when enabled, runs every <xref
linkend="guc-autovacuum-naptime"> seconds. On each run, it selects
one database to process and checks each table within that database.
Beginning in <productname>PostgreSQL</productname> 8.3, autovacuum has a
multi-process architecture: there is a daemon process, called the
<firstterm>autovacuum launcher</firstterm>, which is in charge of starting
an <firstterm>autovacuum worker</firstterm> process on each database every
<xref linkend="guc-autovacuum-naptime"> seconds.
</para>
<para>
There is a limit of <xref linkend="guc-autovacuum-max-workers"> worker
processes that may be running at at any time, so if the <command>VACUUM</>
and <command>ANALYZE</> work to do takes too long to run, the deadline may
be failed to meet for other databases. Also, if a particular database
takes long to process, more than one worker may be processing it
simultaneously. The workers are smart enough to avoid repeating work that
other workers have done, so this is normally not a problem. Note that the
number of running workers does not count towards the <xref
linkend="guc-max-connections"> nor the <xref
linkend="guc-superuser-reserved-connections"> limits.
</para>
<para>
On each run, the worker process checks each table within that database, and
<command>VACUUM</command> or <command>ANALYZE</command> commands are
issued as needed.
</para>
@ -581,6 +598,12 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
</para>
</caution>
<para>
When multiple workers are running, the cost limit is "balanced" among all
the running workers, so that the total impact on the system is the same,
regardless of the number of workers actually running.
</para>
</sect2>
</sect1>

View File

@ -13,7 +13,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/commands/vacuum.c,v 1.349 2007/03/14 18:48:55 tgl Exp $
* $PostgreSQL: pgsql/src/backend/commands/vacuum.c,v 1.350 2007/04/16 18:29:50 alvherre Exp $
*
*-------------------------------------------------------------------------
*/
@ -3504,6 +3504,9 @@ vacuum_delay_point(void)
VacuumCostBalance = 0;
/* update balance values for workers */
AutoVacuumUpdateDelay();
/* Might have gotten an interrupt while sleeping */
CHECK_FOR_INTERRUPTS();
}

File diff suppressed because it is too large Load Diff

View File

@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/storage/lmgr/proc.c,v 1.187 2007/04/03 16:34:36 tgl Exp $
* $PostgreSQL: pgsql/src/backend/storage/lmgr/proc.c,v 1.188 2007/04/16 18:29:53 alvherre Exp $
*
*-------------------------------------------------------------------------
*/
@ -96,7 +96,7 @@ ProcGlobalShmemSize(void)
size = add_size(size, sizeof(PROC_HDR));
/* AuxiliaryProcs */
size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGPROC)));
/* MyProcs */
/* MyProcs, including autovacuum */
size = add_size(size, mul_size(MaxBackends, sizeof(PGPROC)));
/* ProcStructLock */
size = add_size(size, sizeof(slock_t));
@ -110,7 +110,10 @@ ProcGlobalShmemSize(void)
int
ProcGlobalSemas(void)
{
/* We need a sema per backend, plus one for each auxiliary process. */
/*
* We need a sema per backend (including autovacuum), plus one for each
* auxiliary process.
*/
return MaxBackends + NUM_AUXILIARY_PROCS;
}
@ -127,8 +130,8 @@ ProcGlobalSemas(void)
* running out when trying to start another backend is a common failure.
* So, now we grab enough semaphores to support the desired max number
* of backends immediately at initialization --- if the sysadmin has set
* MaxBackends higher than his kernel will support, he'll find out sooner
* rather than later.
* MaxConnections or autovacuum_max_workers higher than his kernel will
* support, he'll find out sooner rather than later.
*
* Another reason for creating semaphores here is that the semaphore
* implementation typically requires us to create semaphores in the
@ -163,25 +166,39 @@ InitProcGlobal(void)
* Initialize the data structures.
*/
ProcGlobal->freeProcs = INVALID_OFFSET;
ProcGlobal->autovacFreeProcs = INVALID_OFFSET;
ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
/*
* Pre-create the PGPROC structures and create a semaphore for each.
*/
procs = (PGPROC *) ShmemAlloc(MaxBackends * sizeof(PGPROC));
procs = (PGPROC *) ShmemAlloc((MaxConnections) * sizeof(PGPROC));
if (!procs)
ereport(FATAL,
(errcode(ERRCODE_OUT_OF_MEMORY),
errmsg("out of shared memory")));
MemSet(procs, 0, MaxBackends * sizeof(PGPROC));
for (i = 0; i < MaxBackends; i++)
MemSet(procs, 0, MaxConnections * sizeof(PGPROC));
for (i = 0; i < MaxConnections; i++)
{
PGSemaphoreCreate(&(procs[i].sem));
procs[i].links.next = ProcGlobal->freeProcs;
ProcGlobal->freeProcs = MAKE_OFFSET(&procs[i]);
}
procs = (PGPROC *) ShmemAlloc((autovacuum_max_workers) * sizeof(PGPROC));
if (!procs)
ereport(FATAL,
(errcode(ERRCODE_OUT_OF_MEMORY),
errmsg("out of shared memory")));
MemSet(procs, 0, autovacuum_max_workers * sizeof(PGPROC));
for (i = 0; i < autovacuum_max_workers; i++)
{
PGSemaphoreCreate(&(procs[i].sem));
procs[i].links.next = ProcGlobal->autovacFreeProcs;
ProcGlobal->autovacFreeProcs = MAKE_OFFSET(&procs[i]);
}
MemSet(AuxiliaryProcs, 0, NUM_AUXILIARY_PROCS * sizeof(PGPROC));
for (i = 0; i < NUM_AUXILIARY_PROCS; i++)
{
@ -226,12 +243,18 @@ InitProcess(void)
set_spins_per_delay(procglobal->spins_per_delay);
myOffset = procglobal->freeProcs;
if (IsAutoVacuumWorkerProcess())
myOffset = procglobal->autovacFreeProcs;
else
myOffset = procglobal->freeProcs;
if (myOffset != INVALID_OFFSET)
{
MyProc = (PGPROC *) MAKE_PTR(myOffset);
procglobal->freeProcs = MyProc->links.next;
if (IsAutoVacuumWorkerProcess())
procglobal->autovacFreeProcs = MyProc->links.next;
else
procglobal->freeProcs = MyProc->links.next;
SpinLockRelease(ProcStructLock);
}
else
@ -239,7 +262,8 @@ InitProcess(void)
/*
* If we reach here, all the PGPROCs are in use. This is one of the
* possible places to detect "too many backends", so give the standard
* error message.
* error message. XXX do we need to give a different failure message
* in the autovacuum case?
*/
SpinLockRelease(ProcStructLock);
ereport(FATAL,
@ -571,8 +595,16 @@ ProcKill(int code, Datum arg)
SpinLockAcquire(ProcStructLock);
/* Return PGPROC structure (and semaphore) to freelist */
MyProc->links.next = procglobal->freeProcs;
procglobal->freeProcs = MAKE_OFFSET(MyProc);
if (IsAutoVacuumWorkerProcess())
{
MyProc->links.next = procglobal->autovacFreeProcs;
procglobal->autovacFreeProcs = MAKE_OFFSET(MyProc);
}
else
{
MyProc->links.next = procglobal->freeProcs;
procglobal->freeProcs = MAKE_OFFSET(MyProc);
}
/* PGPROC struct isn't mine anymore */
MyProc = NULL;
@ -581,6 +613,10 @@ ProcKill(int code, Datum arg)
procglobal->spins_per_delay = update_spins_per_delay(procglobal->spins_per_delay);
SpinLockRelease(ProcStructLock);
/* wake autovac launcher if needed -- see comments in FreeWorkerInfo */
if (AutovacuumLauncherPid != 0)
kill(AutovacuumLauncherPid, SIGUSR1);
}
/*

View File

@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/utils/init/globals.c,v 1.100 2007/01/05 22:19:44 momjian Exp $
* $PostgreSQL: pgsql/src/backend/utils/init/globals.c,v 1.101 2007/04/16 18:29:54 alvherre Exp $
*
* NOTES
* Globals used all over the place should be declared here and not
@ -95,9 +95,14 @@ bool allowSystemTableMods = false;
int work_mem = 1024;
int maintenance_work_mem = 16384;
/* Primary determinants of sizes of shared-memory structures: */
/*
* Primary determinants of sizes of shared-memory structures. MaxBackends is
* MaxConnections + autovacuum_max_workers (it is computed by the GUC assign
* hook):
*/
int NBuffers = 1000;
int MaxBackends = 100;
int MaxConnections = 90;
int VacuumCostPageHit = 1; /* GUC parameters for vacuum */
int VacuumCostPageMiss = 10;

View File

@ -10,7 +10,7 @@
* Written by Peter Eisentraut <peter_e@gmx.net>.
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/utils/misc/guc.c,v 1.384 2007/04/12 06:53:47 neilc Exp $
* $PostgreSQL: pgsql/src/backend/utils/misc/guc.c,v 1.385 2007/04/16 18:29:55 alvherre Exp $
*
*--------------------------------------------------------------------
*/
@ -163,6 +163,8 @@ static bool assign_tcp_keepalives_count(int newval, bool doit, GucSource source)
static const char *show_tcp_keepalives_idle(void);
static const char *show_tcp_keepalives_interval(void);
static const char *show_tcp_keepalives_count(void);
static bool assign_autovacuum_max_workers(int newval, bool doit, GucSource source);
static bool assign_maxconnections(int newval, bool doit, GucSource source);
/*
* GUC option variables that are exported from this module
@ -1149,16 +1151,19 @@ static struct config_int ConfigureNamesInt[] =
* number.
*
* MaxBackends is limited to INT_MAX/4 because some places compute
* 4*MaxBackends without any overflow check. Likewise we have to limit
* NBuffers to INT_MAX/2.
* 4*MaxBackends without any overflow check. This check is made on
* assign_maxconnections, since MaxBackends is computed as MaxConnections +
* autovacuum_max_workers.
*
* Likewise we have to limit NBuffers to INT_MAX/2.
*/
{
{"max_connections", PGC_POSTMASTER, CONN_AUTH_SETTINGS,
gettext_noop("Sets the maximum number of concurrent connections."),
NULL
},
&MaxBackends,
100, 1, INT_MAX / 4, NULL, NULL
&MaxConnections,
100, 1, INT_MAX / 4, assign_maxconnections, NULL
},
{
@ -1622,6 +1627,15 @@ static struct config_int ConfigureNamesInt[] =
&autovacuum_freeze_max_age,
200000000, 100000000, 2000000000, NULL, NULL
},
{
/* see max_connections */
{"autovacuum_max_workers", PGC_POSTMASTER, AUTOVACUUM,
gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
NULL
},
&autovacuum_max_workers,
3, 1, INT_MAX / 4, assign_autovacuum_max_workers, NULL
},
{
{"tcp_keepalives_idle", PGC_USERSET, CLIENT_CONN_OTHER,
@ -6692,5 +6706,32 @@ show_tcp_keepalives_count(void)
return nbuf;
}
static bool
assign_maxconnections(int newval, bool doit, GucSource source)
{
if (doit)
{
if (newval + autovacuum_max_workers > INT_MAX / 4)
return false;
MaxBackends = newval + autovacuum_max_workers;
}
return true;
}
static bool
assign_autovacuum_max_workers(int newval, bool doit, GucSource source)
{
if (doit)
{
if (newval + MaxConnections > INT_MAX / 4)
return false;
MaxBackends = newval + MaxConnections;
}
return true;
}
#include "guc-file.c"

View File

@ -376,6 +376,7 @@
#autovacuum = on # enable autovacuum subprocess?
# 'on' requires stats_start_collector
# and stats_row_level to also be on
#autovacuum_max_workers = 3 # max # of autovacuum subprocesses
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 500 # min # of tuple updates before
# vacuum

View File

@ -13,7 +13,7 @@
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/miscadmin.h,v 1.193 2007/03/01 14:52:04 petere Exp $
* $PostgreSQL: pgsql/src/include/miscadmin.h,v 1.194 2007/04/16 18:29:56 alvherre Exp $
*
* NOTES
* some of the information in this file should be moved to other files.
@ -129,6 +129,7 @@ extern DLLIMPORT char *DataDir;
extern DLLIMPORT int NBuffers;
extern int MaxBackends;
extern int MaxConnections;
extern DLLIMPORT int MyProcPid;
extern DLLIMPORT struct Port *MyProcPort;

View File

@ -7,15 +7,18 @@
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/postmaster/autovacuum.h,v 1.8 2007/02/15 23:23:23 alvherre Exp $
* $PostgreSQL: pgsql/src/include/postmaster/autovacuum.h,v 1.9 2007/04/16 18:30:03 alvherre Exp $
*
*-------------------------------------------------------------------------
*/
#ifndef AUTOVACUUM_H
#define AUTOVACUUM_H
#include "storage/lock.h"
/* GUC variables */
extern bool autovacuum_start_daemon;
extern int autovacuum_max_workers;
extern int autovacuum_naptime;
extern int autovacuum_vac_thresh;
extern double autovacuum_vac_scale;
@ -25,6 +28,9 @@ extern int autovacuum_freeze_max_age;
extern int autovacuum_vac_cost_delay;
extern int autovacuum_vac_cost_limit;
/* autovacuum launcher PID, only valid when worker is shutting down */
extern int AutovacuumLauncherPid;
/* Status inquiry functions */
extern bool AutoVacuumingActive(void);
extern bool IsAutoVacuumLauncherProcess(void);
@ -35,6 +41,9 @@ extern void autovac_init(void);
extern int StartAutoVacLauncher(void);
extern int StartAutoVacWorker(void);
/* autovacuum cost-delay balancer */
extern void AutoVacuumUpdateDelay(void);
#ifdef EXEC_BACKEND
extern void AutoVacLauncherMain(int argc, char *argv[]);
extern void AutoVacWorkerMain(int argc, char *argv[]);

View File

@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/storage/lwlock.h,v 1.35 2007/04/03 16:34:36 tgl Exp $
* $PostgreSQL: pgsql/src/include/storage/lwlock.h,v 1.36 2007/04/16 18:30:04 alvherre Exp $
*
*-------------------------------------------------------------------------
*/
@ -61,6 +61,7 @@ typedef enum LWLockId
BtreeVacuumLock,
AddinShmemInitLock,
AutovacuumLock,
AutovacuumScheduleLock,
/* Individual lock IDs end here */
FirstBufMappingLock,
FirstLockMgrLock = FirstBufMappingLock + NUM_BUFFER_PARTITIONS,

View File

@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/storage/proc.h,v 1.97 2007/04/03 16:34:36 tgl Exp $
* $PostgreSQL: pgsql/src/include/storage/proc.h,v 1.98 2007/04/16 18:30:04 alvherre Exp $
*
*-------------------------------------------------------------------------
*/
@ -115,6 +115,8 @@ typedef struct PROC_HDR
{
/* Head of list of free PGPROC structures */
SHMEM_OFFSET freeProcs;
/* Head of list of autovacuum's free PGPROC structures */
SHMEM_OFFSET autovacFreeProcs;
/* Current shared estimate of appropriate spins_per_delay value */
int spins_per_delay;
} PROC_HDR;