mirror of
https://git.postgresql.org/git/postgresql.git
synced 2024-09-28 00:31:51 +02:00
05ce730978
Discussion: https://postgr.es/m/CAKFQuwZ4CXtTyR19vFbd9WwmW-4BvgAenmF2CfUpx0LWwRPGYg@mail.gmail.com Author: David G. Johnston Backpatch-through: master
1205 lines
57 KiB
Plaintext
1205 lines
57 KiB
Plaintext
<!-- doc/src/sgml/maintenance.sgml -->
|
|
|
|
<chapter id="maintenance">
|
|
<title>Routine Database Maintenance Tasks</title>
|
|
|
|
<indexterm zone="maintenance">
|
|
<primary>maintenance</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="maintenance">
|
|
<primary>routine maintenance</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname>, like any database software, requires that certain tasks
|
|
be performed regularly to achieve optimum performance. The tasks
|
|
discussed here are <emphasis>required</emphasis>, but they
|
|
are repetitive in nature and can easily be automated using standard
|
|
tools such as <application>cron</application> scripts or
|
|
Windows' <application>Task Scheduler</application>. It is the database
|
|
administrator's responsibility to set up appropriate scripts, and to
|
|
check that they execute successfully.
|
|
</para>
|
|
|
|
<para>
|
|
One obvious maintenance task is the creation of backup copies of the data on a
|
|
regular schedule. Without a recent backup, you have no chance of recovery
|
|
after a catastrophe (disk failure, fire, mistakenly dropping a critical
|
|
table, etc.). The backup and recovery mechanisms available in
|
|
<productname>PostgreSQL</productname> are discussed at length in
|
|
<xref linkend="backup"/>.
|
|
</para>
|
|
|
|
<para>
|
|
The other main category of maintenance task is periodic <quote>vacuuming</quote>
|
|
of the database. This activity is discussed in
|
|
<xref linkend="routine-vacuuming"/>. Closely related to this is updating
|
|
the statistics that will be used by the query planner, as discussed in
|
|
<xref linkend="vacuum-for-statistics"/>.
|
|
</para>
|
|
|
|
<para>
|
|
Another task that might need periodic attention is log file management.
|
|
This is discussed in <xref linkend="logfile-maintenance"/>.
|
|
</para>
|
|
|
|
<para>
|
|
<ulink
|
|
url="https://bucardo.org/check_postgres/"><application>check_postgres</application></ulink>
|
|
is available for monitoring database health and reporting unusual
|
|
conditions. <application>check_postgres</application> integrates with
|
|
Nagios and MRTG, but can be run standalone too.
|
|
</para>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> is low-maintenance compared
|
|
to some other database management systems. Nonetheless,
|
|
appropriate attention to these tasks will go far towards ensuring a
|
|
pleasant and productive experience with the system.
|
|
</para>
|
|
|
|
<sect1 id="routine-vacuuming">
|
|
<title>Routine Vacuuming</title>
|
|
|
|
<indexterm zone="routine-vacuuming">
|
|
<primary>vacuum</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> databases require periodic
|
|
maintenance known as <firstterm>vacuuming</firstterm>. For many installations, it
|
|
is sufficient to let vacuuming be performed by the <firstterm>autovacuum
|
|
daemon</firstterm>, which is described in <xref linkend="autovacuum"/>. You might
|
|
need to adjust the autovacuuming parameters described there to obtain best
|
|
results for your situation. Some database administrators will want to
|
|
supplement or replace the daemon's activities with manually-managed
|
|
<command>VACUUM</command> commands, which typically are executed according to a
|
|
schedule by <application>cron</application> or <application>Task
|
|
Scheduler</application> scripts. To set up manually-managed vacuuming properly,
|
|
it is essential to understand the issues discussed in the next few
|
|
subsections. Administrators who rely on autovacuuming may still wish
|
|
to skim this material to help them understand and adjust autovacuuming.
|
|
</para>
|
|
|
|
<sect2 id="vacuum-basics">
|
|
<title>Vacuuming Basics</title>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname>'s
|
|
<link linkend="sql-vacuum"><command>VACUUM</command></link> command has to
|
|
process each table on a regular basis for several reasons:
|
|
|
|
<orderedlist>
|
|
<listitem>
|
|
<simpara>To recover or reuse disk space occupied by updated or deleted
|
|
rows.</simpara>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<simpara>To update data statistics used by the
|
|
<productname>PostgreSQL</productname> query planner.</simpara>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<simpara>To update the visibility map, which speeds
|
|
up <link linkend="indexes-index-only-scans">index-only
|
|
scans</link>.</simpara>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<simpara>To protect against loss of very old data due to
|
|
<firstterm>transaction ID wraparound</firstterm> or
|
|
<firstterm>multixact ID wraparound</firstterm>.</simpara>
|
|
</listitem>
|
|
</orderedlist>
|
|
|
|
Each of these reasons dictates performing <command>VACUUM</command> operations
|
|
of varying frequency and scope, as explained in the following subsections.
|
|
</para>
|
|
|
|
<para>
|
|
There are two variants of <command>VACUUM</command>: standard <command>VACUUM</command>
|
|
and <command>VACUUM FULL</command>. <command>VACUUM FULL</command> can reclaim more
|
|
disk space but runs much more slowly. Also,
|
|
the standard form of <command>VACUUM</command> can run in parallel with production
|
|
database operations. (Commands such as <command>SELECT</command>,
|
|
<command>INSERT</command>, <command>UPDATE</command>, and
|
|
<command>DELETE</command> will continue to function normally, though you
|
|
will not be able to modify the definition of a table with commands such as
|
|
<command>ALTER TABLE</command> while it is being vacuumed.)
|
|
<command>VACUUM FULL</command> requires an
|
|
<literal>ACCESS EXCLUSIVE</literal> lock on the table it is
|
|
working on, and therefore cannot be done in parallel with other use
|
|
of the table. Generally, therefore,
|
|
administrators should strive to use standard <command>VACUUM</command> and
|
|
avoid <command>VACUUM FULL</command>.
|
|
</para>
|
|
|
|
<para>
|
|
<command>VACUUM</command> creates a substantial amount of I/O
|
|
traffic, which can cause poor performance for other active sessions.
|
|
There are configuration parameters that can be adjusted to reduce the
|
|
performance impact of background vacuuming — see
|
|
<xref linkend="runtime-config-resource-vacuum-cost"/>.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="vacuum-for-space-recovery">
|
|
<title>Recovering Disk Space</title>
|
|
|
|
<indexterm zone="vacuum-for-space-recovery">
|
|
<primary>disk space</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
In <productname>PostgreSQL</productname>, an
|
|
<command>UPDATE</command> or <command>DELETE</command> of a row does not
|
|
immediately remove the old version of the row.
|
|
This approach is necessary to gain the benefits of multiversion
|
|
concurrency control (<acronym>MVCC</acronym>, see <xref linkend="mvcc"/>): the row version
|
|
must not be deleted while it is still potentially visible to other
|
|
transactions. But eventually, an outdated or deleted row version is no
|
|
longer of interest to any transaction. The space it occupies must then be
|
|
reclaimed for reuse by new rows, to avoid unbounded growth of disk
|
|
space requirements. This is done by running <command>VACUUM</command>.
|
|
</para>
|
|
|
|
<para>
|
|
The standard form of <command>VACUUM</command> removes dead row
|
|
versions in tables and indexes and marks the space available for
|
|
future reuse. However, it will not return the space to the operating
|
|
system, except in the special case where one or more pages at the
|
|
end of a table become entirely free and an exclusive table lock can be
|
|
easily obtained. In contrast, <command>VACUUM FULL</command> actively compacts
|
|
tables by writing a complete new version of the table file with no dead
|
|
space. This minimizes the size of the table, but can take a long time.
|
|
It also requires extra disk space for the new copy of the table, until
|
|
the operation completes.
|
|
</para>
|
|
|
|
<para>
|
|
The usual goal of routine vacuuming is to do standard <command>VACUUM</command>s
|
|
often enough to avoid needing <command>VACUUM FULL</command>. The
|
|
autovacuum daemon attempts to work this way, and in fact will
|
|
never issue <command>VACUUM FULL</command>. In this approach, the idea
|
|
is not to keep tables at their minimum size, but to maintain steady-state
|
|
usage of disk space: each table occupies space equivalent to its
|
|
minimum size plus however much space gets used up between vacuum runs.
|
|
Although <command>VACUUM FULL</command> can be used to shrink a table back
|
|
to its minimum size and return the disk space to the operating system,
|
|
there is not much point in this if the table will just grow again in the
|
|
future. Thus, moderately-frequent standard <command>VACUUM</command> runs are a
|
|
better approach than infrequent <command>VACUUM FULL</command> runs for
|
|
maintaining heavily-updated tables.
|
|
</para>
|
|
|
|
<para>
|
|
Some administrators prefer to schedule vacuuming themselves, for example
|
|
doing all the work at night when load is low.
|
|
The difficulty with doing vacuuming according to a fixed schedule
|
|
is that if a table has an unexpected spike in update activity, it may
|
|
get bloated to the point that <command>VACUUM FULL</command> is really necessary
|
|
to reclaim space. Using the autovacuum daemon alleviates this problem,
|
|
since the daemon schedules vacuuming dynamically in response to update
|
|
activity. It is unwise to disable the daemon completely unless you
|
|
have an extremely predictable workload. One possible compromise is
|
|
to set the daemon's parameters so that it will only react to unusually
|
|
heavy update activity, thus keeping things from getting out of hand,
|
|
while scheduled <command>VACUUM</command>s are expected to do the bulk of the
|
|
work when the load is typical.
|
|
</para>
|
|
|
|
<para>
|
|
For those not using autovacuum, a typical approach is to schedule a
|
|
database-wide <command>VACUUM</command> once a day during a low-usage period,
|
|
supplemented by more frequent vacuuming of heavily-updated tables as
|
|
necessary. (Some installations with extremely high update rates vacuum
|
|
their busiest tables as often as once every few minutes.) If you have
|
|
multiple databases in a cluster, don't forget to
|
|
<command>VACUUM</command> each one; the program <xref
|
|
linkend="app-vacuumdb"/> might be helpful.
|
|
</para>
|
|
|
|
<tip>
|
|
<para>
|
|
Plain <command>VACUUM</command> may not be satisfactory when
|
|
a table contains large numbers of dead row versions as a result of
|
|
massive update or delete activity. If you have such a table and
|
|
you need to reclaim the excess disk space it occupies, you will need
|
|
to use <command>VACUUM FULL</command>, or alternatively
|
|
<link linkend="sql-cluster"><command>CLUSTER</command></link>
|
|
or one of the table-rewriting variants of
|
|
<link linkend="sql-altertable"><command>ALTER TABLE</command></link>.
|
|
These commands rewrite an entire new copy of the table and build
|
|
new indexes for it. All these options require an
|
|
<literal>ACCESS EXCLUSIVE</literal> lock. Note that
|
|
they also temporarily use extra disk space approximately equal to the size
|
|
of the table, since the old copies of the table and indexes can't be
|
|
released until the new ones are complete.
|
|
</para>
|
|
</tip>
|
|
|
|
<tip>
|
|
<para>
|
|
If you have a table whose entire contents are deleted on a periodic
|
|
basis, consider doing it with
|
|
<link linkend="sql-truncate"><command>TRUNCATE</command></link> rather
|
|
than using <command>DELETE</command> followed by
|
|
<command>VACUUM</command>. <command>TRUNCATE</command> removes the
|
|
entire content of the table immediately, without requiring a
|
|
subsequent <command>VACUUM</command> or <command>VACUUM
|
|
FULL</command> to reclaim the now-unused disk space.
|
|
The disadvantage is that strict MVCC semantics are violated.
|
|
</para>
|
|
</tip>
|
|
</sect2>
|
|
|
|
<sect2 id="vacuum-for-statistics">
|
|
<title>Updating Planner Statistics</title>
|
|
|
|
<indexterm zone="vacuum-for-statistics">
|
|
<primary>statistics</primary>
|
|
<secondary>of the planner</secondary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="vacuum-for-statistics">
|
|
<primary>ANALYZE</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The <productname>PostgreSQL</productname> query planner relies on
|
|
statistical information about the contents of tables in order to
|
|
generate good plans for queries. These statistics are gathered by
|
|
the <link linkend="sql-analyze"><command>ANALYZE</command></link> command,
|
|
which can be invoked by itself or
|
|
as an optional step in <command>VACUUM</command>. It is important to have
|
|
reasonably accurate statistics, otherwise poor choices of plans might
|
|
degrade database performance.
|
|
</para>
|
|
|
|
<para>
|
|
The autovacuum daemon, if enabled, will automatically issue
|
|
<command>ANALYZE</command> commands whenever the content of a table has
|
|
changed sufficiently. However, administrators might prefer to rely
|
|
on manually-scheduled <command>ANALYZE</command> operations, particularly
|
|
if it is known that update activity on a table will not affect the
|
|
statistics of <quote>interesting</quote> columns. The daemon schedules
|
|
<command>ANALYZE</command> strictly as a function of the number of rows
|
|
inserted or updated; it has no knowledge of whether that will lead
|
|
to meaningful statistical changes.
|
|
</para>
|
|
|
|
<para>
|
|
Tuples changed in partitions and inheritance children do not trigger
|
|
analyze on the parent table. If the parent table is empty or rarely
|
|
changed, it may never be processed by autovacuum, and the statistics for
|
|
the inheritance tree as a whole won't be collected. It is necessary to
|
|
run <command>ANALYZE</command> on the parent table manually in order to
|
|
keep the statistics up to date.
|
|
</para>
|
|
|
|
<para>
|
|
As with vacuuming for space recovery, frequent updates of statistics
|
|
are more useful for heavily-updated tables than for seldom-updated
|
|
ones. But even for a heavily-updated table, there might be no need for
|
|
statistics updates if the statistical distribution of the data is
|
|
not changing much. A simple rule of thumb is to think about how much
|
|
the minimum and maximum values of the columns in the table change.
|
|
For example, a <type>timestamp</type> column that contains the time
|
|
of row update will have a constantly-increasing maximum value as
|
|
rows are added and updated; such a column will probably need more
|
|
frequent statistics updates than, say, a column containing URLs for
|
|
pages accessed on a website. The URL column might receive changes just
|
|
as often, but the statistical distribution of its values probably
|
|
changes relatively slowly.
|
|
</para>
|
|
|
|
<para>
|
|
It is possible to run <command>ANALYZE</command> on specific tables and even
|
|
just specific columns of a table, so the flexibility exists to update some
|
|
statistics more frequently than others if your application requires it.
|
|
In practice, however, it is usually best to just analyze the entire
|
|
database, because it is a fast operation. <command>ANALYZE</command> uses a
|
|
statistically random sampling of the rows of a table rather than reading
|
|
every single row.
|
|
</para>
|
|
|
|
<tip>
|
|
<para>
|
|
Although per-column tweaking of <command>ANALYZE</command> frequency might not be
|
|
very productive, you might find it worthwhile to do per-column
|
|
adjustment of the level of detail of the statistics collected by
|
|
<command>ANALYZE</command>. Columns that are heavily used in <literal>WHERE</literal>
|
|
clauses and have highly irregular data distributions might require a
|
|
finer-grain data histogram than other columns. See <command>ALTER TABLE
|
|
SET STATISTICS</command>, or change the database-wide default using the <xref
|
|
linkend="guc-default-statistics-target"/> configuration parameter.
|
|
</para>
|
|
|
|
<para>
|
|
Also, by default there is limited information available about
|
|
the selectivity of functions. However, if you create a statistics
|
|
object or an expression
|
|
index that uses a function call, useful statistics will be
|
|
gathered about the function, which can greatly improve query
|
|
plans that use the expression index.
|
|
</para>
|
|
</tip>
|
|
|
|
<tip>
|
|
<para>
|
|
The autovacuum daemon does not issue <command>ANALYZE</command> commands for
|
|
foreign tables, since it has no means of determining how often that
|
|
might be useful. If your queries require statistics on foreign tables
|
|
for proper planning, it's a good idea to run manually-managed
|
|
<command>ANALYZE</command> commands on those tables on a suitable schedule.
|
|
</para>
|
|
</tip>
|
|
|
|
<tip>
|
|
<para>
|
|
The autovacuum daemon does not issue <command>ANALYZE</command> commands
|
|
for partitioned tables. Inheritance parents will only be analyzed if the
|
|
parent itself is changed - changes to child tables do not trigger
|
|
autoanalyze on the parent table. If your queries require statistics on
|
|
parent tables for proper planning, it is necessary to periodically run
|
|
a manual <command>ANALYZE</command> on those tables to keep the statistics
|
|
up to date.
|
|
</para>
|
|
</tip>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="vacuum-for-visibility-map">
|
|
<title>Updating the Visibility Map</title>
|
|
|
|
<para>
|
|
Vacuum maintains a <link linkend="storage-vm">visibility map</link> for each
|
|
table to keep track of which pages contain only tuples that are known to be
|
|
visible to all active transactions (and all future transactions, until the
|
|
page is again modified). This has two purposes. First, vacuum
|
|
itself can skip such pages on the next run, since there is nothing to
|
|
clean up.
|
|
</para>
|
|
|
|
<para>
|
|
Second, it allows <productname>PostgreSQL</productname> to answer some
|
|
queries using only the index, without reference to the underlying table.
|
|
Since <productname>PostgreSQL</productname> indexes don't contain tuple
|
|
visibility information, a normal index scan fetches the heap tuple for each
|
|
matching index entry, to check whether it should be seen by the current
|
|
transaction.
|
|
An <link linkend="indexes-index-only-scans"><firstterm>index-only
|
|
scan</firstterm></link>, on the other hand, checks the visibility map first.
|
|
If it's known that all tuples on the page are
|
|
visible, the heap fetch can be skipped. This is most useful on
|
|
large data sets where the visibility map can prevent disk accesses.
|
|
The visibility map is vastly smaller than the heap, so it can easily be
|
|
cached even when the heap is very large.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="vacuum-for-wraparound">
|
|
<title>Preventing Transaction ID Wraparound Failures</title>
|
|
|
|
<indexterm zone="vacuum-for-wraparound">
|
|
<primary>transaction ID</primary>
|
|
<secondary>wraparound</secondary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>wraparound</primary>
|
|
<secondary>of transaction IDs</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname>'s
|
|
<link linkend="mvcc-intro">MVCC</link> transaction semantics
|
|
depend on being able to compare transaction ID (<acronym>XID</acronym>)
|
|
numbers: a row version with an insertion XID greater than the current
|
|
transaction's XID is <quote>in the future</quote> and should not be visible
|
|
to the current transaction. But since transaction IDs have limited size
|
|
(32 bits) a cluster that runs for a long time (more
|
|
than 4 billion transactions) would suffer <firstterm>transaction ID
|
|
wraparound</firstterm>: the XID counter wraps around to zero, and all of a sudden
|
|
transactions that were in the past appear to be in the future — which
|
|
means their output become invisible. In short, catastrophic data loss.
|
|
(Actually the data is still there, but that's cold comfort if you cannot
|
|
get at it.) To avoid this, it is necessary to vacuum every table
|
|
in every database at least once every two billion transactions.
|
|
</para>
|
|
|
|
<para>
|
|
The reason that periodic vacuuming solves the problem is that
|
|
<command>VACUUM</command> will mark rows as <emphasis>frozen</emphasis>, indicating that
|
|
they were inserted by a transaction that committed sufficiently far in
|
|
the past that the effects of the inserting transaction are certain to be
|
|
visible to all current and future transactions.
|
|
Normal XIDs are
|
|
compared using modulo-2<superscript>32</superscript> arithmetic. This means
|
|
that for every normal XID, there are two billion XIDs that are
|
|
<quote>older</quote> and two billion that are <quote>newer</quote>; another
|
|
way to say it is that the normal XID space is circular with no
|
|
endpoint. Therefore, once a row version has been created with a particular
|
|
normal XID, the row version will appear to be <quote>in the past</quote> for
|
|
the next two billion transactions, no matter which normal XID we are
|
|
talking about. If the row version still exists after more than two billion
|
|
transactions, it will suddenly appear to be in the future. To
|
|
prevent this, <productname>PostgreSQL</productname> reserves a special XID,
|
|
<literal>FrozenTransactionId</literal>, which does not follow the normal XID
|
|
comparison rules and is always considered older
|
|
than every normal XID.
|
|
Frozen row versions are treated as if the inserting XID were
|
|
<literal>FrozenTransactionId</literal>, so that they will appear to be
|
|
<quote>in the past</quote> to all normal transactions regardless of wraparound
|
|
issues, and so such row versions will be valid until deleted, no matter
|
|
how long that is.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
In <productname>PostgreSQL</productname> versions before 9.4, freezing was
|
|
implemented by actually replacing a row's insertion XID
|
|
with <literal>FrozenTransactionId</literal>, which was visible in the
|
|
row's <structname>xmin</structname> system column. Newer versions just set a flag
|
|
bit, preserving the row's original <structname>xmin</structname> for possible
|
|
forensic use. However, rows with <structname>xmin</structname> equal
|
|
to <literal>FrozenTransactionId</literal> (2) may still be found
|
|
in databases <application>pg_upgrade</application>'d from pre-9.4 versions.
|
|
</para>
|
|
<para>
|
|
Also, system catalogs may contain rows with <structname>xmin</structname> equal
|
|
to <literal>BootstrapTransactionId</literal> (1), indicating that they were
|
|
inserted during the first phase of <application>initdb</application>.
|
|
Like <literal>FrozenTransactionId</literal>, this special XID is treated as
|
|
older than every normal XID.
|
|
</para>
|
|
</note>
|
|
|
|
<para>
|
|
<xref linkend="guc-vacuum-freeze-min-age"/>
|
|
controls how old an XID value has to be before rows bearing that XID will be
|
|
frozen. Increasing this setting may avoid unnecessary work if the
|
|
rows that would otherwise be frozen will soon be modified again,
|
|
but decreasing this setting increases
|
|
the number of transactions that can elapse before the table must be
|
|
vacuumed again.
|
|
</para>
|
|
|
|
<para>
|
|
<command>VACUUM</command> uses the <link linkend="storage-vm">visibility map</link>
|
|
to determine which pages of a table must be scanned. Normally, it
|
|
will skip pages that don't have any dead row versions even if those pages
|
|
might still have row versions with old XID values. Therefore, normal
|
|
<command>VACUUM</command>s won't always freeze every old row version in the table.
|
|
When that happens, <command>VACUUM</command> will eventually need to perform an
|
|
<firstterm>aggressive vacuum</firstterm>, which will freeze all eligible unfrozen
|
|
XID and MXID values, including those from all-visible but not all-frozen pages.
|
|
In practice most tables require periodic aggressive vacuuming.
|
|
<xref linkend="guc-vacuum-freeze-table-age"/>
|
|
controls when <command>VACUUM</command> does that: all-visible but not all-frozen
|
|
pages are scanned if the number of transactions that have passed since the
|
|
last such scan is greater than <varname>vacuum_freeze_table_age</varname> minus
|
|
<varname>vacuum_freeze_min_age</varname>. Setting
|
|
<varname>vacuum_freeze_table_age</varname> to 0 forces <command>VACUUM</command> to
|
|
always use its aggressive strategy.
|
|
</para>
|
|
|
|
<para>
|
|
The maximum time that a table can go unvacuumed is two billion
|
|
transactions minus the <varname>vacuum_freeze_min_age</varname> value at
|
|
the time of the last aggressive vacuum. If it were to go
|
|
unvacuumed for longer than
|
|
that, data loss could result. To ensure that this does not happen,
|
|
autovacuum is invoked on any table that might contain unfrozen rows with
|
|
XIDs older than the age specified by the configuration parameter <xref
|
|
linkend="guc-autovacuum-freeze-max-age"/>. (This will happen even if
|
|
autovacuum is disabled.)
|
|
</para>
|
|
|
|
<para>
|
|
This implies that if a table is not otherwise vacuumed,
|
|
autovacuum will be invoked on it approximately once every
|
|
<varname>autovacuum_freeze_max_age</varname> minus
|
|
<varname>vacuum_freeze_min_age</varname> transactions.
|
|
For tables that are regularly vacuumed for space reclamation purposes,
|
|
this is of little importance. However, for static tables
|
|
(including tables that receive inserts, but no updates or deletes),
|
|
there is no need to vacuum for space reclamation, so it can
|
|
be useful to try to maximize the interval between forced autovacuums
|
|
on very large static tables. Obviously one can do this either by
|
|
increasing <varname>autovacuum_freeze_max_age</varname> or decreasing
|
|
<varname>vacuum_freeze_min_age</varname>.
|
|
</para>
|
|
|
|
<para>
|
|
The effective maximum for <varname>vacuum_freeze_table_age</varname> is 0.95 *
|
|
<varname>autovacuum_freeze_max_age</varname>; a setting higher than that will be
|
|
capped to the maximum. A value higher than
|
|
<varname>autovacuum_freeze_max_age</varname> wouldn't make sense because an
|
|
anti-wraparound autovacuum would be triggered at that point anyway, and
|
|
the 0.95 multiplier leaves some breathing room to run a manual
|
|
<command>VACUUM</command> before that happens. As a rule of thumb,
|
|
<command>vacuum_freeze_table_age</command> should be set to a value somewhat
|
|
below <varname>autovacuum_freeze_max_age</varname>, leaving enough gap so that
|
|
a regularly scheduled <command>VACUUM</command> or an autovacuum triggered by
|
|
normal delete and update activity is run in that window. Setting it too
|
|
close could lead to anti-wraparound autovacuums, even though the table
|
|
was recently vacuumed to reclaim space, whereas lower values lead to more
|
|
frequent aggressive vacuuming.
|
|
</para>
|
|
|
|
<para>
|
|
The sole disadvantage of increasing <varname>autovacuum_freeze_max_age</varname>
|
|
(and <varname>vacuum_freeze_table_age</varname> along with it) is that
|
|
the <filename>pg_xact</filename> and <filename>pg_commit_ts</filename>
|
|
subdirectories of the database cluster will take more space, because it
|
|
must store the commit status and (if <varname>track_commit_timestamp</varname> is
|
|
enabled) timestamp of all transactions back to
|
|
the <varname>autovacuum_freeze_max_age</varname> horizon. The commit status uses
|
|
two bits per transaction, so if
|
|
<varname>autovacuum_freeze_max_age</varname> is set to its maximum allowed value
|
|
of two billion, <filename>pg_xact</filename> can be expected to grow to about half
|
|
a gigabyte and <filename>pg_commit_ts</filename> to about 20GB. If this
|
|
is trivial compared to your total database size,
|
|
setting <varname>autovacuum_freeze_max_age</varname> to its maximum allowed value
|
|
is recommended. Otherwise, set it depending on what you are willing to
|
|
allow for <filename>pg_xact</filename> and <filename>pg_commit_ts</filename> storage.
|
|
(The default, 200 million transactions, translates to about 50MB
|
|
of <filename>pg_xact</filename> storage and about 2GB of <filename>pg_commit_ts</filename>
|
|
storage.)
|
|
</para>
|
|
|
|
<para>
|
|
One disadvantage of decreasing <varname>vacuum_freeze_min_age</varname> is that
|
|
it might cause <command>VACUUM</command> to do useless work: freezing a row
|
|
version is a waste of time if the row is modified
|
|
soon thereafter (causing it to acquire a new XID). So the setting should
|
|
be large enough that rows are not frozen until they are unlikely to change
|
|
any more.
|
|
</para>
|
|
|
|
<para>
|
|
To track the age of the oldest unfrozen XIDs in a database,
|
|
<command>VACUUM</command> stores XID
|
|
statistics in the system tables <structname>pg_class</structname> and
|
|
<structname>pg_database</structname>. In particular,
|
|
the <structfield>relfrozenxid</structfield> column of a table's
|
|
<structname>pg_class</structname> row contains the oldest remaining unfrozen
|
|
XID at the end of the most recent <command>VACUUM</command> that successfully
|
|
advanced <structfield>relfrozenxid</structfield> (typically the most recent
|
|
aggressive VACUUM). Similarly, the
|
|
<structfield>datfrozenxid</structfield> column of a database's
|
|
<structname>pg_database</structname> row is a lower bound on the unfrozen XIDs
|
|
appearing in that database — it is just the minimum of the
|
|
per-table <structfield>relfrozenxid</structfield> values within the database.
|
|
A convenient way to
|
|
examine this information is to execute queries such as:
|
|
|
|
<programlisting>
|
|
SELECT c.oid::regclass as table_name,
|
|
greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age
|
|
FROM pg_class c
|
|
LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
|
|
WHERE c.relkind IN ('r', 'm');
|
|
|
|
SELECT datname, age(datfrozenxid) FROM pg_database;
|
|
</programlisting>
|
|
|
|
The <literal>age</literal> column measures the number of transactions from the
|
|
cutoff XID to the current transaction's XID.
|
|
</para>
|
|
|
|
<tip>
|
|
<para>
|
|
When the <command>VACUUM</command> command's <literal>VERBOSE</literal>
|
|
parameter is specified, <command>VACUUM</command> prints various
|
|
statistics about the table. This includes information about how
|
|
<structfield>relfrozenxid</structfield> and
|
|
<structfield>relminmxid</structfield> advanced, and the number of
|
|
newly frozen pages. The same details appear in the server log when
|
|
autovacuum logging (controlled by <xref
|
|
linkend="guc-log-autovacuum-min-duration"/>) reports on a
|
|
<command>VACUUM</command> operation executed by autovacuum.
|
|
</para>
|
|
</tip>
|
|
|
|
<para>
|
|
<command>VACUUM</command> normally only scans pages that have been modified
|
|
since the last vacuum, but <structfield>relfrozenxid</structfield> can only be
|
|
advanced when every page of the table
|
|
that might contain unfrozen XIDs is scanned. This happens when
|
|
<structfield>relfrozenxid</structfield> is more than
|
|
<varname>vacuum_freeze_table_age</varname> transactions old, when
|
|
<command>VACUUM</command>'s <literal>FREEZE</literal> option is used, or when all
|
|
pages that are not already all-frozen happen to
|
|
require vacuuming to remove dead row versions. When <command>VACUUM</command>
|
|
scans every page in the table that is not already all-frozen, it should
|
|
set <literal>age(relfrozenxid)</literal> to a value just a little more than the
|
|
<varname>vacuum_freeze_min_age</varname> setting
|
|
that was used (more by the number of transactions started since the
|
|
<command>VACUUM</command> started). <command>VACUUM</command>
|
|
will set <structfield>relfrozenxid</structfield> to the oldest XID
|
|
that remains in the table, so it's possible that the final value
|
|
will be much more recent than strictly required.
|
|
If no <structfield>relfrozenxid</structfield>-advancing
|
|
<command>VACUUM</command> is issued on the table until
|
|
<varname>autovacuum_freeze_max_age</varname> is reached, an autovacuum will soon
|
|
be forced for the table.
|
|
</para>
|
|
|
|
<para>
|
|
If for some reason autovacuum fails to clear old XIDs from a table, the
|
|
system will begin to emit warning messages like this when the database's
|
|
oldest XIDs reach forty million transactions from the wraparound point:
|
|
|
|
<programlisting>
|
|
WARNING: database "mydb" must be vacuumed within 39985967 transactions
|
|
HINT: To avoid XID assignment failures, execute a database-wide VACUUM in that database.
|
|
</programlisting>
|
|
|
|
(A manual <command>VACUUM</command> should fix the problem, as suggested by the
|
|
hint; but note that the <command>VACUUM</command> should be performed by a
|
|
superuser, else it will fail to process system catalogs, which prevent it from
|
|
being able to advance the database's <structfield>datfrozenxid</structfield>.)
|
|
If these warnings are ignored, the system will refuse to assign new XIDs once
|
|
there are fewer than three million transactions left until wraparound:
|
|
|
|
<programlisting>
|
|
ERROR: database is not accepting commands that assign new XIDs to avoid wraparound data loss in database "mydb"
|
|
HINT: Execute a database-wide VACUUM in that database.
|
|
</programlisting>
|
|
|
|
In this condition any transactions already in progress can continue,
|
|
but only read-only transactions can be started. Operations that
|
|
modify database records or truncate relations will fail.
|
|
The <command>VACUUM</command> command can still be run normally.
|
|
Note that, contrary to what was sometimes recommended in earlier releases,
|
|
it is not necessary or desirable to stop the postmaster or enter single
|
|
user-mode in order to restore normal operation.
|
|
Instead, follow these steps:
|
|
|
|
<orderedlist>
|
|
<listitem>
|
|
<simpara>Resolve old prepared transactions. You can find these by checking
|
|
<link linkend="view-pg-prepared-xacts">pg_prepared_xacts</link> for rows where
|
|
<literal>age(transactionid)</literal> is large. Such transactions should be
|
|
committed or rolled back.</simpara>
|
|
</listitem>
|
|
<listitem>
|
|
<simpara>End long-running open transactions. You can find these by checking
|
|
<link linkend="monitoring-pg-stat-activity-view">pg_stat_activity</link> for rows where
|
|
<literal>age(backend_xid)</literal> or <literal>age(backend_xmin)</literal> is
|
|
large. Such transactions should be committed or rolled back, or the session
|
|
can be terminated using <literal>pg_terminate_backend</literal>.</simpara>
|
|
</listitem>
|
|
<listitem>
|
|
<simpara>Drop any old replication slots. Use
|
|
<link linkend="monitoring-pg-stat-replication-view">pg_stat_replication</link> to
|
|
find slots where <literal>age(xmin)</literal> or <literal>age(catalog_xmin)</literal>
|
|
is large. In many cases, such slots were created for replication to servers that no
|
|
longer exist, or that have been down for a long time. If you drop a slot for a server
|
|
that still exists and might still try to connect to that slot, that replica may
|
|
need to be rebuilt.</simpara>
|
|
</listitem>
|
|
<listitem>
|
|
<simpara>Execute <command>VACUUM</command> in the target database. A database-wide
|
|
<literal>VACUUM</literal> is simplest; to reduce the time required, it as also possible
|
|
to issue manual <command>VACUUM</command> commands on the tables where
|
|
<structfield>relminxid</structfield> is oldest. Do not use <literal>VACUUM FULL</literal>
|
|
in this scenario, because it requires an XID and will therefore fail, except in super-user
|
|
mode, where it will instead consume an XID and thus increase the risk of transaction ID
|
|
wraparound. Do not use <literal>VACUUM FREEZE</literal> either, because it will do
|
|
more than the minimum amount of work required to restore normal operation.</simpara>
|
|
</listitem>
|
|
<listitem>
|
|
<simpara>Once normal operation is restored, ensure that autovacuum is properly configured
|
|
in the target database in order to avoid future problems.</simpara>
|
|
</listitem>
|
|
</orderedlist>
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
In earlier versions, it was sometimes necessary to stop the postmaster and
|
|
<command>VACUUM</command> the database in a single-user mode. In typical scenarios, this
|
|
is no longer necessary, and should be avoided whenever possible, since it involves taking
|
|
the system down. It is also riskier, since it disables transaction ID wraparound safeguards
|
|
that are designed to prevent data loss. The only reason to use single-user mode in this
|
|
scenario is if you wish to <command>TRUNCATE</command> or <command>DROP</command> unneeded
|
|
tables to avoid needing to <command>VACUUM</command> them. The three-million-transaction
|
|
safety margin exists to let the administrator do this. See the
|
|
<xref linkend="app-postgres"/> reference page for details about using single-user mode.
|
|
</para>
|
|
</note>
|
|
|
|
<sect3 id="vacuum-for-multixact-wraparound">
|
|
<title>Multixacts and Wraparound</title>
|
|
|
|
<indexterm>
|
|
<primary>MultiXactId</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>wraparound</primary>
|
|
<secondary>of multixact IDs</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<firstterm>Multixact IDs</firstterm> are used to support row locking by
|
|
multiple transactions. Since there is only limited space in a tuple
|
|
header to store lock information, that information is encoded as
|
|
a <quote>multiple transaction ID</quote>, or multixact ID for short,
|
|
whenever there is more than one transaction concurrently locking a
|
|
row. Information about which transaction IDs are included in any
|
|
particular multixact ID is stored separately in
|
|
the <filename>pg_multixact</filename> subdirectory, and only the multixact ID
|
|
appears in the <structfield>xmax</structfield> field in the tuple header.
|
|
Like transaction IDs, multixact IDs are implemented as a
|
|
32-bit counter and corresponding storage, all of which requires
|
|
careful aging management, storage cleanup, and wraparound handling.
|
|
There is a separate storage area which holds the list of members in
|
|
each multixact, which also uses a 32-bit counter and which must also
|
|
be managed.
|
|
</para>
|
|
|
|
<para>
|
|
Whenever <command>VACUUM</command> scans any part of a table, it will replace
|
|
any multixact ID it encounters which is older than
|
|
<xref linkend="guc-vacuum-multixact-freeze-min-age"/>
|
|
by a different value, which can be the zero value, a single
|
|
transaction ID, or a newer multixact ID. For each table,
|
|
<structname>pg_class</structname>.<structfield>relminmxid</structfield> stores the oldest
|
|
possible multixact ID still appearing in any tuple of that table.
|
|
If this value is older than
|
|
<xref linkend="guc-vacuum-multixact-freeze-table-age"/>, an aggressive
|
|
vacuum is forced. As discussed in the previous section, an aggressive
|
|
vacuum means that only those pages which are known to be all-frozen will
|
|
be skipped. <function>mxid_age()</function> can be used on
|
|
<structname>pg_class</structname>.<structfield>relminmxid</structfield> to find its age.
|
|
</para>
|
|
|
|
<para>
|
|
Aggressive <command>VACUUM</command>s, regardless of what causes
|
|
them, are <emphasis>guaranteed</emphasis> to be able to advance
|
|
the table's <structfield>relminmxid</structfield>.
|
|
Eventually, as all tables in all databases are scanned and their
|
|
oldest multixact values are advanced, on-disk storage for older
|
|
multixacts can be removed.
|
|
</para>
|
|
|
|
<para>
|
|
As a safety device, an aggressive vacuum scan will
|
|
occur for any table whose multixact-age is greater than <xref
|
|
linkend="guc-autovacuum-multixact-freeze-max-age"/>. Also, if the
|
|
storage occupied by multixacts members exceeds 2GB, aggressive vacuum
|
|
scans will occur more often for all tables, starting with those that
|
|
have the oldest multixact-age. Both of these kinds of aggressive
|
|
scans will occur even if autovacuum is nominally disabled.
|
|
</para>
|
|
|
|
<para>
|
|
Similar to the XID case, if autovacuum fails to clear old MXIDs from a table, the
|
|
system will begin to emit warning messages when the database's oldest MXIDs reach forty
|
|
million transactions from the wraparound point. And, just as an the XID case, if these
|
|
warnings are ignored, the system will refuse to generate new MXIDs once there are fewer
|
|
than three million left until wraparound.
|
|
</para>
|
|
|
|
<para>
|
|
Normal operation when MXIDs are exhausted can be restored in much the same way as
|
|
when XIDs are exhausted. Follow the same steps in the previous section, but with the
|
|
following differences:
|
|
|
|
<orderedlist>
|
|
<listitem>
|
|
<simpara>Running transactions and prepared transactions can be ignored if there
|
|
is no chance that they might appear in a multixact.</simpara>
|
|
</listitem>
|
|
<listitem>
|
|
<simpara>MXID information is not directly visible in system views such as
|
|
<literal>pg_stat_activity</literal>; however, looking for old XIDs is still a good
|
|
way of determining which transactions are causing MXID wraparound problems.</simpara>
|
|
</listitem>
|
|
<listitem>
|
|
<simpara>XID exhaustion will block all write transactions, but MXID exhaustion will
|
|
only block a subset of write transactions, specifically those that involve
|
|
row locks that require an MXID.</simpara>
|
|
</listitem>
|
|
</orderedlist>
|
|
</para>
|
|
|
|
</sect3>
|
|
</sect2>
|
|
|
|
<sect2 id="autovacuum">
|
|
<title>The Autovacuum Daemon</title>
|
|
|
|
<indexterm>
|
|
<primary>autovacuum</primary>
|
|
<secondary>general information</secondary>
|
|
</indexterm>
|
|
<para>
|
|
<productname>PostgreSQL</productname> has an optional but highly
|
|
recommended feature called <firstterm>autovacuum</firstterm>,
|
|
whose purpose is to automate the execution of
|
|
<command>VACUUM</command> and <command>ANALYZE</command> commands.
|
|
When enabled, autovacuum checks for
|
|
tables that have had a large number of inserted, updated or deleted
|
|
tuples. These checks use the statistics collection facility;
|
|
therefore, autovacuum cannot be used unless <xref
|
|
linkend="guc-track-counts"/> is set to <literal>true</literal>.
|
|
In the default configuration, autovacuuming is enabled and the related
|
|
configuration parameters are appropriately set.
|
|
</para>
|
|
|
|
<para>
|
|
The <quote>autovacuum daemon</quote> actually consists of multiple processes.
|
|
There is a persistent daemon process, called the
|
|
<firstterm>autovacuum launcher</firstterm>, which is in charge of starting
|
|
<firstterm>autovacuum worker</firstterm> processes for all databases. The
|
|
launcher will distribute the work across time, attempting to start one
|
|
worker within each database every <xref linkend="guc-autovacuum-naptime"/>
|
|
seconds. (Therefore, if the installation has <replaceable>N</replaceable> databases,
|
|
a new worker will be launched every
|
|
<varname>autovacuum_naptime</varname>/<replaceable>N</replaceable> seconds.)
|
|
A maximum of <xref linkend="guc-autovacuum-max-workers"/> worker processes
|
|
are allowed to run at the same time. If there are more than
|
|
<varname>autovacuum_max_workers</varname> databases to be processed,
|
|
the next database will be processed as soon as the first worker finishes.
|
|
Each worker process will check each table within its database and
|
|
execute <command>VACUUM</command> and/or <command>ANALYZE</command> as needed.
|
|
<xref linkend="guc-log-autovacuum-min-duration"/> can be set to monitor
|
|
autovacuum workers' activity.
|
|
</para>
|
|
|
|
<para>
|
|
If several large tables all become eligible for vacuuming in a short
|
|
amount of time, all autovacuum workers might become occupied with
|
|
vacuuming those tables for a long period. This would result
|
|
in other tables and databases not being vacuumed until a worker becomes
|
|
available. There is no limit on how many workers might be in a
|
|
single database, but workers do try to avoid repeating work that has
|
|
already been done by other workers. Note that the number of running
|
|
workers does not count towards <xref linkend="guc-max-connections"/> or
|
|
<xref linkend="guc-superuser-reserved-connections"/> limits.
|
|
</para>
|
|
|
|
<para>
|
|
Tables whose <structfield>relfrozenxid</structfield> value is more than
|
|
<xref linkend="guc-autovacuum-freeze-max-age"/> transactions old are always
|
|
vacuumed (this also applies to those tables whose freeze max age has
|
|
been modified via storage parameters; see below). Otherwise, if the
|
|
number of tuples obsoleted since the last
|
|
<command>VACUUM</command> exceeds the <quote>vacuum threshold</quote>, the
|
|
table is vacuumed. The vacuum threshold is defined as:
|
|
<programlisting>
|
|
vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples
|
|
</programlisting>
|
|
where the vacuum base threshold is
|
|
<xref linkend="guc-autovacuum-vacuum-threshold"/>,
|
|
the vacuum scale factor is
|
|
<xref linkend="guc-autovacuum-vacuum-scale-factor"/>,
|
|
and the number of tuples is
|
|
<structname>pg_class</structname>.<structfield>reltuples</structfield>.
|
|
</para>
|
|
|
|
<para>
|
|
The table is also vacuumed if the number of tuples inserted since the last
|
|
vacuum has exceeded the defined insert threshold, which is defined as:
|
|
<programlisting>
|
|
vacuum insert threshold = vacuum base insert threshold + vacuum insert scale factor * number of tuples
|
|
</programlisting>
|
|
where the vacuum insert base threshold is
|
|
<xref linkend="guc-autovacuum-vacuum-insert-threshold"/>,
|
|
and vacuum insert scale factor is
|
|
<xref linkend="guc-autovacuum-vacuum-insert-scale-factor"/>.
|
|
Such vacuums may allow portions of the table to be marked as
|
|
<firstterm>all visible</firstterm> and also allow tuples to be frozen, which
|
|
can reduce the work required in subsequent vacuums.
|
|
For tables which receive <command>INSERT</command> operations but no or
|
|
almost no <command>UPDATE</command>/<command>DELETE</command> operations,
|
|
it may be beneficial to lower the table's
|
|
<xref linkend="reloption-autovacuum-freeze-min-age"/> as this may allow
|
|
tuples to be frozen by earlier vacuums. The number of obsolete tuples and
|
|
the number of inserted tuples are obtained from the cumulative statistics system;
|
|
it is an eventually-consistent count updated by each <command>UPDATE</command>,
|
|
<command>DELETE</command> and <command>INSERT</command> operation.
|
|
If the <structfield>relfrozenxid</structfield> value of the table
|
|
is more than <varname>vacuum_freeze_table_age</varname> transactions old,
|
|
an aggressive vacuum is performed to freeze old tuples and advance
|
|
<structfield>relfrozenxid</structfield>; otherwise, only pages that have been modified
|
|
since the last vacuum are scanned.
|
|
</para>
|
|
|
|
<para>
|
|
For analyze, a similar condition is used: the threshold, defined as:
|
|
<programlisting>
|
|
analyze threshold = analyze base threshold + analyze scale factor * number of tuples
|
|
</programlisting>
|
|
is compared to the total number of tuples inserted, updated, or deleted
|
|
since the last <command>ANALYZE</command>.
|
|
</para>
|
|
|
|
<para>
|
|
Partitioned tables do not directly store tuples and consequently
|
|
are not processed by autovacuum. (Autovacuum does process table
|
|
partitions just like other tables.) Unfortunately, this means that
|
|
autovacuum does not run <command>ANALYZE</command> on partitioned
|
|
tables, and this can cause suboptimal plans for queries that reference
|
|
partitioned table statistics. You can work around this problem by
|
|
manually running <command>ANALYZE</command> on partitioned tables
|
|
when they are first populated, and again whenever the distribution
|
|
of data in their partitions changes significantly.
|
|
</para>
|
|
|
|
<para>
|
|
Temporary tables cannot be accessed by autovacuum. Therefore,
|
|
appropriate vacuum and analyze operations should be performed via
|
|
session SQL commands.
|
|
</para>
|
|
|
|
<para>
|
|
The default thresholds and scale factors are taken from
|
|
<filename>postgresql.conf</filename>, but it is possible to override them
|
|
(and many other autovacuum control parameters) on a per-table basis; see
|
|
<xref linkend="sql-createtable-storage-parameters"/> for more information.
|
|
If a setting has been changed via a table's storage parameters, that value
|
|
is used when processing that table; otherwise the global settings are
|
|
used. See <xref linkend="runtime-config-autovacuum"/> for more details on
|
|
the global settings.
|
|
</para>
|
|
|
|
<para>
|
|
When multiple workers are running, the autovacuum cost delay parameters
|
|
(see <xref linkend="runtime-config-resource-vacuum-cost"/>) are
|
|
<quote>balanced</quote> among all the running workers, so that the
|
|
total I/O impact on the system is the same regardless of the number
|
|
of workers actually running. However, any workers processing tables whose
|
|
per-table <literal>autovacuum_vacuum_cost_delay</literal> or
|
|
<literal>autovacuum_vacuum_cost_limit</literal> storage parameters have been set
|
|
are not considered in the balancing algorithm.
|
|
</para>
|
|
|
|
<para>
|
|
Autovacuum workers generally don't block other commands. If a process
|
|
attempts to acquire a lock that conflicts with the
|
|
<literal>SHARE UPDATE EXCLUSIVE</literal> lock held by autovacuum, lock
|
|
acquisition will interrupt the autovacuum. For conflicting lock modes,
|
|
see <xref linkend="table-lock-compatibility"/>. However, if the autovacuum
|
|
is running to prevent transaction ID wraparound (i.e., the autovacuum query
|
|
name in the <structname>pg_stat_activity</structname> view ends with
|
|
<literal>(to prevent wraparound)</literal>), the autovacuum is not
|
|
automatically interrupted.
|
|
</para>
|
|
|
|
<warning>
|
|
<para>
|
|
Regularly running commands that acquire locks conflicting with a
|
|
<literal>SHARE UPDATE EXCLUSIVE</literal> lock (e.g., ANALYZE) can
|
|
effectively prevent autovacuums from ever completing.
|
|
</para>
|
|
</warning>
|
|
</sect2>
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="routine-reindex">
|
|
<title>Routine Reindexing</title>
|
|
|
|
<indexterm zone="routine-reindex">
|
|
<primary>reindex</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
In some situations it is worthwhile to rebuild indexes periodically
|
|
with the <xref linkend="sql-reindex"/> command or a series of individual
|
|
rebuilding steps.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
B-tree index pages that have become completely empty are reclaimed for
|
|
re-use. However, there is still a possibility
|
|
of inefficient use of space: if all but a few index keys on a page have
|
|
been deleted, the page remains allocated. Therefore, a usage
|
|
pattern in which most, but not all, keys in each range are eventually
|
|
deleted will see poor use of space. For such usage patterns,
|
|
periodic reindexing is recommended.
|
|
</para>
|
|
|
|
<para>
|
|
The potential for bloat in non-B-tree indexes has not been well
|
|
researched. It is a good idea to periodically monitor the index's physical
|
|
size when using any non-B-tree index type.
|
|
</para>
|
|
|
|
<para>
|
|
Also, for B-tree indexes, a freshly-constructed index is slightly faster to
|
|
access than one that has been updated many times because logically
|
|
adjacent pages are usually also physically adjacent in a newly built index.
|
|
(This consideration does not apply to non-B-tree indexes.) It
|
|
might be worthwhile to reindex periodically just to improve access speed.
|
|
</para>
|
|
|
|
<para>
|
|
<xref linkend="sql-reindex"/> can be used safely and easily in all cases.
|
|
This command requires an <literal>ACCESS EXCLUSIVE</literal> lock by
|
|
default, hence it is often preferable to execute it with its
|
|
<literal>CONCURRENTLY</literal> option, which requires only a
|
|
<literal>SHARE UPDATE EXCLUSIVE</literal> lock.
|
|
</para>
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="logfile-maintenance">
|
|
<title>Log File Maintenance</title>
|
|
|
|
<indexterm zone="logfile-maintenance">
|
|
<primary>server log</primary>
|
|
<secondary>log file maintenance</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
It is a good idea to save the database server's log output
|
|
somewhere, rather than just discarding it via <filename>/dev/null</filename>.
|
|
The log output is invaluable when diagnosing
|
|
problems.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
The server log can contain sensitive information and needs to be protected,
|
|
no matter how or where it is stored, or the destination to which it is routed.
|
|
For example, some DDL statements might contain plaintext passwords or other
|
|
authentication details. Logged statements at the <literal>ERROR</literal>
|
|
level might show the SQL source code for applications
|
|
and might also contain some parts of data rows. Recording data, events and
|
|
related information is the intended function of this facility, so this is
|
|
not a leakage or a bug. Please ensure the server logs are visible only to
|
|
appropriately authorized people.
|
|
</para>
|
|
</note>
|
|
|
|
<para>
|
|
Log output tends to be voluminous
|
|
(especially at higher debug levels) so you won't want to save it
|
|
indefinitely. You need to <emphasis>rotate</emphasis> the log files so that
|
|
new log files are started and old ones removed after a reasonable
|
|
period of time.
|
|
</para>
|
|
|
|
<para>
|
|
If you simply direct the <systemitem>stderr</systemitem> of
|
|
<command>postgres</command> into a
|
|
file, you will have log output, but
|
|
the only way to truncate the log file is to stop and restart
|
|
the server. This might be acceptable if you are using
|
|
<productname>PostgreSQL</productname> in a development environment,
|
|
but few production servers would find this behavior acceptable.
|
|
</para>
|
|
|
|
<para>
|
|
A better approach is to send the server's
|
|
<systemitem>stderr</systemitem> output to some type of log rotation program.
|
|
There is a built-in log rotation facility, which you can use by
|
|
setting the configuration parameter <varname>logging_collector</varname> to
|
|
<literal>true</literal> in <filename>postgresql.conf</filename>. The control
|
|
parameters for this program are described in <xref
|
|
linkend="runtime-config-logging-where"/>. You can also use this approach
|
|
to capture the log data in machine readable <acronym>CSV</acronym>
|
|
(comma-separated values) format.
|
|
</para>
|
|
|
|
<para>
|
|
Alternatively, you might prefer to use an external log rotation
|
|
program if you have one that you are already using with other
|
|
server software. For example, the <application>rotatelogs</application>
|
|
tool included in the <productname>Apache</productname> distribution
|
|
can be used with <productname>PostgreSQL</productname>. One way to
|
|
do this is to pipe the server's
|
|
<systemitem>stderr</systemitem> output to the desired program.
|
|
If you start the server with
|
|
<command>pg_ctl</command>, then <systemitem>stderr</systemitem>
|
|
is already redirected to <systemitem>stdout</systemitem>, so you just need a
|
|
pipe command, for example:
|
|
|
|
<programlisting>
|
|
pg_ctl start | rotatelogs /var/log/pgsql_log 86400
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
You can combine these approaches by setting up <application>logrotate</application>
|
|
to collect log files produced by <productname>PostgreSQL</productname> built-in
|
|
logging collector. In this case, the logging collector defines the names and
|
|
location of the log files, while <application>logrotate</application>
|
|
periodically archives these files. When initiating log rotation,
|
|
<application>logrotate</application> must ensure that the application
|
|
sends further output to the new file. This is commonly done with a
|
|
<literal>postrotate</literal> script that sends a <literal>SIGHUP</literal>
|
|
signal to the application, which then reopens the log file.
|
|
In <productname>PostgreSQL</productname>, you can run <command>pg_ctl</command>
|
|
with the <literal>logrotate</literal> option instead. When the server receives
|
|
this command, the server either switches to a new log file or reopens the
|
|
existing file, depending on the logging configuration
|
|
(see <xref linkend="runtime-config-logging-where"/>).
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
When using static log file names, the server might fail to reopen the log
|
|
file if the max open file limit is reached or a file table overflow occurs.
|
|
In this case, log messages are sent to the old log file until a
|
|
successful log rotation. If <application>logrotate</application> is
|
|
configured to compress the log file and delete it, the server may lose
|
|
the messages logged in this time frame. To avoid this issue, you can
|
|
configure the logging collector to dynamically assign log file names
|
|
and use a <literal>prerotate</literal> script to ignore open log files.
|
|
</para>
|
|
</note>
|
|
|
|
<para>
|
|
Another production-grade approach to managing log output is to
|
|
send it to <application>syslog</application> and let
|
|
<application>syslog</application> deal with file rotation. To do this, set the
|
|
configuration parameter <varname>log_destination</varname> to <literal>syslog</literal>
|
|
(to log to <application>syslog</application> only) in
|
|
<filename>postgresql.conf</filename>. Then you can send a <literal>SIGHUP</literal>
|
|
signal to the <application>syslog</application> daemon whenever you want to force it
|
|
to start writing a new log file. If you want to automate log
|
|
rotation, the <application>logrotate</application> program can be
|
|
configured to work with log files from
|
|
<application>syslog</application>.
|
|
</para>
|
|
|
|
<para>
|
|
On many systems, however, <application>syslog</application> is not very reliable,
|
|
particularly with large log messages; it might truncate or drop messages
|
|
just when you need them the most. Also, on <productname>Linux</productname>,
|
|
<application>syslog</application> will flush each message to disk, yielding poor
|
|
performance. (You can use a <quote><literal>-</literal></quote> at the start of the file name
|
|
in the <application>syslog</application> configuration file to disable syncing.)
|
|
</para>
|
|
|
|
<para>
|
|
Note that all the solutions described above take care of starting new
|
|
log files at configurable intervals, but they do not handle deletion
|
|
of old, no-longer-useful log files. You will probably want to set
|
|
up a batch job to periodically delete old log files. Another possibility
|
|
is to configure the rotation program so that old log files are overwritten
|
|
cyclically.
|
|
</para>
|
|
|
|
<para>
|
|
<ulink url="https://pgbadger.darold.net/"><productname>pgBadger</productname></ulink>
|
|
is an external project that does sophisticated log file analysis.
|
|
<ulink
|
|
url="https://bucardo.org/check_postgres/"><productname>check_postgres</productname></ulink>
|
|
provides Nagios alerts when important messages appear in the log
|
|
files, as well as detection of many other extraordinary conditions.
|
|
</para>
|
|
</sect1>
|
|
</chapter>
|