2010-09-20 22:08:53 +02:00
|
|
|
<!-- doc/src/sgml/config.sgml -->
|
2006-03-10 20:10:50 +01:00
|
|
|
|
2011-04-05 20:06:06 +02:00
|
|
|
<chapter id="runtime-config">
|
2005-10-26 14:55:07 +02:00
|
|
|
<title>Server Configuration</title>
|
2005-09-13 00:11:38 +02:00
|
|
|
|
|
|
|
<indexterm>
|
|
|
|
<primary>configuration</primary>
|
|
|
|
<secondary>of the server</secondary>
|
|
|
|
</indexterm>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
There are many configuration parameters that affect the behavior of
|
2014-09-11 02:50:15 +02:00
|
|
|
the database system. In the first section of this chapter we
|
|
|
|
describe how to interact with configuration parameters. The subsequent sections
|
2005-09-13 00:11:38 +02:00
|
|
|
discuss each parameter in detail.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<sect1 id="config-setting">
|
|
|
|
<title>Setting Parameters</title>
|
|
|
|
|
2012-05-11 05:01:28 +02:00
|
|
|
<sect2 id="config-setting-names-values">
|
|
|
|
<title>Parameter Names and Values</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
All parameter names are case-insensitive. Every parameter takes a
|
2014-12-15 00:09:51 +01:00
|
|
|
value of one of five types: boolean, string, integer, floating point,
|
|
|
|
or enumerated (enum). The type determines the syntax for setting the
|
|
|
|
parameter:
|
2012-05-11 05:01:28 +02:00
|
|
|
</para>
|
|
|
|
|
2014-09-11 02:50:15 +02:00
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2014-12-15 00:09:51 +01:00
|
|
|
<emphasis>Boolean:</emphasis>
|
|
|
|
Values can be written as
|
2014-09-11 02:50:15 +02:00
|
|
|
<literal>on</literal>,
|
|
|
|
<literal>off</literal>,
|
|
|
|
<literal>true</literal>,
|
|
|
|
<literal>false</literal>,
|
|
|
|
<literal>yes</literal>,
|
|
|
|
<literal>no</literal>,
|
|
|
|
<literal>1</literal>,
|
|
|
|
<literal>0</literal>
|
2014-12-15 00:09:51 +01:00
|
|
|
(all case-insensitive) or any unambiguous prefix of one of these.
|
2014-09-11 02:50:15 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2014-12-15 00:09:51 +01:00
|
|
|
<emphasis>String:</emphasis>
|
|
|
|
In general, enclose the value in single quotes, doubling any single
|
|
|
|
quotes within the value. Quotes can usually be omitted if the value
|
|
|
|
is a simple number or identifier, however.
|
2021-06-11 03:38:04 +02:00
|
|
|
(Values that match an SQL keyword require quoting in some contexts.)
|
2014-09-11 02:50:15 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2014-12-15 00:09:51 +01:00
|
|
|
<emphasis>Numeric (integer and floating point):</emphasis>
|
2019-03-12 00:13:46 +01:00
|
|
|
Numeric parameters can be specified in the customary integer and
|
|
|
|
floating-point formats; fractional values are rounded to the nearest
|
|
|
|
integer if the parameter is of integer type. Integer parameters
|
|
|
|
additionally accept hexadecimal input (beginning
|
|
|
|
with <literal>0x</literal>) and octal input (beginning
|
|
|
|
with <literal>0</literal>), but these formats cannot have a fraction.
|
|
|
|
Do not use thousands separators.
|
|
|
|
Quotes are not required, except for hexadecimal input.
|
2014-09-11 02:50:15 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2014-12-15 00:09:51 +01:00
|
|
|
<emphasis>Numeric with Unit:</emphasis>
|
|
|
|
Some numeric parameters have an implicit unit, because they describe
|
2018-10-06 19:03:43 +02:00
|
|
|
quantities of memory or time. The unit might be bytes, kilobytes, blocks
|
2014-12-15 00:09:51 +01:00
|
|
|
(typically eight kilobytes), milliseconds, seconds, or minutes.
|
|
|
|
An unadorned numeric value for one of these settings will use the
|
|
|
|
setting's default unit, which can be learned from
|
2017-10-09 03:44:17 +02:00
|
|
|
<structname>pg_settings</structname>.<structfield>unit</structfield>.
|
2014-12-15 00:09:51 +01:00
|
|
|
For convenience, settings can be given with a unit specified explicitly,
|
2017-10-09 03:44:17 +02:00
|
|
|
for example <literal>'120 ms'</literal> for a time value, and they will be
|
2014-12-15 00:09:51 +01:00
|
|
|
converted to whatever the parameter's actual unit is. Note that the
|
|
|
|
value must be written as a string (with quotes) to use this feature.
|
|
|
|
The unit name is case-sensitive, and there can be whitespace between
|
|
|
|
the numeric value and the unit.
|
2014-09-11 02:50:15 +02:00
|
|
|
|
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-10-06 19:03:43 +02:00
|
|
|
Valid memory units are <literal>B</literal> (bytes),
|
|
|
|
<literal>kB</literal> (kilobytes),
|
2014-09-11 02:50:15 +02:00
|
|
|
<literal>MB</literal> (megabytes), <literal>GB</literal>
|
|
|
|
(gigabytes), and <literal>TB</literal> (terabytes).
|
|
|
|
The multiplier for memory units is 1024, not 1000.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
2014-12-15 00:09:51 +01:00
|
|
|
|
2014-09-11 02:50:15 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2019-03-10 20:01:39 +01:00
|
|
|
Valid time units are
|
|
|
|
<literal>us</literal> (microseconds),
|
|
|
|
<literal>ms</literal> (milliseconds),
|
2014-09-11 02:50:15 +02:00
|
|
|
<literal>s</literal> (seconds), <literal>min</literal> (minutes),
|
|
|
|
<literal>h</literal> (hours), and <literal>d</literal> (days).
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
2019-03-12 00:13:46 +01:00
|
|
|
|
|
|
|
If a fractional value is specified with a unit, it will be rounded
|
|
|
|
to a multiple of the next smaller unit if there is one.
|
|
|
|
For example, <literal>30.1 GB</literal> will be converted
|
|
|
|
to <literal>30822 MB</literal> not <literal>32319628902 B</literal>.
|
|
|
|
If the parameter is of integer type, a final rounding to integer
|
2020-04-10 04:18:39 +02:00
|
|
|
occurs after any unit conversion.
|
2014-09-11 02:50:15 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2014-12-15 00:09:51 +01:00
|
|
|
<emphasis>Enumerated:</emphasis>
|
|
|
|
Enumerated-type parameters are written in the same way as string
|
|
|
|
parameters, but are restricted to have one of a limited set of
|
|
|
|
values. The values allowable for such a parameter can be found from
|
2017-10-09 03:44:17 +02:00
|
|
|
<structname>pg_settings</structname>.<structfield>enumvals</structfield>.
|
2014-09-11 02:50:15 +02:00
|
|
|
Enum parameter values are case-insensitive.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
2012-05-11 05:01:28 +02:00
|
|
|
</sect2>
|
|
|
|
|
|
|
|
<sect2 id="config-setting-configuration-file">
|
2014-12-15 00:09:51 +01:00
|
|
|
<title>Parameter Interaction via the Configuration File</title>
|
2012-05-11 05:01:28 +02:00
|
|
|
|
|
|
|
<para>
|
2014-12-15 00:09:51 +01:00
|
|
|
The most fundamental way to set these parameters is to edit the file
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>postgresql.conf</filename><indexterm><primary>postgresql.conf</primary></indexterm>,
|
2014-12-15 00:09:51 +01:00
|
|
|
which is normally kept in the data directory. A default copy is
|
|
|
|
installed when the database cluster directory is initialized.
|
2014-09-11 02:50:15 +02:00
|
|
|
An example of what this file might look like is:
|
2005-09-13 00:11:38 +02:00
|
|
|
<programlisting>
|
|
|
|
# This is a comment
|
|
|
|
log_connections = yes
|
|
|
|
log_destination = 'syslog'
|
2005-12-23 01:38:04 +01:00
|
|
|
search_path = '"$user", public'
|
2006-07-27 10:30:41 +02:00
|
|
|
shared_buffers = 128MB
|
2005-09-13 00:11:38 +02:00
|
|
|
</programlisting>
|
2012-05-11 05:01:28 +02:00
|
|
|
One parameter is specified per line. The equal sign between name and
|
2014-12-15 00:09:51 +01:00
|
|
|
value is optional. Whitespace is insignificant (except within a quoted
|
|
|
|
parameter value) and blank lines are
|
2014-09-11 02:50:15 +02:00
|
|
|
ignored. Hash marks (<literal>#</literal>) designate the remainder
|
|
|
|
of the line as a comment. Parameter values that are not simple
|
|
|
|
identifiers or numbers must be single-quoted. To embed a single
|
2014-12-15 00:09:51 +01:00
|
|
|
quote in a parameter value, write either two quotes (preferred)
|
2014-09-11 02:50:15 +02:00
|
|
|
or backslash-quote.
|
2019-08-15 17:14:26 +02:00
|
|
|
If the file contains multiple entries for the same parameter,
|
|
|
|
all but the last one are ignored.
|
2014-09-11 02:50:15 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Parameters set in this way provide default values for the cluster.
|
2014-12-15 00:09:51 +01:00
|
|
|
The settings seen by active sessions will be these values unless they
|
|
|
|
are overridden. The following sections describe ways in which the
|
2014-09-11 02:50:15 +02:00
|
|
|
administrator or user can override these defaults.
|
2012-05-11 05:01:28 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
<indexterm>
|
2014-12-15 00:09:51 +01:00
|
|
|
<primary>SIGHUP</primary>
|
2012-05-11 05:01:28 +02:00
|
|
|
</indexterm>
|
|
|
|
The configuration file is reread whenever the main server process
|
2017-10-09 03:44:17 +02:00
|
|
|
receives a <systemitem>SIGHUP</systemitem> signal; this signal is most easily
|
|
|
|
sent by running <literal>pg_ctl reload</literal> from the command line or by
|
2014-12-15 00:09:51 +01:00
|
|
|
calling the SQL function <function>pg_reload_conf()</function>. The main
|
2014-09-11 02:50:15 +02:00
|
|
|
server process also propagates this signal to all currently running
|
2014-12-15 00:09:51 +01:00
|
|
|
server processes, so that existing sessions also adopt the new values
|
|
|
|
(this will happen after they complete any currently-executing client
|
|
|
|
command). Alternatively, you can
|
2014-09-11 02:50:15 +02:00
|
|
|
send the signal to a single server process directly. Some parameters
|
|
|
|
can only be set at server start; any changes to their entries in the
|
|
|
|
configuration file will be ignored until the server is restarted.
|
|
|
|
Invalid parameter settings in the configuration file are likewise
|
2017-10-09 03:44:17 +02:00
|
|
|
ignored (but logged) during <systemitem>SIGHUP</systemitem> processing.
|
2012-05-11 05:01:28 +02:00
|
|
|
</para>
|
2014-12-15 00:09:51 +01:00
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
In addition to <filename>postgresql.conf</filename>,
|
2014-12-15 00:09:51 +01:00
|
|
|
a <productname>PostgreSQL</productname> data directory contains a file
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>postgresql.auto.conf</filename><indexterm><primary>postgresql.auto.conf</primary></indexterm>,
|
2019-08-15 17:14:26 +02:00
|
|
|
which has the same format as <filename>postgresql.conf</filename> but
|
2019-09-13 16:21:20 +02:00
|
|
|
is intended to be edited automatically, not manually. This file holds
|
Improve <xref> vs. <command> formatting in the documentation
SQL commands are generally marked up as <command>, except when a link
to a reference page is used using <xref>. But the latter doesn't
create monospace markup, so this looks strange especially when a
paragraph contains a mix of links and non-links.
We considered putting <command> in the <refentrytitle> on the target
side, but that creates some formatting side effects elsewhere.
Generally, it seems safer to solve this on the link source side.
We can't put the <xref> inside the <command>; the DTD doesn't allow
this. DocBook 5 would allow the <command> to have the linkend
attribute itself, but we are not there yet.
So to solve this for now, convert the <xref>s to <link> plus
<command>. This gives the correct look and also gives some more
flexibility what we can put into the link text (e.g., subcommands or
other clauses). In the future, these could then be converted to
DocBook 5 style.
I haven't converted absolutely all xrefs to SQL command reference
pages, only those where we care about the appearance of the link text
or where it was otherwise appropriate to make the appearance match a
bit better. Also in some cases, the links where repetitive, so in
those cases the links where just removed and replaced by a plain
<command>. In cases where we just want the link and don't
specifically care about the generated link text (typically phrased
"for further information see <xref ...>") the xref is kept.
Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/87o8pco34z.fsf@wibble.ilmari.org
2020-10-03 16:16:51 +02:00
|
|
|
settings provided through the <link linkend="sql-altersystem"><command>ALTER SYSTEM</command></link> command.
|
2019-08-15 17:14:26 +02:00
|
|
|
This file is read whenever <filename>postgresql.conf</filename> is,
|
|
|
|
and its settings take effect in the same way. Settings
|
|
|
|
in <filename>postgresql.auto.conf</filename> override those
|
|
|
|
in <filename>postgresql.conf</filename>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
External tools may also
|
|
|
|
modify <filename>postgresql.auto.conf</filename>. It is not
|
|
|
|
recommended to do this while the server is running, since a
|
|
|
|
concurrent <command>ALTER SYSTEM</command> command could overwrite
|
|
|
|
such changes. Such tools might simply append new settings to the end,
|
|
|
|
or they might choose to remove duplicate settings and/or comments
|
|
|
|
(as <command>ALTER SYSTEM</command> will).
|
2014-12-15 00:09:51 +01:00
|
|
|
</para>
|
Improve design and implementation of pg_file_settings view.
As first committed, this view reported on the file contents as they were
at the last SIGHUP event. That's not as useful as reporting on the current
contents, and what's more, it didn't work right on Windows unless the
current session had serviced at least one SIGHUP. Therefore, arrange to
re-read the files when pg_show_all_settings() is called. This requires
only minor refactoring so that we can pass changeVal = false to
set_config_option() so that it won't actually apply any changes locally.
In addition, add error reporting so that errors that would prevent the
configuration files from being loaded, or would prevent individual settings
from being applied, are visible directly in the view. This makes the view
usable for pre-testing whether edits made in the config files will have the
desired effect, before one actually issues a SIGHUP.
I also added an "applied" column so that it's easy to identify entries that
are superseded by later entries; this was the main use-case for the original
design, but it seemed unnecessarily hard to use for that.
Also fix a 9.4.1 regression that allowed multiple entries for a
PGC_POSTMASTER variable to cause bogus complaints in the postmaster log.
(The issue here was that commit bf007a27acd7b2fb unintentionally reverted
3e3f65973a3c94a6, which suppressed any duplicate entries within
ParseConfigFp. However, since the original coding of the pg_file_settings
view depended on such suppression *not* happening, we couldn't have fixed
this issue now without first doing something with pg_file_settings.
Now we suppress duplicates by marking them "ignored" within
ProcessConfigFileInternal, which doesn't hide them in the view.)
Lesser changes include:
Drive the view directly off the ConfigVariable list, instead of making a
basically-equivalent second copy of the data. There's no longer any need
to hang onto the data permanently, anyway.
Convert show_all_file_settings() to do its work in one call and return a
tuplestore; this avoids risks associated with assuming that the GUC state
will hold still over the course of query execution. (I think there were
probably latent bugs here, though you might need something like a cursor
on the view to expose them.)
Arrange to run SIGHUP processing in a short-lived memory context, to
forestall process-lifespan memory leaks. (There is one known leak in this
code, in ProcessConfigDirectory; it seems minor enough to not be worth
back-patching a specific fix for.)
Remove mistaken assignment to ConfigFileLineno that caused line counting
after an include_dir directive to be completely wrong.
Add missed failure check in AlterSystemSetConfigFile(). We don't really
expect ParseConfigFp() to fail, but that's not an excuse for not checking.
2015-06-29 00:06:14 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
The system view
|
|
|
|
<link linkend="view-pg-file-settings"><structname>pg_file_settings</structname></link>
|
2019-08-15 17:14:26 +02:00
|
|
|
can be helpful for pre-testing changes to the configuration files, or for
|
2017-10-09 03:44:17 +02:00
|
|
|
diagnosing problems if a <systemitem>SIGHUP</systemitem> signal did not have the
|
Improve design and implementation of pg_file_settings view.
As first committed, this view reported on the file contents as they were
at the last SIGHUP event. That's not as useful as reporting on the current
contents, and what's more, it didn't work right on Windows unless the
current session had serviced at least one SIGHUP. Therefore, arrange to
re-read the files when pg_show_all_settings() is called. This requires
only minor refactoring so that we can pass changeVal = false to
set_config_option() so that it won't actually apply any changes locally.
In addition, add error reporting so that errors that would prevent the
configuration files from being loaded, or would prevent individual settings
from being applied, are visible directly in the view. This makes the view
usable for pre-testing whether edits made in the config files will have the
desired effect, before one actually issues a SIGHUP.
I also added an "applied" column so that it's easy to identify entries that
are superseded by later entries; this was the main use-case for the original
design, but it seemed unnecessarily hard to use for that.
Also fix a 9.4.1 regression that allowed multiple entries for a
PGC_POSTMASTER variable to cause bogus complaints in the postmaster log.
(The issue here was that commit bf007a27acd7b2fb unintentionally reverted
3e3f65973a3c94a6, which suppressed any duplicate entries within
ParseConfigFp. However, since the original coding of the pg_file_settings
view depended on such suppression *not* happening, we couldn't have fixed
this issue now without first doing something with pg_file_settings.
Now we suppress duplicates by marking them "ignored" within
ProcessConfigFileInternal, which doesn't hide them in the view.)
Lesser changes include:
Drive the view directly off the ConfigVariable list, instead of making a
basically-equivalent second copy of the data. There's no longer any need
to hang onto the data permanently, anyway.
Convert show_all_file_settings() to do its work in one call and return a
tuplestore; this avoids risks associated with assuming that the GUC state
will hold still over the course of query execution. (I think there were
probably latent bugs here, though you might need something like a cursor
on the view to expose them.)
Arrange to run SIGHUP processing in a short-lived memory context, to
forestall process-lifespan memory leaks. (There is one known leak in this
code, in ProcessConfigDirectory; it seems minor enough to not be worth
back-patching a specific fix for.)
Remove mistaken assignment to ConfigFileLineno that caused line counting
after an include_dir directive to be completely wrong.
Add missed failure check in AlterSystemSetConfigFile(). We don't really
expect ParseConfigFp() to fail, but that's not an excuse for not checking.
2015-06-29 00:06:14 +02:00
|
|
|
desired effects.
|
|
|
|
</para>
|
2012-05-11 05:01:28 +02:00
|
|
|
</sect2>
|
|
|
|
|
2014-09-11 02:50:15 +02:00
|
|
|
<sect2 id="config-setting-sql-command-interaction">
|
|
|
|
<title>Parameter Interaction via SQL</title>
|
2014-12-15 00:09:51 +01:00
|
|
|
|
2014-09-11 02:50:15 +02:00
|
|
|
<para>
|
2014-12-15 00:09:51 +01:00
|
|
|
<productname>PostgreSQL</productname> provides three SQL
|
|
|
|
commands to establish configuration defaults.
|
Improve <xref> vs. <command> formatting in the documentation
SQL commands are generally marked up as <command>, except when a link
to a reference page is used using <xref>. But the latter doesn't
create monospace markup, so this looks strange especially when a
paragraph contains a mix of links and non-links.
We considered putting <command> in the <refentrytitle> on the target
side, but that creates some formatting side effects elsewhere.
Generally, it seems safer to solve this on the link source side.
We can't put the <xref> inside the <command>; the DTD doesn't allow
this. DocBook 5 would allow the <command> to have the linkend
attribute itself, but we are not there yet.
So to solve this for now, convert the <xref>s to <link> plus
<command>. This gives the correct look and also gives some more
flexibility what we can put into the link text (e.g., subcommands or
other clauses). In the future, these could then be converted to
DocBook 5 style.
I haven't converted absolutely all xrefs to SQL command reference
pages, only those where we care about the appearance of the link text
or where it was otherwise appropriate to make the appearance match a
bit better. Also in some cases, the links where repetitive, so in
those cases the links where just removed and replaced by a plain
<command>. In cases where we just want the link and don't
specifically care about the generated link text (typically phrased
"for further information see <xref ...>") the xref is kept.
Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/87o8pco34z.fsf@wibble.ilmari.org
2020-10-03 16:16:51 +02:00
|
|
|
The already-mentioned <command>ALTER SYSTEM</command> command
|
2021-06-11 03:38:04 +02:00
|
|
|
provides an SQL-accessible means of changing global defaults; it is
|
2017-10-09 03:44:17 +02:00
|
|
|
functionally equivalent to editing <filename>postgresql.conf</filename>.
|
2014-12-15 00:09:51 +01:00
|
|
|
In addition, there are two commands that allow setting of defaults
|
|
|
|
on a per-database or per-role basis:
|
2014-09-11 02:50:15 +02:00
|
|
|
</para>
|
2012-05-11 05:01:28 +02:00
|
|
|
|
2014-09-11 02:50:15 +02:00
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
Improve <xref> vs. <command> formatting in the documentation
SQL commands are generally marked up as <command>, except when a link
to a reference page is used using <xref>. But the latter doesn't
create monospace markup, so this looks strange especially when a
paragraph contains a mix of links and non-links.
We considered putting <command> in the <refentrytitle> on the target
side, but that creates some formatting side effects elsewhere.
Generally, it seems safer to solve this on the link source side.
We can't put the <xref> inside the <command>; the DTD doesn't allow
this. DocBook 5 would allow the <command> to have the linkend
attribute itself, but we are not there yet.
So to solve this for now, convert the <xref>s to <link> plus
<command>. This gives the correct look and also gives some more
flexibility what we can put into the link text (e.g., subcommands or
other clauses). In the future, these could then be converted to
DocBook 5 style.
I haven't converted absolutely all xrefs to SQL command reference
pages, only those where we care about the appearance of the link text
or where it was otherwise appropriate to make the appearance match a
bit better. Also in some cases, the links where repetitive, so in
those cases the links where just removed and replaced by a plain
<command>. In cases where we just want the link and don't
specifically care about the generated link text (typically phrased
"for further information see <xref ...>") the xref is kept.
Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/87o8pco34z.fsf@wibble.ilmari.org
2020-10-03 16:16:51 +02:00
|
|
|
The <link linkend="sql-alterdatabase"><command>ALTER DATABASE</command></link> command allows global
|
2014-12-15 00:09:51 +01:00
|
|
|
settings to be overridden on a per-database basis.
|
2014-09-11 02:50:15 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
2012-05-11 05:01:28 +02:00
|
|
|
|
2014-09-11 02:50:15 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Improve <xref> vs. <command> formatting in the documentation
SQL commands are generally marked up as <command>, except when a link
to a reference page is used using <xref>. But the latter doesn't
create monospace markup, so this looks strange especially when a
paragraph contains a mix of links and non-links.
We considered putting <command> in the <refentrytitle> on the target
side, but that creates some formatting side effects elsewhere.
Generally, it seems safer to solve this on the link source side.
We can't put the <xref> inside the <command>; the DTD doesn't allow
this. DocBook 5 would allow the <command> to have the linkend
attribute itself, but we are not there yet.
So to solve this for now, convert the <xref>s to <link> plus
<command>. This gives the correct look and also gives some more
flexibility what we can put into the link text (e.g., subcommands or
other clauses). In the future, these could then be converted to
DocBook 5 style.
I haven't converted absolutely all xrefs to SQL command reference
pages, only those where we care about the appearance of the link text
or where it was otherwise appropriate to make the appearance match a
bit better. Also in some cases, the links where repetitive, so in
those cases the links where just removed and replaced by a plain
<command>. In cases where we just want the link and don't
specifically care about the generated link text (typically phrased
"for further information see <xref ...>") the xref is kept.
Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/87o8pco34z.fsf@wibble.ilmari.org
2020-10-03 16:16:51 +02:00
|
|
|
The <link linkend="sql-alterrole"><command>ALTER ROLE</command></link> command allows both global and
|
2014-12-15 00:09:51 +01:00
|
|
|
per-database settings to be overridden with user-specific values.
|
2014-09-11 02:50:15 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Values set with <command>ALTER DATABASE</command> and <command>ALTER ROLE</command>
|
2014-12-15 00:09:51 +01:00
|
|
|
are applied only when starting a fresh database session. They
|
|
|
|
override values obtained from the configuration files or server
|
|
|
|
command line, and constitute defaults for the rest of the session.
|
|
|
|
Note that some settings cannot be changed after server start, and
|
|
|
|
so cannot be set with these commands (or the ones listed below).
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Once a client is connected to the database, <productname>PostgreSQL</productname>
|
2014-12-15 00:09:51 +01:00
|
|
|
provides two additional SQL commands (and equivalent functions) to
|
|
|
|
interact with session-local configuration settings:
|
2012-05-11 05:01:28 +02:00
|
|
|
</para>
|
|
|
|
|
2014-09-11 02:50:15 +02:00
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
Improve <xref> vs. <command> formatting in the documentation
SQL commands are generally marked up as <command>, except when a link
to a reference page is used using <xref>. But the latter doesn't
create monospace markup, so this looks strange especially when a
paragraph contains a mix of links and non-links.
We considered putting <command> in the <refentrytitle> on the target
side, but that creates some formatting side effects elsewhere.
Generally, it seems safer to solve this on the link source side.
We can't put the <xref> inside the <command>; the DTD doesn't allow
this. DocBook 5 would allow the <command> to have the linkend
attribute itself, but we are not there yet.
So to solve this for now, convert the <xref>s to <link> plus
<command>. This gives the correct look and also gives some more
flexibility what we can put into the link text (e.g., subcommands or
other clauses). In the future, these could then be converted to
DocBook 5 style.
I haven't converted absolutely all xrefs to SQL command reference
pages, only those where we care about the appearance of the link text
or where it was otherwise appropriate to make the appearance match a
bit better. Also in some cases, the links where repetitive, so in
those cases the links where just removed and replaced by a plain
<command>. In cases where we just want the link and don't
specifically care about the generated link text (typically phrased
"for further information see <xref ...>") the xref is kept.
Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/87o8pco34z.fsf@wibble.ilmari.org
2020-10-03 16:16:51 +02:00
|
|
|
The <link linkend="sql-show"><command>SHOW</command></link> command allows inspection of the
|
2021-01-29 16:46:14 +01:00
|
|
|
current value of any parameter. The corresponding SQL function is
|
|
|
|
<function>current_setting(setting_name text)</function>
|
|
|
|
(see <xref linkend="functions-admin-set"/>).
|
2014-09-11 02:50:15 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
Improve <xref> vs. <command> formatting in the documentation
SQL commands are generally marked up as <command>, except when a link
to a reference page is used using <xref>. But the latter doesn't
create monospace markup, so this looks strange especially when a
paragraph contains a mix of links and non-links.
We considered putting <command> in the <refentrytitle> on the target
side, but that creates some formatting side effects elsewhere.
Generally, it seems safer to solve this on the link source side.
We can't put the <xref> inside the <command>; the DTD doesn't allow
this. DocBook 5 would allow the <command> to have the linkend
attribute itself, but we are not there yet.
So to solve this for now, convert the <xref>s to <link> plus
<command>. This gives the correct look and also gives some more
flexibility what we can put into the link text (e.g., subcommands or
other clauses). In the future, these could then be converted to
DocBook 5 style.
I haven't converted absolutely all xrefs to SQL command reference
pages, only those where we care about the appearance of the link text
or where it was otherwise appropriate to make the appearance match a
bit better. Also in some cases, the links where repetitive, so in
those cases the links where just removed and replaced by a plain
<command>. In cases where we just want the link and don't
specifically care about the generated link text (typically phrased
"for further information see <xref ...>") the xref is kept.
Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/87o8pco34z.fsf@wibble.ilmari.org
2020-10-03 16:16:51 +02:00
|
|
|
The <link linkend="sql-set"><command>SET</command></link> command allows modification of the
|
2014-12-15 00:09:51 +01:00
|
|
|
current value of those parameters that can be set locally to a
|
|
|
|
session; it has no effect on other sessions.
|
2021-01-29 16:46:14 +01:00
|
|
|
The corresponding SQL function is
|
|
|
|
<function>set_config(setting_name, new_value, is_local)</function>
|
|
|
|
(see <xref linkend="functions-admin-set"/>).
|
2014-09-11 02:50:15 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
|
|
|
2012-05-11 05:01:28 +02:00
|
|
|
<para>
|
2014-12-15 00:09:51 +01:00
|
|
|
In addition, the system view <link
|
2017-10-09 03:44:17 +02:00
|
|
|
linkend="view-pg-settings"><structname>pg_settings</structname></link> can be
|
2014-12-15 00:09:51 +01:00
|
|
|
used to view and change session-local values:
|
2012-05-11 05:01:28 +02:00
|
|
|
</para>
|
2013-12-18 15:42:44 +01:00
|
|
|
|
2014-09-11 02:50:15 +02:00
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Querying this view is similar to using <command>SHOW ALL</command> but
|
2014-12-15 00:09:51 +01:00
|
|
|
provides more detail. It is also more flexible, since it's possible
|
|
|
|
to specify filter conditions or join against other relations.
|
2014-09-11 02:50:15 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
|
2014-12-15 00:09:51 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Improve <xref> vs. <command> formatting in the documentation
SQL commands are generally marked up as <command>, except when a link
to a reference page is used using <xref>. But the latter doesn't
create monospace markup, so this looks strange especially when a
paragraph contains a mix of links and non-links.
We considered putting <command> in the <refentrytitle> on the target
side, but that creates some formatting side effects elsewhere.
Generally, it seems safer to solve this on the link source side.
We can't put the <xref> inside the <command>; the DTD doesn't allow
this. DocBook 5 would allow the <command> to have the linkend
attribute itself, but we are not there yet.
So to solve this for now, convert the <xref>s to <link> plus
<command>. This gives the correct look and also gives some more
flexibility what we can put into the link text (e.g., subcommands or
other clauses). In the future, these could then be converted to
DocBook 5 style.
I haven't converted absolutely all xrefs to SQL command reference
pages, only those where we care about the appearance of the link text
or where it was otherwise appropriate to make the appearance match a
bit better. Also in some cases, the links where repetitive, so in
those cases the links where just removed and replaced by a plain
<command>. In cases where we just want the link and don't
specifically care about the generated link text (typically phrased
"for further information see <xref ...>") the xref is kept.
Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/87o8pco34z.fsf@wibble.ilmari.org
2020-10-03 16:16:51 +02:00
|
|
|
Using <command>UPDATE</command> on this view, specifically
|
2017-10-09 03:44:17 +02:00
|
|
|
updating the <structname>setting</structname> column, is the equivalent
|
|
|
|
of issuing <command>SET</command> commands. For example, the equivalent of
|
2014-09-11 02:50:15 +02:00
|
|
|
<programlisting>
|
|
|
|
SET configuration_parameter TO DEFAULT;
|
2014-12-15 00:09:51 +01:00
|
|
|
</programlisting>
|
|
|
|
is:
|
2014-09-11 02:50:15 +02:00
|
|
|
<programlisting>
|
|
|
|
UPDATE pg_settings SET setting = reset_val WHERE name = 'configuration_parameter';
|
|
|
|
</programlisting>
|
2014-12-15 00:09:51 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
2014-09-11 02:50:15 +02:00
|
|
|
|
2012-05-11 05:01:28 +02:00
|
|
|
</sect2>
|
|
|
|
|
2014-09-11 02:50:15 +02:00
|
|
|
<sect2>
|
2014-12-15 00:09:51 +01:00
|
|
|
<title>Parameter Interaction via the Shell</title>
|
|
|
|
|
2014-09-11 02:50:15 +02:00
|
|
|
<para>
|
|
|
|
In addition to setting global defaults or attaching
|
|
|
|
overrides at the database or role level, you can pass settings to
|
|
|
|
<productname>PostgreSQL</productname> via shell facilities.
|
2017-10-09 03:44:17 +02:00
|
|
|
Both the server and <application>libpq</application> client library
|
2014-09-11 02:50:15 +02:00
|
|
|
accept parameter values via the shell.
|
|
|
|
</para>
|
2012-05-11 05:01:28 +02:00
|
|
|
|
2014-09-11 02:50:15 +02:00
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2014-12-15 00:09:51 +01:00
|
|
|
During server startup, parameter settings can be
|
|
|
|
passed to the <command>postgres</command> command via the
|
2017-10-09 03:44:17 +02:00
|
|
|
<option>-c</option> command-line parameter. For example,
|
2014-09-11 02:50:15 +02:00
|
|
|
<programlisting>
|
|
|
|
postgres -c log_connections=yes -c log_destination='syslog'
|
|
|
|
</programlisting>
|
2014-12-15 00:09:51 +01:00
|
|
|
Settings provided in this way override those set via
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>postgresql.conf</filename> or <command>ALTER SYSTEM</command>,
|
2014-12-15 00:09:51 +01:00
|
|
|
so they cannot be changed globally without restarting the server.
|
2014-09-11 02:50:15 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
2012-05-11 05:01:28 +02:00
|
|
|
|
2014-09-11 02:50:15 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
When starting a client session via <application>libpq</application>,
|
2014-12-15 00:09:51 +01:00
|
|
|
parameter settings can be
|
2014-09-11 02:50:15 +02:00
|
|
|
specified using the <envar>PGOPTIONS</envar> environment variable.
|
2014-12-15 00:09:51 +01:00
|
|
|
Settings established in this way constitute defaults for the life
|
|
|
|
of the session, but do not affect other sessions.
|
|
|
|
For historical reasons, the format of <envar>PGOPTIONS</envar> is
|
|
|
|
similar to that used when launching the <command>postgres</command>
|
2017-10-09 03:44:17 +02:00
|
|
|
command; specifically, the <option>-c</option> flag must be specified.
|
2014-12-15 00:09:51 +01:00
|
|
|
For example,
|
2014-09-11 02:50:15 +02:00
|
|
|
<programlisting>
|
2014-12-15 00:09:51 +01:00
|
|
|
env PGOPTIONS="-c geqo=off -c statement_timeout=5min" psql
|
2014-09-11 02:50:15 +02:00
|
|
|
</programlisting>
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Other clients and libraries might provide their own mechanisms,
|
|
|
|
via the shell or otherwise, that allow the user to alter session
|
2014-12-15 00:09:51 +01:00
|
|
|
settings without direct use of SQL commands.
|
2014-09-11 02:50:15 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
2012-05-11 05:01:28 +02:00
|
|
|
|
|
|
|
</sect2>
|
2012-09-24 16:55:53 +02:00
|
|
|
|
|
|
|
<sect2 id="config-includes">
|
2014-12-15 00:09:51 +01:00
|
|
|
<title>Managing Configuration File Contents</title>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
<productname>PostgreSQL</productname> provides several features for breaking
|
|
|
|
down complex <filename>postgresql.conf</filename> files into sub-files.
|
2014-12-15 00:09:51 +01:00
|
|
|
These features are especially useful when managing multiple servers
|
|
|
|
with related, but not identical, configurations.
|
|
|
|
</para>
|
2012-09-24 16:55:53 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><literal>include</literal></primary>
|
2012-09-24 16:55:53 +02:00
|
|
|
<secondary>in configuration file</secondary>
|
2014-12-15 00:09:51 +01:00
|
|
|
</indexterm>
|
|
|
|
In addition to individual parameter settings,
|
2017-10-09 03:44:17 +02:00
|
|
|
the <filename>postgresql.conf</filename> file can contain <firstterm>include
|
|
|
|
directives</firstterm>, which specify another file to read and process as if
|
2014-12-15 00:09:51 +01:00
|
|
|
it were inserted into the configuration file at this point. This
|
|
|
|
feature allows a configuration file to be divided into physically
|
|
|
|
separate parts. Include directives simply look like:
|
2012-09-24 16:55:53 +02:00
|
|
|
<programlisting>
|
|
|
|
include 'filename'
|
|
|
|
</programlisting>
|
2014-12-15 00:09:51 +01:00
|
|
|
If the file name is not an absolute path, it is taken as relative to
|
|
|
|
the directory containing the referencing configuration file.
|
|
|
|
Inclusions can be nested.
|
2012-09-24 16:55:53 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><literal>include_if_exists</literal></primary>
|
2012-09-24 16:55:53 +02:00
|
|
|
<secondary>in configuration file</secondary>
|
|
|
|
</indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
There is also an <literal>include_if_exists</literal> directive, which acts
|
|
|
|
the same as the <literal>include</literal> directive, except
|
2014-12-15 00:09:51 +01:00
|
|
|
when the referenced file does not exist or cannot be read. A regular
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>include</literal> will consider this an error condition, but
|
|
|
|
<literal>include_if_exists</literal> merely logs a message and continues
|
2014-12-15 00:09:51 +01:00
|
|
|
processing the referencing configuration file.
|
2012-09-24 16:55:53 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><literal>include_dir</literal></primary>
|
2012-09-24 16:55:53 +02:00
|
|
|
<secondary>in configuration file</secondary>
|
|
|
|
</indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
The <filename>postgresql.conf</filename> file can also contain
|
2014-12-15 00:09:51 +01:00
|
|
|
<literal>include_dir</literal> directives, which specify an entire
|
|
|
|
directory of configuration files to include. These look like
|
2014-07-08 17:39:07 +02:00
|
|
|
<programlisting>
|
|
|
|
include_dir 'directory'
|
|
|
|
</programlisting>
|
2014-12-15 00:09:51 +01:00
|
|
|
Non-absolute directory names are taken as relative to the directory
|
|
|
|
containing the referencing configuration file. Within the specified
|
|
|
|
directory, only non-directory files whose names end with the
|
|
|
|
suffix <literal>.conf</literal> will be included. File names that
|
|
|
|
start with the <literal>.</literal> character are also ignored, to
|
|
|
|
prevent mistakes since such files are hidden on some platforms. Multiple
|
|
|
|
files within an include directory are processed in file name order
|
2020-09-01 00:33:37 +02:00
|
|
|
(according to C locale rules, i.e., numbers before letters, and
|
2014-12-15 00:09:51 +01:00
|
|
|
uppercase letters before lowercase ones).
|
2012-09-24 16:55:53 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2014-12-15 00:09:51 +01:00
|
|
|
Include files or directories can be used to logically separate portions
|
|
|
|
of the database configuration, rather than having a single large
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>postgresql.conf</filename> file. Consider a company that has two
|
2014-12-15 00:09:51 +01:00
|
|
|
database servers, each with a different amount of memory. There are
|
|
|
|
likely elements of the configuration both will share, for things such
|
|
|
|
as logging. But memory-related parameters on the server will vary
|
|
|
|
between the two. And there might be server specific customizations,
|
|
|
|
too. One way to manage this situation is to break the custom
|
|
|
|
configuration changes for your site into three files. You could add
|
2017-10-09 03:44:17 +02:00
|
|
|
this to the end of your <filename>postgresql.conf</filename> file to include
|
2014-12-15 00:09:51 +01:00
|
|
|
them:
|
2014-07-08 17:39:07 +02:00
|
|
|
<programlisting>
|
|
|
|
include 'shared.conf'
|
|
|
|
include 'memory.conf'
|
|
|
|
include 'server.conf'
|
|
|
|
</programlisting>
|
2017-10-09 03:44:17 +02:00
|
|
|
All systems would have the same <filename>shared.conf</filename>. Each
|
2014-12-15 00:09:51 +01:00
|
|
|
server with a particular amount of memory could share the
|
2017-10-09 03:44:17 +02:00
|
|
|
same <filename>memory.conf</filename>; you might have one for all servers
|
2014-12-15 00:09:51 +01:00
|
|
|
with 8GB of RAM, another for those having 16GB. And
|
2017-10-09 03:44:17 +02:00
|
|
|
finally <filename>server.conf</filename> could have truly server-specific
|
2014-12-15 00:09:51 +01:00
|
|
|
configuration information in it.
|
2012-09-24 16:55:53 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2014-12-15 00:09:51 +01:00
|
|
|
Another possibility is to create a configuration file directory and
|
2017-10-09 03:44:17 +02:00
|
|
|
put this information into files there. For example, a <filename>conf.d</filename>
|
|
|
|
directory could be referenced at the end of <filename>postgresql.conf</filename>:
|
2014-07-08 17:39:07 +02:00
|
|
|
<programlisting>
|
|
|
|
include_dir 'conf.d'
|
|
|
|
</programlisting>
|
2017-10-09 03:44:17 +02:00
|
|
|
Then you could name the files in the <filename>conf.d</filename> directory
|
2014-12-15 00:09:51 +01:00
|
|
|
like this:
|
2014-07-08 17:39:07 +02:00
|
|
|
<programlisting>
|
|
|
|
00shared.conf
|
|
|
|
01memory.conf
|
|
|
|
02server.conf
|
|
|
|
</programlisting>
|
2014-12-15 00:09:51 +01:00
|
|
|
This naming convention establishes a clear order in which these
|
|
|
|
files will be loaded. This is important because only the last
|
|
|
|
setting encountered for a particular parameter while the server is
|
|
|
|
reading configuration files will be used. In this example,
|
2017-10-09 03:44:17 +02:00
|
|
|
something set in <filename>conf.d/02server.conf</filename> would override a
|
|
|
|
value set in <filename>conf.d/01memory.conf</filename>.
|
2012-09-24 16:55:53 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2014-12-15 00:09:51 +01:00
|
|
|
You might instead use this approach to naming the files
|
|
|
|
descriptively:
|
2014-07-08 17:39:07 +02:00
|
|
|
<programlisting>
|
|
|
|
00shared.conf
|
|
|
|
01memory-8GB.conf
|
|
|
|
02server-foo.conf
|
|
|
|
</programlisting>
|
2014-12-15 00:09:51 +01:00
|
|
|
This sort of arrangement gives a unique name for each configuration file
|
|
|
|
variation. This can help eliminate ambiguity when several servers have
|
|
|
|
their configurations all stored in one place, such as in a version
|
|
|
|
control repository. (Storing database configuration files under version
|
|
|
|
control is another good practice to consider.)
|
2012-09-24 16:55:53 +02:00
|
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
</sect1>
|
2005-09-13 00:11:38 +02:00
|
|
|
|
|
|
|
<sect1 id="runtime-config-file-locations">
|
|
|
|
<title>File Locations</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
In addition to the <filename>postgresql.conf</filename> file
|
|
|
|
already mentioned, <productname>PostgreSQL</productname> uses
|
|
|
|
two other manually-edited configuration files, which control
|
|
|
|
client authentication (their use is discussed in <xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="client-authentication"/>). By default, all three
|
2005-09-13 00:11:38 +02:00
|
|
|
configuration files are stored in the database cluster's data
|
2006-01-23 19:16:41 +01:00
|
|
|
directory. The parameters described in this section allow the
|
2005-09-13 00:11:38 +02:00
|
|
|
configuration files to be placed elsewhere. (Doing so can ease
|
|
|
|
administration. In particular it is often easier to ensure that
|
|
|
|
the configuration files are properly backed-up when they are
|
|
|
|
kept separate.)
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry id="guc-data-directory" xreflabel="data_directory">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>data_directory</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>data_directory</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the directory to use for data storage.
|
2006-01-23 19:16:41 +01:00
|
|
|
This parameter can only be set at server start.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-config-file" xreflabel="config_file">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>config_file</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>config_file</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the main server configuration file
|
2017-10-09 03:44:17 +02:00
|
|
|
(customarily called <filename>postgresql.conf</filename>).
|
2006-10-23 20:10:32 +02:00
|
|
|
This parameter can only be set on the <command>postgres</command> command line.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-hba-file" xreflabel="hba_file">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>hba_file</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>hba_file</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the configuration file for host-based authentication
|
2017-10-09 03:44:17 +02:00
|
|
|
(customarily called <filename>pg_hba.conf</filename>).
|
2006-01-23 19:16:41 +01:00
|
|
|
This parameter can only be set at server start.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-ident-file" xreflabel="ident_file">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>ident_file</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>ident_file</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-03-19 04:43:47 +01:00
|
|
|
Specifies the configuration file for user name mapping
|
2017-10-09 03:44:17 +02:00
|
|
|
(customarily called <filename>pg_ident.conf</filename>).
|
2006-01-23 19:16:41 +01:00
|
|
|
This parameter can only be set at server start.
|
2017-11-23 15:39:47 +01:00
|
|
|
See also <xref linkend="auth-username-maps"/>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-external-pid-file" xreflabel="external_pid_file">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>external_pid_file</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>external_pid_file</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2011-06-18 23:37:30 +02:00
|
|
|
Specifies the name of an additional process-ID (PID) file that the
|
2006-06-18 17:38:37 +02:00
|
|
|
server should create for use by server administration programs.
|
2006-01-23 19:16:41 +01:00
|
|
|
This parameter can only be set at server start.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
|
|
|
|
|
|
<para>
|
2006-01-23 19:16:41 +01:00
|
|
|
In a default installation, none of the above parameters are set
|
|
|
|
explicitly. Instead, the
|
2005-09-13 00:11:38 +02:00
|
|
|
data directory is specified by the <option>-D</option> command-line
|
|
|
|
option or the <envar>PGDATA</envar> environment variable, and the
|
|
|
|
configuration files are all found within the data directory.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
If you wish to keep the configuration files elsewhere than the
|
2006-10-23 20:10:32 +02:00
|
|
|
data directory, the <command>postgres</command> <option>-D</option>
|
2005-09-13 00:11:38 +02:00
|
|
|
command-line option or <envar>PGDATA</envar> environment variable
|
|
|
|
must point to the directory containing the configuration files,
|
2017-10-09 03:44:17 +02:00
|
|
|
and the <varname>data_directory</varname> parameter must be set in
|
2005-09-13 00:11:38 +02:00
|
|
|
<filename>postgresql.conf</filename> (or on the command line) to show
|
|
|
|
where the data directory is actually located. Notice that
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>data_directory</varname> overrides <option>-D</option> and
|
2005-09-13 00:11:38 +02:00
|
|
|
<envar>PGDATA</envar> for the location
|
|
|
|
of the data directory, but not for the location of the configuration
|
|
|
|
files.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
If you wish, you can specify the configuration file names and locations
|
2017-10-09 03:44:17 +02:00
|
|
|
individually using the parameters <varname>config_file</varname>,
|
|
|
|
<varname>hba_file</varname> and/or <varname>ident_file</varname>.
|
|
|
|
<varname>config_file</varname> can only be specified on the
|
2006-06-18 17:38:37 +02:00
|
|
|
<command>postgres</command> command line, but the others can be
|
2006-01-23 19:16:41 +01:00
|
|
|
set within the main configuration file. If all three parameters plus
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>data_directory</varname> are explicitly set, then it is not necessary
|
2005-09-13 00:11:38 +02:00
|
|
|
to specify <option>-D</option> or <envar>PGDATA</envar>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2006-01-23 19:16:41 +01:00
|
|
|
When setting any of these parameters, a relative path will be interpreted
|
2006-06-18 17:38:37 +02:00
|
|
|
with respect to the directory in which <command>postgres</command>
|
2005-09-13 00:11:38 +02:00
|
|
|
is started.
|
|
|
|
</para>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="runtime-config-connection">
|
|
|
|
<title>Connections and Authentication</title>
|
|
|
|
|
|
|
|
<sect2 id="runtime-config-connection-settings">
|
|
|
|
<title>Connection Settings</title>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
|
|
|
|
<varlistentry id="guc-listen-addresses" xreflabel="listen_addresses">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>listen_addresses</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>listen_addresses</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the TCP/IP address(es) on which the server is
|
2007-06-03 19:08:34 +02:00
|
|
|
to listen for connections from client applications.
|
2005-09-13 00:11:38 +02:00
|
|
|
The value takes the form of a comma-separated list of host names
|
2017-10-09 03:44:17 +02:00
|
|
|
and/or numeric IP addresses. The special entry <literal>*</literal>
|
2011-03-11 16:31:25 +01:00
|
|
|
corresponds to all available IP interfaces. The entry
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>0.0.0.0</literal> allows listening for all IPv4 addresses and
|
|
|
|
<literal>::</literal> allows listening for all IPv6 addresses.
|
2005-09-13 00:11:38 +02:00
|
|
|
If the list is empty, the server does not listen on any IP interface
|
|
|
|
at all, in which case only Unix-domain sockets can be used to connect
|
|
|
|
to it.
|
2017-10-09 03:44:17 +02:00
|
|
|
The default value is <systemitem class="systemname">localhost</systemitem>,
|
|
|
|
which allows only local TCP/IP <quote>loopback</quote> connections to be
|
2009-10-04 01:10:47 +02:00
|
|
|
made. While client authentication (<xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="client-authentication"/>) allows fine-grained control
|
2009-10-04 01:10:47 +02:00
|
|
|
over who can access the server, <varname>listen_addresses</varname>
|
|
|
|
controls which interfaces accept connection attempts, which
|
|
|
|
can help prevent repeated malicious connection requests on
|
|
|
|
insecure network interfaces. This parameter can only be set
|
|
|
|
at server start.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-port" xreflabel="port">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>port</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>port</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
The TCP port the server listens on; 5432 by default. Note that the
|
|
|
|
same port number is used for all IP addresses the server listens on.
|
|
|
|
This parameter can only be set at server start.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-max-connections" xreflabel="max_connections">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>max_connections</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_connections</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Determines the maximum number of concurrent connections to the
|
2007-01-20 22:30:26 +01:00
|
|
|
database server. The default is typically 100 connections, but
|
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
|
|
|
might be less if your kernel settings will not support it (as
|
2017-10-09 03:44:17 +02:00
|
|
|
determined during <application>initdb</application>). This parameter can
|
2007-01-20 22:30:26 +01:00
|
|
|
only be set at server start.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
<para>
|
|
|
|
When running a standby server, you must set this parameter to the
|
2020-06-15 19:12:58 +02:00
|
|
|
same or higher value than on the primary server. Otherwise, queries
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
will not be allowed in the standby server.
|
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2007-06-03 19:08:34 +02:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-superuser-reserved-connections"
|
|
|
|
xreflabel="superuser_reserved_connections">
|
|
|
|
<term><varname>superuser_reserved_connections</varname>
|
2014-05-07 03:28:58 +02:00
|
|
|
(<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>superuser_reserved_connections</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Determines the number of connection <quote>slots</quote> that
|
2017-10-09 03:44:17 +02:00
|
|
|
are reserved for connections by <productname>PostgreSQL</productname>
|
2017-11-23 15:39:47 +01:00
|
|
|
superusers. At most <xref linkend="guc-max-connections"/>
|
2005-09-13 00:11:38 +02:00
|
|
|
connections can ever be active simultaneously. Whenever the
|
|
|
|
number of active concurrent connections is at least
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>max_connections</varname> minus
|
2005-09-13 00:11:38 +02:00
|
|
|
<varname>superuser_reserved_connections</varname>, new
|
2010-04-26 12:52:00 +02:00
|
|
|
connections will be accepted only for superusers, and no
|
|
|
|
new replication connections will be accepted.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2007-01-20 22:30:26 +01:00
|
|
|
The default value is three connections. The value must be less
|
Move max_wal_senders out of max_connections for connection slot handling
Since its introduction, max_wal_senders is counted as part of
max_connections when it comes to define how many connection slots can be
used for replication connections with a WAL sender context. This can
lead to confusion for some users, as it could be possible to block a
base backup or replication from happening because other backend sessions
are already taken for other purposes by an application, and
superuser-only connection slots are not a correct solution to handle
that case.
This commit makes max_wal_senders independent of max_connections for its
handling of PGPROC entries in ProcGlobal, meaning that connection slots
for WAL senders are handled using their own free queue, like autovacuum
workers and bgworkers.
One compatibility issue that this change creates is that a standby now
requires to have a value of max_wal_senders at least equal to its
primary. So, if a standby created enforces the value of
max_wal_senders to be lower than that, then this could break failovers.
Normally this should not be an issue though, as any settings of a
standby are inherited from its primary as postgresql.conf gets normally
copied as part of a base backup, so parameters would be consistent.
Author: Alexander Kukushkin
Reviewed-by: Kyotaro Horiguchi, Petr Jelínek, Masahiko Sawada, Oleksii
Kliukin
Discussion: https://postgr.es/m/CAFh8B=nBzHQeYAu0b8fjK-AF1X4+_p6GRtwG+cCgs6Vci2uRuQ@mail.gmail.com
2019-02-12 02:07:56 +01:00
|
|
|
than <varname>max_connections</varname>.
|
2018-03-08 17:25:26 +01:00
|
|
|
This parameter can only be set at server start.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2012-08-10 23:26:44 +02:00
|
|
|
<varlistentry id="guc-unix-socket-directories" xreflabel="unix_socket_directories">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>unix_socket_directories</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>unix_socket_directories</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2012-08-10 23:26:44 +02:00
|
|
|
Specifies the directory of the Unix-domain socket(s) on which the
|
|
|
|
server is to listen for connections from client applications.
|
|
|
|
Multiple sockets can be created by listing multiple directories
|
|
|
|
separated by commas. Whitespace between entries is
|
|
|
|
ignored; surround a directory name with double quotes if you need
|
|
|
|
to include whitespace or commas in the name.
|
|
|
|
An empty value
|
|
|
|
specifies not listening on any Unix-domain sockets, in which case
|
|
|
|
only TCP/IP sockets can be used to connect to the server.
|
2020-11-25 08:14:23 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
A value that starts with <literal>@</literal> specifies that a
|
|
|
|
Unix-domain socket in the abstract namespace should be created
|
|
|
|
(currently supported on Linux and Windows). In that case, this value
|
|
|
|
does not specify a <quote>directory</quote> but a prefix from which
|
|
|
|
the actual socket name is computed in the same manner as for the
|
|
|
|
file-system namespace. While the abstract socket name prefix can be
|
|
|
|
chosen freely, since it is not a file-system location, the convention
|
|
|
|
is to nonetheless use file-system-like values such as
|
|
|
|
<literal>@/tmp</literal>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2012-08-10 23:26:44 +02:00
|
|
|
The default value is normally
|
|
|
|
<filename>/tmp</filename>, but that can be changed at build time.
|
2020-04-02 08:01:30 +02:00
|
|
|
On Windows, the default is empty, which means no Unix-domain socket is
|
|
|
|
created by default.
|
2005-09-13 00:11:38 +02:00
|
|
|
This parameter can only be set at server start.
|
|
|
|
</para>
|
2010-08-27 00:00:19 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
In addition to the socket file itself, which is named
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>.s.PGSQL.<replaceable>nnnn</replaceable></literal> where
|
|
|
|
<replaceable>nnnn</replaceable> is the server's port number, an ordinary file
|
|
|
|
named <literal>.s.PGSQL.<replaceable>nnnn</replaceable>.lock</literal> will be
|
|
|
|
created in each of the <varname>unix_socket_directories</varname> directories.
|
2012-08-10 23:26:44 +02:00
|
|
|
Neither file should ever be removed manually.
|
2020-11-25 08:14:23 +01:00
|
|
|
For sockets in the abstract namespace, no lock file is created.
|
2010-08-27 00:00:19 +02:00
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-unix-socket-group" xreflabel="unix_socket_group">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>unix_socket_group</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>unix_socket_group</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2012-08-10 23:26:44 +02:00
|
|
|
Sets the owning group of the Unix-domain socket(s). (The owning
|
|
|
|
user of the sockets is always the user that starts the
|
2006-01-23 19:16:41 +01:00
|
|
|
server.) In combination with the parameter
|
2005-09-13 00:11:38 +02:00
|
|
|
<varname>unix_socket_permissions</varname> this can be used as
|
|
|
|
an additional access control mechanism for Unix-domain connections.
|
2010-02-03 18:25:06 +01:00
|
|
|
By default this is the empty string, which uses the default
|
|
|
|
group of the server user. This parameter can only be set at
|
2005-09-13 00:11:38 +02:00
|
|
|
server start.
|
|
|
|
</para>
|
2010-08-27 00:00:19 +02:00
|
|
|
|
|
|
|
<para>
|
2020-04-02 08:01:30 +02:00
|
|
|
This parameter is not supported on Windows. Any setting will be
|
2020-11-25 08:14:23 +01:00
|
|
|
ignored. Also, sockets in the abstract namespace have no file owner,
|
|
|
|
so this setting is also ignored in that case.
|
2010-08-27 00:00:19 +02:00
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-unix-socket-permissions" xreflabel="unix_socket_permissions">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>unix_socket_permissions</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>unix_socket_permissions</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2012-08-10 23:26:44 +02:00
|
|
|
Sets the access permissions of the Unix-domain socket(s). Unix-domain
|
2005-09-13 00:11:38 +02:00
|
|
|
sockets use the usual Unix file system permission set.
|
2006-01-23 19:16:41 +01:00
|
|
|
The parameter value is expected to be a numeric mode
|
2010-02-03 18:25:06 +01:00
|
|
|
specified in the format accepted by the
|
2005-09-13 00:11:38 +02:00
|
|
|
<function>chmod</function> and <function>umask</function>
|
|
|
|
system calls. (To use the customary octal format the number
|
|
|
|
must start with a <literal>0</literal> (zero).)
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The default permissions are <literal>0777</literal>, meaning
|
|
|
|
anyone can connect. Reasonable alternatives are
|
|
|
|
<literal>0770</literal> (only user and group, see also
|
|
|
|
<varname>unix_socket_group</varname>) and <literal>0700</literal>
|
|
|
|
(only user). (Note that for a Unix-domain socket, only write
|
2010-02-03 18:25:06 +01:00
|
|
|
permission matters, so there is no point in setting or revoking
|
2005-09-13 00:11:38 +02:00
|
|
|
read or execute permissions.)
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
This access control mechanism is independent of the one
|
2017-11-23 15:39:47 +01:00
|
|
|
described in <xref linkend="client-authentication"/>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2006-01-23 19:16:41 +01:00
|
|
|
This parameter can only be set at server start.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2010-08-27 00:00:19 +02:00
|
|
|
|
|
|
|
<para>
|
2014-03-29 05:52:31 +01:00
|
|
|
This parameter is irrelevant on systems, notably Solaris as of Solaris
|
|
|
|
10, that ignore socket permissions entirely. There, one can achieve a
|
2017-10-09 03:44:17 +02:00
|
|
|
similar effect by pointing <varname>unix_socket_directories</varname> to a
|
2014-03-29 05:52:31 +01:00
|
|
|
directory having search permission limited to the desired audience.
|
2010-08-27 00:00:19 +02:00
|
|
|
</para>
|
2020-11-25 08:14:23 +01:00
|
|
|
|
|
|
|
<para>
|
|
|
|
Sockets in the abstract namespace have no file permissions, so this
|
|
|
|
setting is also ignored in that case.
|
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2007-06-03 19:08:34 +02:00
|
|
|
|
2009-09-08 19:08:36 +02:00
|
|
|
<varlistentry id="guc-bonjour" xreflabel="bonjour">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>bonjour</varname> (<type>boolean</type>)
|
2009-09-08 19:08:36 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>bonjour</varname> configuration parameter</primary>
|
2009-09-08 19:08:36 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2009-09-08 19:08:36 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables advertising the server's existence via
|
|
|
|
<productname>Bonjour</productname>. The default is off.
|
|
|
|
This parameter can only be set at server start.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-bonjour-name" xreflabel="bonjour_name">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>bonjour_name</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>bonjour_name</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2009-09-08 19:08:36 +02:00
|
|
|
Specifies the <productname>Bonjour</productname> service
|
2006-01-23 19:16:41 +01:00
|
|
|
name. The computer name is used if this parameter is set to the
|
2017-10-09 03:44:17 +02:00
|
|
|
empty string <literal>''</literal> (which is the default). This parameter is
|
2006-01-23 19:16:41 +01:00
|
|
|
ignored if the server was not compiled with
|
|
|
|
<productname>Bonjour</productname> support.
|
|
|
|
This parameter can only be set at server start.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2007-06-03 19:08:34 +02:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-tcp-keepalives-idle" xreflabel="tcp_keepalives_idle">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>tcp_keepalives_idle</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>tcp_keepalives_idle</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Specifies the amount of time with no network activity after which
|
|
|
|
the operating system should send a TCP keepalive message to the client.
|
|
|
|
If this value is specified without units, it is taken as seconds.
|
|
|
|
A value of 0 (the default) selects the operating system's default.
|
2017-06-28 18:30:16 +02:00
|
|
|
This parameter is supported only on systems that support
|
2017-10-09 03:44:17 +02:00
|
|
|
<symbol>TCP_KEEPIDLE</symbol> or an equivalent socket option, and on
|
2012-10-31 19:26:20 +01:00
|
|
|
Windows; on other systems, it must be zero.
|
|
|
|
In sessions connected via a Unix-domain socket, this parameter is
|
|
|
|
ignored and always reads as zero.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2010-08-27 00:00:19 +02:00
|
|
|
<note>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
On Windows, setting a value of 0 will set this parameter to 2 hours,
|
2010-08-27 00:00:19 +02:00
|
|
|
since Windows does not provide a way to read the system default value.
|
|
|
|
</para>
|
|
|
|
</note>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2007-06-03 19:08:34 +02:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-tcp-keepalives-interval" xreflabel="tcp_keepalives_interval">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>tcp_keepalives_interval</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>tcp_keepalives_interval</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Specifies the amount of time after which a TCP keepalive message
|
|
|
|
that has not been acknowledged by the client should be retransmitted.
|
|
|
|
If this value is specified without units, it is taken as seconds.
|
|
|
|
A value of 0 (the default) selects the operating system's default.
|
2017-06-28 18:30:16 +02:00
|
|
|
This parameter is supported only on systems that support
|
2017-10-09 03:44:17 +02:00
|
|
|
<symbol>TCP_KEEPINTVL</symbol> or an equivalent socket option, and on
|
2017-06-28 18:30:16 +02:00
|
|
|
Windows; on other systems, it must be zero.
|
2012-10-31 19:26:20 +01:00
|
|
|
In sessions connected via a Unix-domain socket, this parameter is
|
|
|
|
ignored and always reads as zero.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2010-08-27 00:00:19 +02:00
|
|
|
<note>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
On Windows, setting a value of 0 will set this parameter to 1 second,
|
2010-08-27 00:00:19 +02:00
|
|
|
since Windows does not provide a way to read the system default value.
|
|
|
|
</para>
|
|
|
|
</note>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2007-06-03 19:08:34 +02:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-tcp-keepalives-count" xreflabel="tcp_keepalives_count">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>tcp_keepalives_count</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>tcp_keepalives_count</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Specifies the number of TCP keepalive messages that can be lost before
|
2017-06-28 18:30:16 +02:00
|
|
|
the server's connection to the client is considered dead.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
A value of 0 (the default) selects the operating system's default.
|
2017-06-28 18:30:16 +02:00
|
|
|
This parameter is supported only on systems that support
|
2017-10-09 03:44:17 +02:00
|
|
|
<symbol>TCP_KEEPCNT</symbol> or an equivalent socket option;
|
2017-06-28 18:30:16 +02:00
|
|
|
on other systems, it must be zero.
|
2012-10-31 19:26:20 +01:00
|
|
|
In sessions connected via a Unix-domain socket, this parameter is
|
|
|
|
ignored and always reads as zero.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2010-08-27 00:00:19 +02:00
|
|
|
<note>
|
|
|
|
<para>
|
|
|
|
This parameter is not supported on Windows, and must be zero.
|
|
|
|
</para>
|
|
|
|
</note>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2007-06-03 19:08:34 +02:00
|
|
|
|
Add support TCP user timeout in libpq and the backend server
Similarly to the set of parameters for keepalive, a connection parameter
for libpq is added as well as a backend GUC, called tcp_user_timeout.
Increasing the TCP user timeout is useful to allow a connection to
survive extended periods without end-to-end connection, and decreasing
it allows application to fail faster. By default, the parameter is 0,
which makes the connection use the system default, and follows a logic
close to the keepalive parameters in its handling. When connecting
through a Unix-socket domain, the parameters have no effect.
Author: Ryohei Nagaura
Reviewed-by: Fabien Coelho, Robert Haas, Kyotaro Horiguchi, Kirk
Jamison, Mikalai Keida, Takayuki Tsunakawa, Andrei Yahorau
Discussion: https://postgr.es/m/EDA4195584F5064680D8130B1CA91C45367328@G01JPEXMBYT04
2019-04-06 08:23:37 +02:00
|
|
|
<varlistentry id="guc-tcp-user-timeout" xreflabel="tcp_user_timeout">
|
|
|
|
<term><varname>tcp_user_timeout</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>tcp_user_timeout</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Specifies the amount of time that transmitted data may
|
|
|
|
remain unacknowledged before the TCP connection is forcibly closed.
|
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
|
|
|
A value of 0 (the default) selects the operating system's default.
|
Add support TCP user timeout in libpq and the backend server
Similarly to the set of parameters for keepalive, a connection parameter
for libpq is added as well as a backend GUC, called tcp_user_timeout.
Increasing the TCP user timeout is useful to allow a connection to
survive extended periods without end-to-end connection, and decreasing
it allows application to fail faster. By default, the parameter is 0,
which makes the connection use the system default, and follows a logic
close to the keepalive parameters in its handling. When connecting
through a Unix-socket domain, the parameters have no effect.
Author: Ryohei Nagaura
Reviewed-by: Fabien Coelho, Robert Haas, Kyotaro Horiguchi, Kirk
Jamison, Mikalai Keida, Takayuki Tsunakawa, Andrei Yahorau
Discussion: https://postgr.es/m/EDA4195584F5064680D8130B1CA91C45367328@G01JPEXMBYT04
2019-04-06 08:23:37 +02:00
|
|
|
This parameter is supported only on systems that support
|
|
|
|
<symbol>TCP_USER_TIMEOUT</symbol>; on other systems, it must be zero.
|
|
|
|
In sessions connected via a Unix-domain socket, this parameter is
|
2019-04-08 22:27:35 +02:00
|
|
|
ignored and always reads as zero.
|
Add support TCP user timeout in libpq and the backend server
Similarly to the set of parameters for keepalive, a connection parameter
for libpq is added as well as a backend GUC, called tcp_user_timeout.
Increasing the TCP user timeout is useful to allow a connection to
survive extended periods without end-to-end connection, and decreasing
it allows application to fail faster. By default, the parameter is 0,
which makes the connection use the system default, and follows a logic
close to the keepalive parameters in its handling. When connecting
through a Unix-socket domain, the parameters have no effect.
Author: Ryohei Nagaura
Reviewed-by: Fabien Coelho, Robert Haas, Kyotaro Horiguchi, Kirk
Jamison, Mikalai Keida, Takayuki Tsunakawa, Andrei Yahorau
Discussion: https://postgr.es/m/EDA4195584F5064680D8130B1CA91C45367328@G01JPEXMBYT04
2019-04-06 08:23:37 +02:00
|
|
|
</para>
|
|
|
|
<note>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
This parameter is not supported on Windows, and must be zero.
|
Add support TCP user timeout in libpq and the backend server
Similarly to the set of parameters for keepalive, a connection parameter
for libpq is added as well as a backend GUC, called tcp_user_timeout.
Increasing the TCP user timeout is useful to allow a connection to
survive extended periods without end-to-end connection, and decreasing
it allows application to fail faster. By default, the parameter is 0,
which makes the connection use the system default, and follows a logic
close to the keepalive parameters in its handling. When connecting
through a Unix-socket domain, the parameters have no effect.
Author: Ryohei Nagaura
Reviewed-by: Fabien Coelho, Robert Haas, Kyotaro Horiguchi, Kirk
Jamison, Mikalai Keida, Takayuki Tsunakawa, Andrei Yahorau
Discussion: https://postgr.es/m/EDA4195584F5064680D8130B1CA91C45367328@G01JPEXMBYT04
2019-04-06 08:23:37 +02:00
|
|
|
</para>
|
|
|
|
</note>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2021-04-02 21:52:30 +02:00
|
|
|
<varlistentry id="guc-client-connection-check-interval" xreflabel="client_connection_check_interval">
|
|
|
|
<term><varname>client_connection_check_interval</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>client_connection_check_interval</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the time interval between optional checks that the client is still
|
|
|
|
connected, while running queries. The check is performed by polling
|
|
|
|
the socket, and allows long running queries to be aborted sooner if
|
|
|
|
the kernel reports that the connection is closed.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
This option is currently available only on systems that support the
|
|
|
|
non-standard <symbol>POLLRDHUP</symbol> extension to the
|
|
|
|
<symbol>poll</symbol> system call, including Linux.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
If the value is specified without units, it is taken as milliseconds.
|
|
|
|
The default value is <literal>0</literal>, which disables connection
|
|
|
|
checks. Without connection checks, the server will detect the loss of
|
|
|
|
the connection only at the next interaction with the socket, when it
|
|
|
|
waits for, receives or sends data.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
For the kernel itself to detect lost TCP connections reliably and within
|
|
|
|
a known timeframe in all scenarios including network failure, it may
|
|
|
|
also be necessary to adjust the TCP keepalive settings of the operating
|
|
|
|
system, or the <xref linkend="guc-tcp-keepalives-idle"/>,
|
|
|
|
<xref linkend="guc-tcp-keepalives-interval"/> and
|
|
|
|
<xref linkend="guc-tcp-keepalives-count"/> settings of
|
|
|
|
<productname>PostgreSQL</productname>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
2018-01-19 01:12:05 +01:00
|
|
|
|
|
|
|
<sect2 id="runtime-config-connection-authentication">
|
|
|
|
<title>Authentication</title>
|
2007-06-03 19:08:34 +02:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<variablelist>
|
|
|
|
<varlistentry id="guc-authentication-timeout" xreflabel="authentication_timeout">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>authentication_timeout</varname> (<type>integer</type>)
|
2017-10-09 03:44:17 +02:00
|
|
|
<indexterm><primary>timeout</primary><secondary>client authentication</secondary></indexterm>
|
|
|
|
<indexterm><primary>client authentication</primary><secondary>timeout during</secondary></indexterm>
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>authentication_timeout</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Maximum amount of time allowed to complete client authentication. If a
|
2005-09-13 00:11:38 +02:00
|
|
|
would-be client has not completed the authentication protocol in
|
2010-02-03 18:25:06 +01:00
|
|
|
this much time, the server closes the connection. This prevents
|
2006-01-23 19:16:41 +01:00
|
|
|
hung clients from occupying a connection indefinitely.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as seconds.
|
2017-10-09 03:44:17 +02:00
|
|
|
The default is one minute (<literal>1m</literal>).
|
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2006-01-23 19:16:41 +01:00
|
|
|
file or on the server command line.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2010-05-27 01:49:19 +02:00
|
|
|
|
2018-01-19 01:12:05 +01:00
|
|
|
<varlistentry id="guc-password-encryption" xreflabel="password_encryption">
|
|
|
|
<term><varname>password_encryption</varname> (<type>enum</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>password_encryption</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
When a password is specified in <xref linkend="sql-createrole"/> or
|
2020-06-10 16:16:37 +02:00
|
|
|
<xref linkend="sql-alterrole"/>, this parameter determines the
|
|
|
|
algorithm to use to encrypt the password. Possible values are
|
|
|
|
<literal>scram-sha-256</literal>, which will encrypt the password with
|
|
|
|
SCRAM-SHA-256, and <literal>md5</literal>, which stores the password
|
|
|
|
as an MD5 hash. The default is <literal>scram-sha-256</literal>.
|
2018-01-19 01:12:05 +01:00
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
Note that older clients might lack support for the SCRAM authentication
|
|
|
|
mechanism, and hence not work with passwords encrypted with
|
|
|
|
SCRAM-SHA-256. See <xref linkend="auth-password"/> for more details.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-krb-server-keyfile" xreflabel="krb_server_keyfile">
|
|
|
|
<term><varname>krb_server_keyfile</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>krb_server_keyfile</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
Fix up usage of krb_server_keyfile GUC parameter.
secure_open_gssapi() installed the krb_server_keyfile setting as
KRB5_KTNAME unconditionally, so long as it's not empty. However,
pg_GSS_recvauth() only installed it if KRB5_KTNAME wasn't set already,
leading to a troubling inconsistency: in theory, clients could see
different sets of server principal names depending on whether they
use GSSAPI encryption. Always using krb_server_keyfile seems like
the right thing, so make both places do that. Also fix up
secure_open_gssapi()'s lack of a check for setenv() failure ---
it's unlikely, surely, but security-critical actions are no place
to be sloppy.
Also improve the associated documentation.
This patch does nothing about secure_open_gssapi()'s use of setenv(),
and indeed causes pg_GSS_recvauth() to use it too. That's nominally
against project portability rules, but since this code is only built
with --with-gssapi, I do not feel a need to do something about this
in the back branches. A fix will be forthcoming for HEAD though.
Back-patch to v12 where GSSAPI encryption was introduced. The
dubious behavior in pg_GSS_recvauth() goes back further, but it
didn't have anything to be inconsistent with, so let it be.
Discussion: https://postgr.es/m/2187460.1609263156@sss.pgh.pa.us
2020-12-30 17:38:42 +01:00
|
|
|
Sets the location of the server's Kerberos key file. The default is
|
|
|
|
<filename>FILE:/usr/local/pgsql/etc/krb5.keytab</filename>
|
|
|
|
(where the directory part is whatever was specified
|
|
|
|
as <varname>sysconfdir</varname> at build time; use
|
|
|
|
<literal>pg_config --sysconfdir</literal> to determine that).
|
|
|
|
If this parameter is set to an empty string, it is ignored and a
|
|
|
|
system-dependent default is used.
|
|
|
|
This parameter can only be set in the
|
2018-01-19 01:12:05 +01:00
|
|
|
<filename>postgresql.conf</filename> file or on the server command line.
|
Fix up usage of krb_server_keyfile GUC parameter.
secure_open_gssapi() installed the krb_server_keyfile setting as
KRB5_KTNAME unconditionally, so long as it's not empty. However,
pg_GSS_recvauth() only installed it if KRB5_KTNAME wasn't set already,
leading to a troubling inconsistency: in theory, clients could see
different sets of server principal names depending on whether they
use GSSAPI encryption. Always using krb_server_keyfile seems like
the right thing, so make both places do that. Also fix up
secure_open_gssapi()'s lack of a check for setenv() failure ---
it's unlikely, surely, but security-critical actions are no place
to be sloppy.
Also improve the associated documentation.
This patch does nothing about secure_open_gssapi()'s use of setenv(),
and indeed causes pg_GSS_recvauth() to use it too. That's nominally
against project portability rules, but since this code is only built
with --with-gssapi, I do not feel a need to do something about this
in the back branches. A fix will be forthcoming for HEAD though.
Back-patch to v12 where GSSAPI encryption was introduced. The
dubious behavior in pg_GSS_recvauth() goes back further, but it
didn't have anything to be inconsistent with, so let it be.
Discussion: https://postgr.es/m/2187460.1609263156@sss.pgh.pa.us
2020-12-30 17:38:42 +01:00
|
|
|
See <xref linkend="gssapi-auth"/> for more information.
|
2018-01-19 01:12:05 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-krb-caseins-users" xreflabel="krb_caseins_users">
|
|
|
|
<term><varname>krb_caseins_users</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>krb_caseins_users</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets whether GSSAPI user names should be treated
|
|
|
|
case-insensitively.
|
|
|
|
The default is <literal>off</literal> (case sensitive). This parameter can only be
|
|
|
|
set in the <filename>postgresql.conf</filename> file or on the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-db-user-namespace" xreflabel="db_user_namespace">
|
|
|
|
<term><varname>db_user_namespace</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>db_user_namespace</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This parameter enables per-database user names. It is off by default.
|
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
If this is on, you should create users as <replaceable>username@dbname</replaceable>.
|
|
|
|
When <replaceable>username</replaceable> is passed by a connecting client,
|
|
|
|
<literal>@</literal> and the database name are appended to the user
|
|
|
|
name and that database-specific user name is looked up by the
|
|
|
|
server. Note that when you create users with names containing
|
|
|
|
<literal>@</literal> within the SQL environment, you will need to
|
|
|
|
quote the user name.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
With this parameter enabled, you can still create ordinary global
|
|
|
|
users. Simply append <literal>@</literal> when specifying the user
|
2020-09-01 00:33:37 +02:00
|
|
|
name in the client, e.g., <literal>joe@</literal>. The <literal>@</literal>
|
2018-01-19 01:12:05 +01:00
|
|
|
will be stripped off before the user name is looked up by the
|
|
|
|
server.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
<varname>db_user_namespace</varname> causes the client's and
|
|
|
|
server's user name representation to differ.
|
|
|
|
Authentication checks are always done with the server's user name
|
|
|
|
so authentication methods must be configured for the
|
|
|
|
server's user name, not the client's. Because
|
|
|
|
<literal>md5</literal> uses the user name as salt on both the
|
|
|
|
client and server, <literal>md5</literal> cannot be used with
|
|
|
|
<varname>db_user_namespace</varname>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<note>
|
|
|
|
<para>
|
|
|
|
This feature is intended as a temporary measure until a
|
|
|
|
complete solution is found. At that time, this option will
|
|
|
|
be removed.
|
|
|
|
</para>
|
|
|
|
</note>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
|
|
|
|
<sect2 id="runtime-config-connection-ssl">
|
|
|
|
<title>SSL</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
See <xref linkend="ssl-tcp"/> for more information about setting up SSL.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<variablelist>
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-ssl" xreflabel="ssl">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>ssl</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>ssl</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-01-19 01:12:05 +01:00
|
|
|
Enables <acronym>SSL</acronym> connections.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2017-01-03 03:37:12 +01:00
|
|
|
file or on the server command line.
|
2017-10-09 03:44:17 +02:00
|
|
|
The default is <literal>off</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2012-02-22 22:40:46 +01:00
|
|
|
<varlistentry id="guc-ssl-ca-file" xreflabel="ssl_ca_file">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>ssl_ca_file</varname> (<type>string</type>)
|
2012-02-22 22:40:46 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>ssl_ca_file</varname> configuration parameter</primary>
|
2012-02-22 22:40:46 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2012-02-22 22:40:46 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the name of the file containing the SSL server certificate
|
2017-01-03 03:37:12 +01:00
|
|
|
authority (CA).
|
|
|
|
Relative paths are relative to the data directory.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2017-01-03 03:37:12 +01:00
|
|
|
file or on the server command line.
|
|
|
|
The default is empty, meaning no CA file is loaded,
|
|
|
|
and client certificate verification is not performed.
|
|
|
|
</para>
|
2012-02-22 22:40:46 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-ssl-cert-file" xreflabel="ssl_cert_file">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>ssl_cert_file</varname> (<type>string</type>)
|
2012-02-22 22:40:46 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>ssl_cert_file</varname> configuration parameter</primary>
|
2012-02-22 22:40:46 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2012-02-22 22:40:46 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the name of the file containing the SSL server certificate.
|
2017-01-03 03:37:12 +01:00
|
|
|
Relative paths are relative to the data directory.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2017-01-03 03:37:12 +01:00
|
|
|
file or on the server command line.
|
|
|
|
The default is <filename>server.crt</filename>.
|
2012-02-22 22:40:46 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-ssl-crl-file" xreflabel="ssl_crl_file">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>ssl_crl_file</varname> (<type>string</type>)
|
2012-02-22 22:40:46 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>ssl_crl_file</varname> configuration parameter</primary>
|
2012-02-22 22:40:46 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2012-02-22 22:40:46 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the name of the file containing the SSL server certificate
|
2017-01-03 03:37:12 +01:00
|
|
|
revocation list (CRL).
|
|
|
|
Relative paths are relative to the data directory.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2017-01-03 03:37:12 +01:00
|
|
|
file or on the server command line.
|
2021-02-18 07:59:10 +01:00
|
|
|
The default is empty, meaning no CRL file is loaded (unless
|
|
|
|
<xref linkend="guc-ssl-crl-dir"/> is set).
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-ssl-crl-dir" xreflabel="ssl_crl_dir">
|
|
|
|
<term><varname>ssl_crl_dir</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>ssl_crl_dir</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the name of the directory containing the SSL server
|
|
|
|
certificate revocation list (CRL). Relative paths are relative to the
|
|
|
|
data directory. This parameter can only be set in
|
|
|
|
the <filename>postgresql.conf</filename> file or on the server command
|
|
|
|
line. The default is empty, meaning no CRLs are used (unless
|
|
|
|
<xref linkend="guc-ssl-crl-file"/> is set).
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2021-06-18 07:22:31 +02:00
|
|
|
The directory needs to be prepared with the
|
|
|
|
<productname>OpenSSL</productname> command
|
2021-02-18 07:59:10 +01:00
|
|
|
<literal>openssl rehash</literal> or <literal>c_rehash</literal>. See
|
|
|
|
its documentation for details.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
When using this setting, CRLs in the specified directory are loaded
|
|
|
|
on-demand at connection time. New CRLs can be added to the directory
|
|
|
|
and will be used immediately. This is unlike <xref
|
|
|
|
linkend="guc-ssl-crl-file"/>, which causes the CRL in the file to be
|
|
|
|
loaded at server start time or when the configuration is reloaded.
|
|
|
|
Both settings can be used together.
|
2017-01-03 03:37:12 +01:00
|
|
|
</para>
|
2012-02-22 22:40:46 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-ssl-key-file" xreflabel="ssl_key_file">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>ssl_key_file</varname> (<type>string</type>)
|
2012-02-22 22:40:46 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>ssl_key_file</varname> configuration parameter</primary>
|
2012-02-22 22:40:46 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2012-02-22 22:40:46 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the name of the file containing the SSL server private key.
|
2017-01-03 03:37:12 +01:00
|
|
|
Relative paths are relative to the data directory.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2017-01-03 03:37:12 +01:00
|
|
|
file or on the server command line.
|
|
|
|
The default is <filename>server.key</filename>.
|
2012-02-22 22:40:46 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2007-02-16 17:37:29 +01:00
|
|
|
<varlistentry id="guc-ssl-ciphers" xreflabel="ssl_ciphers">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>ssl_ciphers</varname> (<type>string</type>)
|
2007-02-16 03:59:41 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>ssl_ciphers</varname> configuration parameter</primary>
|
2007-02-16 03:59:41 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2007-02-16 03:59:41 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2020-07-23 17:13:00 +02:00
|
|
|
Specifies a list of <acronym>SSL</acronym> cipher suites that are
|
|
|
|
allowed to be used by SSL connections. See the
|
|
|
|
<citerefentry><refentrytitle>ciphers</refentrytitle></citerefentry>
|
2021-06-18 07:22:31 +02:00
|
|
|
manual page in the <productname>OpenSSL</productname> package for the
|
2020-07-23 17:13:00 +02:00
|
|
|
syntax of this setting and a list of supported values. Only
|
|
|
|
connections using TLS version 1.2 and lower are affected. There is
|
|
|
|
currently no setting that controls the cipher choices used by TLS
|
|
|
|
version 1.3 connections. The default value is
|
|
|
|
<literal>HIGH:MEDIUM:+3DES:!aNULL</literal>. The default is usually a
|
|
|
|
reasonable choice unless you have specific security requirements.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
This parameter can only be set in the
|
|
|
|
<filename>postgresql.conf</filename> file or on the server command
|
|
|
|
line.
|
2014-02-25 02:30:28 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Explanation of the default value:
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry>
|
|
|
|
<term><literal>HIGH</literal></term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Cipher suites that use ciphers from <literal>HIGH</literal> group (e.g.,
|
2014-02-25 02:30:28 +01:00
|
|
|
AES, Camellia, 3DES)
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry>
|
|
|
|
<term><literal>MEDIUM</literal></term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Cipher suites that use ciphers from <literal>MEDIUM</literal> group
|
2014-02-25 02:30:28 +01:00
|
|
|
(e.g., RC4, SEED)
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry>
|
|
|
|
<term><literal>+3DES</literal></term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2020-09-17 09:33:22 +02:00
|
|
|
The <productname>OpenSSL</productname> default order for
|
|
|
|
<literal>HIGH</literal> is problematic because it orders 3DES
|
|
|
|
higher than AES128. This is wrong because 3DES offers less
|
|
|
|
security than AES128, and it is also much slower.
|
|
|
|
<literal>+3DES</literal> reorders it after all other
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>HIGH</literal> and <literal>MEDIUM</literal> ciphers.
|
2014-02-25 02:30:28 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry>
|
|
|
|
<term><literal>!aNULL</literal></term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Disables anonymous cipher suites that do no authentication. Such
|
2021-06-25 04:29:03 +02:00
|
|
|
cipher suites are vulnerable to <acronym>MITM</acronym> attacks and
|
2014-02-25 02:30:28 +01:00
|
|
|
therefore should not be used.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2020-09-17 09:33:22 +02:00
|
|
|
Available cipher suite details will vary across
|
|
|
|
<productname>OpenSSL</productname> versions. Use the command
|
2014-02-25 02:30:28 +01:00
|
|
|
<literal>openssl ciphers -v 'HIGH:MEDIUM:+3DES:!aNULL'</literal> to
|
2021-06-18 07:22:31 +02:00
|
|
|
see actual details for the currently installed
|
|
|
|
<productname>OpenSSL</productname> version. Note that this list is
|
|
|
|
filtered at run time based on the server key type.
|
2007-02-16 03:59:41 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2013-12-07 14:04:27 +01:00
|
|
|
<varlistentry id="guc-ssl-prefer-server-ciphers" xreflabel="ssl_prefer_server_ciphers">
|
2017-01-03 03:37:12 +01:00
|
|
|
<term><varname>ssl_prefer_server_ciphers</varname> (<type>boolean</type>)
|
2013-12-07 14:04:27 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>ssl_prefer_server_ciphers</varname> configuration parameter</primary>
|
2013-12-07 14:04:27 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2013-12-07 14:04:27 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies whether to use the server's SSL cipher preferences, rather
|
2017-01-03 03:37:12 +01:00
|
|
|
than the client's.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2017-01-03 03:37:12 +01:00
|
|
|
file or on the server command line.
|
2018-12-06 04:15:15 +01:00
|
|
|
The default is <literal>on</literal>.
|
2013-12-07 14:04:27 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Older PostgreSQL versions do not have this setting and always use the
|
|
|
|
client's preferences. This setting is mainly for backward
|
|
|
|
compatibility with those versions. Using the server's preferences is
|
|
|
|
usually better because it is more likely that the server is appropriately
|
|
|
|
configured.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2013-12-07 21:11:44 +01:00
|
|
|
<varlistentry id="guc-ssl-ecdh-curve" xreflabel="ssl_ecdh_curve">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>ssl_ecdh_curve</varname> (<type>string</type>)
|
2013-12-07 21:11:44 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>ssl_ecdh_curve</varname> configuration parameter</primary>
|
2013-12-07 21:11:44 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2013-12-07 21:11:44 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Specifies the name of the curve to use in <acronym>ECDH</acronym> key
|
2014-05-28 12:27:01 +02:00
|
|
|
exchange. It needs to be supported by all clients that connect.
|
2017-01-03 03:37:12 +01:00
|
|
|
It does not need to be the same curve used by the server's Elliptic
|
|
|
|
Curve key.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2017-01-03 03:37:12 +01:00
|
|
|
file or on the server command line.
|
2017-10-09 03:44:17 +02:00
|
|
|
The default is <literal>prime256v1</literal>.
|
2013-12-07 21:11:44 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2020-09-17 09:33:22 +02:00
|
|
|
<productname>OpenSSL</productname> names for the most common curves
|
|
|
|
are:
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>prime256v1</literal> (NIST P-256),
|
|
|
|
<literal>secp384r1</literal> (NIST P-384),
|
|
|
|
<literal>secp521r1</literal> (NIST P-521).
|
2014-05-28 03:30:20 +02:00
|
|
|
The full list of available curves can be shown with the command
|
2014-05-28 12:27:01 +02:00
|
|
|
<command>openssl ecparam -list_curves</command>. Not all of them
|
2017-10-09 03:44:17 +02:00
|
|
|
are usable in <acronym>TLS</acronym> though.
|
2013-12-07 21:11:44 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2018-11-20 21:49:01 +01:00
|
|
|
<varlistentry id="guc-ssl-min-protocol-version" xreflabel="ssl_min_protocol_version">
|
|
|
|
<term><varname>ssl_min_protocol_version</varname> (<type>enum</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>ssl_min_protocol_version</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the minimum SSL/TLS protocol version to use. Valid values are
|
|
|
|
currently: <literal>TLSv1</literal>, <literal>TLSv1.1</literal>,
|
|
|
|
<literal>TLSv1.2</literal>, <literal>TLSv1.3</literal>. Older
|
|
|
|
versions of the <productname>OpenSSL</productname> library do not
|
|
|
|
support all values; an error will be raised if an unsupported setting
|
|
|
|
is chosen. Protocol versions before TLS 1.0, namely SSL version 2 and
|
|
|
|
3, are always disabled.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2019-12-04 21:40:17 +01:00
|
|
|
The default is <literal>TLSv1.2</literal>, which satisfies industry
|
|
|
|
best practices as of this writing.
|
2018-11-20 21:49:01 +01:00
|
|
|
</para>
|
doc: Fix description of some GUCs in docs and postgresql.conf.sample
The following parameters have been imprecise, or incorrect, about their
description (PGC_POSTMASTER or PGC_SIGHUP):
- autovacuum_work_mem (docs, as of 9.6~)
- huge_page_size (docs, as of 14~)
- max_logical_replication_workers (docs, as of 10~)
- max_sync_workers_per_subscription (docs, as of 10~)
- min_dynamic_shared_memory (docs, as of 14~)
- recovery_init_sync_method (postgresql.conf.sample, as of 14~)
- remove_temp_files_after_crash (docs, as of 14~)
- restart_after_crash (docs, as of 9.6~)
- ssl_min_protocol_version (docs, as of 12~)
- ssl_max_protocol_version (docs, as of 12~)
This commit adjusts the description of all these parameters to be more
consistent with the practice used for the others.
Revewed-by: Justin Pryzby
Discussion: https://postgr.es/m/YK2ltuLpe+FbRXzA@paquier.xyz
Backpatch-through: 9.6
2021-05-27 07:57:28 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
2018-11-20 21:49:01 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-ssl-max-protocol-version" xreflabel="ssl_max_protocol_version">
|
|
|
|
<term><varname>ssl_max_protocol_version</varname> (<type>enum</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>ssl_max_protocol_version</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the maximum SSL/TLS protocol version to use. Valid values are as
|
|
|
|
for <xref linkend="guc-ssl-min-protocol-version"/>, with addition of
|
|
|
|
an empty string, which allows any protocol version. The default is to
|
|
|
|
allow any version. Setting the maximum protocol version is mainly
|
|
|
|
useful for testing or if some component has issues working with a
|
|
|
|
newer protocol.
|
|
|
|
</para>
|
doc: Fix description of some GUCs in docs and postgresql.conf.sample
The following parameters have been imprecise, or incorrect, about their
description (PGC_POSTMASTER or PGC_SIGHUP):
- autovacuum_work_mem (docs, as of 9.6~)
- huge_page_size (docs, as of 14~)
- max_logical_replication_workers (docs, as of 10~)
- max_sync_workers_per_subscription (docs, as of 10~)
- min_dynamic_shared_memory (docs, as of 14~)
- recovery_init_sync_method (postgresql.conf.sample, as of 14~)
- remove_temp_files_after_crash (docs, as of 14~)
- restart_after_crash (docs, as of 9.6~)
- ssl_min_protocol_version (docs, as of 12~)
- ssl_max_protocol_version (docs, as of 12~)
This commit adjusts the description of all these parameters to be more
consistent with the practice used for the others.
Revewed-by: Justin Pryzby
Discussion: https://postgr.es/m/YK2ltuLpe+FbRXzA@paquier.xyz
Backpatch-through: 9.6
2021-05-27 07:57:28 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
2018-11-20 21:49:01 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Always use 2048 bit DH parameters for OpenSSL ephemeral DH ciphers.
1024 bits is considered weak these days, but OpenSSL always passes 1024 as
the key length to the tmp_dh callback. All the code to handle other key
lengths is, in fact, dead.
To remedy those issues:
* Only include hard-coded 2048-bit parameters.
* Set the parameters directly with SSL_CTX_set_tmp_dh(), without the
callback
* The name of the file containing the DH parameters is now a GUC. This
replaces the old hardcoded "dh1024.pem" filename. (The files for other
key lengths, dh512.pem, dh2048.pem, etc. were never actually used.)
This is not a new problem, but it doesn't seem worth the risk and churn to
backport. If you care enough about the strength of the DH parameters on
old versions, you can create custom DH parameters, with as many bits as you
wish, and put them in the "dh1024.pem" file.
Per report by Nicolas Guini and Damian Quiroga. Reviewed by Michael Paquier.
Discussion: https://www.postgresql.org/message-id/CAMxBoUyjOOautVozN6ofzym828aNrDjuCcOTcCquxjwS-L2hGQ@mail.gmail.com
2017-07-31 21:36:09 +02:00
|
|
|
<varlistentry id="guc-ssl-dh-params-file" xreflabel="ssl_dh_params_file">
|
|
|
|
<term><varname>ssl_dh_params_file</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>ssl_dh_params_file</varname> configuration parameter</primary>
|
Always use 2048 bit DH parameters for OpenSSL ephemeral DH ciphers.
1024 bits is considered weak these days, but OpenSSL always passes 1024 as
the key length to the tmp_dh callback. All the code to handle other key
lengths is, in fact, dead.
To remedy those issues:
* Only include hard-coded 2048-bit parameters.
* Set the parameters directly with SSL_CTX_set_tmp_dh(), without the
callback
* The name of the file containing the DH parameters is now a GUC. This
replaces the old hardcoded "dh1024.pem" filename. (The files for other
key lengths, dh512.pem, dh2048.pem, etc. were never actually used.)
This is not a new problem, but it doesn't seem worth the risk and churn to
backport. If you care enough about the strength of the DH parameters on
old versions, you can create custom DH parameters, with as many bits as you
wish, and put them in the "dh1024.pem" file.
Per report by Nicolas Guini and Damian Quiroga. Reviewed by Michael Paquier.
Discussion: https://www.postgresql.org/message-id/CAMxBoUyjOOautVozN6ofzym828aNrDjuCcOTcCquxjwS-L2hGQ@mail.gmail.com
2017-07-31 21:36:09 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the name of the file containing Diffie-Hellman parameters
|
|
|
|
used for so-called ephemeral DH family of SSL ciphers. The default is
|
|
|
|
empty, in which case compiled-in default DH parameters used. Using
|
|
|
|
custom DH parameters reduces the exposure if an attacker manages to
|
|
|
|
crack the well-known compiled-in DH parameters. You can create your own
|
|
|
|
DH parameters file with the command
|
|
|
|
<command>openssl dhparam -out dhparams.pem 2048</command>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
Always use 2048 bit DH parameters for OpenSSL ephemeral DH ciphers.
1024 bits is considered weak these days, but OpenSSL always passes 1024 as
the key length to the tmp_dh callback. All the code to handle other key
lengths is, in fact, dead.
To remedy those issues:
* Only include hard-coded 2048-bit parameters.
* Set the parameters directly with SSL_CTX_set_tmp_dh(), without the
callback
* The name of the file containing the DH parameters is now a GUC. This
replaces the old hardcoded "dh1024.pem" filename. (The files for other
key lengths, dh512.pem, dh2048.pem, etc. were never actually used.)
This is not a new problem, but it doesn't seem worth the risk and churn to
backport. If you care enough about the strength of the DH parameters on
old versions, you can create custom DH parameters, with as many bits as you
wish, and put them in the "dh1024.pem" file.
Per report by Nicolas Guini and Damian Quiroga. Reviewed by Michael Paquier.
Discussion: https://www.postgresql.org/message-id/CAMxBoUyjOOautVozN6ofzym828aNrDjuCcOTcCquxjwS-L2hGQ@mail.gmail.com
2017-07-31 21:36:09 +02:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2018-02-26 19:28:38 +01:00
|
|
|
|
|
|
|
<varlistentry id="guc-ssl-passphrase-command" xreflabel="ssl_passphrase_command">
|
|
|
|
<term><varname>ssl_passphrase_command</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>ssl_passphrase_command</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets an external command to be invoked when a passphrase for
|
|
|
|
decrypting an SSL file such as a private key needs to be obtained. By
|
|
|
|
default, this parameter is empty, which means the built-in prompting
|
|
|
|
mechanism is used.
|
|
|
|
</para>
|
|
|
|
<para>
|
2020-12-28 03:37:42 +01:00
|
|
|
The command must print the passphrase to the standard output and exit
|
|
|
|
with code 0. In the parameter value, <literal>%p</literal> is
|
|
|
|
replaced by a prompt string. (Write <literal>%%</literal> for a
|
|
|
|
literal <literal>%</literal>.) Note that the prompt string will
|
|
|
|
probably contain whitespace, so be sure to quote adequately. A single
|
|
|
|
newline is stripped from the end of the output if present.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
The command does not actually have to prompt the user for a
|
|
|
|
passphrase. It can read it from a file, obtain it from a keychain
|
|
|
|
facility, or similar. It is up to the user to make sure the chosen
|
|
|
|
mechanism is adequately secure.
|
2018-02-26 19:28:38 +01:00
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-ssl-passphrase-command-supports-reload" xreflabel="ssl_passphrase_command_supports_reload">
|
|
|
|
<term><varname>ssl_passphrase_command_supports_reload</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>ssl_passphrase_command_supports_reload</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-03-21 12:27:26 +01:00
|
|
|
This parameter determines whether the passphrase command set by
|
2018-02-26 19:28:38 +01:00
|
|
|
<varname>ssl_passphrase_command</varname> will also be called during a
|
|
|
|
configuration reload if a key file needs a passphrase. If this
|
2018-12-06 04:15:15 +01:00
|
|
|
parameter is off (the default), then
|
2018-02-26 19:28:38 +01:00
|
|
|
<varname>ssl_passphrase_command</varname> will be ignored during a
|
|
|
|
reload and the SSL configuration will not be reloaded if a passphrase
|
2020-12-28 03:37:42 +01:00
|
|
|
is needed. That setting is appropriate for a command that requires a
|
|
|
|
TTY for prompting, which might not be available when the server is
|
|
|
|
running. Setting this parameter to on might be appropriate if the
|
|
|
|
passphrase is obtained from a file, for example.
|
2018-02-26 19:28:38 +01:00
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="runtime-config-resource">
|
|
|
|
<title>Resource Consumption</title>
|
|
|
|
|
|
|
|
<sect2 id="runtime-config-resource-memory">
|
|
|
|
<title>Memory</title>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry id="guc-shared-buffers" xreflabel="shared_buffers">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>shared_buffers</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>shared_buffers</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2007-01-20 22:30:26 +01:00
|
|
|
Sets the amount of memory the database server uses for shared
|
2012-10-01 08:23:06 +02:00
|
|
|
memory buffers. The default is typically 128 megabytes
|
2017-10-09 03:44:17 +02:00
|
|
|
(<literal>128MB</literal>), but might be less if your kernel settings will
|
|
|
|
not support it (as determined during <application>initdb</application>).
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
This setting must be at least 128 kilobytes. However,
|
2007-01-20 22:30:26 +01:00
|
|
|
settings significantly higher than the minimum are usually needed
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
for good performance.
|
|
|
|
If this value is specified without units, it is taken as blocks,
|
|
|
|
that is <symbol>BLCKSZ</symbol> bytes, typically 8kB.
|
|
|
|
(Non-default values of <symbol>BLCKSZ</symbol> change the minimum
|
|
|
|
value.)
|
|
|
|
This parameter can only be set at server start.
|
2010-04-16 23:46:07 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
If you have a dedicated database server with 1GB or more of RAM, a
|
|
|
|
reasonable starting value for <varname>shared_buffers</varname> is 25%
|
|
|
|
of the memory in your system. There are some workloads where even
|
2016-11-30 18:00:00 +01:00
|
|
|
larger settings for <varname>shared_buffers</varname> are effective, but
|
2010-04-16 23:46:07 +02:00
|
|
|
because <productname>PostgreSQL</productname> also relies on the
|
|
|
|
operating system cache, it is unlikely that an allocation of more than
|
|
|
|
40% of RAM to <varname>shared_buffers</varname> will work better than a
|
|
|
|
smaller amount. Larger settings for <varname>shared_buffers</varname>
|
|
|
|
usually require a corresponding increase in
|
2015-02-23 17:53:02 +01:00
|
|
|
<varname>max_wal_size</varname>, in order to spread out the
|
2010-04-16 23:46:07 +02:00
|
|
|
process of writing large quantities of new or changed data over a
|
|
|
|
longer period of time.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
On systems with less than 1GB of RAM, a smaller percentage of RAM is
|
|
|
|
appropriate, so as to leave adequate space for the operating system.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2014-03-03 19:52:48 +01:00
|
|
|
<varlistentry id="guc-huge-pages" xreflabel="huge_pages">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>huge_pages</varname> (<type>enum</type>)
|
Allow using huge TLB pages on Linux (MAP_HUGETLB)
This patch adds an option, huge_tlb_pages, which allows requesting the
shared memory segment to be allocated using huge pages, by using the
MAP_HUGETLB flag in mmap(). This can improve performance.
The default is 'try', which means that we will attempt using huge pages,
and fall back to non-huge pages if it doesn't work. Currently, only Linux
has MAP_HUGETLB. On other platforms, the default 'try' behaves the same as
'off'.
In the passing, don't try to round the mmap() size to a multiple of
pagesize. mmap() doesn't require that, and there's no particular reason for
PostgreSQL to do that either. When using MAP_HUGETLB, however, round the
request size up to nearest 2MB boundary. This is to work around a bug in
some Linux kernel versions, but also to avoid wasting memory, because the
kernel will round the size up anyway.
Many people were involved in writing this patch, including Christian Kruse,
Richard Poole, Abhijit Menon-Sen, reviewed by Peter Geoghegan, Andres Freund
and me.
2014-01-29 12:44:45 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>huge_pages</varname> configuration parameter</primary>
|
Allow using huge TLB pages on Linux (MAP_HUGETLB)
This patch adds an option, huge_tlb_pages, which allows requesting the
shared memory segment to be allocated using huge pages, by using the
MAP_HUGETLB flag in mmap(). This can improve performance.
The default is 'try', which means that we will attempt using huge pages,
and fall back to non-huge pages if it doesn't work. Currently, only Linux
has MAP_HUGETLB. On other platforms, the default 'try' behaves the same as
'off'.
In the passing, don't try to round the mmap() size to a multiple of
pagesize. mmap() doesn't require that, and there's no particular reason for
PostgreSQL to do that either. When using MAP_HUGETLB, however, round the
request size up to nearest 2MB boundary. This is to work around a bug in
some Linux kernel versions, but also to avoid wasting memory, because the
kernel will round the size up anyway.
Many people were involved in writing this patch, including Christian Kruse,
Richard Poole, Abhijit Menon-Sen, reviewed by Peter Geoghegan, Andres Freund
and me.
2014-01-29 12:44:45 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
Allow using huge TLB pages on Linux (MAP_HUGETLB)
This patch adds an option, huge_tlb_pages, which allows requesting the
shared memory segment to be allocated using huge pages, by using the
MAP_HUGETLB flag in mmap(). This can improve performance.
The default is 'try', which means that we will attempt using huge pages,
and fall back to non-huge pages if it doesn't work. Currently, only Linux
has MAP_HUGETLB. On other platforms, the default 'try' behaves the same as
'off'.
In the passing, don't try to round the mmap() size to a multiple of
pagesize. mmap() doesn't require that, and there's no particular reason for
PostgreSQL to do that either. When using MAP_HUGETLB, however, round the
request size up to nearest 2MB boundary. This is to work around a bug in
some Linux kernel versions, but also to avoid wasting memory, because the
kernel will round the size up anyway.
Many people were involved in writing this patch, including Christian Kruse,
Richard Poole, Abhijit Menon-Sen, reviewed by Peter Geoghegan, Andres Freund
and me.
2014-01-29 12:44:45 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-01-25 17:14:24 +01:00
|
|
|
Controls whether huge pages are requested for the main shared memory
|
|
|
|
area. Valid values are <literal>try</literal> (the default),
|
|
|
|
<literal>on</literal>, and <literal>off</literal>. With
|
|
|
|
<varname>huge_pages</varname> set to <literal>try</literal>, the
|
|
|
|
server will try to request huge pages, but fall back to the default if
|
|
|
|
that fails. With <literal>on</literal>, failure to request huge pages
|
|
|
|
will prevent the server from starting up. With <literal>off</literal>,
|
|
|
|
huge pages will not be requested.
|
Allow using huge TLB pages on Linux (MAP_HUGETLB)
This patch adds an option, huge_tlb_pages, which allows requesting the
shared memory segment to be allocated using huge pages, by using the
MAP_HUGETLB flag in mmap(). This can improve performance.
The default is 'try', which means that we will attempt using huge pages,
and fall back to non-huge pages if it doesn't work. Currently, only Linux
has MAP_HUGETLB. On other platforms, the default 'try' behaves the same as
'off'.
In the passing, don't try to round the mmap() size to a multiple of
pagesize. mmap() doesn't require that, and there's no particular reason for
PostgreSQL to do that either. When using MAP_HUGETLB, however, round the
request size up to nearest 2MB boundary. This is to work around a bug in
some Linux kernel versions, but also to avoid wasting memory, because the
kernel will round the size up anyway.
Many people were involved in writing this patch, including Christian Kruse,
Richard Poole, Abhijit Menon-Sen, reviewed by Peter Geoghegan, Andres Freund
and me.
2014-01-29 12:44:45 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2018-01-25 17:14:24 +01:00
|
|
|
At present, this setting is supported only on Linux and Windows. The
|
|
|
|
setting is ignored on other systems when set to
|
2021-10-26 01:54:55 +02:00
|
|
|
<literal>try</literal>. On Linux, it is only supported when
|
|
|
|
<varname>shared_memory_type</varname> is set to <literal>mmap</literal>
|
|
|
|
(the default).
|
Allow using huge TLB pages on Linux (MAP_HUGETLB)
This patch adds an option, huge_tlb_pages, which allows requesting the
shared memory segment to be allocated using huge pages, by using the
MAP_HUGETLB flag in mmap(). This can improve performance.
The default is 'try', which means that we will attempt using huge pages,
and fall back to non-huge pages if it doesn't work. Currently, only Linux
has MAP_HUGETLB. On other platforms, the default 'try' behaves the same as
'off'.
In the passing, don't try to round the mmap() size to a multiple of
pagesize. mmap() doesn't require that, and there's no particular reason for
PostgreSQL to do that either. When using MAP_HUGETLB, however, round the
request size up to nearest 2MB boundary. This is to work around a bug in
some Linux kernel versions, but also to avoid wasting memory, because the
kernel will round the size up anyway.
Many people were involved in writing this patch, including Christian Kruse,
Richard Poole, Abhijit Menon-Sen, reviewed by Peter Geoghegan, Andres Freund
and me.
2014-01-29 12:44:45 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2014-03-03 19:52:48 +01:00
|
|
|
The use of huge pages results in smaller page tables and less CPU time
|
2018-01-21 15:40:46 +01:00
|
|
|
spent on memory management, increasing performance. For more details about
|
|
|
|
using huge pages on Linux, see <xref linkend="linux-huge-pages"/>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Huge pages are known as large pages on Windows. To use them, you need to
|
2021-02-04 13:31:13 +01:00
|
|
|
assign the user right <quote>Lock pages in memory</quote> to the Windows user account
|
2018-01-21 15:40:46 +01:00
|
|
|
that runs <productname>PostgreSQL</productname>.
|
|
|
|
You can use Windows Group Policy tool (gpedit.msc) to assign the user right
|
2021-02-04 13:31:13 +01:00
|
|
|
<quote>Lock pages in memory</quote>.
|
2018-01-21 15:40:46 +01:00
|
|
|
To start the database server on the command prompt as a standalone process,
|
2018-01-22 10:18:09 +01:00
|
|
|
not as a Windows service, the command prompt must be run as an administrator or
|
2018-01-21 15:40:46 +01:00
|
|
|
User Access Control (UAC) must be disabled. When the UAC is enabled, the normal
|
2021-02-04 13:31:13 +01:00
|
|
|
command prompt revokes the user right <quote>Lock pages in memory</quote> when started.
|
Allow using huge TLB pages on Linux (MAP_HUGETLB)
This patch adds an option, huge_tlb_pages, which allows requesting the
shared memory segment to be allocated using huge pages, by using the
MAP_HUGETLB flag in mmap(). This can improve performance.
The default is 'try', which means that we will attempt using huge pages,
and fall back to non-huge pages if it doesn't work. Currently, only Linux
has MAP_HUGETLB. On other platforms, the default 'try' behaves the same as
'off'.
In the passing, don't try to round the mmap() size to a multiple of
pagesize. mmap() doesn't require that, and there's no particular reason for
PostgreSQL to do that either. When using MAP_HUGETLB, however, round the
request size up to nearest 2MB boundary. This is to work around a bug in
some Linux kernel versions, but also to avoid wasting memory, because the
kernel will round the size up anyway.
Many people were involved in writing this patch, including Christian Kruse,
Richard Poole, Abhijit Menon-Sen, reviewed by Peter Geoghegan, Andres Freund
and me.
2014-01-29 12:44:45 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2018-01-25 17:14:24 +01:00
|
|
|
Note that this setting only affects the main shared memory area.
|
|
|
|
Operating systems such as Linux, FreeBSD, and Illumos can also use
|
|
|
|
huge pages (also known as <quote>super</quote> pages or
|
|
|
|
<quote>large</quote> pages) automatically for normal memory
|
|
|
|
allocation, without an explicit request from
|
|
|
|
<productname>PostgreSQL</productname>. On Linux, this is called
|
|
|
|
<quote>transparent huge pages</quote><indexterm><primary>transparent
|
|
|
|
huge pages</primary></indexterm> (THP). That feature has been known to
|
|
|
|
cause performance degradation with
|
|
|
|
<productname>PostgreSQL</productname> for some users on some Linux
|
|
|
|
versions, so its use is currently discouraged (unlike explicit use of
|
|
|
|
<varname>huge_pages</varname>).
|
Allow using huge TLB pages on Linux (MAP_HUGETLB)
This patch adds an option, huge_tlb_pages, which allows requesting the
shared memory segment to be allocated using huge pages, by using the
MAP_HUGETLB flag in mmap(). This can improve performance.
The default is 'try', which means that we will attempt using huge pages,
and fall back to non-huge pages if it doesn't work. Currently, only Linux
has MAP_HUGETLB. On other platforms, the default 'try' behaves the same as
'off'.
In the passing, don't try to round the mmap() size to a multiple of
pagesize. mmap() doesn't require that, and there's no particular reason for
PostgreSQL to do that either. When using MAP_HUGETLB, however, round the
request size up to nearest 2MB boundary. This is to work around a bug in
some Linux kernel versions, but also to avoid wasting memory, because the
kernel will round the size up anyway.
Many people were involved in writing this patch, including Christian Kruse,
Richard Poole, Abhijit Menon-Sen, reviewed by Peter Geoghegan, Andres Freund
and me.
2014-01-29 12:44:45 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2020-07-17 04:33:00 +02:00
|
|
|
<varlistentry id="guc-huge-page-size" xreflabel="huge_page_size">
|
|
|
|
<term><varname>huge_page_size</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>huge_page_size</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Controls the size of huge pages, when they are enabled with
|
|
|
|
<xref linkend="guc-huge-pages"/>.
|
|
|
|
The default is zero (<literal>0</literal>).
|
|
|
|
When set to <literal>0</literal>, the default huge page size on the
|
doc: Fix description of some GUCs in docs and postgresql.conf.sample
The following parameters have been imprecise, or incorrect, about their
description (PGC_POSTMASTER or PGC_SIGHUP):
- autovacuum_work_mem (docs, as of 9.6~)
- huge_page_size (docs, as of 14~)
- max_logical_replication_workers (docs, as of 10~)
- max_sync_workers_per_subscription (docs, as of 10~)
- min_dynamic_shared_memory (docs, as of 14~)
- recovery_init_sync_method (postgresql.conf.sample, as of 14~)
- remove_temp_files_after_crash (docs, as of 14~)
- restart_after_crash (docs, as of 9.6~)
- ssl_min_protocol_version (docs, as of 12~)
- ssl_max_protocol_version (docs, as of 12~)
This commit adjusts the description of all these parameters to be more
consistent with the practice used for the others.
Revewed-by: Justin Pryzby
Discussion: https://postgr.es/m/YK2ltuLpe+FbRXzA@paquier.xyz
Backpatch-through: 9.6
2021-05-27 07:57:28 +02:00
|
|
|
system will be used. This parameter can only be set at server start.
|
2020-07-17 04:33:00 +02:00
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
Some commonly available page sizes on modern 64 bit server architectures include:
|
|
|
|
<literal>2MB</literal> and <literal>1GB</literal> (Intel and AMD), <literal>16MB</literal> and
|
|
|
|
<literal>16GB</literal> (IBM POWER), and <literal>64kB</literal>, <literal>2MB</literal>,
|
|
|
|
<literal>32MB</literal> and <literal>1GB</literal> (ARM). For more information
|
|
|
|
about usage and support, see <xref linkend="linux-huge-pages"/>.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
Non-default settings are currently supported only on Linux.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-temp-buffers" xreflabel="temp_buffers">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>temp_buffers</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>temp_buffers</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Sets the maximum amount of memory used for temporary buffers within
|
|
|
|
each database session. These are session-local buffers used only
|
|
|
|
for access to temporary tables.
|
|
|
|
If this value is specified without units, it is taken as blocks,
|
|
|
|
that is <symbol>BLCKSZ</symbol> bytes, typically 8kB.
|
|
|
|
The default is eight megabytes (<literal>8MB</literal>).
|
|
|
|
(If <symbol>BLCKSZ</symbol> is not 8kB, the default value scales
|
|
|
|
proportionally to it.)
|
|
|
|
This setting can be changed within individual
|
2010-02-03 18:25:06 +01:00
|
|
|
sessions, but only before the first use of temporary tables
|
|
|
|
within the session; subsequent attempts to change the value will
|
2007-01-20 22:30:26 +01:00
|
|
|
have no effect on that session.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
A session will allocate temporary buffers as needed up to the limit
|
2017-10-09 03:44:17 +02:00
|
|
|
given by <varname>temp_buffers</varname>. The cost of setting a large
|
2010-02-03 18:25:06 +01:00
|
|
|
value in sessions that do not actually need many temporary
|
2005-09-13 00:11:38 +02:00
|
|
|
buffers is only a buffer descriptor, or about 64 bytes, per
|
2017-10-09 03:44:17 +02:00
|
|
|
increment in <varname>temp_buffers</varname>. However if a buffer is
|
2005-09-13 00:11:38 +02:00
|
|
|
actually used an additional 8192 bytes will be consumed for it
|
|
|
|
(or in general, <symbol>BLCKSZ</symbol> bytes).
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-max-prepared-transactions" xreflabel="max_prepared_transactions">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>max_prepared_transactions</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_prepared_transactions</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the maximum number of transactions that can be in the
|
2017-10-09 03:44:17 +02:00
|
|
|
<quote>prepared</quote> state simultaneously (see <xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="sql-prepare-transaction"/>).
|
2009-04-23 02:23:46 +02:00
|
|
|
Setting this parameter to zero (which is the default)
|
|
|
|
disables the prepared-transaction feature.
|
2006-01-23 19:16:41 +01:00
|
|
|
This parameter can only be set at server start.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2009-04-23 02:23:46 +02:00
|
|
|
If you are not planning to use prepared transactions, this parameter
|
|
|
|
should be set to zero to prevent accidental creation of prepared
|
|
|
|
transactions. If you are using prepared transactions, you will
|
|
|
|
probably want <varname>max_prepared_transactions</varname> to be at
|
2017-11-23 15:39:47 +01:00
|
|
|
least as large as <xref linkend="guc-max-connections"/>, so that every
|
2009-04-23 02:23:46 +02:00
|
|
|
session can have a prepared transaction pending.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
<para>
|
|
|
|
When running a standby server, you must set this parameter to the
|
2020-06-15 19:12:58 +02:00
|
|
|
same or higher value than on the primary server. Otherwise, queries
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
will not be allowed in the standby server.
|
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-work-mem" xreflabel="work_mem">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>work_mem</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>work_mem</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2020-07-29 23:14:58 +02:00
|
|
|
Sets the base maximum amount of memory to be used by a query operation
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
(such as a sort or hash table) before writing to temporary disk files.
|
|
|
|
If this value is specified without units, it is taken as kilobytes.
|
|
|
|
The default value is four megabytes (<literal>4MB</literal>).
|
2005-09-13 00:11:38 +02:00
|
|
|
Note that for a complex query, several sort or hash operations might be
|
2020-07-29 23:14:58 +02:00
|
|
|
running in parallel; each operation will generally be allowed
|
|
|
|
to use as much memory as this value specifies before it starts
|
|
|
|
to write data into temporary files. Also, several running
|
|
|
|
sessions could be doing such operations concurrently.
|
|
|
|
Therefore, the total memory used could be many times the value
|
|
|
|
of <varname>work_mem</varname>; it is necessary to keep this
|
|
|
|
fact in mind when choosing the value. Sort operations are used
|
|
|
|
for <literal>ORDER BY</literal>, <literal>DISTINCT</literal>,
|
|
|
|
and merge joins.
|
Add Result Cache executor node (take 2)
Here we add a new executor node type named "Result Cache". The planner
can include this node type in the plan to have the executor cache the
results from the inner side of parameterized nested loop joins. This
allows caching of tuples for sets of parameters so that in the event that
the node sees the same parameter values again, it can just return the
cached tuples instead of rescanning the inner side of the join all over
again. Internally, result cache uses a hash table in order to quickly
find tuples that have been previously cached.
For certain data sets, this can significantly improve the performance of
joins. The best cases for using this new node type are for join problems
where a large portion of the tuples from the inner side of the join have
no join partner on the outer side of the join. In such cases, hash join
would have to hash values that are never looked up, thus bloating the hash
table and possibly causing it to multi-batch. Merge joins would have to
skip over all of the unmatched rows. If we use a nested loop join with a
result cache, then we only cache tuples that have at least one join
partner on the outer side of the join. The benefits of using a
parameterized nested loop with a result cache increase when there are
fewer distinct values being looked up and the number of lookups of each
value is large. Also, hash probes to lookup the cache can be much faster
than the hash probe in a hash join as it's common that the result cache's
hash table is much smaller than the hash join's due to result cache only
caching useful tuples rather than all tuples from the inner side of the
join. This variation in hash probe performance is more significant when
the hash join's hash table no longer fits into the CPU's L3 cache, but the
result cache's hash table does. The apparent "random" access of hash
buckets with each hash probe can cause a poor L3 cache hit ratio for large
hash tables. Smaller hash tables generally perform better.
The hash table used for the cache limits itself to not exceeding work_mem
* hash_mem_multiplier in size. We maintain a dlist of keys for this cache
and when we're adding new tuples and realize we've exceeded the memory
budget, we evict cache entries starting with the least recently used ones
until we have enough memory to add the new tuples to the cache.
For parameterized nested loop joins, we now consider using one of these
result cache nodes in between the nested loop node and its inner node. We
determine when this might be useful based on cost, which is primarily
driven off of what the expected cache hit ratio will be. Estimating the
cache hit ratio relies on having good distinct estimates on the nested
loop's parameters.
For now, the planner will only consider using a result cache for
parameterized nested loop joins. This works for both normal joins and
also for LATERAL type joins to subqueries. It is possible to use this new
node for other uses in the future. For example, to cache results from
correlated subqueries. However, that's not done here due to some
difficulties obtaining a distinct estimation on the outer plan to
calculate the estimated cache hit ratio. Currently we plan the inner plan
before planning the outer plan so there is no good way to know if a result
cache would be useful or not since we can't estimate the number of times
the subplan will be called until the outer plan is generated.
The functionality being added here is newly introducing a dependency on
the return value of estimate_num_groups() during the join search.
Previously, during the join search, we only ever needed to perform
selectivity estimations. With this commit, we need to use
estimate_num_groups() in order to estimate what the hit ratio on the
result cache will be. In simple terms, if we expect 10 distinct values
and we expect 1000 outer rows, then we'll estimate the hit ratio to be
99%. Since cache hits are very cheap compared to scanning the underlying
nodes on the inner side of the nested loop join, then this will
significantly reduce the planner's cost for the join. However, it's
fairly easy to see here that things will go bad when estimate_num_groups()
incorrectly returns a value that's significantly lower than the actual
number of distinct values. If this happens then that may cause us to make
use of a nested loop join with a result cache instead of some other join
type, such as a merge or hash join. Our distinct estimations have been
known to be a source of trouble in the past, so the extra reliance on them
here could cause the planner to choose slower plans than it did previous
to having this feature. Distinct estimations are also fairly hard to
estimate accurately when several tables have been joined already or when a
WHERE clause filters out a set of values that are correlated to the
expressions we're estimating the number of distinct value for.
For now, the costing we perform during query planning for result caches
does put quite a bit of faith in the distinct estimations being accurate.
When these are accurate then we should generally see faster execution
times for plans containing a result cache. However, in the real world, we
may find that we need to either change the costings to put less trust in
the distinct estimations being accurate or perhaps even disable this
feature by default. There's always an element of risk when we teach the
query planner to do new tricks that it decides to use that new trick at
the wrong time and causes a regression. Users may opt to get the old
behavior by turning the feature off using the enable_resultcache GUC.
Currently, this is enabled by default. It remains to be seen if we'll
maintain that setting for the release.
Additionally, the name "Result Cache" is the best name I could think of
for this new node at the time I started writing the patch. Nobody seems
to strongly dislike the name. A few people did suggest other names but no
other name seemed to dominate in the brief discussion that there was about
names. Let's allow the beta period to see if the current name pleases
enough people. If there's some consensus on a better name, then we can
change it before the release. Please see the 2nd discussion link below
for the discussion on the "Result Cache" name.
Author: David Rowley
Reviewed-by: Andy Fan, Justin Pryzby, Zhihong Yu, Hou Zhijie
Tested-By: Konstantin Knizhnik
Discussion: https://postgr.es/m/CAApHDvrPcQyQdWERGYWx8J%2B2DLUNgXu%2BfOSbQ1UscxrunyXyrQ%40mail.gmail.com
Discussion: https://postgr.es/m/CAApHDvq=yQXr5kqhRviT2RhNKwToaWr9JAN5t+5_PzhuRJ3wvg@mail.gmail.com
2021-04-02 03:10:56 +02:00
|
|
|
Hash tables are used in hash joins, hash-based aggregation, result
|
|
|
|
cache nodes and hash-based processing of <literal>IN</literal>
|
|
|
|
subqueries.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2020-07-29 23:14:58 +02:00
|
|
|
<para>
|
|
|
|
Hash-based operations are generally more sensitive to memory
|
|
|
|
availability than equivalent sort-based operations. The
|
|
|
|
memory available for hash tables is computed by multiplying
|
|
|
|
<varname>work_mem</varname> by
|
|
|
|
<varname>hash_mem_multiplier</varname>. This makes it
|
|
|
|
possible for hash-based operations to use an amount of memory
|
|
|
|
that exceeds the usual <varname>work_mem</varname> base
|
|
|
|
amount.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-hash-mem-multiplier" xreflabel="hash_mem_multiplier">
|
|
|
|
<term><varname>hash_mem_multiplier</varname> (<type>floating point</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>hash_mem_multiplier</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Used to compute the maximum amount of memory that hash-based
|
|
|
|
operations can use. The final limit is determined by
|
|
|
|
multiplying <varname>work_mem</varname> by
|
|
|
|
<varname>hash_mem_multiplier</varname>. The default value is
|
|
|
|
1.0, which makes hash-based operations subject to the same
|
|
|
|
simple <varname>work_mem</varname> maximum as sort-based
|
|
|
|
operations.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
Consider increasing <varname>hash_mem_multiplier</varname> in
|
|
|
|
environments where spilling by query operations is a regular
|
|
|
|
occurrence, especially when simply increasing
|
|
|
|
<varname>work_mem</varname> results in memory pressure (memory
|
|
|
|
pressure typically takes the form of intermittent out of
|
|
|
|
memory errors). A setting of 1.5 or 2.0 may be effective with
|
|
|
|
mixed workloads. Higher settings in the range of 2.0 - 8.0 or
|
|
|
|
more may be effective in environments where
|
|
|
|
<varname>work_mem</varname> has already been increased to 40MB
|
|
|
|
or more.
|
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2007-06-03 19:08:34 +02:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-maintenance-work-mem" xreflabel="maintenance_work_mem">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>maintenance_work_mem</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>maintenance_work_mem</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2010-02-03 18:25:06 +01:00
|
|
|
Specifies the maximum amount of memory to be used by maintenance
|
2005-09-13 00:11:38 +02:00
|
|
|
operations, such as <command>VACUUM</command>, <command>CREATE
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
INDEX</command>, and <command>ALTER TABLE ADD FOREIGN KEY</command>.
|
|
|
|
If this value is specified without units, it is taken as kilobytes.
|
|
|
|
It defaults
|
2017-10-09 03:44:17 +02:00
|
|
|
to 64 megabytes (<literal>64MB</literal>). Since only one of these
|
2007-01-20 22:30:26 +01:00
|
|
|
operations can be executed at a time by a database session, and
|
|
|
|
an installation normally doesn't have many of them running
|
|
|
|
concurrently, it's safe to set this value significantly larger
|
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
|
|
|
than <varname>work_mem</varname>. Larger settings might improve
|
2007-01-20 22:30:26 +01:00
|
|
|
performance for vacuuming and for restoring database dumps.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2008-12-08 16:11:39 +01:00
|
|
|
<para>
|
|
|
|
Note that when autovacuum runs, up to
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-autovacuum-max-workers"/> times this memory
|
2013-12-12 12:42:39 +01:00
|
|
|
may be allocated, so be careful not to set the default value
|
|
|
|
too high. It may be useful to control for this by separately
|
2017-11-23 15:39:47 +01:00
|
|
|
setting <xref linkend="guc-autovacuum-work-mem"/>.
|
2013-12-12 12:42:39 +01:00
|
|
|
</para>
|
2021-07-04 12:28:38 +02:00
|
|
|
<para>
|
2021-08-09 06:45:35 +02:00
|
|
|
Note that for the collection of dead tuple identifiers,
|
|
|
|
<command>VACUUM</command> is only able to utilize up to a maximum of
|
|
|
|
<literal>1GB</literal> of memory.
|
2021-07-04 12:28:38 +02:00
|
|
|
</para>
|
2013-12-12 12:42:39 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-autovacuum-work-mem" xreflabel="autovacuum_work_mem">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>autovacuum_work_mem</varname> (<type>integer</type>)
|
2013-12-12 12:42:39 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>autovacuum_work_mem</varname> configuration parameter</primary>
|
2013-12-12 12:42:39 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2013-12-12 12:42:39 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the maximum amount of memory to be used by each
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
autovacuum worker process.
|
|
|
|
If this value is specified without units, it is taken as kilobytes.
|
|
|
|
It defaults to -1, indicating that
|
2017-11-23 15:39:47 +01:00
|
|
|
the value of <xref linkend="guc-maintenance-work-mem"/> should
|
2013-12-12 12:42:39 +01:00
|
|
|
be used instead. The setting has no effect on the behavior of
|
|
|
|
<command>VACUUM</command> when run in other contexts.
|
doc: Fix description of some GUCs in docs and postgresql.conf.sample
The following parameters have been imprecise, or incorrect, about their
description (PGC_POSTMASTER or PGC_SIGHUP):
- autovacuum_work_mem (docs, as of 9.6~)
- huge_page_size (docs, as of 14~)
- max_logical_replication_workers (docs, as of 10~)
- max_sync_workers_per_subscription (docs, as of 10~)
- min_dynamic_shared_memory (docs, as of 14~)
- recovery_init_sync_method (postgresql.conf.sample, as of 14~)
- remove_temp_files_after_crash (docs, as of 14~)
- restart_after_crash (docs, as of 9.6~)
- ssl_min_protocol_version (docs, as of 12~)
- ssl_max_protocol_version (docs, as of 12~)
This commit adjusts the description of all these parameters to be more
consistent with the practice used for the others.
Revewed-by: Justin Pryzby
Discussion: https://postgr.es/m/YK2ltuLpe+FbRXzA@paquier.xyz
Backpatch-through: 9.6
2021-05-27 07:57:28 +02:00
|
|
|
This parameter can only be set in the
|
|
|
|
<filename>postgresql.conf</filename> file or on the server command
|
|
|
|
line.
|
2008-12-08 16:11:39 +01:00
|
|
|
</para>
|
2021-08-09 06:45:35 +02:00
|
|
|
<para>
|
|
|
|
For the collection of dead tuple identifiers, autovacuum is only able
|
|
|
|
to utilize up to a maximum of <literal>1GB</literal> of memory, so
|
|
|
|
setting <varname>autovacuum_work_mem</varname> to a value higher than
|
|
|
|
that has no effect on the number of dead tuples that autovacuum can
|
|
|
|
collect while scanning a table.
|
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Add logical_decoding_work_mem to limit ReorderBuffer memory usage.
Instead of deciding to serialize a transaction merely based on the
number of changes in that xact (toplevel or subxact), this makes
the decisions based on amount of memory consumed by the changes.
The memory limit is defined by a new logical_decoding_work_mem GUC,
so for example we can do this
SET logical_decoding_work_mem = '128kB'
to reduce the memory usage of walsenders or set the higher value to
reduce disk writes. The minimum value is 64kB.
When adding a change to a transaction, we account for the size in
two places. Firstly, in the ReorderBuffer, which is then used to
decide if we reached the total memory limit. And secondly in the
transaction the change belongs to, so that we can pick the largest
transaction to evict (and serialize to disk).
We still use max_changes_in_memory when loading changes serialized
to disk. The trouble is we can't use the memory limit directly as
there might be multiple subxact serialized, we need to read all of
them but we don't know how many are there (and which subxact to
read first).
We do not serialize the ReorderBufferTXN entries, so if there is a
transaction with many subxacts, most memory may be in this type of
objects. Those records are not included in the memory accounting.
We also do not account for INTERNAL_TUPLECID changes, which are
kept in a separate list and not evicted from memory. Transactions
with many CTID changes may consume significant amounts of memory,
but we can't really do much about that.
The current eviction algorithm is very simple - the transaction is
picked merely by size, while it might be useful to also consider age
(LSN) of the changes for example. With the new Generational memory
allocator, evicting the oldest changes would make it more likely
the memory gets actually pfreed.
The logical_decoding_work_mem can be set in postgresql.conf, in which
case it serves as the default for all publishers on that instance.
Author: Tomas Vondra, with changes by Dilip Kumar and Amit Kapila
Reviewed-by: Dilip Kumar and Amit Kapila
Tested-By: Vignesh C
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2019-11-16 13:19:33 +01:00
|
|
|
<varlistentry id="guc-logical-decoding-work-mem" xreflabel="logical_decoding_work_mem">
|
|
|
|
<term><varname>logical_decoding_work_mem</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>logical_decoding_work_mem</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the maximum amount of memory to be used by logical decoding,
|
|
|
|
before some of the decoded changes are written to local disk. This
|
|
|
|
limits the amount of memory used by logical streaming replication
|
|
|
|
connections. It defaults to 64 megabytes (<literal>64MB</literal>).
|
|
|
|
Since each replication connection only uses a single buffer of this size,
|
|
|
|
and an installation normally doesn't have many such connections
|
|
|
|
concurrently (as limited by <varname>max_wal_senders</varname>), it's
|
|
|
|
safe to set this value significantly higher than <varname>work_mem</varname>,
|
|
|
|
reducing the amount of decoded changes written to disk.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-max-stack-depth" xreflabel="max_stack_depth">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>max_stack_depth</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_stack_depth</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the maximum safe depth of the server's execution stack.
|
|
|
|
The ideal setting for this parameter is the actual stack size limit
|
2017-10-09 03:44:17 +02:00
|
|
|
enforced by the kernel (as set by <literal>ulimit -s</literal> or local
|
2005-09-13 00:11:38 +02:00
|
|
|
equivalent), less a safety margin of a megabyte or so. The safety
|
|
|
|
margin is needed because the stack depth is not checked in every
|
2018-12-10 16:44:06 +01:00
|
|
|
routine in the server, but only in key potentially-recursive routines.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as kilobytes.
|
2018-12-10 16:44:06 +01:00
|
|
|
The default setting is two megabytes (<literal>2MB</literal>), which
|
|
|
|
is conservatively small and unlikely to risk crashes. However,
|
|
|
|
it might be too small to allow execution of complex functions.
|
|
|
|
Only superusers can change this setting.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2006-10-07 21:25:29 +02:00
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Setting <varname>max_stack_depth</varname> higher than
|
2006-10-07 21:25:29 +02:00
|
|
|
the actual kernel limit will mean that a runaway recursive function
|
|
|
|
can crash an individual backend process. On platforms where
|
|
|
|
<productname>PostgreSQL</productname> can determine the kernel limit,
|
2010-02-03 18:25:06 +01:00
|
|
|
the server will not allow this variable to be set to an unsafe
|
|
|
|
value. However, not all platforms provide the information,
|
|
|
|
so caution is recommended in selecting a value.
|
2006-10-07 21:25:29 +02:00
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2019-02-03 09:55:39 +01:00
|
|
|
<varlistentry id="guc-shared-memory-type" xreflabel="shared_memory_type">
|
|
|
|
<term><varname>shared_memory_type</varname> (<type>enum</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>shared_memory_type</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the shared memory implementation that the server
|
|
|
|
should use for the main shared memory region that holds
|
|
|
|
<productname>PostgreSQL</productname>'s shared buffers and other
|
|
|
|
shared data. Possible values are <literal>mmap</literal> (for
|
|
|
|
anonymous shared memory allocated using <function>mmap</function>),
|
|
|
|
<literal>sysv</literal> (for System V shared memory allocated via
|
|
|
|
<function>shmget</function>) and <literal>windows</literal> (for Windows
|
|
|
|
shared memory). Not all values are supported on all platforms; the
|
|
|
|
first supported option is the default for that platform. The use of
|
|
|
|
the <literal>sysv</literal> option, which is not the default on any
|
|
|
|
platform, is generally discouraged because it typically requires
|
|
|
|
non-default kernel settings to allow for large allocations (see <xref
|
|
|
|
linkend="sysvipc"/>).
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2013-10-10 03:05:02 +02:00
|
|
|
<varlistentry id="guc-dynamic-shared-memory-type" xreflabel="dynamic_shared_memory_type">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>dynamic_shared_memory_type</varname> (<type>enum</type>)
|
2013-10-10 03:05:02 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>dynamic_shared_memory_type</varname> configuration parameter</primary>
|
2013-10-10 03:05:02 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2013-10-10 03:05:02 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the dynamic shared memory implementation that the server
|
2017-10-09 03:44:17 +02:00
|
|
|
should use. Possible values are <literal>posix</literal> (for POSIX shared
|
|
|
|
memory allocated using <literal>shm_open</literal>), <literal>sysv</literal>
|
|
|
|
(for System V shared memory allocated via <literal>shmget</literal>),
|
2018-07-10 17:37:42 +02:00
|
|
|
<literal>windows</literal> (for Windows shared memory),
|
|
|
|
and <literal>mmap</literal> (to simulate shared memory using
|
|
|
|
memory-mapped files stored in the data directory).
|
2013-10-10 03:05:02 +02:00
|
|
|
Not all values are supported on all platforms; the first supported
|
|
|
|
option is the default for that platform. The use of the
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>mmap</literal> option, which is not the default on any platform,
|
2013-10-10 03:05:02 +02:00
|
|
|
is generally discouraged because the operating system may write
|
|
|
|
modified pages back to disk repeatedly, increasing system I/O load;
|
|
|
|
however, it may be useful for debugging, when the
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>pg_dynshmem</literal> directory is stored on a RAM disk, or when
|
2013-10-10 03:05:02 +02:00
|
|
|
other shared memory facilities are not available.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2020-07-31 07:27:09 +02:00
|
|
|
<varlistentry id="guc-min-dynamic-shared-memory" xreflabel="min_dynamic_shared_memory">
|
|
|
|
<term><varname>min_dynamic_shared_memory</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>min_dynamic_shared_memory</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the amount of memory that should be allocated at server
|
2020-12-29 08:49:14 +01:00
|
|
|
startup for use by parallel queries. When this memory region is
|
2020-07-31 07:27:09 +02:00
|
|
|
insufficient or exhausted by concurrent queries, new parallel queries
|
|
|
|
try to allocate extra shared memory temporarily from the operating
|
|
|
|
system using the method configured with
|
|
|
|
<varname>dynamic_shared_memory_type</varname>, which may be slower due
|
|
|
|
to memory management overheads. Memory that is allocated at startup
|
2020-12-29 08:49:14 +01:00
|
|
|
with <varname>min_dynamic_shared_memory</varname> is affected by
|
2020-07-31 07:27:09 +02:00
|
|
|
the <varname>huge_pages</varname> setting on operating systems where
|
|
|
|
that is supported, and may be more likely to benefit from larger pages
|
|
|
|
on operating systems where that is managed automatically.
|
doc: Fix description of some GUCs in docs and postgresql.conf.sample
The following parameters have been imprecise, or incorrect, about their
description (PGC_POSTMASTER or PGC_SIGHUP):
- autovacuum_work_mem (docs, as of 9.6~)
- huge_page_size (docs, as of 14~)
- max_logical_replication_workers (docs, as of 10~)
- max_sync_workers_per_subscription (docs, as of 10~)
- min_dynamic_shared_memory (docs, as of 14~)
- recovery_init_sync_method (postgresql.conf.sample, as of 14~)
- remove_temp_files_after_crash (docs, as of 14~)
- restart_after_crash (docs, as of 9.6~)
- ssl_min_protocol_version (docs, as of 12~)
- ssl_max_protocol_version (docs, as of 12~)
This commit adjusts the description of all these parameters to be more
consistent with the practice used for the others.
Revewed-by: Justin Pryzby
Discussion: https://postgr.es/m/YK2ltuLpe+FbRXzA@paquier.xyz
Backpatch-through: 9.6
2021-05-27 07:57:28 +02:00
|
|
|
The default value is <literal>0</literal> (none). This parameter can
|
|
|
|
only be set at server start.
|
2020-07-31 07:27:09 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
2007-11-28 06:01:24 +01:00
|
|
|
|
2011-07-17 20:19:31 +02:00
|
|
|
<sect2 id="runtime-config-resource-disk">
|
|
|
|
<title>Disk</title>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry id="guc-temp-file-limit" xreflabel="temp_file_limit">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>temp_file_limit</varname> (<type>integer</type>)
|
2011-07-17 20:19:31 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>temp_file_limit</varname> configuration parameter</primary>
|
2011-07-17 20:19:31 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2011-07-17 20:19:31 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2016-07-07 17:18:51 +02:00
|
|
|
Specifies the maximum amount of disk space that a process can use
|
2011-07-17 20:19:31 +02:00
|
|
|
for temporary files, such as sort and hash temporary files, or the
|
2012-08-03 21:15:27 +02:00
|
|
|
storage file for a held cursor. A transaction attempting to exceed
|
2013-05-21 03:13:13 +02:00
|
|
|
this limit will be canceled.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as kilobytes.
|
|
|
|
<literal>-1</literal> (the default) means no limit.
|
2011-07-17 20:19:31 +02:00
|
|
|
Only superusers can change this setting.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
This setting constrains the total space used at any instant by all
|
2017-10-09 03:44:17 +02:00
|
|
|
temporary files used by a given <productname>PostgreSQL</productname> process.
|
2011-07-17 20:19:31 +02:00
|
|
|
It should be noted that disk space used for explicit temporary
|
|
|
|
tables, as opposed to temporary files used behind-the-scenes in query
|
|
|
|
execution, does <emphasis>not</emphasis> count against this limit.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<sect2 id="runtime-config-resource-kernel">
|
|
|
|
<title>Kernel Resource Usage</title>
|
|
|
|
|
2011-07-17 20:19:31 +02:00
|
|
|
<variablelist>
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-max-files-per-process" xreflabel="max_files_per_process">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>max_files_per_process</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_files_per_process</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the maximum number of simultaneously open files allowed to each
|
2007-01-20 22:30:26 +01:00
|
|
|
server subprocess. The default is one thousand files. If the kernel is enforcing
|
2005-09-13 00:11:38 +02:00
|
|
|
a safe per-process limit, you don't need to worry about this setting.
|
|
|
|
But on some platforms (notably, most BSD systems), the kernel will
|
|
|
|
allow individual processes to open many more files than the system
|
2010-02-03 18:25:06 +01:00
|
|
|
can actually support if many processes all try to open
|
2005-09-13 00:11:38 +02:00
|
|
|
that many files. If you find yourself seeing <quote>Too many open
|
2017-10-09 03:44:17 +02:00
|
|
|
files</quote> failures, try reducing this setting.
|
2006-01-23 19:16:41 +01:00
|
|
|
This parameter can only be set at server start.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
|
|
|
|
<sect2 id="runtime-config-resource-vacuum-cost">
|
2011-02-01 23:00:26 +01:00
|
|
|
<title>Cost-based Vacuum Delay</title>
|
2005-09-13 00:11:38 +02:00
|
|
|
|
|
|
|
<para>
|
2017-11-23 15:39:47 +01:00
|
|
|
During the execution of <xref linkend="sql-vacuum"/>
|
|
|
|
and <xref linkend="sql-analyze"/>
|
2010-04-03 09:23:02 +02:00
|
|
|
commands, the system maintains an
|
2005-09-13 00:11:38 +02:00
|
|
|
internal counter that keeps track of the estimated cost of the
|
|
|
|
various I/O operations that are performed. When the accumulated
|
|
|
|
cost reaches a limit (specified by
|
|
|
|
<varname>vacuum_cost_limit</varname>), the process performing
|
2010-02-03 18:25:06 +01:00
|
|
|
the operation will sleep for a short period of time, as specified by
|
|
|
|
<varname>vacuum_cost_delay</varname>. Then it will reset the
|
2005-09-13 00:11:38 +02:00
|
|
|
counter and continue execution.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The intent of this feature is to allow administrators to reduce
|
|
|
|
the I/O impact of these commands on concurrent database
|
2010-02-03 18:25:06 +01:00
|
|
|
activity. There are many situations where it is not
|
2005-09-13 00:11:38 +02:00
|
|
|
important that maintenance commands like
|
|
|
|
<command>VACUUM</command> and <command>ANALYZE</command> finish
|
|
|
|
quickly; however, it is usually very important that these
|
|
|
|
commands do not significantly interfere with the ability of the
|
|
|
|
system to perform other database operations. Cost-based vacuum
|
|
|
|
delay provides a way for administrators to achieve this.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2009-02-28 01:10:52 +01:00
|
|
|
This feature is disabled by default for manually issued
|
|
|
|
<command>VACUUM</command> commands. To enable it, set the
|
2005-09-13 00:11:38 +02:00
|
|
|
<varname>vacuum_cost_delay</varname> variable to a nonzero
|
|
|
|
value.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry id="guc-vacuum-cost-delay" xreflabel="vacuum_cost_delay">
|
2019-03-10 20:01:39 +01:00
|
|
|
<term><varname>vacuum_cost_delay</varname> (<type>floating point</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>vacuum_cost_delay</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
The amount of time that the process will sleep
|
2005-09-13 00:11:38 +02:00
|
|
|
when the cost limit has been exceeded.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
2007-01-20 22:30:26 +01:00
|
|
|
The default value is zero, which disables the cost-based vacuum
|
2005-09-13 00:11:38 +02:00
|
|
|
delay feature. Positive values enable cost-based vacuuming.
|
|
|
|
</para>
|
2009-02-28 01:10:52 +01:00
|
|
|
|
|
|
|
<para>
|
|
|
|
When using cost-based vacuuming, appropriate values for
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>vacuum_cost_delay</varname> are usually quite small, perhaps
|
2019-03-10 20:01:39 +01:00
|
|
|
less than 1 millisecond. While <varname>vacuum_cost_delay</varname>
|
|
|
|
can be set to fractional-millisecond values, such delays may not be
|
|
|
|
measured accurately on older platforms. On such platforms,
|
|
|
|
increasing <command>VACUUM</command>'s throttled resource consumption
|
|
|
|
above what you get at 1ms will require changing the other vacuum cost
|
|
|
|
parameters. You should, nonetheless,
|
|
|
|
keep <varname>vacuum_cost_delay</varname> as small as your platform
|
|
|
|
will consistently measure; large delays are not helpful.
|
2009-02-28 01:10:52 +01:00
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-vacuum-cost-page-hit" xreflabel="vacuum_cost_page_hit">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>vacuum_cost_page_hit</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>vacuum_cost_page_hit</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
The estimated cost for vacuuming a buffer found in the shared buffer
|
|
|
|
cache. It represents the cost to lock the buffer pool, lookup
|
|
|
|
the shared hash table and scan the content of the page. The
|
2007-01-20 22:30:26 +01:00
|
|
|
default value is one.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-vacuum-cost-page-miss" xreflabel="vacuum_cost_page_miss">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>vacuum_cost_page_miss</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>vacuum_cost_page_miss</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
The estimated cost for vacuuming a buffer that has to be read from
|
|
|
|
disk. This represents the effort to lock the buffer pool,
|
|
|
|
lookup the shared hash table, read the desired block in from
|
2021-01-28 00:11:13 +01:00
|
|
|
the disk and scan its content. The default value is 2.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-vacuum-cost-page-dirty" xreflabel="vacuum_cost_page_dirty">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>vacuum_cost_page_dirty</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>vacuum_cost_page_dirty</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
The estimated cost charged when vacuum modifies a block that was
|
|
|
|
previously clean. It represents the extra I/O required to
|
|
|
|
flush the dirty block out to disk again. The default value is
|
|
|
|
20.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-vacuum-cost-limit" xreflabel="vacuum_cost_limit">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>vacuum_cost_limit</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>vacuum_cost_limit</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
The accumulated cost that will cause the vacuuming process to sleep.
|
2019-03-10 20:05:25 +01:00
|
|
|
The default value is 200.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
|
|
|
|
|
|
<note>
|
|
|
|
<para>
|
|
|
|
There are certain operations that hold critical locks and should
|
|
|
|
therefore complete as quickly as possible. Cost-based vacuum
|
|
|
|
delays do not occur during such operations. Therefore it is
|
|
|
|
possible that the cost accumulates far higher than the specified
|
|
|
|
limit. To avoid uselessly long delays in such cases, the actual
|
|
|
|
delay is calculated as <varname>vacuum_cost_delay</varname> *
|
|
|
|
<varname>accumulated_balance</varname> /
|
|
|
|
<varname>vacuum_cost_limit</varname> with a maximum of
|
|
|
|
<varname>vacuum_cost_delay</varname> * 4.
|
|
|
|
</para>
|
|
|
|
</note>
|
|
|
|
</sect2>
|
|
|
|
|
|
|
|
<sect2 id="runtime-config-resource-background-writer">
|
|
|
|
<title>Background Writer</title>
|
|
|
|
|
|
|
|
<para>
|
2008-01-21 04:28:42 +01:00
|
|
|
There is a separate server
|
2017-10-09 03:44:17 +02:00
|
|
|
process called the <firstterm>background writer</firstterm>, whose function
|
|
|
|
is to issue writes of <quote>dirty</quote> (new or modified) shared
|
2020-11-16 19:13:43 +01:00
|
|
|
buffers. When the number of clean shared buffers appears to be
|
|
|
|
insufficient, the background writer writes some dirty buffers to the
|
|
|
|
file system and marks them as clean. This reduces the likelihood
|
|
|
|
that server processes handling user queries will be unable to find
|
|
|
|
clean buffers and have to write dirty buffers themselves.
|
2010-02-03 18:25:06 +01:00
|
|
|
However, the background writer does cause a net overall
|
|
|
|
increase in I/O load, because while a repeatedly-dirtied page might
|
|
|
|
otherwise be written only once per checkpoint interval, the
|
|
|
|
background writer might write it several times as it is dirtied
|
|
|
|
in the same interval. The parameters discussed in this subsection
|
|
|
|
can be used to tune the behavior for local needs.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry id="guc-bgwriter-delay" xreflabel="bgwriter_delay">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>bgwriter_delay</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>bgwriter_delay</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Specifies the delay between activity rounds for the
|
2005-09-13 00:11:38 +02:00
|
|
|
background writer. In each round the writer issues writes
|
|
|
|
for some number of dirty buffers (controllable by the
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
following parameters). It then sleeps for
|
|
|
|
the length of <varname>bgwriter_delay</varname>, and repeats.
|
|
|
|
When there are no dirty buffers in the
|
2012-01-27 08:09:50 +01:00
|
|
|
buffer pool, though, it goes into a longer sleep regardless of
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
<varname>bgwriter_delay</varname>.
|
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
|
|
|
The default value is 200
|
2017-10-09 03:44:17 +02:00
|
|
|
milliseconds (<literal>200ms</literal>). Note that on many systems, the
|
2012-01-27 08:09:50 +01:00
|
|
|
effective resolution of sleep delays is 10 milliseconds; setting
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>bgwriter_delay</varname> to a value that is not a multiple of 10
|
2012-01-27 08:09:50 +01:00
|
|
|
might have the same results as setting it to the next higher multiple
|
|
|
|
of 10. This parameter can only be set in the
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>postgresql.conf</filename> file or on the server command line.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2007-09-25 22:03:38 +02:00
|
|
|
<varlistentry id="guc-bgwriter-lru-maxpages" xreflabel="bgwriter_lru_maxpages">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>bgwriter_lru_maxpages</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>bgwriter_lru_maxpages</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2007-09-25 22:03:38 +02:00
|
|
|
In each round, no more than this many buffers will be written
|
|
|
|
by the background writer. Setting this to zero disables
|
2012-01-20 03:52:51 +01:00
|
|
|
background writing. (Note that checkpoints, which are managed by
|
|
|
|
a separate, dedicated auxiliary process, are unaffected.)
|
2007-09-25 22:03:38 +02:00
|
|
|
The default value is 100 buffers.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2006-01-23 19:16:41 +01:00
|
|
|
file or on the server command line.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2007-09-25 22:03:38 +02:00
|
|
|
<varlistentry id="guc-bgwriter-lru-multiplier" xreflabel="bgwriter_lru_multiplier">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>bgwriter_lru_multiplier</varname> (<type>floating point</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>bgwriter_lru_multiplier</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2008-01-21 04:28:42 +01:00
|
|
|
The number of dirty buffers written in each round is based on the
|
|
|
|
number of new buffers that have been needed by server processes
|
|
|
|
during recent rounds. The average recent need is multiplied by
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>bgwriter_lru_multiplier</varname> to arrive at an estimate of the
|
2008-01-21 04:28:42 +01:00
|
|
|
number of buffers that will be needed during the next round. Dirty
|
|
|
|
buffers are written until there are that many clean, reusable buffers
|
2017-10-09 03:44:17 +02:00
|
|
|
available. (However, no more than <varname>bgwriter_lru_maxpages</varname>
|
2008-01-21 04:28:42 +01:00
|
|
|
buffers will be written per round.)
|
2017-10-09 03:44:17 +02:00
|
|
|
Thus, a setting of 1.0 represents a <quote>just in time</quote> policy
|
2007-09-25 22:03:38 +02:00
|
|
|
of writing exactly the number of buffers predicted to be needed.
|
|
|
|
Larger values provide some cushion against spikes in demand,
|
|
|
|
while smaller values intentionally leave writes to be done by
|
|
|
|
server processes.
|
|
|
|
The default is 2.0.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2006-01-23 19:16:41 +01:00
|
|
|
file or on the server command line.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
|
|
|
|
<varlistentry id="guc-bgwriter-flush-after" xreflabel="bgwriter_flush_after">
|
2016-04-24 21:26:55 +02:00
|
|
|
<term><varname>bgwriter_flush_after</varname> (<type>integer</type>)
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>bgwriter_flush_after</varname> configuration parameter</primary>
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Whenever more than this amount of data has
|
2017-06-18 20:01:45 +02:00
|
|
|
been written by the background writer, attempt to force the OS to issue these
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
writes to the underlying storage. Doing so will limit the amount of
|
|
|
|
dirty data in the kernel's page cache, reducing the likelihood of
|
2017-06-18 20:01:45 +02:00
|
|
|
stalls when an <function>fsync</function> is issued at the end of a checkpoint, or when
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
the OS writes data back in larger batches in the background. Often
|
|
|
|
that will result in greatly reduced transaction latency, but there
|
|
|
|
also are some cases, especially with workloads that are bigger than
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-shared-buffers"/>, but smaller than the OS's page
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
cache, where performance might degrade. This setting may have no
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
effect on some platforms.
|
|
|
|
If this value is specified without units, it is taken as blocks,
|
|
|
|
that is <symbol>BLCKSZ</symbol> bytes, typically 8kB.
|
|
|
|
The valid range is between
|
2016-11-26 00:36:10 +01:00
|
|
|
<literal>0</literal>, which disables forced writeback, and
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>2MB</literal>. The default is <literal>512kB</literal> on Linux,
|
|
|
|
<literal>0</literal> elsewhere. (If <symbol>BLCKSZ</symbol> is not 8kB,
|
2016-11-26 00:36:10 +01:00
|
|
|
the default and maximum values scale proportionally to it.)
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
|
|
|
|
<para>
|
2007-09-25 22:03:38 +02:00
|
|
|
Smaller values of <varname>bgwriter_lru_maxpages</varname> and
|
|
|
|
<varname>bgwriter_lru_multiplier</varname> reduce the extra I/O load
|
2005-09-13 00:11:38 +02:00
|
|
|
caused by the background writer, but make it more likely that server
|
|
|
|
processes will have to issue writes for themselves, delaying interactive
|
|
|
|
queries.
|
|
|
|
</para>
|
|
|
|
</sect2>
|
2009-01-12 06:10:45 +01:00
|
|
|
|
|
|
|
<sect2 id="runtime-config-resource-async-behavior">
|
|
|
|
<title>Asynchronous Behavior</title>
|
|
|
|
|
|
|
|
<variablelist>
|
2021-04-22 02:47:43 +02:00
|
|
|
<varlistentry id="guc-backend-flush-after" xreflabel="backend_flush_after">
|
|
|
|
<term><varname>backend_flush_after</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>backend_flush_after</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Whenever more than this amount of data has
|
|
|
|
been written by a single backend, attempt to force the OS to issue
|
|
|
|
these writes to the underlying storage. Doing so will limit the
|
|
|
|
amount of dirty data in the kernel's page cache, reducing the
|
|
|
|
likelihood of stalls when an <function>fsync</function> is issued at the end of a
|
|
|
|
checkpoint, or when the OS writes data back in larger batches in the
|
|
|
|
background. Often that will result in greatly reduced transaction
|
|
|
|
latency, but there also are some cases, especially with workloads
|
|
|
|
that are bigger than <xref linkend="guc-shared-buffers"/>, but smaller
|
|
|
|
than the OS's page cache, where performance might degrade. This
|
|
|
|
setting may have no effect on some platforms.
|
|
|
|
If this value is specified without units, it is taken as blocks,
|
|
|
|
that is <symbol>BLCKSZ</symbol> bytes, typically 8kB.
|
|
|
|
The valid range is
|
|
|
|
between <literal>0</literal>, which disables forced writeback,
|
|
|
|
and <literal>2MB</literal>. The default is <literal>0</literal>, i.e., no
|
|
|
|
forced writeback. (If <symbol>BLCKSZ</symbol> is not 8kB,
|
|
|
|
the maximum value scales proportionally to it.)
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2009-01-12 06:10:45 +01:00
|
|
|
<varlistentry id="guc-effective-io-concurrency" xreflabel="effective_io_concurrency">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>effective_io_concurrency</varname> (<type>integer</type>)
|
2009-01-12 06:10:45 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>effective_io_concurrency</varname> configuration parameter</primary>
|
2009-01-12 06:10:45 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2009-01-12 06:10:45 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the number of concurrent disk I/O operations that
|
2017-10-09 03:44:17 +02:00
|
|
|
<productname>PostgreSQL</productname> expects can be executed
|
2009-01-12 06:10:45 +01:00
|
|
|
simultaneously. Raising this value will increase the number of I/O
|
2017-10-09 03:44:17 +02:00
|
|
|
operations that any individual <productname>PostgreSQL</productname> session
|
2009-01-12 06:10:45 +01:00
|
|
|
attempts to initiate in parallel. The allowed range is 1 to 1000,
|
2010-10-27 03:44:14 +02:00
|
|
|
or zero to disable issuance of asynchronous I/O requests. Currently,
|
|
|
|
this setting only affects bitmap heap scans.
|
2009-01-12 06:10:45 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2016-06-28 22:09:04 +02:00
|
|
|
For magnetic drives, a good starting point for this setting is the
|
|
|
|
number of separate
|
2009-01-12 06:10:45 +01:00
|
|
|
drives comprising a RAID 0 stripe or RAID 1 mirror being used for the
|
|
|
|
database. (For RAID 5 the parity drive should not be counted.)
|
|
|
|
However, if the database is often busy with multiple queries issued in
|
|
|
|
concurrent sessions, lower values may be sufficient to keep the disk
|
|
|
|
array busy. A value higher than needed to keep the disks busy will
|
|
|
|
only result in extra CPU overhead.
|
2016-06-28 22:09:04 +02:00
|
|
|
SSDs and other memory-based storage can often process many
|
|
|
|
concurrent requests, so the best value might be in the hundreds.
|
2009-01-12 06:10:45 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Asynchronous I/O depends on an effective <function>posix_fadvise</function>
|
2009-01-12 06:10:45 +01:00
|
|
|
function, which some operating systems lack. If the function is not
|
|
|
|
present then setting this parameter to anything but zero will result
|
2009-06-17 23:58:49 +02:00
|
|
|
in an error. On some operating systems (e.g., Solaris), the function
|
|
|
|
is present but does not actually do anything.
|
2009-01-12 06:10:45 +01:00
|
|
|
</para>
|
2014-10-19 03:35:46 +02:00
|
|
|
|
|
|
|
<para>
|
2015-09-08 17:51:42 +02:00
|
|
|
The default is 1 on supported systems, otherwise 0. This value can
|
2015-09-11 03:22:21 +02:00
|
|
|
be overridden for tables in a particular tablespace by setting the
|
2015-09-08 17:51:42 +02:00
|
|
|
tablespace parameter of the same name (see
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="sql-altertablespace"/>).
|
2014-10-19 03:35:46 +02:00
|
|
|
</para>
|
2009-01-12 06:10:45 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
Add new GUC, max_worker_processes, limiting number of bgworkers.
In 9.3, there's no particular limit on the number of bgworkers;
instead, we just count up the number that are actually registered,
and use that to set MaxBackends. However, that approach causes
problems for Hot Standby, which needs both MaxBackends and the
size of the lock table to be the same on the standby as on the
master, yet it may not be desirable to run the same bgworkers in
both places. 9.3 handles that by failing to notice the problem,
which will probably work fine in nearly all cases anyway, but is
not theoretically sound.
A further problem with simply counting the number of registered
workers is that new workers can't be registered without a
postmaster restart. This is inconvenient for administrators,
since bouncing the postmaster causes an interruption of service.
Moreover, there are a number of applications for background
processes where, by necessity, the background process must be
started on the fly (e.g. parallel query). While this patch
doesn't actually make it possible to register new background
workers after startup time, it's a necessary prerequisite.
Patch by me. Review by Michael Paquier.
2013-07-04 17:24:24 +02:00
|
|
|
|
2020-03-16 00:31:34 +01:00
|
|
|
<varlistentry id="guc-maintenance-io-concurrency" xreflabel="maintenance_io_concurrency">
|
|
|
|
<term><varname>maintenance_io_concurrency</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>maintenance_io_concurrency</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Similar to <varname>effective_io_concurrency</varname>, but used
|
|
|
|
for maintenance work that is done on behalf of many client sessions.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
The default is 10 on supported systems, otherwise 0. This value can
|
|
|
|
be overridden for tables in a particular tablespace by setting the
|
|
|
|
tablespace parameter of the same name (see
|
|
|
|
<xref linkend="sql-altertablespace"/>).
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2020-04-02 21:04:51 +02:00
|
|
|
|
2013-07-05 16:19:16 +02:00
|
|
|
<varlistentry id="guc-max-worker-processes" xreflabel="max_worker_processes">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>max_worker_processes</varname> (<type>integer</type>)
|
Add new GUC, max_worker_processes, limiting number of bgworkers.
In 9.3, there's no particular limit on the number of bgworkers;
instead, we just count up the number that are actually registered,
and use that to set MaxBackends. However, that approach causes
problems for Hot Standby, which needs both MaxBackends and the
size of the lock table to be the same on the standby as on the
master, yet it may not be desirable to run the same bgworkers in
both places. 9.3 handles that by failing to notice the problem,
which will probably work fine in nearly all cases anyway, but is
not theoretically sound.
A further problem with simply counting the number of registered
workers is that new workers can't be registered without a
postmaster restart. This is inconvenient for administrators,
since bouncing the postmaster causes an interruption of service.
Moreover, there are a number of applications for background
processes where, by necessity, the background process must be
started on the fly (e.g. parallel query). While this patch
doesn't actually make it possible to register new background
workers after startup time, it's a necessary prerequisite.
Patch by me. Review by Michael Paquier.
2013-07-04 17:24:24 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_worker_processes</varname> configuration parameter</primary>
|
Add new GUC, max_worker_processes, limiting number of bgworkers.
In 9.3, there's no particular limit on the number of bgworkers;
instead, we just count up the number that are actually registered,
and use that to set MaxBackends. However, that approach causes
problems for Hot Standby, which needs both MaxBackends and the
size of the lock table to be the same on the standby as on the
master, yet it may not be desirable to run the same bgworkers in
both places. 9.3 handles that by failing to notice the problem,
which will probably work fine in nearly all cases anyway, but is
not theoretically sound.
A further problem with simply counting the number of registered
workers is that new workers can't be registered without a
postmaster restart. This is inconvenient for administrators,
since bouncing the postmaster causes an interruption of service.
Moreover, there are a number of applications for background
processes where, by necessity, the background process must be
started on the fly (e.g. parallel query). While this patch
doesn't actually make it possible to register new background
workers after startup time, it's a necessary prerequisite.
Patch by me. Review by Michael Paquier.
2013-07-04 17:24:24 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
Add new GUC, max_worker_processes, limiting number of bgworkers.
In 9.3, there's no particular limit on the number of bgworkers;
instead, we just count up the number that are actually registered,
and use that to set MaxBackends. However, that approach causes
problems for Hot Standby, which needs both MaxBackends and the
size of the lock table to be the same on the standby as on the
master, yet it may not be desirable to run the same bgworkers in
both places. 9.3 handles that by failing to notice the problem,
which will probably work fine in nearly all cases anyway, but is
not theoretically sound.
A further problem with simply counting the number of registered
workers is that new workers can't be registered without a
postmaster restart. This is inconvenient for administrators,
since bouncing the postmaster causes an interruption of service.
Moreover, there are a number of applications for background
processes where, by necessity, the background process must be
started on the fly (e.g. parallel query). While this patch
doesn't actually make it possible to register new background
workers after startup time, it's a necessary prerequisite.
Patch by me. Review by Michael Paquier.
2013-07-04 17:24:24 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the maximum number of background processes that the system
|
2016-05-12 15:15:49 +02:00
|
|
|
can support. This parameter can only be set at server start. The
|
2016-12-05 16:53:21 +01:00
|
|
|
default is 8.
|
Add new GUC, max_worker_processes, limiting number of bgworkers.
In 9.3, there's no particular limit on the number of bgworkers;
instead, we just count up the number that are actually registered,
and use that to set MaxBackends. However, that approach causes
problems for Hot Standby, which needs both MaxBackends and the
size of the lock table to be the same on the standby as on the
master, yet it may not be desirable to run the same bgworkers in
both places. 9.3 handles that by failing to notice the problem,
which will probably work fine in nearly all cases anyway, but is
not theoretically sound.
A further problem with simply counting the number of registered
workers is that new workers can't be registered without a
postmaster restart. This is inconvenient for administrators,
since bouncing the postmaster causes an interruption of service.
Moreover, there are a number of applications for background
processes where, by necessity, the background process must be
started on the fly (e.g. parallel query). While this patch
doesn't actually make it possible to register new background
workers after startup time, it's a necessary prerequisite.
Patch by me. Review by Michael Paquier.
2013-07-04 17:24:24 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
When running a standby server, you must set this parameter to the
|
2020-06-15 19:12:58 +02:00
|
|
|
same or higher value than on the primary server. Otherwise, queries
|
Add new GUC, max_worker_processes, limiting number of bgworkers.
In 9.3, there's no particular limit on the number of bgworkers;
instead, we just count up the number that are actually registered,
and use that to set MaxBackends. However, that approach causes
problems for Hot Standby, which needs both MaxBackends and the
size of the lock table to be the same on the standby as on the
master, yet it may not be desirable to run the same bgworkers in
both places. 9.3 handles that by failing to notice the problem,
which will probably work fine in nearly all cases anyway, but is
not theoretically sound.
A further problem with simply counting the number of registered
workers is that new workers can't be registered without a
postmaster restart. This is inconvenient for administrators,
since bouncing the postmaster causes an interruption of service.
Moreover, there are a number of applications for background
processes where, by necessity, the background process must be
started on the fly (e.g. parallel query). While this patch
doesn't actually make it possible to register new background
workers after startup time, it's a necessary prerequisite.
Patch by me. Review by Michael Paquier.
2013-07-04 17:24:24 +02:00
|
|
|
will not be allowed in the standby server.
|
|
|
|
</para>
|
2016-12-05 17:03:17 +01:00
|
|
|
|
|
|
|
<para>
|
|
|
|
When changing this value, consider also adjusting
|
Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds. Testing
to date shows that this can often be 2-3x faster than a serial
index build.
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature. We can
refine it as we get more experience.
Peter Geoghegan with some help from Rushabh Lathia. While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature. Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 19:25:55 +01:00
|
|
|
<xref linkend="guc-max-parallel-workers"/>,
|
2020-09-30 07:39:38 +02:00
|
|
|
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-max-parallel-workers-per-gather"/>.
|
2016-12-05 17:03:17 +01:00
|
|
|
</para>
|
Add new GUC, max_worker_processes, limiting number of bgworkers.
In 9.3, there's no particular limit on the number of bgworkers;
instead, we just count up the number that are actually registered,
and use that to set MaxBackends. However, that approach causes
problems for Hot Standby, which needs both MaxBackends and the
size of the lock table to be the same on the standby as on the
master, yet it may not be desirable to run the same bgworkers in
both places. 9.3 handles that by failing to notice the problem,
which will probably work fine in nearly all cases anyway, but is
not theoretically sound.
A further problem with simply counting the number of registered
workers is that new workers can't be registered without a
postmaster restart. This is inconvenient for administrators,
since bouncing the postmaster causes an interruption of service.
Moreover, there are a number of applications for background
processes where, by necessity, the background process must be
started on the fly (e.g. parallel query). While this patch
doesn't actually make it possible to register new background
workers after startup time, it's a necessary prerequisite.
Patch by me. Review by Michael Paquier.
2013-07-04 17:24:24 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
Add a Gather executor node.
A Gather executor node runs any number of copies of a plan in an equal
number of workers and merges all of the results into a single tuple
stream. It can also run the plan itself, if the workers are
unavailable or haven't started up yet. It is intended to work with
the Partial Seq Scan node which will be added in future commits.
It could also be used to implement parallel query of a different sort
by itself, without help from Partial Seq Scan, if the single_copy mode
is used. In that mode, a worker executes the plan, and the parallel
leader does not, merely collecting the worker's results. So, a Gather
node could be inserted into a plan to split the execution of that plan
across two processes. Nested Gather nodes aren't currently supported,
but we might want to add support for that in the future.
There's nothing in the planner to actually generate Gather nodes yet,
so it's not quite time to break out the champagne. But we're getting
close.
Amit Kapila. Some designs suggestions were provided by me, and I also
reviewed the patch. Single-copy mode, documentation, and other minor
changes also by me.
2015-10-01 01:23:36 +02:00
|
|
|
|
2016-06-09 15:08:27 +02:00
|
|
|
<varlistentry id="guc-max-parallel-workers-per-gather" xreflabel="max_parallel_workers_per_gather">
|
|
|
|
<term><varname>max_parallel_workers_per_gather</varname> (<type>integer</type>)
|
Add a Gather executor node.
A Gather executor node runs any number of copies of a plan in an equal
number of workers and merges all of the results into a single tuple
stream. It can also run the plan itself, if the workers are
unavailable or haven't started up yet. It is intended to work with
the Partial Seq Scan node which will be added in future commits.
It could also be used to implement parallel query of a different sort
by itself, without help from Partial Seq Scan, if the single_copy mode
is used. In that mode, a worker executes the plan, and the parallel
leader does not, merely collecting the worker's results. So, a Gather
node could be inserted into a plan to split the execution of that plan
across two processes. Nested Gather nodes aren't currently supported,
but we might want to add support for that in the future.
There's nothing in the planner to actually generate Gather nodes yet,
so it's not quite time to break out the champagne. But we're getting
close.
Amit Kapila. Some designs suggestions were provided by me, and I also
reviewed the patch. Single-copy mode, documentation, and other minor
changes also by me.
2015-10-01 01:23:36 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_parallel_workers_per_gather</varname> configuration parameter</primary>
|
Add a Gather executor node.
A Gather executor node runs any number of copies of a plan in an equal
number of workers and merges all of the results into a single tuple
stream. It can also run the plan itself, if the workers are
unavailable or haven't started up yet. It is intended to work with
the Partial Seq Scan node which will be added in future commits.
It could also be used to implement parallel query of a different sort
by itself, without help from Partial Seq Scan, if the single_copy mode
is used. In that mode, a worker executes the plan, and the parallel
leader does not, merely collecting the worker's results. So, a Gather
node could be inserted into a plan to split the execution of that plan
across two processes. Nested Gather nodes aren't currently supported,
but we might want to add support for that in the future.
There's nothing in the planner to actually generate Gather nodes yet,
so it's not quite time to break out the champagne. But we're getting
close.
Amit Kapila. Some designs suggestions were provided by me, and I also
reviewed the patch. Single-copy mode, documentation, and other minor
changes also by me.
2015-10-01 01:23:36 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2016-06-09 15:08:27 +02:00
|
|
|
Sets the maximum number of workers that can be started by a single
|
2017-08-10 19:22:31 +02:00
|
|
|
<literal>Gather</literal> or <literal>Gather Merge</literal> node.
|
|
|
|
Parallel workers are taken from the pool of processes established by
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-max-worker-processes"/>, limited by
|
|
|
|
<xref linkend="guc-max-parallel-workers"/>. Note that the requested
|
2017-06-18 20:01:45 +02:00
|
|
|
number of workers may not actually be available at run time. If this
|
2016-04-26 14:31:38 +02:00
|
|
|
occurs, the plan will run with fewer workers than expected, which may
|
|
|
|
be inefficient. The default value is 2. Setting this value to 0
|
2016-05-05 19:27:59 +02:00
|
|
|
disables parallel query execution.
|
Add a Gather executor node.
A Gather executor node runs any number of copies of a plan in an equal
number of workers and merges all of the results into a single tuple
stream. It can also run the plan itself, if the workers are
unavailable or haven't started up yet. It is intended to work with
the Partial Seq Scan node which will be added in future commits.
It could also be used to implement parallel query of a different sort
by itself, without help from Partial Seq Scan, if the single_copy mode
is used. In that mode, a worker executes the plan, and the parallel
leader does not, merely collecting the worker's results. So, a Gather
node could be inserted into a plan to split the execution of that plan
across two processes. Nested Gather nodes aren't currently supported,
but we might want to add support for that in the future.
There's nothing in the planner to actually generate Gather nodes yet,
so it's not quite time to break out the champagne. But we're getting
close.
Amit Kapila. Some designs suggestions were provided by me, and I also
reviewed the patch. Single-copy mode, documentation, and other minor
changes also by me.
2015-10-01 01:23:36 +02:00
|
|
|
</para>
|
2016-07-07 17:18:51 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
Note that parallel queries may consume very substantially more
|
|
|
|
resources than non-parallel queries, because each worker process is
|
|
|
|
a completely separate process which has roughly the same impact on the
|
|
|
|
system as an additional user session. This should be taken into
|
|
|
|
account when choosing a value for this setting, as well as when
|
|
|
|
configuring other settings that control resource utilization, such
|
2017-11-23 15:39:47 +01:00
|
|
|
as <xref linkend="guc-work-mem"/>. Resource limits such as
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>work_mem</varname> are applied individually to each worker,
|
2016-07-07 17:18:51 +02:00
|
|
|
which means the total utilization may be much higher across all
|
|
|
|
processes than it would normally be for any single process.
|
|
|
|
For example, a parallel query using 4 workers may use up to 5 times
|
|
|
|
as much CPU time, memory, I/O bandwidth, and so forth as a query which
|
|
|
|
uses no workers at all.
|
|
|
|
</para>
|
2016-09-21 14:37:02 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
For more information on parallel query, see
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="parallel-query"/>.
|
2016-09-21 14:37:02 +02:00
|
|
|
</para>
|
Add a Gather executor node.
A Gather executor node runs any number of copies of a plan in an equal
number of workers and merges all of the results into a single tuple
stream. It can also run the plan itself, if the workers are
unavailable or haven't started up yet. It is intended to work with
the Partial Seq Scan node which will be added in future commits.
It could also be used to implement parallel query of a different sort
by itself, without help from Partial Seq Scan, if the single_copy mode
is used. In that mode, a worker executes the plan, and the parallel
leader does not, merely collecting the worker's results. So, a Gather
node could be inserted into a plan to split the execution of that plan
across two processes. Nested Gather nodes aren't currently supported,
but we might want to add support for that in the future.
There's nothing in the planner to actually generate Gather nodes yet,
so it's not quite time to break out the champagne. But we're getting
close.
Amit Kapila. Some designs suggestions were provided by me, and I also
reviewed the patch. Single-copy mode, documentation, and other minor
changes also by me.
2015-10-01 01:23:36 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
|
2020-09-30 07:39:38 +02:00
|
|
|
<varlistentry id="guc-max-parallel-maintenance-workers" xreflabel="max_parallel_maintenance_workers">
|
Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds. Testing
to date shows that this can often be 2-3x faster than a serial
index build.
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature. We can
refine it as we get more experience.
Peter Geoghegan with some help from Rushabh Lathia. While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature. Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 19:25:55 +01:00
|
|
|
<term><varname>max_parallel_maintenance_workers</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>max_parallel_maintenance_workers</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the maximum number of parallel workers that can be
|
2020-01-20 03:27:49 +01:00
|
|
|
started by a single utility command. Currently, the parallel
|
|
|
|
utility commands that support the use of parallel workers are
|
|
|
|
<command>CREATE INDEX</command> only when building a B-tree index,
|
|
|
|
and <command>VACUUM</command> without <literal>FULL</literal>
|
|
|
|
option. Parallel workers are taken from the pool of processes
|
|
|
|
established by <xref linkend="guc-max-worker-processes"/>, limited
|
|
|
|
by <xref linkend="guc-max-parallel-workers"/>. Note that the requested
|
2018-06-29 21:26:41 +02:00
|
|
|
number of workers may not actually be available at run time.
|
Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds. Testing
to date shows that this can often be 2-3x faster than a serial
index build.
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature. We can
refine it as we get more experience.
Peter Geoghegan with some help from Rushabh Lathia. While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature. Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 19:25:55 +01:00
|
|
|
If this occurs, the utility operation will run with fewer
|
|
|
|
workers than expected. The default value is 2. Setting this
|
|
|
|
value to 0 disables the use of parallel workers by utility
|
|
|
|
commands.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Note that parallel utility commands should not consume
|
|
|
|
substantially more memory than equivalent non-parallel
|
|
|
|
operations. This strategy differs from that of parallel
|
|
|
|
query, where resource limits generally apply per worker
|
|
|
|
process. Parallel utility commands treat the resource limit
|
|
|
|
<varname>maintenance_work_mem</varname> as a limit to be applied to
|
|
|
|
the entire utility command, regardless of the number of
|
|
|
|
parallel worker processes. However, parallel utility
|
|
|
|
commands may still consume substantially more CPU resources
|
|
|
|
and I/O bandwidth.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2016-12-02 13:42:58 +01:00
|
|
|
<varlistentry id="guc-max-parallel-workers" xreflabel="max_parallel_workers">
|
|
|
|
<term><varname>max_parallel_workers</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_parallel_workers</varname> configuration parameter</primary>
|
2016-12-02 13:42:58 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the maximum number of workers that the system can support for
|
Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds. Testing
to date shows that this can often be 2-3x faster than a serial
index build.
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature. We can
refine it as we get more experience.
Peter Geoghegan with some help from Rushabh Lathia. While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature. Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 19:25:55 +01:00
|
|
|
parallel operations. The default value is 8. When increasing or
|
2016-12-02 13:42:58 +01:00
|
|
|
decreasing this value, consider also adjusting
|
2020-09-30 07:39:38 +02:00
|
|
|
<xref linkend="guc-max-parallel-maintenance-workers"/> and
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-max-parallel-workers-per-gather"/>.
|
2016-12-05 17:03:17 +01:00
|
|
|
Also, note that a setting for this value which is higher than
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-max-worker-processes"/> will have no effect,
|
2016-12-05 17:03:17 +01:00
|
|
|
since parallel workers are taken from the pool of worker processes
|
|
|
|
established by that setting.
|
2016-12-02 13:42:58 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2021-04-22 02:47:43 +02:00
|
|
|
<varlistentry id="guc-parallel-leader-participation" xreflabel="parallel_leader_participation">
|
|
|
|
<term>
|
|
|
|
<varname>parallel_leader_participation</varname> (<type>boolean</type>)
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
<indexterm>
|
2021-04-22 02:47:43 +02:00
|
|
|
<primary><varname>parallel_leader_participation</varname> configuration parameter</primary>
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2021-04-22 02:47:43 +02:00
|
|
|
Allows the leader process to execute the query plan under
|
|
|
|
<literal>Gather</literal> and <literal>Gather Merge</literal> nodes
|
|
|
|
instead of waiting for worker processes. The default is
|
|
|
|
<literal>on</literal>. Setting this value to <literal>off</literal>
|
|
|
|
reduces the likelihood that workers will become blocked because the
|
|
|
|
leader is not reading tuples fast enough, but requires the leader
|
|
|
|
process to wait for worker processes to start up before the first
|
|
|
|
tuples can be produced. The degree to which the leader can help or
|
|
|
|
hinder performance depends on the plan type, number of workers and
|
|
|
|
query duration.
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2016-04-08 21:36:30 +02:00
|
|
|
|
|
|
|
<varlistentry id="guc-old-snapshot-threshold" xreflabel="old_snapshot_threshold">
|
|
|
|
<term><varname>old_snapshot_threshold</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>old_snapshot_threshold</varname> configuration parameter</primary>
|
2016-04-08 21:36:30 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Sets the minimum amount of time that a query snapshot can be used
|
|
|
|
without risk of a <quote>snapshot too old</quote> error occurring
|
|
|
|
when using the snapshot. Data that has been dead for longer than
|
|
|
|
this threshold is allowed to be vacuumed away. This can help
|
2016-04-08 21:36:30 +02:00
|
|
|
prevent bloat in the face of snapshots which remain in use for a
|
|
|
|
long time. To prevent incorrect results due to cleanup of data which
|
|
|
|
would otherwise be visible to the snapshot, an error is generated
|
|
|
|
when the snapshot is older than this threshold and the snapshot is
|
|
|
|
used to read a page which has been modified since the snapshot was
|
|
|
|
built.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as minutes.
|
|
|
|
A value of <literal>-1</literal> (the default) disables this feature,
|
|
|
|
effectively setting the snapshot age limit to infinity.
|
|
|
|
This parameter can only be set at server start.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2016-04-08 21:36:30 +02:00
|
|
|
Useful values for production work probably range from a small number
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
of hours to a few days. Small values (such as <literal>0</literal> or
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>1min</literal>) are only allowed because they may sometimes be
|
|
|
|
useful for testing. While a setting as high as <literal>60d</literal> is
|
2016-04-08 21:36:30 +02:00
|
|
|
allowed, please note that in many workloads extreme bloat or
|
|
|
|
transaction ID wraparound may occur in much shorter time frames.
|
|
|
|
</para>
|
2016-05-06 14:47:12 +02:00
|
|
|
|
2016-07-19 23:25:53 +02:00
|
|
|
<para>
|
|
|
|
When this feature is enabled, freed space at the end of a relation
|
|
|
|
cannot be released to the operating system, since that could remove
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
information needed to detect the <quote>snapshot too old</quote>
|
2016-07-19 23:25:53 +02:00
|
|
|
condition. All space allocated to a relation remains associated with
|
|
|
|
that relation for reuse only within that relation unless explicitly
|
2017-10-09 03:44:17 +02:00
|
|
|
freed (for example, with <command>VACUUM FULL</command>).
|
2016-07-19 23:25:53 +02:00
|
|
|
</para>
|
|
|
|
|
2016-05-06 14:47:12 +02:00
|
|
|
<para>
|
|
|
|
This setting does not attempt to guarantee that an error will be
|
|
|
|
generated under any particular circumstances. In fact, if the
|
|
|
|
correct results can be generated from (for example) a cursor which
|
|
|
|
has materialized a result set, no error will be generated even if the
|
|
|
|
underlying rows in the referenced table have been vacuumed away.
|
|
|
|
Some tables cannot safely be vacuumed early, and so will not be
|
2017-03-14 18:27:02 +01:00
|
|
|
affected by this setting, such as system catalogs. For such tables
|
|
|
|
this setting will neither reduce bloat nor create a possibility
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
of a <quote>snapshot too old</quote> error on scanning.
|
2016-05-06 14:47:12 +02:00
|
|
|
</para>
|
2016-04-08 21:36:30 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2009-01-12 06:10:45 +01:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
2005-09-13 00:11:38 +02:00
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="runtime-config-wal">
|
|
|
|
<title>Write Ahead Log</title>
|
|
|
|
|
|
|
|
<para>
|
2013-03-15 22:41:47 +01:00
|
|
|
For additional information on tuning these settings,
|
2017-11-23 15:39:47 +01:00
|
|
|
see <xref linkend="wal-configuration"/>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<sect2 id="runtime-config-wal-settings">
|
|
|
|
<title>Settings</title>
|
|
|
|
<variablelist>
|
2010-04-29 23:36:19 +02:00
|
|
|
|
Introduce wal_level GUC to explicitly control if information needed for
archival or hot standby should be WAL-logged, instead of deducing that from
other options like archive_mode. This replaces recovery_connections GUC in
the primary, where it now has no effect, but it's still used in the standby
to enable/disable hot standby.
Remove the WAL-logging of "unlogged operations", like creating an index
without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
the wal_mode setting and the settings that affect how much shared memory a
hot standby server needs to track master transactions (max_connections,
max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
change, at server restart, write a WAL record noting the new settings and
update pg_control. This allows us to notice the change in those settings in
the standby at the right moment, they used to be included in checkpoint
records, but that meant that a changed value was not reflected in the
standby until the first checkpoint after the change.
Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
the sequence it used to follow, before hot standby and subsequent patches
changed it to 0x9003.
2010-04-28 18:10:43 +02:00
|
|
|
<varlistentry id="guc-wal-level" xreflabel="wal_level">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>wal_level</varname> (<type>enum</type>)
|
Introduce wal_level GUC to explicitly control if information needed for
archival or hot standby should be WAL-logged, instead of deducing that from
other options like archive_mode. This replaces recovery_connections GUC in
the primary, where it now has no effect, but it's still used in the standby
to enable/disable hot standby.
Remove the WAL-logging of "unlogged operations", like creating an index
without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
the wal_mode setting and the settings that affect how much shared memory a
hot standby server needs to track master transactions (max_connections,
max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
change, at server restart, write a WAL record noting the new settings and
update pg_control. This allows us to notice the change in those settings in
the standby at the right moment, they used to be included in checkpoint
records, but that meant that a changed value was not reflected in the
standby until the first checkpoint after the change.
Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
the sequence it used to follow, before hot standby and subsequent patches
changed it to 0x9003.
2010-04-28 18:10:43 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>wal_level</varname> configuration parameter</primary>
|
Introduce wal_level GUC to explicitly control if information needed for
archival or hot standby should be WAL-logged, instead of deducing that from
other options like archive_mode. This replaces recovery_connections GUC in
the primary, where it now has no effect, but it's still used in the standby
to enable/disable hot standby.
Remove the WAL-logging of "unlogged operations", like creating an index
without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
the wal_mode setting and the settings that affect how much shared memory a
hot standby server needs to track master transactions (max_connections,
max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
change, at server restart, write a WAL record noting the new settings and
update pg_control. This allows us to notice the change in those settings in
the standby at the right moment, they used to be included in checkpoint
records, but that meant that a changed value was not reflected in the
standby until the first checkpoint after the change.
Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
the sequence it used to follow, before hot standby and subsequent patches
changed it to 0x9003.
2010-04-28 18:10:43 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
Introduce wal_level GUC to explicitly control if information needed for
archival or hot standby should be WAL-logged, instead of deducing that from
other options like archive_mode. This replaces recovery_connections GUC in
the primary, where it now has no effect, but it's still used in the standby
to enable/disable hot standby.
Remove the WAL-logging of "unlogged operations", like creating an index
without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
the wal_mode setting and the settings that affect how much shared memory a
hot standby server needs to track master transactions (max_connections,
max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
change, at server restart, write a WAL record noting the new settings and
update pg_control. This allows us to notice the change in those settings in
the standby at the right moment, they used to be included in checkpoint
records, but that meant that a changed value was not reflected in the
standby until the first checkpoint after the change.
Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
the sequence it used to follow, before hot standby and subsequent patches
changed it to 0x9003.
2010-04-28 18:10:43 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>wal_level</varname> determines how much information is written to
|
|
|
|
the WAL. The default value is <literal>replica</literal>, which writes enough
|
2017-01-14 17:14:56 +01:00
|
|
|
data to support WAL archiving and replication, including running
|
2017-10-09 03:44:17 +02:00
|
|
|
read-only queries on a standby server. <literal>minimal</literal> removes all
|
2017-01-14 17:14:56 +01:00
|
|
|
logging except the information required to recover from a crash or
|
|
|
|
immediate shutdown. Finally,
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>logical</literal> adds information necessary to support logical
|
Add new wal_level, logical, sufficient for logical decoding.
When wal_level=logical, we'll log columns from the old tuple as
configured by the REPLICA IDENTITY facility added in commit
07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65. This makes it possible
a properly-configured logical replication solution to correctly
follow table updates even if they change the chosen key columns,
or, with REPLICA IDENTITY FULL, even if the table has no key at
all. Note that updates which do not modify the replica identity
column won't log anything extra, making the choice of a good key
(i.e. one that will rarely be changed) important to performance
when wal_level=logical is configured.
Each insert, update, or delete to a catalog table will also log
the CMIN and/or CMAX values of stamped by the current transaction.
This is necessary because logical decoding will require access to
historical snapshots of the catalog in order to decode some data
types, and the CMIN/CMAX values that we may need in order to judge
row visibility may have been overwritten by the time we need them.
Andres Freund, reviewed in various versions by myself, Heikki
Linnakangas, KONDO Mitsumasa, and many others.
2013-12-11 00:33:45 +01:00
|
|
|
decoding. Each level includes the information logged at all lower
|
|
|
|
levels. This parameter can only be set at server start.
|
Introduce wal_level GUC to explicitly control if information needed for
archival or hot standby should be WAL-logged, instead of deducing that from
other options like archive_mode. This replaces recovery_connections GUC in
the primary, where it now has no effect, but it's still used in the standby
to enable/disable hot standby.
Remove the WAL-logging of "unlogged operations", like creating an index
without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
the wal_mode setting and the settings that affect how much shared memory a
hot standby server needs to track master transactions (max_connections,
max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
change, at server restart, write a WAL record noting the new settings and
update pg_control. This allows us to notice the change in those settings in
the standby at the right moment, they used to be included in checkpoint
records, but that meant that a changed value was not reflected in the
standby until the first checkpoint after the change.
Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
the sequence it used to follow, before hot standby and subsequent patches
changed it to 0x9003.
2010-04-28 18:10:43 +02:00
|
|
|
</para>
|
|
|
|
<para>
|
Skip WAL for new relfilenodes, under wal_level=minimal.
Until now, only selected bulk operations (e.g. COPY) did this. If a
given relfilenode received both a WAL-skipping COPY and a WAL-logged
operation (e.g. INSERT), recovery could lose tuples from the COPY. See
src/backend/access/transam/README section "Skipping WAL for New
RelFileNode" for the new coding rules. Maintainers of table access
methods should examine that section.
To maintain data durability, just before commit, we choose between an
fsync of the relfilenode and copying its contents to WAL. A new GUC,
wal_skip_threshold, guides that choice. If this change slows a workload
that creates small, permanent relfilenodes under wal_level=minimal, try
adjusting wal_skip_threshold. Users setting a timeout on COMMIT may
need to adjust that timeout, and log_min_duration_statement analysis
will reflect time consumption moving to COMMIT from commands like COPY.
Internally, this requires a reliable determination of whether
RollbackAndReleaseCurrentSubTransaction() would unlink a relation's
current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the
specification of rd_createSubid such that the field is zero when a new
rel has an old rd_node. Make relcache.c retain entries for certain
dropped relations until end of transaction.
Bump XLOG_PAGE_MAGIC, since this introduces XLOG_GIST_ASSIGN_LSN.
Future servers accept older WAL, so this bump is discretionary.
Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert
Haas. Heikki Linnakangas and Michael Paquier implemented earlier
designs that materially clarified the problem. Reviewed, in earlier
designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane,
Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout.
Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
2020-04-04 21:25:34 +02:00
|
|
|
In <literal>minimal</literal> level, no information is logged for
|
|
|
|
permanent relations for the remainder of a transaction that creates or
|
|
|
|
rewrites them. This can make operations much faster (see
|
|
|
|
<xref linkend="populate-pitr"/>). Operations that initiate this
|
|
|
|
optimization include:
|
2011-08-04 18:06:53 +02:00
|
|
|
<simplelist>
|
Skip WAL for new relfilenodes, under wal_level=minimal.
Until now, only selected bulk operations (e.g. COPY) did this. If a
given relfilenode received both a WAL-skipping COPY and a WAL-logged
operation (e.g. INSERT), recovery could lose tuples from the COPY. See
src/backend/access/transam/README section "Skipping WAL for New
RelFileNode" for the new coding rules. Maintainers of table access
methods should examine that section.
To maintain data durability, just before commit, we choose between an
fsync of the relfilenode and copying its contents to WAL. A new GUC,
wal_skip_threshold, guides that choice. If this change slows a workload
that creates small, permanent relfilenodes under wal_level=minimal, try
adjusting wal_skip_threshold. Users setting a timeout on COMMIT may
need to adjust that timeout, and log_min_duration_statement analysis
will reflect time consumption moving to COMMIT from commands like COPY.
Internally, this requires a reliable determination of whether
RollbackAndReleaseCurrentSubTransaction() would unlink a relation's
current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the
specification of rd_createSubid such that the field is zero when a new
rel has an old rd_node. Make relcache.c retain entries for certain
dropped relations until end of transaction.
Bump XLOG_PAGE_MAGIC, since this introduces XLOG_GIST_ASSIGN_LSN.
Future servers accept older WAL, so this bump is discretionary.
Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert
Haas. Heikki Linnakangas and Michael Paquier implemented earlier
designs that materially clarified the problem. Reviewed, in earlier
designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane,
Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout.
Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
2020-04-04 21:25:34 +02:00
|
|
|
<member><command>ALTER ... SET TABLESPACE</command></member>
|
2017-10-09 03:44:17 +02:00
|
|
|
<member><command>CLUSTER</command></member>
|
Skip WAL for new relfilenodes, under wal_level=minimal.
Until now, only selected bulk operations (e.g. COPY) did this. If a
given relfilenode received both a WAL-skipping COPY and a WAL-logged
operation (e.g. INSERT), recovery could lose tuples from the COPY. See
src/backend/access/transam/README section "Skipping WAL for New
RelFileNode" for the new coding rules. Maintainers of table access
methods should examine that section.
To maintain data durability, just before commit, we choose between an
fsync of the relfilenode and copying its contents to WAL. A new GUC,
wal_skip_threshold, guides that choice. If this change slows a workload
that creates small, permanent relfilenodes under wal_level=minimal, try
adjusting wal_skip_threshold. Users setting a timeout on COMMIT may
need to adjust that timeout, and log_min_duration_statement analysis
will reflect time consumption moving to COMMIT from commands like COPY.
Internally, this requires a reliable determination of whether
RollbackAndReleaseCurrentSubTransaction() would unlink a relation's
current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the
specification of rd_createSubid such that the field is zero when a new
rel has an old rd_node. Make relcache.c retain entries for certain
dropped relations until end of transaction.
Bump XLOG_PAGE_MAGIC, since this introduces XLOG_GIST_ASSIGN_LSN.
Future servers accept older WAL, so this bump is discretionary.
Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert
Haas. Heikki Linnakangas and Michael Paquier implemented earlier
designs that materially clarified the problem. Reviewed, in earlier
designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane,
Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout.
Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
2020-04-04 21:25:34 +02:00
|
|
|
<member><command>CREATE TABLE</command></member>
|
|
|
|
<member><command>REFRESH MATERIALIZED VIEW</command>
|
|
|
|
(without <option>CONCURRENTLY</option>)</member>
|
|
|
|
<member><command>REINDEX</command></member>
|
|
|
|
<member><command>TRUNCATE</command></member>
|
2011-08-04 18:06:53 +02:00
|
|
|
</simplelist>
|
Add new wal_level, logical, sufficient for logical decoding.
When wal_level=logical, we'll log columns from the old tuple as
configured by the REPLICA IDENTITY facility added in commit
07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65. This makes it possible
a properly-configured logical replication solution to correctly
follow table updates even if they change the chosen key columns,
or, with REPLICA IDENTITY FULL, even if the table has no key at
all. Note that updates which do not modify the replica identity
column won't log anything extra, making the choice of a good key
(i.e. one that will rarely be changed) important to performance
when wal_level=logical is configured.
Each insert, update, or delete to a catalog table will also log
the CMIN and/or CMAX values of stamped by the current transaction.
This is necessary because logical decoding will require access to
historical snapshots of the catalog in order to decode some data
types, and the CMIN/CMAX values that we may need in order to judge
row visibility may have been overwritten by the time we need them.
Andres Freund, reviewed in various versions by myself, Heikki
Linnakangas, KONDO Mitsumasa, and many others.
2013-12-11 00:33:45 +01:00
|
|
|
But minimal WAL does not contain enough information to reconstruct the
|
2017-10-09 03:44:17 +02:00
|
|
|
data from a base backup and the WAL logs, so <literal>replica</literal> or
|
Add new wal_level, logical, sufficient for logical decoding.
When wal_level=logical, we'll log columns from the old tuple as
configured by the REPLICA IDENTITY facility added in commit
07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65. This makes it possible
a properly-configured logical replication solution to correctly
follow table updates even if they change the chosen key columns,
or, with REPLICA IDENTITY FULL, even if the table has no key at
all. Note that updates which do not modify the replica identity
column won't log anything extra, making the choice of a good key
(i.e. one that will rarely be changed) important to performance
when wal_level=logical is configured.
Each insert, update, or delete to a catalog table will also log
the CMIN and/or CMAX values of stamped by the current transaction.
This is necessary because logical decoding will require access to
historical snapshots of the catalog in order to decode some data
types, and the CMIN/CMAX values that we may need in order to judge
row visibility may have been overwritten by the time we need them.
Andres Freund, reviewed in various versions by myself, Heikki
Linnakangas, KONDO Mitsumasa, and many others.
2013-12-11 00:33:45 +01:00
|
|
|
higher must be used to enable WAL archiving
|
2017-11-23 15:39:47 +01:00
|
|
|
(<xref linkend="guc-archive-mode"/>) and streaming replication.
|
Stop archive recovery if WAL generated with wal_level=minimal is found.
Previously if hot standby was enabled, archive recovery exited with
an error when it found WAL generated with wal_level=minimal.
But if hot standby was disabled, it just reported a warning and
continued in that case. Which could lead to data loss or errors
during normal operation. A warning was emitted, but users could
easily miss that and not notice this serious situation until
they encountered the actual errors.
To improve this situation, this commit changes archive recovery
so that it exits with FATAL error when it finds WAL generated with
wal_level=minimal whatever the setting of hot standby. This enables
users to notice the serious situation soon.
The FATAL error is thrown if archive recovery starts from a base
backup taken before wal_level is changed to minimal. When archive
recovery exits with the error, if users have a base backup taken
after setting wal_level to higher than minimal, they can recover
the database by starting archive recovery from that newer backup.
But note that if such backup doesn't exist, there is no easy way to
complete archive recovery, which may make the database server
unstartable and users may lose whole database. The commit adds
the note about this risk into the document.
Even in the case of unstartable database server, previously by just
disabling hot standby users could avoid the error during archive
recovery, forcibly start up the server and salvage data from it.
But note that this commit makes this procedure unavailable at all.
Author: Takamichi Osumi
Reviewed-by: Laurenz Albe, Kyotaro Horiguchi, David Steele, Fujii Masao
Discussion: https://postgr.es/m/OSBPR01MB4888CBE1DA08818FD2D90ED8EDF90@OSBPR01MB4888.jpnprd01.prod.outlook.com
2021-04-06 15:56:51 +02:00
|
|
|
Note that changing <varname>wal_level</varname> to
|
|
|
|
<literal>minimal</literal> makes any base backups taken before
|
|
|
|
unavailable for archive recovery and standby server, which may
|
2021-04-09 06:53:07 +02:00
|
|
|
lead to data loss.
|
Introduce wal_level GUC to explicitly control if information needed for
archival or hot standby should be WAL-logged, instead of deducing that from
other options like archive_mode. This replaces recovery_connections GUC in
the primary, where it now has no effect, but it's still used in the standby
to enable/disable hot standby.
Remove the WAL-logging of "unlogged operations", like creating an index
without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
the wal_mode setting and the settings that affect how much shared memory a
hot standby server needs to track master transactions (max_connections,
max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
change, at server restart, write a WAL record noting the new settings and
update pg_control. This allows us to notice the change in those settings in
the standby at the right moment, they used to be included in checkpoint
records, but that meant that a changed value was not reflected in the
standby until the first checkpoint after the change.
Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
the sequence it used to follow, before hot standby and subsequent patches
changed it to 0x9003.
2010-04-28 18:10:43 +02:00
|
|
|
</para>
|
Add new wal_level, logical, sufficient for logical decoding.
When wal_level=logical, we'll log columns from the old tuple as
configured by the REPLICA IDENTITY facility added in commit
07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65. This makes it possible
a properly-configured logical replication solution to correctly
follow table updates even if they change the chosen key columns,
or, with REPLICA IDENTITY FULL, even if the table has no key at
all. Note that updates which do not modify the replica identity
column won't log anything extra, making the choice of a good key
(i.e. one that will rarely be changed) important to performance
when wal_level=logical is configured.
Each insert, update, or delete to a catalog table will also log
the CMIN and/or CMAX values of stamped by the current transaction.
This is necessary because logical decoding will require access to
historical snapshots of the catalog in order to decode some data
types, and the CMIN/CMAX values that we may need in order to judge
row visibility may have been overwritten by the time we need them.
Andres Freund, reviewed in various versions by myself, Heikki
Linnakangas, KONDO Mitsumasa, and many others.
2013-12-11 00:33:45 +01:00
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
In <literal>logical</literal> level, the same information is logged as
|
|
|
|
with <literal>replica</literal>, plus information needed to allow
|
2015-09-11 03:22:21 +02:00
|
|
|
extracting logical change sets from the WAL. Using a level of
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>logical</literal> will increase the WAL volume, particularly if many
|
Add new wal_level, logical, sufficient for logical decoding.
When wal_level=logical, we'll log columns from the old tuple as
configured by the REPLICA IDENTITY facility added in commit
07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65. This makes it possible
a properly-configured logical replication solution to correctly
follow table updates even if they change the chosen key columns,
or, with REPLICA IDENTITY FULL, even if the table has no key at
all. Note that updates which do not modify the replica identity
column won't log anything extra, making the choice of a good key
(i.e. one that will rarely be changed) important to performance
when wal_level=logical is configured.
Each insert, update, or delete to a catalog table will also log
the CMIN and/or CMAX values of stamped by the current transaction.
This is necessary because logical decoding will require access to
historical snapshots of the catalog in order to decode some data
types, and the CMIN/CMAX values that we may need in order to judge
row visibility may have been overwritten by the time we need them.
Andres Freund, reviewed in various versions by myself, Heikki
Linnakangas, KONDO Mitsumasa, and many others.
2013-12-11 00:33:45 +01:00
|
|
|
tables are configured for <literal>REPLICA IDENTITY FULL</literal> and
|
2017-10-09 03:44:17 +02:00
|
|
|
many <command>UPDATE</command> and <command>DELETE</command> statements are
|
Add new wal_level, logical, sufficient for logical decoding.
When wal_level=logical, we'll log columns from the old tuple as
configured by the REPLICA IDENTITY facility added in commit
07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65. This makes it possible
a properly-configured logical replication solution to correctly
follow table updates even if they change the chosen key columns,
or, with REPLICA IDENTITY FULL, even if the table has no key at
all. Note that updates which do not modify the replica identity
column won't log anything extra, making the choice of a good key
(i.e. one that will rarely be changed) important to performance
when wal_level=logical is configured.
Each insert, update, or delete to a catalog table will also log
the CMIN and/or CMAX values of stamped by the current transaction.
This is necessary because logical decoding will require access to
historical snapshots of the catalog in order to decode some data
types, and the CMIN/CMAX values that we may need in order to judge
row visibility may have been overwritten by the time we need them.
Andres Freund, reviewed in various versions by myself, Heikki
Linnakangas, KONDO Mitsumasa, and many others.
2013-12-11 00:33:45 +01:00
|
|
|
executed.
|
Introduce wal_level GUC to explicitly control if information needed for
archival or hot standby should be WAL-logged, instead of deducing that from
other options like archive_mode. This replaces recovery_connections GUC in
the primary, where it now has no effect, but it's still used in the standby
to enable/disable hot standby.
Remove the WAL-logging of "unlogged operations", like creating an index
without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
the wal_mode setting and the settings that affect how much shared memory a
hot standby server needs to track master transactions (max_connections,
max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
change, at server restart, write a WAL record noting the new settings and
update pg_control. This allows us to notice the change in those settings in
the standby at the right moment, they used to be included in checkpoint
records, but that meant that a changed value was not reflected in the
standby until the first checkpoint after the change.
Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
the sequence it used to follow, before hot standby and subsequent patches
changed it to 0x9003.
2010-04-28 18:10:43 +02:00
|
|
|
</para>
|
2016-03-01 02:01:54 +01:00
|
|
|
<para>
|
|
|
|
In releases prior to 9.6, this parameter also allowed the
|
|
|
|
values <literal>archive</literal> and <literal>hot_standby</literal>.
|
|
|
|
These are still accepted but mapped to <literal>replica</literal>.
|
|
|
|
</para>
|
Introduce wal_level GUC to explicitly control if information needed for
archival or hot standby should be WAL-logged, instead of deducing that from
other options like archive_mode. This replaces recovery_connections GUC in
the primary, where it now has no effect, but it's still used in the standby
to enable/disable hot standby.
Remove the WAL-logging of "unlogged operations", like creating an index
without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
the wal_mode setting and the settings that affect how much shared memory a
hot standby server needs to track master transactions (max_connections,
max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
change, at server restart, write a WAL record noting the new settings and
update pg_control. This allows us to notice the change in those settings in
the standby at the right moment, they used to be included in checkpoint
records, but that meant that a changed value was not reflected in the
standby until the first checkpoint after the change.
Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
the sequence it used to follow, before hot standby and subsequent patches
changed it to 0x9003.
2010-04-28 18:10:43 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-fsync" xreflabel="fsync">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>fsync</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>fsync</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
If this parameter is on, the <productname>PostgreSQL</productname> server
|
2005-10-22 23:56:07 +02:00
|
|
|
will try to make sure that updates are physically written to
|
2017-10-09 03:44:17 +02:00
|
|
|
disk, by issuing <function>fsync()</function> system calls or various
|
2017-11-23 15:39:47 +01:00
|
|
|
equivalent methods (see <xref linkend="guc-wal-sync-method"/>).
|
2005-10-22 23:56:07 +02:00
|
|
|
This ensures that the database cluster can recover to a
|
2005-09-13 00:11:38 +02:00
|
|
|
consistent state after an operating system or hardware crash.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2010-05-31 17:50:48 +02:00
|
|
|
While turning off <varname>fsync</varname> is often a performance
|
|
|
|
benefit, this can result in unrecoverable data corruption in
|
2010-12-09 02:01:09 +01:00
|
|
|
the event of a power failure or system crash. Thus it
|
|
|
|
is only advisable to turn off <varname>fsync</varname> if
|
2010-05-31 17:50:48 +02:00
|
|
|
you can easily recreate your entire database from external
|
|
|
|
data.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2010-05-31 17:50:48 +02:00
|
|
|
Examples of safe circumstances for turning off
|
2010-12-09 02:01:09 +01:00
|
|
|
<varname>fsync</varname> include the initial loading of a new
|
2010-05-31 17:50:48 +02:00
|
|
|
database cluster from a backup file, using a database cluster
|
2010-12-09 02:01:09 +01:00
|
|
|
for processing a batch of data after which the database
|
|
|
|
will be thrown away and recreated,
|
|
|
|
or for a read-only database clone which
|
2010-05-31 17:50:48 +02:00
|
|
|
gets recreated frequently and is not used for failover. High
|
|
|
|
quality hardware alone is not a sufficient justification for
|
|
|
|
turning off <varname>fsync</varname>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
2012-12-04 04:47:59 +01:00
|
|
|
<para>
|
|
|
|
For reliable recovery when changing <varname>fsync</varname>
|
|
|
|
off to on, it is necessary to force all modified buffers in the
|
|
|
|
kernel to durable storage. This can be done while the cluster
|
2017-06-18 20:01:45 +02:00
|
|
|
is shutdown or while <varname>fsync</varname> is on by running <command>initdb
|
2017-10-09 03:44:17 +02:00
|
|
|
--sync-only</command>, running <command>sync</command>, unmounting the
|
2012-12-04 04:47:59 +01:00
|
|
|
file system, or rebooting the server.
|
|
|
|
</para>
|
|
|
|
|
2007-08-02 00:45:09 +02:00
|
|
|
<para>
|
2017-11-23 15:39:47 +01:00
|
|
|
In many situations, turning off <xref linkend="guc-synchronous-commit"/>
|
2007-08-02 00:45:09 +02:00
|
|
|
for noncritical transactions can provide much of the potential
|
|
|
|
performance benefit of turning off <varname>fsync</varname>, without
|
2008-02-03 00:29:12 +01:00
|
|
|
the attendant risks of data corruption.
|
2007-08-02 00:45:09 +02:00
|
|
|
</para>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>fsync</varname> can only be set in the <filename>postgresql.conf</filename>
|
2006-01-23 19:16:41 +01:00
|
|
|
file or on the server command line.
|
2010-11-23 21:27:50 +01:00
|
|
|
If you turn this parameter off, also consider turning off
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-full-page-writes"/>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2007-08-02 00:45:09 +02:00
|
|
|
|
|
|
|
<varlistentry id="guc-synchronous-commit" xreflabel="synchronous_commit">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>synchronous_commit</varname> (<type>enum</type>)
|
2007-08-02 00:45:09 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>synchronous_commit</varname> configuration parameter</primary>
|
2007-08-02 00:45:09 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2007-08-02 00:45:09 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2020-10-15 21:15:29 +02:00
|
|
|
Specifies how much WAL processing must complete before
|
|
|
|
the database server returns a <quote>success</quote>
|
|
|
|
indication to the client. Valid values are
|
|
|
|
<literal>remote_apply</literal>, <literal>on</literal>
|
|
|
|
(the default), <literal>remote_write</literal>,
|
|
|
|
<literal>local</literal>, and <literal>off</literal>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
If <varname>synchronous_standby_names</varname> is empty,
|
|
|
|
the only meaningful settings are <literal>on</literal> and
|
|
|
|
<literal>off</literal>; <literal>remote_apply</literal>,
|
|
|
|
<literal>remote_write</literal> and <literal>local</literal>
|
|
|
|
all provide the same local synchronization level
|
|
|
|
as <literal>on</literal>. The local behavior of all
|
|
|
|
non-<literal>off</literal> modes is to wait for local flush of WAL
|
|
|
|
to disk. In <literal>off</literal> mode, there is no waiting,
|
|
|
|
so there can be a delay between when success is reported to the
|
|
|
|
client and when the transaction is later guaranteed to be safe
|
|
|
|
against a server crash. (The maximum
|
2017-11-23 15:39:47 +01:00
|
|
|
delay is three times <xref linkend="guc-wal-writer-delay"/>.) Unlike
|
|
|
|
<xref linkend="guc-fsync"/>, setting this parameter to <literal>off</literal>
|
2010-06-28 23:57:17 +02:00
|
|
|
does not create any risk of database inconsistency: an operating
|
2010-06-29 00:46:11 +02:00
|
|
|
system or database crash might
|
2007-08-02 00:45:09 +02:00
|
|
|
result in some recent allegedly-committed transactions being lost, but
|
|
|
|
the database state will be just the same as if those transactions had
|
2017-10-09 03:44:17 +02:00
|
|
|
been aborted cleanly. So, turning <varname>synchronous_commit</varname> off
|
2007-08-02 00:45:09 +02:00
|
|
|
can be a useful alternative when performance is more important than
|
|
|
|
exact certainty about the durability of a transaction. For more
|
2017-11-23 15:39:47 +01:00
|
|
|
discussion see <xref linkend="wal-async-commit"/>.
|
2007-08-02 00:45:09 +02:00
|
|
|
</para>
|
2020-10-15 21:15:29 +02:00
|
|
|
|
2011-04-04 22:13:01 +02:00
|
|
|
<para>
|
2020-10-15 21:15:29 +02:00
|
|
|
If <xref linkend="guc-synchronous-standby-names"/> is non-empty,
|
|
|
|
<varname>synchronous_commit</varname> also controls whether
|
|
|
|
transaction commits will wait for their WAL records to be
|
|
|
|
processed on the standby server(s).
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
When set to <literal>remote_apply</literal>, commits will wait
|
|
|
|
until replies from the current synchronous standby(s) indicate they
|
|
|
|
have received the commit record of the transaction and applied
|
|
|
|
it, so that it has become visible to queries on the standby(s),
|
|
|
|
and also written to durable storage on the standbys. This will
|
|
|
|
cause much larger commit delays than previous settings since
|
|
|
|
it waits for WAL replay. When set to <literal>on</literal>,
|
|
|
|
commits wait until replies
|
2016-04-30 16:54:45 +02:00
|
|
|
from the current synchronous standby(s) indicate they have received
|
2020-10-15 21:15:29 +02:00
|
|
|
the commit record of the transaction and flushed it to durable storage. This
|
2016-04-30 16:54:45 +02:00
|
|
|
ensures the transaction will not be lost unless both the primary and
|
|
|
|
all synchronous standbys suffer corruption of their database storage.
|
2017-10-09 03:44:17 +02:00
|
|
|
When set to <literal>remote_write</literal>, commits will wait until replies
|
2016-04-30 16:54:45 +02:00
|
|
|
from the current synchronous standby(s) indicate they have
|
2020-10-15 21:15:29 +02:00
|
|
|
received the commit record of the transaction and written it to
|
|
|
|
their file systems. This setting ensures data preservation if a standby instance of
|
|
|
|
<productname>PostgreSQL</productname> crashes, but not if the standby
|
|
|
|
suffers an operating-system-level crash because the data has not
|
2020-08-31 21:23:19 +02:00
|
|
|
necessarily reached durable storage on the standby.
|
2020-10-15 21:15:29 +02:00
|
|
|
The setting <literal>local</literal> causes commits to wait for
|
|
|
|
local flush to disk, but not for replication. This is usually not
|
2016-04-30 16:54:45 +02:00
|
|
|
desirable when synchronous replication is in use, but is provided for
|
|
|
|
completeness.
|
2012-08-22 20:04:02 +02:00
|
|
|
</para>
|
2020-10-15 21:15:29 +02:00
|
|
|
|
2007-08-02 00:45:09 +02:00
|
|
|
<para>
|
|
|
|
This parameter can be changed at any time; the behavior for any
|
|
|
|
one transaction is determined by the setting in effect when it
|
|
|
|
commits. It is therefore possible, and useful, to have some
|
|
|
|
transactions commit synchronously and others asynchronously.
|
2010-08-17 06:37:21 +02:00
|
|
|
For example, to make a single multistatement transaction commit
|
2008-02-03 00:29:12 +01:00
|
|
|
asynchronously when the default is the opposite, issue <command>SET
|
2017-10-09 03:44:17 +02:00
|
|
|
LOCAL synchronous_commit TO OFF</command> within the transaction.
|
2007-08-02 00:45:09 +02:00
|
|
|
</para>
|
2020-10-15 21:15:29 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
<xref linkend="synchronous-commit-matrix"/> summarizes the
|
|
|
|
capabilities of the <varname>synchronous_commit</varname> settings.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<table id="synchronous-commit-matrix">
|
|
|
|
<title>synchronous_commit Modes</title>
|
|
|
|
<tgroup cols="5">
|
2020-10-16 17:36:34 +02:00
|
|
|
<colspec colname="col1" colwidth="1.5*"/>
|
2020-10-15 21:15:29 +02:00
|
|
|
<colspec colname="col2" colwidth="1*"/>
|
|
|
|
<colspec colname="col3" colwidth="1*"/>
|
|
|
|
<colspec colname="col4" colwidth="1*"/>
|
|
|
|
<colspec colname="col5" colwidth="1*"/>
|
|
|
|
<thead>
|
|
|
|
<row>
|
|
|
|
<entry>synchronous_commit setting</entry>
|
|
|
|
<entry>local durable commit</entry>
|
|
|
|
<entry>standby durable commit after PG crash</entry>
|
|
|
|
<entry>standby durable commit after OS crash</entry>
|
|
|
|
<entry>standby query consistency</entry>
|
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
|
|
|
|
<tbody>
|
|
|
|
|
|
|
|
<row>
|
|
|
|
<entry>remote_apply</entry>
|
|
|
|
<entry align="center">•</entry>
|
|
|
|
<entry align="center">•</entry>
|
|
|
|
<entry align="center">•</entry>
|
|
|
|
<entry align="center">•</entry>
|
|
|
|
</row>
|
|
|
|
|
|
|
|
<row>
|
|
|
|
<entry>on</entry>
|
|
|
|
<entry align="center">•</entry>
|
|
|
|
<entry align="center">•</entry>
|
|
|
|
<entry align="center">•</entry>
|
|
|
|
<entry align="center"></entry>
|
|
|
|
</row>
|
|
|
|
|
|
|
|
<row>
|
|
|
|
<entry>remote_write</entry>
|
|
|
|
<entry align="center">•</entry>
|
|
|
|
<entry align="center">•</entry>
|
|
|
|
<entry align="center"></entry>
|
|
|
|
<entry align="center"></entry>
|
|
|
|
</row>
|
|
|
|
|
|
|
|
<row>
|
|
|
|
<entry>local</entry>
|
|
|
|
<entry align="center">•</entry>
|
|
|
|
<entry align="center"></entry>
|
|
|
|
<entry align="center"></entry>
|
|
|
|
<entry align="center"></entry>
|
|
|
|
</row>
|
|
|
|
|
|
|
|
<row>
|
|
|
|
<entry>off</entry>
|
|
|
|
<entry align="center"></entry>
|
|
|
|
<entry align="center"></entry>
|
|
|
|
<entry align="center"></entry>
|
|
|
|
<entry align="center"></entry>
|
|
|
|
</row>
|
|
|
|
|
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
|
|
|
|
2007-08-02 00:45:09 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2010-11-23 21:27:50 +01:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-wal-sync-method" xreflabel="wal_sync_method">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>wal_sync_method</varname> (<type>enum</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>wal_sync_method</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2005-10-22 23:56:07 +02:00
|
|
|
Method used for forcing WAL updates out to disk.
|
|
|
|
If <varname>fsync</varname> is off then this setting is irrelevant,
|
2010-02-03 18:25:06 +01:00
|
|
|
since WAL file updates will not be forced out at all.
|
2005-10-22 23:56:07 +02:00
|
|
|
Possible values are:
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
<itemizedlist>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>open_datasync</literal> (write WAL files with <function>open()</function> option <symbol>O_DSYNC</symbol>)
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>fdatasync</literal> (call <function>fdatasync()</function> at each commit)
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>fsync</literal> (call <function>fsync()</function> at each commit)
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>fsync_writethrough</literal> (call <function>fsync()</function> at each commit, forcing write-through of any disk write cache)
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>open_sync</literal> (write WAL files with <function>open()</function> option <symbol>O_SYNC</symbol>)
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The <literal>open_</literal>* options also use <literal>O_DIRECT</literal> if available.
|
2010-12-09 02:01:09 +01:00
|
|
|
Not all of these choices are available on all platforms.
|
2010-10-19 16:56:33 +02:00
|
|
|
The default is the first method in the above list that is supported
|
2017-10-09 03:44:17 +02:00
|
|
|
by the platform, except that <literal>fdatasync</literal> is the default on
|
2021-02-15 03:43:39 +01:00
|
|
|
Linux and FreeBSD. The default is not necessarily ideal; it might be
|
2010-10-19 16:56:33 +02:00
|
|
|
necessary to change this setting or other aspects of your system
|
|
|
|
configuration in order to create a crash-safe configuration or
|
|
|
|
achieve optimal performance.
|
2017-11-23 15:39:47 +01:00
|
|
|
These aspects are discussed in <xref linkend="wal-reliability"/>.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2006-01-23 19:16:41 +01:00
|
|
|
file or on the server command line.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2010-11-23 21:27:50 +01:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-full-page-writes" xreflabel="full_page_writes">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>full_page_writes</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>full_page_writes</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
When this parameter is on, the <productname>PostgreSQL</productname> server
|
2005-10-22 23:56:07 +02:00
|
|
|
writes the entire content of each disk page to WAL during the
|
|
|
|
first modification of that page after a checkpoint.
|
|
|
|
This is needed because
|
|
|
|
a page write that is in process during an operating system crash might
|
|
|
|
be only partially completed, leading to an on-disk page
|
|
|
|
that contains a mix of old and new data. The row-level change data
|
|
|
|
normally stored in WAL will not be enough to completely restore
|
|
|
|
such a page during post-crash recovery. Storing the full page image
|
2010-02-03 18:25:06 +01:00
|
|
|
guarantees that the page can be correctly restored, but at the price
|
|
|
|
of increasing the amount of data that must be written to WAL.
|
2005-10-22 23:56:07 +02:00
|
|
|
(Because WAL replay always starts from a checkpoint, it is sufficient
|
|
|
|
to do this during the first change of each page after a checkpoint.
|
|
|
|
Therefore, one way to reduce the cost of full-page writes is to
|
|
|
|
increase the checkpoint interval parameters.)
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2006-01-23 19:16:41 +01:00
|
|
|
Turning this parameter off speeds normal operation, but
|
2010-05-31 17:50:48 +02:00
|
|
|
might lead to either unrecoverable data corruption, or silent
|
|
|
|
data corruption, after a system failure. The risks are similar to turning off
|
|
|
|
<varname>fsync</varname>, though smaller, and it should be turned off
|
|
|
|
only based on the same circumstances recommended for that parameter.
|
2005-10-22 23:56:07 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2006-01-23 19:16:41 +01:00
|
|
|
Turning off this parameter does not affect use of
|
2005-10-22 23:56:07 +02:00
|
|
|
WAL archiving for point-in-time recovery (PITR)
|
2017-11-23 15:39:47 +01:00
|
|
|
(see <xref linkend="continuous-archiving"/>).
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2006-01-23 19:16:41 +01:00
|
|
|
file or on the server command line.
|
2017-10-09 03:44:17 +02:00
|
|
|
The default is <literal>on</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
2013-12-13 15:26:14 +01:00
|
|
|
</varlistentry>
|
|
|
|
|
2013-12-20 19:33:16 +01:00
|
|
|
<varlistentry id="guc-wal-log-hints" xreflabel="wal_log_hints">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>wal_log_hints</varname> (<type>boolean</type>)
|
2013-12-13 15:26:14 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>wal_log_hints</varname> configuration parameter</primary>
|
2013-12-13 15:26:14 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2013-12-13 15:26:14 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
When this parameter is <literal>on</literal>, the <productname>PostgreSQL</productname>
|
2013-12-13 15:26:14 +01:00
|
|
|
server writes the entire content of each disk page to WAL during the
|
|
|
|
first modification of that page after a checkpoint, even for
|
|
|
|
non-critical modifications of so-called hint bits.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
If data checksums are enabled, hint bit updates are always WAL-logged
|
|
|
|
and this setting is ignored. You can use this setting to test how much
|
|
|
|
extra WAL-logging would occur if your database had data checksums
|
|
|
|
enabled.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set at server start. The default value is <literal>off</literal>.
|
2013-12-13 15:26:14 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
2005-09-13 00:11:38 +02:00
|
|
|
</varlistentry>
|
2007-07-24 06:54:09 +02:00
|
|
|
|
Add GUC to enable compression of full page images stored in WAL.
When newly-added GUC parameter, wal_compression, is on, the PostgreSQL server
compresses a full page image written to WAL when full_page_writes is on or
during a base backup. A compressed page image will be decompressed during WAL
replay. Turning this parameter on can reduce the WAL volume without increasing
the risk of unrecoverable data corruption, but at the cost of some extra CPU
spent on the compression during WAL logging and on the decompression during
WAL replay.
This commit changes the WAL format (so bumping WAL version number) so that
the one-byte flag indicating whether a full page image is compressed or not is
included in its header information. This means that the commit increases the
WAL volume one-byte per a full page image even if WAL compression is not used
at all. We can save that one-byte by borrowing one-bit from the existing field
like hole_offset in the header and using it as the flag, for example. But which
would reduce the code readability and the extensibility of the feature.
Per discussion, it's not worth paying those prices to save only one-byte, so we
decided to add the one-byte flag to the header.
This commit doesn't introduce any new compression algorithm like lz4.
Currently a full page image is compressed using the existing PGLZ algorithm.
Per discussion, we decided to use it at least in the first version of the
feature because there were no performance reports showing that its compression
ratio is unacceptably lower than that of other algorithm. Of course,
in the future, it's worth considering the support of other compression
algorithm for the better compression.
Rahila Syed and Michael Paquier, reviewed in various versions by myself,
Andres Freund, Robert Haas, Abhijit Menon-Sen and many others.
2015-03-11 07:52:24 +01:00
|
|
|
<varlistentry id="guc-wal-compression" xreflabel="wal_compression">
|
Add support for LZ4 with compression of full-page writes in WAL
The logic is implemented so as there can be a choice in the compression
used when building a WAL record, and an extra per-record bit is used to
track down if a block is compressed with PGLZ, LZ4 or nothing.
wal_compression, the existing parameter, is changed to an enum with
support for the following backward-compatible values:
- "off", the default, to not use compression.
- "pglz" or "on", to compress FPWs with PGLZ.
- "lz4", the new mode, to compress FPWs with LZ4.
Benchmarking has showed that LZ4 outclasses easily PGLZ. ZSTD would be
also an interesting choice, but going just with LZ4 for now makes the
patch minimalistic as toast compression is already able to use LZ4, so
there is no need to worry about any build-related needs for this
implementation.
Author: Andrey Borodin, Justin Pryzby
Reviewed-by: Dilip Kumar, Michael Paquier
Discussion: https://postgr.es/m/3037310D-ECB7-4BF1-AF20-01C10BB33A33@yandex-team.ru
2021-06-29 04:17:55 +02:00
|
|
|
<term><varname>wal_compression</varname> (<type>enum</type>)
|
Add GUC to enable compression of full page images stored in WAL.
When newly-added GUC parameter, wal_compression, is on, the PostgreSQL server
compresses a full page image written to WAL when full_page_writes is on or
during a base backup. A compressed page image will be decompressed during WAL
replay. Turning this parameter on can reduce the WAL volume without increasing
the risk of unrecoverable data corruption, but at the cost of some extra CPU
spent on the compression during WAL logging and on the decompression during
WAL replay.
This commit changes the WAL format (so bumping WAL version number) so that
the one-byte flag indicating whether a full page image is compressed or not is
included in its header information. This means that the commit increases the
WAL volume one-byte per a full page image even if WAL compression is not used
at all. We can save that one-byte by borrowing one-bit from the existing field
like hole_offset in the header and using it as the flag, for example. But which
would reduce the code readability and the extensibility of the feature.
Per discussion, it's not worth paying those prices to save only one-byte, so we
decided to add the one-byte flag to the header.
This commit doesn't introduce any new compression algorithm like lz4.
Currently a full page image is compressed using the existing PGLZ algorithm.
Per discussion, we decided to use it at least in the first version of the
feature because there were no performance reports showing that its compression
ratio is unacceptably lower than that of other algorithm. Of course,
in the future, it's worth considering the support of other compression
algorithm for the better compression.
Rahila Syed and Michael Paquier, reviewed in various versions by myself,
Andres Freund, Robert Haas, Abhijit Menon-Sen and many others.
2015-03-11 07:52:24 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>wal_compression</varname> configuration parameter</primary>
|
Add GUC to enable compression of full page images stored in WAL.
When newly-added GUC parameter, wal_compression, is on, the PostgreSQL server
compresses a full page image written to WAL when full_page_writes is on or
during a base backup. A compressed page image will be decompressed during WAL
replay. Turning this parameter on can reduce the WAL volume without increasing
the risk of unrecoverable data corruption, but at the cost of some extra CPU
spent on the compression during WAL logging and on the decompression during
WAL replay.
This commit changes the WAL format (so bumping WAL version number) so that
the one-byte flag indicating whether a full page image is compressed or not is
included in its header information. This means that the commit increases the
WAL volume one-byte per a full page image even if WAL compression is not used
at all. We can save that one-byte by borrowing one-bit from the existing field
like hole_offset in the header and using it as the flag, for example. But which
would reduce the code readability and the extensibility of the feature.
Per discussion, it's not worth paying those prices to save only one-byte, so we
decided to add the one-byte flag to the header.
This commit doesn't introduce any new compression algorithm like lz4.
Currently a full page image is compressed using the existing PGLZ algorithm.
Per discussion, we decided to use it at least in the first version of the
feature because there were no performance reports showing that its compression
ratio is unacceptably lower than that of other algorithm. Of course,
in the future, it's worth considering the support of other compression
algorithm for the better compression.
Rahila Syed and Michael Paquier, reviewed in various versions by myself,
Andres Freund, Robert Haas, Abhijit Menon-Sen and many others.
2015-03-11 07:52:24 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2021-07-17 17:52:54 +02:00
|
|
|
This parameter enables compression of WAL using the specified
|
Add support for LZ4 with compression of full-page writes in WAL
The logic is implemented so as there can be a choice in the compression
used when building a WAL record, and an extra per-record bit is used to
track down if a block is compressed with PGLZ, LZ4 or nothing.
wal_compression, the existing parameter, is changed to an enum with
support for the following backward-compatible values:
- "off", the default, to not use compression.
- "pglz" or "on", to compress FPWs with PGLZ.
- "lz4", the new mode, to compress FPWs with LZ4.
Benchmarking has showed that LZ4 outclasses easily PGLZ. ZSTD would be
also an interesting choice, but going just with LZ4 for now makes the
patch minimalistic as toast compression is already able to use LZ4, so
there is no need to worry about any build-related needs for this
implementation.
Author: Andrey Borodin, Justin Pryzby
Reviewed-by: Dilip Kumar, Michael Paquier
Discussion: https://postgr.es/m/3037310D-ECB7-4BF1-AF20-01C10BB33A33@yandex-team.ru
2021-06-29 04:17:55 +02:00
|
|
|
compression method.
|
|
|
|
When enabled, the <productname>PostgreSQL</productname>
|
2021-04-09 06:53:07 +02:00
|
|
|
server compresses full page images written to WAL when
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-full-page-writes"/> is on or during a base backup.
|
Add GUC to enable compression of full page images stored in WAL.
When newly-added GUC parameter, wal_compression, is on, the PostgreSQL server
compresses a full page image written to WAL when full_page_writes is on or
during a base backup. A compressed page image will be decompressed during WAL
replay. Turning this parameter on can reduce the WAL volume without increasing
the risk of unrecoverable data corruption, but at the cost of some extra CPU
spent on the compression during WAL logging and on the decompression during
WAL replay.
This commit changes the WAL format (so bumping WAL version number) so that
the one-byte flag indicating whether a full page image is compressed or not is
included in its header information. This means that the commit increases the
WAL volume one-byte per a full page image even if WAL compression is not used
at all. We can save that one-byte by borrowing one-bit from the existing field
like hole_offset in the header and using it as the flag, for example. But which
would reduce the code readability and the extensibility of the feature.
Per discussion, it's not worth paying those prices to save only one-byte, so we
decided to add the one-byte flag to the header.
This commit doesn't introduce any new compression algorithm like lz4.
Currently a full page image is compressed using the existing PGLZ algorithm.
Per discussion, we decided to use it at least in the first version of the
feature because there were no performance reports showing that its compression
ratio is unacceptably lower than that of other algorithm. Of course,
in the future, it's worth considering the support of other compression
algorithm for the better compression.
Rahila Syed and Michael Paquier, reviewed in various versions by myself,
Andres Freund, Robert Haas, Abhijit Menon-Sen and many others.
2015-03-11 07:52:24 +01:00
|
|
|
A compressed page image will be decompressed during WAL replay.
|
Add support for LZ4 with compression of full-page writes in WAL
The logic is implemented so as there can be a choice in the compression
used when building a WAL record, and an extra per-record bit is used to
track down if a block is compressed with PGLZ, LZ4 or nothing.
wal_compression, the existing parameter, is changed to an enum with
support for the following backward-compatible values:
- "off", the default, to not use compression.
- "pglz" or "on", to compress FPWs with PGLZ.
- "lz4", the new mode, to compress FPWs with LZ4.
Benchmarking has showed that LZ4 outclasses easily PGLZ. ZSTD would be
also an interesting choice, but going just with LZ4 for now makes the
patch minimalistic as toast compression is already able to use LZ4, so
there is no need to worry about any build-related needs for this
implementation.
Author: Andrey Borodin, Justin Pryzby
Reviewed-by: Dilip Kumar, Michael Paquier
Discussion: https://postgr.es/m/3037310D-ECB7-4BF1-AF20-01C10BB33A33@yandex-team.ru
2021-06-29 04:17:55 +02:00
|
|
|
The supported methods are <literal>pglz</literal> and
|
|
|
|
<literal>lz4</literal> (if <productname>PostgreSQL</productname> was
|
|
|
|
compiled with <option>--with-lz4</option>). The default value is
|
|
|
|
<literal>off</literal>. Only superusers can change this setting.
|
Add GUC to enable compression of full page images stored in WAL.
When newly-added GUC parameter, wal_compression, is on, the PostgreSQL server
compresses a full page image written to WAL when full_page_writes is on or
during a base backup. A compressed page image will be decompressed during WAL
replay. Turning this parameter on can reduce the WAL volume without increasing
the risk of unrecoverable data corruption, but at the cost of some extra CPU
spent on the compression during WAL logging and on the decompression during
WAL replay.
This commit changes the WAL format (so bumping WAL version number) so that
the one-byte flag indicating whether a full page image is compressed or not is
included in its header information. This means that the commit increases the
WAL volume one-byte per a full page image even if WAL compression is not used
at all. We can save that one-byte by borrowing one-bit from the existing field
like hole_offset in the header and using it as the flag, for example. But which
would reduce the code readability and the extensibility of the feature.
Per discussion, it's not worth paying those prices to save only one-byte, so we
decided to add the one-byte flag to the header.
This commit doesn't introduce any new compression algorithm like lz4.
Currently a full page image is compressed using the existing PGLZ algorithm.
Per discussion, we decided to use it at least in the first version of the
feature because there were no performance reports showing that its compression
ratio is unacceptably lower than that of other algorithm. Of course,
in the future, it's worth considering the support of other compression
algorithm for the better compression.
Rahila Syed and Michael Paquier, reviewed in various versions by myself,
Andres Freund, Robert Haas, Abhijit Menon-Sen and many others.
2015-03-11 07:52:24 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
Add support for LZ4 with compression of full-page writes in WAL
The logic is implemented so as there can be a choice in the compression
used when building a WAL record, and an extra per-record bit is used to
track down if a block is compressed with PGLZ, LZ4 or nothing.
wal_compression, the existing parameter, is changed to an enum with
support for the following backward-compatible values:
- "off", the default, to not use compression.
- "pglz" or "on", to compress FPWs with PGLZ.
- "lz4", the new mode, to compress FPWs with LZ4.
Benchmarking has showed that LZ4 outclasses easily PGLZ. ZSTD would be
also an interesting choice, but going just with LZ4 for now makes the
patch minimalistic as toast compression is already able to use LZ4, so
there is no need to worry about any build-related needs for this
implementation.
Author: Andrey Borodin, Justin Pryzby
Reviewed-by: Dilip Kumar, Michael Paquier
Discussion: https://postgr.es/m/3037310D-ECB7-4BF1-AF20-01C10BB33A33@yandex-team.ru
2021-06-29 04:17:55 +02:00
|
|
|
Enabling compression can reduce the WAL volume without
|
Add GUC to enable compression of full page images stored in WAL.
When newly-added GUC parameter, wal_compression, is on, the PostgreSQL server
compresses a full page image written to WAL when full_page_writes is on or
during a base backup. A compressed page image will be decompressed during WAL
replay. Turning this parameter on can reduce the WAL volume without increasing
the risk of unrecoverable data corruption, but at the cost of some extra CPU
spent on the compression during WAL logging and on the decompression during
WAL replay.
This commit changes the WAL format (so bumping WAL version number) so that
the one-byte flag indicating whether a full page image is compressed or not is
included in its header information. This means that the commit increases the
WAL volume one-byte per a full page image even if WAL compression is not used
at all. We can save that one-byte by borrowing one-bit from the existing field
like hole_offset in the header and using it as the flag, for example. But which
would reduce the code readability and the extensibility of the feature.
Per discussion, it's not worth paying those prices to save only one-byte, so we
decided to add the one-byte flag to the header.
This commit doesn't introduce any new compression algorithm like lz4.
Currently a full page image is compressed using the existing PGLZ algorithm.
Per discussion, we decided to use it at least in the first version of the
feature because there were no performance reports showing that its compression
ratio is unacceptably lower than that of other algorithm. Of course,
in the future, it's worth considering the support of other compression
algorithm for the better compression.
Rahila Syed and Michael Paquier, reviewed in various versions by myself,
Andres Freund, Robert Haas, Abhijit Menon-Sen and many others.
2015-03-11 07:52:24 +01:00
|
|
|
increasing the risk of unrecoverable data corruption,
|
|
|
|
but at the cost of some extra CPU spent on the compression during
|
|
|
|
WAL logging and on the decompression during WAL replay.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2020-06-03 02:59:43 +02:00
|
|
|
<varlistentry id="guc-wal-init-zero" xreflabel="wal_init_zero">
|
|
|
|
<term><varname>wal_init_zero</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>wal_init_zero</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
If set to <literal>on</literal> (the default), this option causes new
|
|
|
|
WAL files to be filled with zeroes. On some file systems, this ensures
|
|
|
|
that space is allocated before we need to write WAL records. However,
|
|
|
|
<firstterm>Copy-On-Write</firstterm> (COW) file systems may not benefit
|
|
|
|
from this technique, so the option is given to skip the unnecessary
|
|
|
|
work. If set to <literal>off</literal>, only the final byte is written
|
|
|
|
when the file is created so that it has the expected size.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-wal-recycle" xreflabel="wal_recycle">
|
|
|
|
<term><varname>wal_recycle</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>wal_recycle</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
If set to <literal>on</literal> (the default), this option causes WAL
|
|
|
|
files to be recycled by renaming them, avoiding the need to create new
|
|
|
|
ones. On COW file systems, it may be faster to create new ones, so the
|
|
|
|
option is given to disable this behavior.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-wal-buffers" xreflabel="wal_buffers">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>wal_buffers</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>wal_buffers</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2011-01-23 02:31:24 +01:00
|
|
|
The amount of shared memory used for WAL data that has not yet been
|
|
|
|
written to disk. The default setting of -1 selects a size equal to
|
2017-11-23 15:39:47 +01:00
|
|
|
1/32nd (about 3%) of <xref linkend="guc-shared-buffers"/>, but not less
|
2011-01-23 02:31:24 +01:00
|
|
|
than <literal>64kB</literal> nor more than the size of one WAL
|
|
|
|
segment, typically <literal>16MB</literal>. This value can be set
|
|
|
|
manually if the automatic choice is too large or too small,
|
|
|
|
but any positive value less than <literal>32kB</literal> will be
|
|
|
|
treated as <literal>32kB</literal>.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as WAL blocks,
|
|
|
|
that is <symbol>XLOG_BLCKSZ</symbol> bytes, typically 8kB.
|
2011-01-23 02:31:24 +01:00
|
|
|
This parameter can only be set at server start.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The contents of the WAL buffers are written out to disk at every
|
|
|
|
transaction commit, so extremely large values are unlikely to
|
|
|
|
provide a significant benefit. However, setting this value to at
|
|
|
|
least a few megabytes can improve write performance on a busy
|
|
|
|
server where many clients are committing at once. The auto-tuning
|
|
|
|
selected by the default setting of -1 should give reasonable
|
|
|
|
results in most cases.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2007-07-24 06:54:09 +02:00
|
|
|
|
|
|
|
<varlistentry id="guc-wal-writer-delay" xreflabel="wal_writer_delay">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>wal_writer_delay</varname> (<type>integer</type>)
|
2007-07-24 06:54:09 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>wal_writer_delay</varname> configuration parameter</primary>
|
2007-07-24 06:54:09 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2007-07-24 06:54:09 +02:00
|
|
|
<listitem>
|
2016-02-15 23:52:38 +01:00
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Specifies how often the WAL writer flushes WAL, in time terms.
|
|
|
|
After flushing WAL the writer sleeps for the length of time given
|
|
|
|
by <varname>wal_writer_delay</varname>, unless woken up sooner
|
2016-11-26 00:36:10 +01:00
|
|
|
by an asynchronously committing transaction. If the last flush
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
happened less than <varname>wal_writer_delay</varname> ago and less
|
|
|
|
than <varname>wal_writer_flush_after</varname> worth of WAL has been
|
2016-11-26 00:36:10 +01:00
|
|
|
produced since, then WAL is only written to the operating system, not
|
|
|
|
flushed to disk.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
2017-10-09 03:44:17 +02:00
|
|
|
The default value is 200 milliseconds (<literal>200ms</literal>). Note that
|
2016-02-15 23:52:38 +01:00
|
|
|
on many systems, the effective resolution of sleep delays is 10
|
2017-10-09 03:44:17 +02:00
|
|
|
milliseconds; setting <varname>wal_writer_delay</varname> to a value that is
|
2016-02-15 23:52:38 +01:00
|
|
|
not a multiple of 10 might have the same results as setting it to the
|
|
|
|
next higher multiple of 10. This parameter can only be set in the
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>postgresql.conf</filename> file or on the server command line.
|
2016-02-15 23:52:38 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-wal-writer-flush-after" xreflabel="wal_writer_flush_after">
|
|
|
|
<term><varname>wal_writer_flush_after</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>wal_writer_flush_after</varname> configuration parameter</primary>
|
2016-02-15 23:52:38 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Specifies how often the WAL writer flushes WAL, in volume terms.
|
|
|
|
If the last flush happened less
|
|
|
|
than <varname>wal_writer_delay</varname> ago and less
|
|
|
|
than <varname>wal_writer_flush_after</varname> worth of WAL has been
|
2016-11-26 00:36:10 +01:00
|
|
|
produced since, then WAL is only written to the operating system, not
|
2017-10-09 03:44:17 +02:00
|
|
|
flushed to disk. If <varname>wal_writer_flush_after</varname> is set
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
to <literal>0</literal> then WAL data is always flushed immediately.
|
|
|
|
If this value is specified without units, it is taken as WAL blocks,
|
|
|
|
that is <symbol>XLOG_BLCKSZ</symbol> bytes, typically 8kB.
|
|
|
|
The default is <literal>1MB</literal>.
|
|
|
|
This parameter can only be set in the
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>postgresql.conf</filename> file or on the server command line.
|
2007-07-24 06:54:09 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Skip WAL for new relfilenodes, under wal_level=minimal.
Until now, only selected bulk operations (e.g. COPY) did this. If a
given relfilenode received both a WAL-skipping COPY and a WAL-logged
operation (e.g. INSERT), recovery could lose tuples from the COPY. See
src/backend/access/transam/README section "Skipping WAL for New
RelFileNode" for the new coding rules. Maintainers of table access
methods should examine that section.
To maintain data durability, just before commit, we choose between an
fsync of the relfilenode and copying its contents to WAL. A new GUC,
wal_skip_threshold, guides that choice. If this change slows a workload
that creates small, permanent relfilenodes under wal_level=minimal, try
adjusting wal_skip_threshold. Users setting a timeout on COMMIT may
need to adjust that timeout, and log_min_duration_statement analysis
will reflect time consumption moving to COMMIT from commands like COPY.
Internally, this requires a reliable determination of whether
RollbackAndReleaseCurrentSubTransaction() would unlink a relation's
current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the
specification of rd_createSubid such that the field is zero when a new
rel has an old rd_node. Make relcache.c retain entries for certain
dropped relations until end of transaction.
Bump XLOG_PAGE_MAGIC, since this introduces XLOG_GIST_ASSIGN_LSN.
Future servers accept older WAL, so this bump is discretionary.
Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert
Haas. Heikki Linnakangas and Michael Paquier implemented earlier
designs that materially clarified the problem. Reviewed, in earlier
designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane,
Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout.
Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
2020-04-04 21:25:34 +02:00
|
|
|
<varlistentry id="guc-wal-skip-threshold" xreflabel="wal_skip_threshold">
|
|
|
|
<term><varname>wal_skip_threshold</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>wal_skip_threshold</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
When <varname>wal_level</varname> is <literal>minimal</literal> and a
|
|
|
|
transaction commits after creating or rewriting a permanent relation,
|
|
|
|
this setting determines how to persist the new data. If the data is
|
|
|
|
smaller than this setting, write it to the WAL log; otherwise, use an
|
|
|
|
fsync of affected files. Depending on the properties of your storage,
|
|
|
|
raising or lowering this value might help if such commits are slowing
|
|
|
|
concurrent transactions. If this value is specified without units, it
|
|
|
|
is taken as kilobytes. The default is two megabytes
|
|
|
|
(<literal>2MB</literal>).
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-commit-delay" xreflabel="commit_delay">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>commit_delay</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>commit_delay</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Setting <varname>commit_delay</varname> adds a time delay
|
|
|
|
before a WAL flush is initiated. This can improve
|
Make commit_delay much smarter.
Instead of letting every backend participating in a group commit wait
independently, have the first one that becomes ready to flush WAL wait
for the configured delay, and let all the others wait just long enough
for that first process to complete its flush. This greatly increases
the chances of being able to configure a commit_delay setting that
actually improves performance.
As a side consequence of this change, commit_delay now affects all WAL
flushes, rather than just commits. There was some discussion on
pgsql-hackers about whether to rename the GUC to, say, wal_flush_delay,
but in the absence of consensus I am leaving it alone for now.
Peter Geoghegan, with some changes, mostly to the documentation, by me.
2012-07-02 16:26:31 +02:00
|
|
|
group commit throughput by allowing a larger number of transactions
|
|
|
|
to commit via a single WAL flush, if system load is high enough
|
|
|
|
that additional transactions become ready to commit within the
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
given interval. However, it also increases latency by up to the
|
|
|
|
<varname>commit_delay</varname> for each WAL
|
Make commit_delay much smarter.
Instead of letting every backend participating in a group commit wait
independently, have the first one that becomes ready to flush WAL wait
for the configured delay, and let all the others wait just long enough
for that first process to complete its flush. This greatly increases
the chances of being able to configure a commit_delay setting that
actually improves performance.
As a side consequence of this change, commit_delay now affects all WAL
flushes, rather than just commits. There was some discussion on
pgsql-hackers about whether to rename the GUC to, say, wal_flush_delay,
but in the absence of consensus I am leaving it alone for now.
Peter Geoghegan, with some changes, mostly to the documentation, by me.
2012-07-02 16:26:31 +02:00
|
|
|
flush. Because the delay is just wasted if no other transactions
|
2013-03-15 22:41:47 +01:00
|
|
|
become ready to commit, a delay is only performed if at least
|
Make commit_delay much smarter.
Instead of letting every backend participating in a group commit wait
independently, have the first one that becomes ready to flush WAL wait
for the configured delay, and let all the others wait just long enough
for that first process to complete its flush. This greatly increases
the chances of being able to configure a commit_delay setting that
actually improves performance.
As a side consequence of this change, commit_delay now affects all WAL
flushes, rather than just commits. There was some discussion on
pgsql-hackers about whether to rename the GUC to, say, wal_flush_delay,
but in the absence of consensus I am leaving it alone for now.
Peter Geoghegan, with some changes, mostly to the documentation, by me.
2012-07-02 16:26:31 +02:00
|
|
|
<varname>commit_siblings</varname> other transactions are active
|
2013-03-22 16:39:15 +01:00
|
|
|
when a flush is about to be initiated. Also, no delays are
|
|
|
|
performed if <varname>fsync</varname> is disabled.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as microseconds.
|
2017-10-09 03:44:17 +02:00
|
|
|
The default <varname>commit_delay</varname> is zero (no delay).
|
2013-03-22 16:39:15 +01:00
|
|
|
Only superusers can change this setting.
|
|
|
|
</para>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
In <productname>PostgreSQL</productname> releases prior to 9.3,
|
Make commit_delay much smarter.
Instead of letting every backend participating in a group commit wait
independently, have the first one that becomes ready to flush WAL wait
for the configured delay, and let all the others wait just long enough
for that first process to complete its flush. This greatly increases
the chances of being able to configure a commit_delay setting that
actually improves performance.
As a side consequence of this change, commit_delay now affects all WAL
flushes, rather than just commits. There was some discussion on
pgsql-hackers about whether to rename the GUC to, say, wal_flush_delay,
but in the absence of consensus I am leaving it alone for now.
Peter Geoghegan, with some changes, mostly to the documentation, by me.
2012-07-02 16:26:31 +02:00
|
|
|
<varname>commit_delay</varname> behaved differently and was much
|
|
|
|
less effective: it affected only commits, rather than all WAL flushes,
|
|
|
|
and waited for the entire configured delay even if the WAL flush
|
2017-10-09 03:44:17 +02:00
|
|
|
was completed sooner. Beginning in <productname>PostgreSQL</productname> 9.3,
|
Make commit_delay much smarter.
Instead of letting every backend participating in a group commit wait
independently, have the first one that becomes ready to flush WAL wait
for the configured delay, and let all the others wait just long enough
for that first process to complete its flush. This greatly increases
the chances of being able to configure a commit_delay setting that
actually improves performance.
As a side consequence of this change, commit_delay now affects all WAL
flushes, rather than just commits. There was some discussion on
pgsql-hackers about whether to rename the GUC to, say, wal_flush_delay,
but in the absence of consensus I am leaving it alone for now.
Peter Geoghegan, with some changes, mostly to the documentation, by me.
2012-07-02 16:26:31 +02:00
|
|
|
the first process that becomes ready to flush waits for the configured
|
|
|
|
interval, while subsequent processes wait only until the leader
|
2013-03-22 16:39:15 +01:00
|
|
|
completes the flush operation.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-commit-siblings" xreflabel="commit_siblings">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>commit_siblings</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>commit_siblings</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Minimum number of concurrent open transactions to require
|
2017-10-09 03:44:17 +02:00
|
|
|
before performing the <varname>commit_delay</varname> delay. A larger
|
2005-09-13 00:11:38 +02:00
|
|
|
value makes it more probable that at least one other
|
|
|
|
transaction will become ready to commit during the delay
|
2007-01-20 22:30:26 +01:00
|
|
|
interval. The default is five transactions.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="runtime-config-wal-checkpoints">
|
|
|
|
<title>Checkpoints</title>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry id="guc-checkpoint-timeout" xreflabel="checkpoint_timeout">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>checkpoint_timeout</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>checkpoint_timeout</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Maximum time between automatic WAL checkpoints.
|
|
|
|
If this value is specified without units, it is taken as seconds.
|
2016-09-12 00:26:18 +02:00
|
|
|
The valid range is between 30 seconds and one day.
|
2017-10-09 03:44:17 +02:00
|
|
|
The default is five minutes (<literal>5min</literal>).
|
2008-03-05 17:59:10 +01:00
|
|
|
Increasing this parameter can increase the amount of time needed
|
|
|
|
for crash recovery.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2006-01-23 19:16:41 +01:00
|
|
|
file or on the server command line.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2007-06-28 02:02:40 +02:00
|
|
|
<varlistentry id="guc-checkpoint-completion-target" xreflabel="checkpoint_completion_target">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>checkpoint_completion_target</varname> (<type>floating point</type>)
|
2007-06-28 02:02:40 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>checkpoint_completion_target</varname> configuration parameter</primary>
|
2007-06-28 02:02:40 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2007-06-28 02:02:40 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2010-02-03 18:25:06 +01:00
|
|
|
Specifies the target of checkpoint completion, as a fraction of
|
2021-03-24 18:07:51 +01:00
|
|
|
total time between checkpoints. The default is 0.9, which spreads the
|
|
|
|
checkpoint across almost all of the available interval, providing fairly
|
|
|
|
consistent I/O load while also leaving some time for checkpoint
|
|
|
|
completion overhead. Reducing this parameter is not recommended because
|
|
|
|
it causes the checkpoint to complete faster. This results in a higher
|
|
|
|
rate of I/O during the checkpoint followed by a period of less I/O between
|
|
|
|
the checkpoint completion and the next scheduled checkpoint. This
|
|
|
|
parameter can only be set in the <filename>postgresql.conf</filename> file
|
|
|
|
or on the server command line.
|
2007-06-28 02:02:40 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
<varlistentry id="guc-checkpoint-flush-after" xreflabel="checkpoint_flush_after">
|
2016-04-24 21:26:55 +02:00
|
|
|
<term><varname>checkpoint_flush_after</varname> (<type>integer</type>)
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>checkpoint_flush_after</varname> configuration parameter</primary>
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Whenever more than this amount of data has been
|
|
|
|
written while performing a checkpoint, attempt to force the
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
OS to issue these writes to the underlying storage. Doing so will
|
|
|
|
limit the amount of dirty data in the kernel's page cache, reducing
|
2017-06-18 20:01:45 +02:00
|
|
|
the likelihood of stalls when an <function>fsync</function> is issued at the end of the
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
checkpoint, or when the OS writes data back in larger batches in the
|
|
|
|
background. Often that will result in greatly reduced transaction
|
|
|
|
latency, but there also are some cases, especially with workloads
|
2017-11-23 15:39:47 +01:00
|
|
|
that are bigger than <xref linkend="guc-shared-buffers"/>, but smaller
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
than the OS's page cache, where performance might degrade. This
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
setting may have no effect on some platforms.
|
|
|
|
If this value is specified without units, it is taken as blocks,
|
|
|
|
that is <symbol>BLCKSZ</symbol> bytes, typically 8kB.
|
|
|
|
The valid range is
|
2016-11-26 00:36:10 +01:00
|
|
|
between <literal>0</literal>, which disables forced writeback,
|
2017-10-09 03:44:17 +02:00
|
|
|
and <literal>2MB</literal>. The default is <literal>256kB</literal> on
|
|
|
|
Linux, <literal>0</literal> elsewhere. (If <symbol>BLCKSZ</symbol> is not
|
2016-11-26 00:36:10 +01:00
|
|
|
8kB, the default and maximum values scale proportionally to it.)
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches. When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively. This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.
On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.
Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache. Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.
While desirable and likely possible this patch does not contain an
implementation for windows.
With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.
A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences. This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.
Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-02-19 21:13:05 +01:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-checkpoint-warning" xreflabel="checkpoint_warning">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>checkpoint_warning</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>checkpoint_warning</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Write a message to the server log if checkpoints caused by
|
2018-03-03 20:23:13 +01:00
|
|
|
the filling of WAL segment files happen closer together
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
than this amount of time (which suggests that
|
|
|
|
<varname>max_wal_size</varname> ought to be raised).
|
|
|
|
If this value is specified without units, it is taken as seconds.
|
|
|
|
The default is 30 seconds (<literal>30s</literal>).
|
|
|
|
Zero disables the warning.
|
2013-03-10 22:16:18 +01:00
|
|
|
No warnings will be generated if <varname>checkpoint_timeout</varname>
|
|
|
|
is less than <varname>checkpoint_warning</varname>.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2006-01-23 19:16:41 +01:00
|
|
|
file or on the server command line.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2015-10-20 19:33:39 +02:00
|
|
|
<varlistentry id="guc-max-wal-size" xreflabel="max_wal_size">
|
|
|
|
<term><varname>max_wal_size</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_wal_size</varname> configuration parameter</primary>
|
2015-10-20 19:33:39 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2020-09-01 23:00:10 +02:00
|
|
|
Maximum size to let the WAL grow during automatic
|
2015-10-20 19:33:39 +02:00
|
|
|
checkpoints. This is a soft limit; WAL size can exceed
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
<varname>max_wal_size</varname> under special circumstances, such as
|
|
|
|
heavy load, a failing <varname>archive_command</varname>, or a high
|
2020-07-20 06:30:18 +02:00
|
|
|
<varname>wal_keep_size</varname> setting.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as megabytes.
|
|
|
|
The default is 1 GB.
|
2015-10-20 19:33:39 +02:00
|
|
|
Increasing this parameter can increase the amount of time needed for
|
|
|
|
crash recovery.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2015-10-20 19:33:39 +02:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2015-02-23 17:53:02 +01:00
|
|
|
<varlistentry id="guc-min-wal-size" xreflabel="min_wal_size">
|
2015-02-23 22:57:54 +01:00
|
|
|
<term><varname>min_wal_size</varname> (<type>integer</type>)
|
2015-02-23 17:53:02 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>min_wal_size</varname> configuration parameter</primary>
|
2015-02-23 17:53:02 +01:00
|
|
|
</indexterm>
|
2015-02-23 22:57:54 +01:00
|
|
|
</term>
|
2015-02-23 17:53:02 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
As long as WAL disk usage stays below this setting, old WAL files are
|
|
|
|
always recycled for future use at a checkpoint, rather than removed.
|
|
|
|
This can be used to ensure that enough WAL space is reserved to
|
|
|
|
handle spikes in WAL usage, for example when running large batch
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
jobs.
|
|
|
|
If this value is specified without units, it is taken as megabytes.
|
|
|
|
The default is 80 MB.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2015-02-23 17:53:02 +01:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="runtime-config-wal-archiving">
|
|
|
|
<title>Archiving</title>
|
|
|
|
|
|
|
|
<variablelist>
|
2007-09-27 00:36:30 +02:00
|
|
|
<varlistentry id="guc-archive-mode" xreflabel="archive_mode">
|
2015-05-15 17:55:24 +02:00
|
|
|
<term><varname>archive_mode</varname> (<type>enum</type>)
|
2007-09-27 00:36:30 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>archive_mode</varname> configuration parameter</primary>
|
2007-09-27 00:36:30 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2007-09-27 00:36:30 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
When <varname>archive_mode</varname> is enabled, completed WAL segments
|
2010-02-03 18:25:06 +01:00
|
|
|
are sent to archive storage by setting
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-archive-command"/>. In addition to <literal>off</literal>,
|
2017-10-09 03:44:17 +02:00
|
|
|
to disable, there are two modes: <literal>on</literal>, and
|
|
|
|
<literal>always</literal>. During normal operation, there is no
|
|
|
|
difference between the two modes, but when set to <literal>always</literal>
|
2015-05-15 17:55:24 +02:00
|
|
|
the WAL archiver is enabled also during archive recovery or standby
|
2017-10-09 03:44:17 +02:00
|
|
|
mode. In <literal>always</literal> mode, all files restored from the archive
|
2015-05-15 17:55:24 +02:00
|
|
|
or streamed with streaming replication will be archived (again). See
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="continuous-archiving-in-standby"/> for details.
|
2015-08-21 04:34:35 +02:00
|
|
|
</para>
|
2015-05-15 17:55:24 +02:00
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>archive_mode</varname> and <varname>archive_command</varname> are
|
|
|
|
separate variables so that <varname>archive_command</varname> can be
|
2007-09-27 00:36:30 +02:00
|
|
|
changed without leaving archiving mode.
|
2011-07-07 21:10:32 +02:00
|
|
|
This parameter can only be set at server start.
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>archive_mode</varname> cannot be enabled when
|
|
|
|
<varname>wal_level</varname> is set to <literal>minimal</literal>.
|
2007-09-27 00:36:30 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-archive-command" xreflabel="archive_command">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>archive_command</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>archive_command</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2014-04-19 20:59:47 +02:00
|
|
|
The local shell command to execute to archive a completed WAL file
|
2017-10-09 03:44:17 +02:00
|
|
|
segment. Any <literal>%p</literal> in the string is
|
2006-11-04 19:20:27 +01:00
|
|
|
replaced by the path name of the file to archive, and any
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>%f</literal> is replaced by only the file name.
|
2006-11-04 19:20:27 +01:00
|
|
|
(The path name is relative to the working directory of the server,
|
|
|
|
i.e., the cluster's data directory.)
|
2017-10-09 03:44:17 +02:00
|
|
|
Use <literal>%%</literal> to embed an actual <literal>%</literal> character in the
|
2010-04-13 00:09:58 +02:00
|
|
|
command. It is important for the command to return a zero
|
|
|
|
exit status only if it succeeds. For more information see
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="backup-archiving-wal"/>.
|
2010-04-13 00:09:58 +02:00
|
|
|
</para>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2007-09-27 00:36:30 +02:00
|
|
|
file or on the server command line. It is ignored unless
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>archive_mode</varname> was enabled at server start.
|
|
|
|
If <varname>archive_command</varname> is an empty string (the default) while
|
|
|
|
<varname>archive_mode</varname> is enabled, WAL archiving is temporarily
|
2007-09-27 00:36:30 +02:00
|
|
|
disabled, but the server continues to accumulate WAL segment files in
|
2010-02-03 18:25:06 +01:00
|
|
|
the expectation that a command will soon be provided. Setting
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>archive_command</varname> to a command that does nothing but
|
2020-09-01 00:33:37 +02:00
|
|
|
return true, e.g., <literal>/bin/true</literal> (<literal>REM</literal> on
|
2010-06-30 04:43:10 +02:00
|
|
|
Windows), effectively disables
|
2010-02-03 18:25:06 +01:00
|
|
|
archiving, but also breaks the chain of WAL files needed for
|
|
|
|
archive recovery, so it should only be used in unusual circumstances.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2010-11-23 21:27:50 +01:00
|
|
|
|
2006-08-18 01:04:10 +02:00
|
|
|
<varlistentry id="guc-archive-timeout" xreflabel="archive_timeout">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>archive_timeout</varname> (<type>integer</type>)
|
2006-08-18 01:04:10 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>archive_timeout</varname> configuration parameter</primary>
|
2006-08-18 01:04:10 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2006-08-18 01:04:10 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-11-23 15:39:47 +01:00
|
|
|
The <xref linkend="guc-archive-command"/> is only invoked for
|
2006-11-12 06:12:42 +01:00
|
|
|
completed WAL segments. Hence, if your server generates little WAL
|
|
|
|
traffic (or has slack periods where it does so), there could be a
|
|
|
|
long delay between the completion of a transaction and its safe
|
2010-02-03 18:25:06 +01:00
|
|
|
recording in archive storage. To limit how old unarchived
|
2017-10-09 03:44:17 +02:00
|
|
|
data can be, you can set <varname>archive_timeout</varname> to force the
|
2006-11-12 06:12:42 +01:00
|
|
|
server to switch to a new WAL segment file periodically. When this
|
|
|
|
parameter is greater than zero, the server will switch to a new
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
segment file whenever this amount of time has elapsed since the last
|
2010-02-06 00:37:43 +01:00
|
|
|
segment file switch, and there has been any database activity,
|
Skip checkpoints, archiving on idle systems.
Some background activity (like checkpoints, archive timeout, standby
snapshots) is not supposed to happen on an idle system. Unfortunately
so far it was not easy to determine when a system is idle, which
defeated some of the attempts to avoid redundant activity on an idle
system.
To make that easier, allow to make individual WAL insertions as not
being "important". By checking whether any important activity happened
since the last time an activity was performed, it now is easy to check
whether some action needs to be repeated.
Use the new facility for checkpoints, archive timeout and standby
snapshots.
The lack of a facility causes some issues in older releases, but in my
opinion the consequences (superflous checkpoints / archived segments)
aren't grave enough to warrant backpatching.
Author: Michael Paquier, editorialized by Andres Freund
Reviewed-By: Andres Freund, David Steele, Amit Kapila, Kyotaro HORIGUCHI
Bug: #13685
Discussion:
https://www.postgresql.org/message-id/20151016203031.3019.72930@wrigleys.postgresql.org
https://www.postgresql.org/message-id/CAB7nPqQcPqxEM3S735Bd2RzApNqSNJVietAC=6kfkYv_45dKwA@mail.gmail.com
Backpatch: -
2016-12-22 20:31:50 +01:00
|
|
|
including a single checkpoint (checkpoints are skipped if there is
|
|
|
|
no database activity). Note that archived files that are closed
|
|
|
|
early due to a forced switch are still the same length as completely
|
|
|
|
full files. Therefore, it is unwise to use a very short
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>archive_timeout</varname> — it will bloat your archive
|
|
|
|
storage. <varname>archive_timeout</varname> settings of a minute or so are
|
2011-07-07 21:10:32 +02:00
|
|
|
usually reasonable. You should consider using streaming replication,
|
2020-06-15 19:12:58 +02:00
|
|
|
instead of archiving, if you want data to be copied off the primary
|
2011-07-07 21:10:32 +02:00
|
|
|
server more quickly than that.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as seconds.
|
2011-07-07 21:10:32 +02:00
|
|
|
This parameter can only be set in the
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>postgresql.conf</filename> file or on the server command line.
|
2006-08-18 01:04:10 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2010-04-29 23:36:19 +02:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
|
2018-11-25 16:31:16 +01:00
|
|
|
<sect2 id="runtime-config-wal-archive-recovery">
|
|
|
|
|
|
|
|
<title>Archive Recovery</title>
|
|
|
|
|
|
|
|
<indexterm>
|
|
|
|
<primary>configuration</primary>
|
|
|
|
<secondary>of recovery</secondary>
|
|
|
|
<tertiary>of a standby server</tertiary>
|
|
|
|
</indexterm>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
This section describes the settings that apply only for the duration of
|
|
|
|
the recovery. They must be reset for any subsequent recovery you wish to
|
2019-02-04 09:28:17 +01:00
|
|
|
perform.
|
2018-11-25 16:31:16 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
<quote>Recovery</quote> covers using the server as a standby or for
|
|
|
|
executing a targeted recovery. Typically, standby mode would be used to
|
|
|
|
provide high availability and/or read scalability, whereas a targeted
|
|
|
|
recovery is used to recover from data loss.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2019-08-20 05:36:31 +02:00
|
|
|
To start the server in standby mode, create a file called
|
2018-11-25 16:31:16 +01:00
|
|
|
<filename>standby.signal</filename><indexterm><primary>standby.signal</primary></indexterm>
|
|
|
|
in the data directory. The server will enter recovery and will not stop
|
|
|
|
recovery when the end of archived WAL is reached, but will keep trying to
|
|
|
|
continue recovery by connecting to the sending server as specified by the
|
|
|
|
<varname>primary_conninfo</varname> setting and/or by fetching new WAL
|
2019-09-29 23:07:22 +02:00
|
|
|
segments using <varname>restore_command</varname>. For this mode, the
|
|
|
|
parameters from this section and <xref
|
|
|
|
linkend="runtime-config-replication-standby"/> are of interest.
|
|
|
|
Parameters from <xref linkend="runtime-config-wal-recovery-target"/> will
|
|
|
|
also be applied but are typically not useful in this mode.
|
2018-11-25 16:31:16 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2019-08-20 05:36:31 +02:00
|
|
|
To start the server in targeted recovery mode, create a file called
|
2018-11-25 16:31:16 +01:00
|
|
|
<filename>recovery.signal</filename><indexterm><primary>recovery.signal</primary></indexterm>
|
|
|
|
in the data directory. If both <filename>standby.signal</filename> and
|
|
|
|
<filename>recovery.signal</filename> files are created, standby mode
|
2019-08-20 05:36:31 +02:00
|
|
|
takes precedence. Targeted recovery mode ends when the archived WAL is
|
|
|
|
fully replayed, or when <varname>recovery_target</varname> is reached.
|
2019-09-29 23:07:22 +02:00
|
|
|
In this mode, the parameters from both this section and <xref
|
2019-11-09 09:35:21 +01:00
|
|
|
linkend="runtime-config-wal-recovery-target"/> will be used.
|
2018-11-25 16:31:16 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry id="guc-restore-command" xreflabel="restore_command">
|
|
|
|
<term><varname>restore_command</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>restore_command</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
The local shell command to execute to retrieve an archived segment of
|
|
|
|
the WAL file series. This parameter is required for archive recovery,
|
|
|
|
but optional for streaming replication.
|
|
|
|
Any <literal>%f</literal> in the string is
|
|
|
|
replaced by the name of the file to retrieve from the archive,
|
|
|
|
and any <literal>%p</literal> is replaced by the copy destination path name
|
|
|
|
on the server.
|
|
|
|
(The path name is relative to the current working directory,
|
|
|
|
i.e., the cluster's data directory.)
|
|
|
|
Any <literal>%r</literal> is replaced by the name of the file containing the
|
|
|
|
last valid restart point. That is the earliest file that must be kept
|
|
|
|
to allow a restore to be restartable, so this information can be used
|
|
|
|
to truncate the archive to just the minimum required to support
|
|
|
|
restarting from the current restore. <literal>%r</literal> is typically only
|
|
|
|
used by warm-standby configurations
|
|
|
|
(see <xref linkend="warm-standby"/>).
|
|
|
|
Write <literal>%%</literal> to embed an actual <literal>%</literal> character.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
It is important for the command to return a zero exit status
|
|
|
|
only if it succeeds. The command <emphasis>will</emphasis> be asked for file
|
|
|
|
names that are not present in the archive; it must return nonzero
|
|
|
|
when so asked. Examples:
|
|
|
|
<programlisting>
|
|
|
|
restore_command = 'cp /mnt/server/archivedir/%f "%p"'
|
|
|
|
restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
|
|
|
|
</programlisting>
|
|
|
|
An exception is that if the command was terminated by a signal (other
|
|
|
|
than <systemitem>SIGTERM</systemitem>, which is used as part of a
|
|
|
|
database server shutdown) or an error by the shell (such as command
|
|
|
|
not found), then recovery will abort and the server will not start up.
|
|
|
|
</para>
|
2019-02-04 09:28:17 +01:00
|
|
|
|
|
|
|
<para>
|
2020-12-02 03:00:15 +01:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
|
|
|
file or on the server command line.
|
2019-02-04 09:28:17 +01:00
|
|
|
</para>
|
2018-11-25 16:31:16 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-archive-cleanup-command" xreflabel="archive_cleanup_command">
|
|
|
|
<term><varname>archive_cleanup_command</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>archive_cleanup_command</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This optional parameter specifies a shell command that will be executed
|
|
|
|
at every restartpoint. The purpose of
|
|
|
|
<varname>archive_cleanup_command</varname> is to provide a mechanism for
|
|
|
|
cleaning up old archived WAL files that are no longer needed by the
|
|
|
|
standby server.
|
|
|
|
Any <literal>%r</literal> is replaced by the name of the file containing the
|
|
|
|
last valid restart point.
|
|
|
|
That is the earliest file that must be <emphasis>kept</emphasis> to allow a
|
|
|
|
restore to be restartable, and so all files earlier than <literal>%r</literal>
|
|
|
|
may be safely removed.
|
|
|
|
This information can be used to truncate the archive to just the
|
|
|
|
minimum required to support restart from the current restore.
|
|
|
|
The <xref linkend="pgarchivecleanup"/> module
|
|
|
|
is often used in <varname>archive_cleanup_command</varname> for
|
|
|
|
single-standby configurations, for example:
|
|
|
|
<programlisting>archive_cleanup_command = 'pg_archivecleanup /mnt/server/archivedir %r'</programlisting>
|
|
|
|
Note however that if multiple standby servers are restoring from the
|
|
|
|
same archive directory, you will need to ensure that you do not delete
|
|
|
|
WAL files until they are no longer needed by any of the servers.
|
|
|
|
<varname>archive_cleanup_command</varname> would typically be used in a
|
|
|
|
warm-standby configuration (see <xref linkend="warm-standby"/>).
|
|
|
|
Write <literal>%%</literal> to embed an actual <literal>%</literal> character in the
|
|
|
|
command.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
If the command returns a nonzero exit status then a warning log
|
|
|
|
message will be written. An exception is that if the command was
|
|
|
|
terminated by a signal or an error by the shell (such as command not
|
|
|
|
found), a fatal error will be raised.
|
|
|
|
</para>
|
2019-02-04 09:28:17 +01:00
|
|
|
<para>
|
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
2018-11-25 16:31:16 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-recovery-end-command" xreflabel="recovery_end_command">
|
|
|
|
<term><varname>recovery_end_command</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>recovery_end_command</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This parameter specifies a shell command that will be executed once only
|
|
|
|
at the end of recovery. This parameter is optional. The purpose of the
|
|
|
|
<varname>recovery_end_command</varname> is to provide a mechanism for cleanup
|
|
|
|
following replication or recovery.
|
|
|
|
Any <literal>%r</literal> is replaced by the name of the file containing the
|
|
|
|
last valid restart point, like in <xref linkend="guc-archive-cleanup-command"/>.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
If the command returns a nonzero exit status then a warning log
|
|
|
|
message will be written and the database will proceed to start up
|
|
|
|
anyway. An exception is that if the command was terminated by a
|
|
|
|
signal or an error by the shell (such as command not found), the
|
|
|
|
database will not proceed with startup.
|
|
|
|
</para>
|
2019-02-04 09:28:17 +01:00
|
|
|
<para>
|
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
2018-11-25 16:31:16 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
</variablelist>
|
|
|
|
|
|
|
|
</sect2>
|
|
|
|
|
|
|
|
<sect2 id="runtime-config-wal-recovery-target">
|
|
|
|
|
|
|
|
<title>Recovery Target</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
By default, recovery will recover to the end of the WAL log. The
|
|
|
|
following parameters can be used to specify an earlier stopping point.
|
|
|
|
At most one of <varname>recovery_target</varname>,
|
|
|
|
<varname>recovery_target_lsn</varname>, <varname>recovery_target_name</varname>,
|
|
|
|
<varname>recovery_target_time</varname>, or <varname>recovery_target_xid</varname>
|
|
|
|
can be used; if more than one of these is specified in the configuration
|
2018-11-28 12:36:49 +01:00
|
|
|
file, an error will be raised.
|
2018-11-25 16:31:16 +01:00
|
|
|
These parameters can only be set at server start.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry id="guc-recovery-target" xreflabel="recovery_target">
|
|
|
|
<term><varname>recovery_target</varname><literal> = 'immediate'</literal>
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>recovery_target</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This parameter specifies that recovery should end as soon as a
|
2020-09-01 00:33:37 +02:00
|
|
|
consistent state is reached, i.e., as early as possible. When restoring
|
2018-11-25 16:31:16 +01:00
|
|
|
from an online backup, this means the point where taking the backup
|
|
|
|
ended.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
Technically, this is a string parameter, but <literal>'immediate'</literal>
|
|
|
|
is currently the only allowed value.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-recovery-target-name" xreflabel="recovery_target_name">
|
|
|
|
<term><varname>recovery_target_name</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>recovery_target_name</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This parameter specifies the named restore point (created with
|
|
|
|
<function>pg_create_restore_point()</function>) to which recovery will proceed.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-recovery-target-time" xreflabel="recovery_target_time">
|
|
|
|
<term><varname>recovery_target_time</varname> (<type>timestamp</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>recovery_target_time</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This parameter specifies the time stamp up to which recovery
|
|
|
|
will proceed.
|
|
|
|
The precise stopping point is also influenced by
|
|
|
|
<xref linkend="guc-recovery-target-inclusive"/>.
|
|
|
|
</para>
|
2020-05-05 22:06:49 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
The value of this parameter is a time stamp in the same format
|
|
|
|
accepted by the <type>timestamp with time zone</type> data type,
|
|
|
|
except that you cannot use a time zone abbreviation (unless the
|
|
|
|
<xref linkend="guc-timezone-abbreviations"/> variable has been set
|
|
|
|
earlier in the configuration file). Preferred style is to use a
|
|
|
|
numeric offset from UTC, or you can write a full time zone name,
|
2020-09-01 00:33:37 +02:00
|
|
|
e.g., <literal>Europe/Helsinki</literal> not <literal>EEST</literal>.
|
2020-05-05 22:06:49 +02:00
|
|
|
</para>
|
2018-11-25 16:31:16 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-recovery-target-xid" xreflabel="recovery_target_xid">
|
|
|
|
<term><varname>recovery_target_xid</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>recovery_target_xid</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This parameter specifies the transaction ID up to which recovery
|
|
|
|
will proceed. Keep in mind
|
|
|
|
that while transaction IDs are assigned sequentially at transaction
|
|
|
|
start, transactions can complete in a different numeric order.
|
|
|
|
The transactions that will be recovered are those that committed
|
|
|
|
before (and optionally including) the specified one.
|
|
|
|
The precise stopping point is also influenced by
|
|
|
|
<xref linkend="guc-recovery-target-inclusive"/>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-recovery-target-lsn" xreflabel="recovery_target_lsn">
|
|
|
|
<term><varname>recovery_target_lsn</varname> (<type>pg_lsn</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>recovery_target_lsn</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This parameter specifies the LSN of the write-ahead log location up
|
|
|
|
to which recovery will proceed. The precise stopping point is also
|
|
|
|
influenced by <xref linkend="guc-recovery-target-inclusive"/>. This
|
|
|
|
parameter is parsed using the system data type
|
|
|
|
<link linkend="datatype-pg-lsn"><type>pg_lsn</type></link>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The following options further specify the recovery target, and affect
|
|
|
|
what happens when the target is reached:
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry id="guc-recovery-target-inclusive"
|
|
|
|
xreflabel="recovery_target_inclusive">
|
|
|
|
<term><varname>recovery_target_inclusive</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>recovery_target_inclusive</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies whether to stop just after the specified recovery target
|
2018-12-06 04:15:15 +01:00
|
|
|
(<literal>on</literal>), or just before the recovery target
|
|
|
|
(<literal>off</literal>).
|
2018-11-25 16:31:16 +01:00
|
|
|
Applies when <xref linkend="guc-recovery-target-lsn"/>,
|
|
|
|
<xref linkend="guc-recovery-target-time"/>, or
|
|
|
|
<xref linkend="guc-recovery-target-xid"/> is specified.
|
|
|
|
This setting controls whether transactions
|
|
|
|
having exactly the target WAL location (LSN), commit time, or transaction ID, respectively, will
|
2018-12-06 04:15:15 +01:00
|
|
|
be included in the recovery. Default is <literal>on</literal>.
|
2018-11-25 16:31:16 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-recovery-target-timeline"
|
|
|
|
xreflabel="recovery_target_timeline">
|
|
|
|
<term><varname>recovery_target_timeline</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>recovery_target_timeline</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2019-01-11 10:25:06 +01:00
|
|
|
Specifies recovering into a particular timeline. The value can be a
|
|
|
|
numeric timeline ID or a special value. The value
|
|
|
|
<literal>current</literal> recovers along the same timeline that was
|
2019-01-11 10:36:10 +01:00
|
|
|
current when the base backup was taken. The
|
2019-01-11 10:25:06 +01:00
|
|
|
value <literal>latest</literal> recovers
|
2018-11-25 16:31:16 +01:00
|
|
|
to the latest timeline found in the archive, which is useful in
|
2019-01-11 10:36:10 +01:00
|
|
|
a standby server. <literal>latest</literal> is the default.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
You usually only need to set this parameter
|
2018-11-25 16:31:16 +01:00
|
|
|
in complex re-recovery situations, where you need to return to
|
|
|
|
a state that itself was reached after a point-in-time recovery.
|
|
|
|
See <xref linkend="backup-timelines"/> for discussion.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-recovery-target-action"
|
|
|
|
xreflabel="recovery_target_action">
|
|
|
|
<term><varname>recovery_target_action</varname> (<type>enum</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>recovery_target_action</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies what action the server should take once the recovery target is
|
|
|
|
reached. The default is <literal>pause</literal>, which means recovery will
|
|
|
|
be paused. <literal>promote</literal> means the recovery process will finish
|
|
|
|
and the server will start to accept connections.
|
|
|
|
Finally <literal>shutdown</literal> will stop the server after reaching the
|
|
|
|
recovery target.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
The intended use of the <literal>pause</literal> setting is to allow queries
|
|
|
|
to be executed against the database to check if this recovery target
|
|
|
|
is the most desirable point for recovery.
|
|
|
|
The paused state can be resumed by
|
|
|
|
using <function>pg_wal_replay_resume()</function> (see
|
|
|
|
<xref linkend="functions-recovery-control-table"/>), which then
|
|
|
|
causes recovery to end. If this recovery target is not the
|
|
|
|
desired stopping point, then shut down the server, change the
|
|
|
|
recovery target settings to a later target and restart to
|
|
|
|
continue recovery.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
The <literal>shutdown</literal> setting is useful to have the instance ready
|
|
|
|
at the exact replay point desired. The instance will still be able to
|
|
|
|
replay more WAL records (and in fact will have to replay WAL records
|
|
|
|
since the last checkpoint next time it is started).
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
Note that because <filename>recovery.signal</filename> will not be
|
|
|
|
removed when <varname>recovery_target_action</varname> is set to <literal>shutdown</literal>,
|
|
|
|
any subsequent start will end with immediate shutdown unless the
|
|
|
|
configuration is changed or the <filename>recovery.signal</filename>
|
|
|
|
file is removed manually.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
This setting has no effect if no recovery target is set.
|
|
|
|
If <xref linkend="guc-hot-standby"/> is not enabled, a setting of
|
|
|
|
<literal>pause</literal> will act the same as <literal>shutdown</literal>.
|
2020-03-24 04:46:48 +01:00
|
|
|
If the recovery target is reached while a promotion is ongoing,
|
|
|
|
a setting of <literal>pause</literal> will act the same as
|
|
|
|
<literal>promote</literal>.
|
2018-11-25 16:31:16 +01:00
|
|
|
</para>
|
2020-01-29 15:43:32 +01:00
|
|
|
<para>
|
|
|
|
In any case, if a recovery target is configured but the archive
|
|
|
|
recovery ends before the target is reached, the server will shut down
|
|
|
|
with a fatal error.
|
|
|
|
</para>
|
2018-11-25 16:31:16 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
|
2011-07-07 21:10:32 +02:00
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="runtime-config-replication">
|
|
|
|
<title>Replication</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
These settings control the behavior of the built-in
|
2017-10-09 03:44:17 +02:00
|
|
|
<firstterm>streaming replication</firstterm> feature (see
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="streaming-replication"/>). Servers will be either a
|
2020-06-15 19:12:58 +02:00
|
|
|
primary or a standby server. Primaries can send data, while standbys
|
2011-07-19 04:40:03 +02:00
|
|
|
are always receivers of replicated data. When cascading replication
|
2018-05-07 17:05:19 +02:00
|
|
|
(see <xref linkend="cascading-replication"/>) is used, standby servers
|
2011-07-19 04:40:03 +02:00
|
|
|
can also be senders, as well as receivers.
|
2018-05-07 17:05:19 +02:00
|
|
|
Parameters are mainly for sending and standby servers, though some
|
2020-06-15 19:12:58 +02:00
|
|
|
parameters have meaning only on the primary server. Settings may vary
|
2011-07-19 04:40:03 +02:00
|
|
|
across the cluster without problems if that is required.
|
2011-07-07 21:10:32 +02:00
|
|
|
</para>
|
|
|
|
|
2011-07-19 04:40:03 +02:00
|
|
|
<sect2 id="runtime-config-replication-sender">
|
2018-05-07 17:05:19 +02:00
|
|
|
<title>Sending Servers</title>
|
2010-01-15 10:19:10 +01:00
|
|
|
|
|
|
|
<para>
|
2011-07-19 04:40:03 +02:00
|
|
|
These parameters can be set on any server that is
|
2010-07-03 22:43:58 +02:00
|
|
|
to send replication data to one or more standby servers.
|
2020-06-15 19:12:58 +02:00
|
|
|
The primary is always a sending server, so these parameters must
|
|
|
|
always be set on the primary.
|
2011-07-19 04:40:03 +02:00
|
|
|
The role and meaning of these parameters does not change after a
|
2020-06-15 19:12:58 +02:00
|
|
|
standby becomes the primary.
|
2010-01-15 10:19:10 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry id="guc-max-wal-senders" xreflabel="max_wal_senders">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>max_wal_senders</varname> (<type>integer</type>)
|
2010-01-15 10:19:10 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_wal_senders</varname> configuration parameter</primary>
|
2010-01-15 10:19:10 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2010-01-15 10:19:10 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Move max_wal_senders out of max_connections for connection slot handling
Since its introduction, max_wal_senders is counted as part of
max_connections when it comes to define how many connection slots can be
used for replication connections with a WAL sender context. This can
lead to confusion for some users, as it could be possible to block a
base backup or replication from happening because other backend sessions
are already taken for other purposes by an application, and
superuser-only connection slots are not a correct solution to handle
that case.
This commit makes max_wal_senders independent of max_connections for its
handling of PGPROC entries in ProcGlobal, meaning that connection slots
for WAL senders are handled using their own free queue, like autovacuum
workers and bgworkers.
One compatibility issue that this change creates is that a standby now
requires to have a value of max_wal_senders at least equal to its
primary. So, if a standby created enforces the value of
max_wal_senders to be lower than that, then this could break failovers.
Normally this should not be an issue though, as any settings of a
standby are inherited from its primary as postgresql.conf gets normally
copied as part of a base backup, so parameters would be consistent.
Author: Alexander Kukushkin
Reviewed-by: Kyotaro Horiguchi, Petr Jelínek, Masahiko Sawada, Oleksii
Kliukin
Discussion: https://postgr.es/m/CAFh8B=nBzHQeYAu0b8fjK-AF1X4+_p6GRtwG+cCgs6Vci2uRuQ@mail.gmail.com
2019-02-12 02:07:56 +01:00
|
|
|
Specifies the maximum number of concurrent connections from standby
|
|
|
|
servers or streaming base backup clients (i.e., the maximum number of
|
|
|
|
simultaneously running WAL sender processes). The default is
|
|
|
|
<literal>10</literal>. The value <literal>0</literal> means
|
2020-09-21 18:43:42 +02:00
|
|
|
replication is disabled. Abrupt disconnection of a streaming client might
|
Move max_wal_senders out of max_connections for connection slot handling
Since its introduction, max_wal_senders is counted as part of
max_connections when it comes to define how many connection slots can be
used for replication connections with a WAL sender context. This can
lead to confusion for some users, as it could be possible to block a
base backup or replication from happening because other backend sessions
are already taken for other purposes by an application, and
superuser-only connection slots are not a correct solution to handle
that case.
This commit makes max_wal_senders independent of max_connections for its
handling of PGPROC entries in ProcGlobal, meaning that connection slots
for WAL senders are handled using their own free queue, like autovacuum
workers and bgworkers.
One compatibility issue that this change creates is that a standby now
requires to have a value of max_wal_senders at least equal to its
primary. So, if a standby created enforces the value of
max_wal_senders to be lower than that, then this could break failovers.
Normally this should not be an issue though, as any settings of a
standby are inherited from its primary as postgresql.conf gets normally
copied as part of a base backup, so parameters would be consistent.
Author: Alexander Kukushkin
Reviewed-by: Kyotaro Horiguchi, Petr Jelínek, Masahiko Sawada, Oleksii
Kliukin
Discussion: https://postgr.es/m/CAFh8B=nBzHQeYAu0b8fjK-AF1X4+_p6GRtwG+cCgs6Vci2uRuQ@mail.gmail.com
2019-02-12 02:07:56 +01:00
|
|
|
leave an orphaned connection slot behind until a timeout is reached,
|
|
|
|
so this parameter should be set slightly higher than the maximum
|
|
|
|
number of expected clients so disconnected clients can immediately
|
|
|
|
reconnect. This parameter can only be set at server start. Also,
|
|
|
|
<varname>wal_level</varname> must be set to
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>replica</literal> or higher to allow connections from standby
|
Add new wal_level, logical, sufficient for logical decoding.
When wal_level=logical, we'll log columns from the old tuple as
configured by the REPLICA IDENTITY facility added in commit
07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65. This makes it possible
a properly-configured logical replication solution to correctly
follow table updates even if they change the chosen key columns,
or, with REPLICA IDENTITY FULL, even if the table has no key at
all. Note that updates which do not modify the replica identity
column won't log anything extra, making the choice of a good key
(i.e. one that will rarely be changed) important to performance
when wal_level=logical is configured.
Each insert, update, or delete to a catalog table will also log
the CMIN and/or CMAX values of stamped by the current transaction.
This is necessary because logical decoding will require access to
historical snapshots of the catalog in order to decode some data
types, and the CMIN/CMAX values that we may need in order to judge
row visibility may have been overwritten by the time we need them.
Andres Freund, reviewed in various versions by myself, Heikki
Linnakangas, KONDO Mitsumasa, and many others.
2013-12-11 00:33:45 +01:00
|
|
|
servers.
|
2010-01-15 10:19:10 +01:00
|
|
|
</para>
|
Move max_wal_senders out of max_connections for connection slot handling
Since its introduction, max_wal_senders is counted as part of
max_connections when it comes to define how many connection slots can be
used for replication connections with a WAL sender context. This can
lead to confusion for some users, as it could be possible to block a
base backup or replication from happening because other backend sessions
are already taken for other purposes by an application, and
superuser-only connection slots are not a correct solution to handle
that case.
This commit makes max_wal_senders independent of max_connections for its
handling of PGPROC entries in ProcGlobal, meaning that connection slots
for WAL senders are handled using their own free queue, like autovacuum
workers and bgworkers.
One compatibility issue that this change creates is that a standby now
requires to have a value of max_wal_senders at least equal to its
primary. So, if a standby created enforces the value of
max_wal_senders to be lower than that, then this could break failovers.
Normally this should not be an issue though, as any settings of a
standby are inherited from its primary as postgresql.conf gets normally
copied as part of a base backup, so parameters would be consistent.
Author: Alexander Kukushkin
Reviewed-by: Kyotaro Horiguchi, Petr Jelínek, Masahiko Sawada, Oleksii
Kliukin
Discussion: https://postgr.es/m/CAFh8B=nBzHQeYAu0b8fjK-AF1X4+_p6GRtwG+cCgs6Vci2uRuQ@mail.gmail.com
2019-02-12 02:07:56 +01:00
|
|
|
|
|
|
|
<para>
|
|
|
|
When running a standby server, you must set this parameter to the
|
2020-06-15 19:12:58 +02:00
|
|
|
same or higher value than on the primary server. Otherwise, queries
|
Move max_wal_senders out of max_connections for connection slot handling
Since its introduction, max_wal_senders is counted as part of
max_connections when it comes to define how many connection slots can be
used for replication connections with a WAL sender context. This can
lead to confusion for some users, as it could be possible to block a
base backup or replication from happening because other backend sessions
are already taken for other purposes by an application, and
superuser-only connection slots are not a correct solution to handle
that case.
This commit makes max_wal_senders independent of max_connections for its
handling of PGPROC entries in ProcGlobal, meaning that connection slots
for WAL senders are handled using their own free queue, like autovacuum
workers and bgworkers.
One compatibility issue that this change creates is that a standby now
requires to have a value of max_wal_senders at least equal to its
primary. So, if a standby created enforces the value of
max_wal_senders to be lower than that, then this could break failovers.
Normally this should not be an issue though, as any settings of a
standby are inherited from its primary as postgresql.conf gets normally
copied as part of a base backup, so parameters would be consistent.
Author: Alexander Kukushkin
Reviewed-by: Kyotaro Horiguchi, Petr Jelínek, Masahiko Sawada, Oleksii
Kliukin
Discussion: https://postgr.es/m/CAFh8B=nBzHQeYAu0b8fjK-AF1X4+_p6GRtwG+cCgs6Vci2uRuQ@mail.gmail.com
2019-02-12 02:07:56 +01:00
|
|
|
will not be allowed in the standby server.
|
|
|
|
</para>
|
2010-01-15 10:19:10 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2010-04-12 11:52:29 +02:00
|
|
|
|
2014-02-01 04:45:17 +01:00
|
|
|
<varlistentry id="guc-max-replication-slots" xreflabel="max_replication_slots">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>max_replication_slots</varname> (<type>integer</type>)
|
2014-02-01 04:45:17 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_replication_slots</varname> configuration parameter</primary>
|
2014-02-01 04:45:17 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2014-02-01 04:45:17 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the maximum number of replication slots
|
2017-11-23 15:39:47 +01:00
|
|
|
(see <xref linkend="streaming-replication-slots"/>) that the server
|
2017-01-14 17:14:56 +01:00
|
|
|
can support. The default is 10. This parameter can only be set at
|
2014-02-01 04:45:17 +01:00
|
|
|
server start.
|
2018-03-08 17:25:26 +01:00
|
|
|
Setting it to a lower value than the number of currently
|
2014-02-01 04:45:17 +01:00
|
|
|
existing replication slots will prevent the server from starting.
|
2018-03-08 17:25:26 +01:00
|
|
|
Also, <varname>wal_level</varname> must be set
|
|
|
|
to <literal>replica</literal> or higher to allow replication slots to
|
|
|
|
be used.
|
2014-02-01 04:45:17 +01:00
|
|
|
</para>
|
2021-03-03 07:31:56 +01:00
|
|
|
|
|
|
|
<para>
|
|
|
|
On the subscriber side, specifies how many replication origins (see
|
|
|
|
<xref linkend="replication-origins"/>) can be tracked simultaneously,
|
|
|
|
effectively limiting how many logical replication subscriptions can
|
2021-04-09 06:53:07 +02:00
|
|
|
be created on the server. Setting it to a lower value than the current
|
2021-03-03 07:31:56 +01:00
|
|
|
number of tracked replication origins (reflected in
|
|
|
|
<link linkend="view-pg-replication-origin-status">pg_replication_origin_status</link>,
|
|
|
|
not <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>)
|
|
|
|
will prevent the server from starting.
|
|
|
|
</para>
|
2014-02-01 04:45:17 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2020-07-20 06:30:18 +02:00
|
|
|
<varlistentry id="guc-wal-keep-size" xreflabel="wal_keep_size">
|
|
|
|
<term><varname>wal_keep_size</varname> (<type>integer</type>)
|
2010-04-12 11:52:29 +02:00
|
|
|
<indexterm>
|
2020-07-20 06:30:18 +02:00
|
|
|
<primary><varname>wal_keep_size</varname> configuration parameter</primary>
|
2010-04-12 11:52:29 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2010-04-12 11:52:29 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2020-07-20 06:30:18 +02:00
|
|
|
Specifies the minimum size of past log file segments kept in the
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>pg_wal</filename>
|
2010-04-29 23:36:19 +02:00
|
|
|
directory, in case a standby server needs to fetch them for streaming
|
2020-07-20 06:30:18 +02:00
|
|
|
replication. If a standby
|
2011-07-19 04:40:03 +02:00
|
|
|
server connected to the sending server falls behind by more than
|
2020-07-20 06:30:18 +02:00
|
|
|
<varname>wal_keep_size</varname> megabytes, the sending server might
|
|
|
|
remove a WAL segment still needed by the standby, in which case the
|
2011-07-19 04:40:03 +02:00
|
|
|
replication connection will be terminated. Downstream connections
|
|
|
|
will also eventually fail as a result. (However, the standby
|
2010-07-03 22:43:58 +02:00
|
|
|
server can recover by fetching the segment from archive, if WAL
|
|
|
|
archiving is in use.)
|
2010-04-29 23:36:19 +02:00
|
|
|
</para>
|
2010-04-12 11:52:29 +02:00
|
|
|
|
2010-04-29 23:36:19 +02:00
|
|
|
<para>
|
2020-07-20 06:30:18 +02:00
|
|
|
This sets only the minimum size of segments retained in
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>pg_wal</filename>; the system might need to retain more segments
|
2010-07-03 22:43:58 +02:00
|
|
|
for WAL archival or to recover from a checkpoint. If
|
2020-07-20 06:30:18 +02:00
|
|
|
<varname>wal_keep_size</varname> is zero (the default), the system
|
2011-07-07 21:10:32 +02:00
|
|
|
doesn't keep any extra segments for standby purposes, so the number
|
2010-07-03 22:43:58 +02:00
|
|
|
of old WAL segments available to standby servers is a function of
|
|
|
|
the location of the previous checkpoint and status of WAL
|
2011-07-19 04:40:03 +02:00
|
|
|
archiving.
|
2020-07-20 06:30:18 +02:00
|
|
|
If this value is specified without units, it is taken as megabytes.
|
2010-07-16 13:20:23 +02:00
|
|
|
This parameter can only be set in the
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>postgresql.conf</filename> file or on the server command line.
|
2010-04-12 11:52:29 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2010-07-03 23:23:58 +02:00
|
|
|
|
2020-04-08 00:35:00 +02:00
|
|
|
<varlistentry id="guc-max-slot-wal-keep-size" xreflabel="max_slot_wal_keep_size">
|
|
|
|
<term><varname>max_slot_wal_keep_size</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>max_slot_wal_keep_size</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specify the maximum size of WAL files
|
|
|
|
that <link linkend="streaming-replication-slots">replication
|
|
|
|
slots</link> are allowed to retain in the <filename>pg_wal</filename>
|
|
|
|
directory at checkpoint time.
|
|
|
|
If <varname>max_slot_wal_keep_size</varname> is -1 (the default),
|
2020-09-21 18:43:42 +02:00
|
|
|
replication slots may retain an unlimited amount of WAL files. Otherwise, if
|
|
|
|
restart_lsn of a replication slot falls behind the current LSN by more
|
|
|
|
than the given size, the standby using the slot may no longer be able
|
2020-04-08 00:35:00 +02:00
|
|
|
to continue replication due to removal of required WAL files. You
|
|
|
|
can see the WAL availability of replication slots
|
|
|
|
in <link linkend="view-pg-replication-slots">pg_replication_slots</link>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2012-10-11 16:39:52 +02:00
|
|
|
<varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>wal_sender_timeout</varname> (<type>integer</type>)
|
2011-03-30 09:10:32 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>wal_sender_timeout</varname> configuration parameter</primary>
|
2011-03-30 09:10:32 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2011-03-30 09:10:32 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Terminate replication connections that are inactive for longer
|
|
|
|
than this amount of time. This is useful for
|
2011-07-19 04:40:03 +02:00
|
|
|
the sending server to detect a standby crash or network outage.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
|
|
|
The default value is 60 seconds.
|
|
|
|
A value of zero disables the timeout mechanism.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
With a cluster distributed across multiple geographic
|
2018-09-22 08:23:59 +02:00
|
|
|
locations, using different values per location brings more flexibility
|
|
|
|
in the cluster management. A smaller value is useful for faster
|
|
|
|
failure detection with a standby having a low-latency network
|
|
|
|
connection, and a larger value helps in judging better the health
|
|
|
|
of a standby if located on a remote location, with a high-latency
|
|
|
|
network connection.
|
2011-03-30 09:10:32 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2011-03-06 23:49:16 +01:00
|
|
|
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
<varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
|
2017-01-03 03:37:12 +01:00
|
|
|
<term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>track_commit_timestamp</varname> configuration parameter</primary>
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
</indexterm>
|
2014-12-03 20:23:38 +01:00
|
|
|
</term>
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Record commit time of transactions. This parameter
|
2017-10-09 03:44:17 +02:00
|
|
|
can only be set in <filename>postgresql.conf</filename> file or on the server
|
Keep track of transaction commit timestamps
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
2014-12-03 15:53:02 +01:00
|
|
|
command line. The default value is <literal>off</literal>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2011-07-19 04:40:03 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
|
2020-06-15 19:12:58 +02:00
|
|
|
<sect2 id="runtime-config-replication-primary">
|
|
|
|
<title>Primary Server</title>
|
2011-07-19 04:40:03 +02:00
|
|
|
|
|
|
|
<para>
|
2020-06-15 19:12:58 +02:00
|
|
|
These parameters can be set on the primary server that is
|
2011-07-19 04:40:03 +02:00
|
|
|
to send replication data to one or more standby servers.
|
|
|
|
Note that in addition to these parameters,
|
2020-06-15 19:12:58 +02:00
|
|
|
<xref linkend="guc-wal-level"/> must be set appropriately on the primary
|
2011-07-19 10:07:42 +02:00
|
|
|
server, and optionally WAL archiving can be enabled as
|
2017-11-23 15:39:47 +01:00
|
|
|
well (see <xref linkend="runtime-config-wal-archiving"/>).
|
2011-07-19 04:40:03 +02:00
|
|
|
The values of these parameters on standby servers are irrelevant,
|
|
|
|
although you may wish to set them there in preparation for the
|
2020-06-15 19:12:58 +02:00
|
|
|
possibility of a standby becoming the primary.
|
2011-07-19 04:40:03 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
|
2011-03-15 16:25:04 +01:00
|
|
|
<varlistentry id="guc-synchronous-standby-names" xreflabel="synchronous_standby_names">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>synchronous_standby_names</varname> (<type>string</type>)
|
2011-03-06 23:49:16 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>synchronous_standby_names</varname> configuration parameter</primary>
|
2011-03-06 23:49:16 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2011-03-06 23:49:16 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Clean up parsing of synchronous_standby_names GUC variable.
Commit 989be0810dffd08b added a flex/bison lexer/parser to interpret
synchronous_standby_names. It was done in a pretty crufty way, though,
making assorted end-use sites responsible for calling the parser at the
right times. That was not only vulnerable to errors of omission, but made
it possible that lexer/parser errors occur at very undesirable times,
and created memory leakages even if there was no error.
Instead, perform the parsing once during check_synchronous_standby_names
and let guc.c manage the resulting data. To do that, we have to flatten
the parsed representation into a single hunk of malloc'd memory, but that
is not very hard.
While at it, work a little harder on making useful error reports for
parsing problems; the previous code felt that "synchronous_standby_names
parser returned 1" was an appropriate user-facing error message. (To
be fair, it did also log a syntax error message, but separately from the
GUC problem report, which is at best confusing.) It had some outright
bugs in the face of invalid input, too.
I (tgl) also concluded that we need to restrict unquoted names in
synchronous_standby_names to be just SQL identifiers. The previous coding
would accept darn near anything, which (1) makes the quoting convention
both nearly-unnecessary and formally ambiguous, (2) makes it very hard to
understand what is a syntax error and what is a creative interpretation of
the input as a standby name, and (3) makes it impossible to further extend
the syntax in future without a compatibility break. I presume that we're
intending future extensions of the syntax, else this parsing infrastructure
is massive overkill, so (3) is an important objection. Since we've taken
a compatibility hit for non-identifier names with this change anyway, we
might as well lock things down now and insist that users use double quotes
for standby names that aren't identifiers.
Kyotaro Horiguchi and Tom Lane
2016-04-27 23:55:19 +02:00
|
|
|
Specifies a list of standby servers that can support
|
2017-10-09 03:44:17 +02:00
|
|
|
<firstterm>synchronous replication</firstterm>, as described in
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="synchronous-replication"/>.
|
Support multiple synchronous standby servers.
Previously synchronous replication offered only the ability to confirm
that all changes made by a transaction had been transferred to at most
one synchronous standby server.
This commit extends synchronous replication so that it supports multiple
synchronous standby servers. It enables users to consider one or more
standby servers as synchronous, and increase the level of transaction
durability by ensuring that transaction commits wait for replies from
all of those synchronous standbys.
Multiple synchronous standby servers are configured in
synchronous_standby_names which is extended to support new syntax of
'num_sync ( standby_name [ , ... ] )', where num_sync specifies
the number of synchronous standbys that transaction commits need to
wait for replies from and standby_name is the name of a standby
server.
The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before
is also still supported. It's the same as new syntax with num_sync=1.
This commit doesn't include "quorum commit" feature which was discussed
in pgsql-hackers. Synchronous standbys are chosen based on their priorities.
synchronous_standby_names determines the priority of each standby for
being chosen as a synchronous standby. The standbys whose names appear
earlier in the list are given higher priority and will be considered as
synchronous. Other standby servers appearing later in this list
represent potential synchronous standbys.
The regression test for multiple synchronous standbys is not included
in this commit. It should come later.
Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao
Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs,
Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen,
Rajeev Rastogi
Many thanks to the various individuals who were involved in
discussing and developing this feature.
2016-04-06 10:18:25 +02:00
|
|
|
There will be one or more active synchronous standbys;
|
2011-07-07 21:10:32 +02:00
|
|
|
transactions waiting for commit will be allowed to proceed after
|
Support multiple synchronous standby servers.
Previously synchronous replication offered only the ability to confirm
that all changes made by a transaction had been transferred to at most
one synchronous standby server.
This commit extends synchronous replication so that it supports multiple
synchronous standby servers. It enables users to consider one or more
standby servers as synchronous, and increase the level of transaction
durability by ensuring that transaction commits wait for replies from
all of those synchronous standbys.
Multiple synchronous standby servers are configured in
synchronous_standby_names which is extended to support new syntax of
'num_sync ( standby_name [ , ... ] )', where num_sync specifies
the number of synchronous standbys that transaction commits need to
wait for replies from and standby_name is the name of a standby
server.
The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before
is also still supported. It's the same as new syntax with num_sync=1.
This commit doesn't include "quorum commit" feature which was discussed
in pgsql-hackers. Synchronous standbys are chosen based on their priorities.
synchronous_standby_names determines the priority of each standby for
being chosen as a synchronous standby. The standbys whose names appear
earlier in the list are given higher priority and will be considered as
synchronous. Other standby servers appearing later in this list
represent potential synchronous standbys.
The regression test for multiple synchronous standbys is not included
in this commit. It should come later.
Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao
Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs,
Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen,
Rajeev Rastogi
Many thanks to the various individuals who were involved in
discussing and developing this feature.
2016-04-06 10:18:25 +02:00
|
|
|
these standby servers confirm receipt of their data.
|
|
|
|
The synchronous standbys will be those whose names appear
|
2016-12-19 13:15:30 +01:00
|
|
|
in this list, and
|
Clean up parsing of synchronous_standby_names GUC variable.
Commit 989be0810dffd08b added a flex/bison lexer/parser to interpret
synchronous_standby_names. It was done in a pretty crufty way, though,
making assorted end-use sites responsible for calling the parser at the
right times. That was not only vulnerable to errors of omission, but made
it possible that lexer/parser errors occur at very undesirable times,
and created memory leakages even if there was no error.
Instead, perform the parsing once during check_synchronous_standby_names
and let guc.c manage the resulting data. To do that, we have to flatten
the parsed representation into a single hunk of malloc'd memory, but that
is not very hard.
While at it, work a little harder on making useful error reports for
parsing problems; the previous code felt that "synchronous_standby_names
parser returned 1" was an appropriate user-facing error message. (To
be fair, it did also log a syntax error message, but separately from the
GUC problem report, which is at best confusing.) It had some outright
bugs in the face of invalid input, too.
I (tgl) also concluded that we need to restrict unquoted names in
synchronous_standby_names to be just SQL identifiers. The previous coding
would accept darn near anything, which (1) makes the quoting convention
both nearly-unnecessary and formally ambiguous, (2) makes it very hard to
understand what is a syntax error and what is a creative interpretation of
the input as a standby name, and (3) makes it impossible to further extend
the syntax in future without a compatibility break. I presume that we're
intending future extensions of the syntax, else this parsing infrastructure
is massive overkill, so (3) is an important objection. Since we've taken
a compatibility hit for non-identifier names with this change anyway, we
might as well lock things down now and insist that users use double quotes
for standby names that aren't identifiers.
Kyotaro Horiguchi and Tom Lane
2016-04-27 23:55:19 +02:00
|
|
|
that are both currently connected and streaming data in real-time
|
2020-05-29 10:14:33 +02:00
|
|
|
(as shown by a state of <literal>streaming</literal> in the
|
|
|
|
<link linkend="monitoring-pg-stat-replication-view">
|
|
|
|
<structname>pg_stat_replication</structname></link> view).
|
2017-07-02 06:10:57 +02:00
|
|
|
Specifying more than one synchronous standby can allow for very high
|
|
|
|
availability and protection against data loss.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
The name of a standby server for this purpose is the
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>application_name</varname> setting of the standby, as set in the
|
2017-07-02 06:10:57 +02:00
|
|
|
standby's connection information. In case of a physical replication
|
2017-10-09 03:44:17 +02:00
|
|
|
standby, this should be set in the <varname>primary_conninfo</varname>
|
2019-02-08 08:17:21 +01:00
|
|
|
setting; the default is the setting of <xref linkend="guc-cluster-name"/>
|
|
|
|
if set, else <literal>walreceiver</literal>.
|
2018-11-25 16:31:16 +01:00
|
|
|
For logical replication, this can be set in the connection
|
|
|
|
information of the subscription, and it defaults to the
|
|
|
|
subscription name. For other replication stream consumers,
|
|
|
|
consult their documentation.
|
2011-03-06 23:49:16 +01:00
|
|
|
</para>
|
Support multiple synchronous standby servers.
Previously synchronous replication offered only the ability to confirm
that all changes made by a transaction had been transferred to at most
one synchronous standby server.
This commit extends synchronous replication so that it supports multiple
synchronous standby servers. It enables users to consider one or more
standby servers as synchronous, and increase the level of transaction
durability by ensuring that transaction commits wait for replies from
all of those synchronous standbys.
Multiple synchronous standby servers are configured in
synchronous_standby_names which is extended to support new syntax of
'num_sync ( standby_name [ , ... ] )', where num_sync specifies
the number of synchronous standbys that transaction commits need to
wait for replies from and standby_name is the name of a standby
server.
The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before
is also still supported. It's the same as new syntax with num_sync=1.
This commit doesn't include "quorum commit" feature which was discussed
in pgsql-hackers. Synchronous standbys are chosen based on their priorities.
synchronous_standby_names determines the priority of each standby for
being chosen as a synchronous standby. The standbys whose names appear
earlier in the list are given higher priority and will be considered as
synchronous. Other standby servers appearing later in this list
represent potential synchronous standbys.
The regression test for multiple synchronous standbys is not included
in this commit. It should come later.
Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao
Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs,
Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen,
Rajeev Rastogi
Many thanks to the various individuals who were involved in
discussing and developing this feature.
2016-04-06 10:18:25 +02:00
|
|
|
<para>
|
Clean up parsing of synchronous_standby_names GUC variable.
Commit 989be0810dffd08b added a flex/bison lexer/parser to interpret
synchronous_standby_names. It was done in a pretty crufty way, though,
making assorted end-use sites responsible for calling the parser at the
right times. That was not only vulnerable to errors of omission, but made
it possible that lexer/parser errors occur at very undesirable times,
and created memory leakages even if there was no error.
Instead, perform the parsing once during check_synchronous_standby_names
and let guc.c manage the resulting data. To do that, we have to flatten
the parsed representation into a single hunk of malloc'd memory, but that
is not very hard.
While at it, work a little harder on making useful error reports for
parsing problems; the previous code felt that "synchronous_standby_names
parser returned 1" was an appropriate user-facing error message. (To
be fair, it did also log a syntax error message, but separately from the
GUC problem report, which is at best confusing.) It had some outright
bugs in the face of invalid input, too.
I (tgl) also concluded that we need to restrict unquoted names in
synchronous_standby_names to be just SQL identifiers. The previous coding
would accept darn near anything, which (1) makes the quoting convention
both nearly-unnecessary and formally ambiguous, (2) makes it very hard to
understand what is a syntax error and what is a creative interpretation of
the input as a standby name, and (3) makes it impossible to further extend
the syntax in future without a compatibility break. I presume that we're
intending future extensions of the syntax, else this parsing infrastructure
is massive overkill, so (3) is an important objection. Since we've taken
a compatibility hit for non-identifier names with this change anyway, we
might as well lock things down now and insist that users use double quotes
for standby names that aren't identifiers.
Kyotaro Horiguchi and Tom Lane
2016-04-27 23:55:19 +02:00
|
|
|
This parameter specifies a list of standby servers using
|
Support multiple synchronous standby servers.
Previously synchronous replication offered only the ability to confirm
that all changes made by a transaction had been transferred to at most
one synchronous standby server.
This commit extends synchronous replication so that it supports multiple
synchronous standby servers. It enables users to consider one or more
standby servers as synchronous, and increase the level of transaction
durability by ensuring that transaction commits wait for replies from
all of those synchronous standbys.
Multiple synchronous standby servers are configured in
synchronous_standby_names which is extended to support new syntax of
'num_sync ( standby_name [ , ... ] )', where num_sync specifies
the number of synchronous standbys that transaction commits need to
wait for replies from and standby_name is the name of a standby
server.
The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before
is also still supported. It's the same as new syntax with num_sync=1.
This commit doesn't include "quorum commit" feature which was discussed
in pgsql-hackers. Synchronous standbys are chosen based on their priorities.
synchronous_standby_names determines the priority of each standby for
being chosen as a synchronous standby. The standbys whose names appear
earlier in the list are given higher priority and will be considered as
synchronous. Other standby servers appearing later in this list
represent potential synchronous standbys.
The regression test for multiple synchronous standbys is not included
in this commit. It should come later.
Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao
Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs,
Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen,
Rajeev Rastogi
Many thanks to the various individuals who were involved in
discussing and developing this feature.
2016-04-06 10:18:25 +02:00
|
|
|
either of the following syntaxes:
|
|
|
|
<synopsis>
|
2016-12-19 13:15:30 +01:00
|
|
|
[FIRST] <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="parameter">standby_name</replaceable> [, ...] )
|
|
|
|
ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="parameter">standby_name</replaceable> [, ...] )
|
Support multiple synchronous standby servers.
Previously synchronous replication offered only the ability to confirm
that all changes made by a transaction had been transferred to at most
one synchronous standby server.
This commit extends synchronous replication so that it supports multiple
synchronous standby servers. It enables users to consider one or more
standby servers as synchronous, and increase the level of transaction
durability by ensuring that transaction commits wait for replies from
all of those synchronous standbys.
Multiple synchronous standby servers are configured in
synchronous_standby_names which is extended to support new syntax of
'num_sync ( standby_name [ , ... ] )', where num_sync specifies
the number of synchronous standbys that transaction commits need to
wait for replies from and standby_name is the name of a standby
server.
The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before
is also still supported. It's the same as new syntax with num_sync=1.
This commit doesn't include "quorum commit" feature which was discussed
in pgsql-hackers. Synchronous standbys are chosen based on their priorities.
synchronous_standby_names determines the priority of each standby for
being chosen as a synchronous standby. The standbys whose names appear
earlier in the list are given higher priority and will be considered as
synchronous. Other standby servers appearing later in this list
represent potential synchronous standbys.
The regression test for multiple synchronous standbys is not included
in this commit. It should come later.
Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao
Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs,
Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen,
Rajeev Rastogi
Many thanks to the various individuals who were involved in
discussing and developing this feature.
2016-04-06 10:18:25 +02:00
|
|
|
<replaceable class="parameter">standby_name</replaceable> [, ...]
|
|
|
|
</synopsis>
|
|
|
|
where <replaceable class="parameter">num_sync</replaceable> is
|
|
|
|
the number of synchronous standbys that transactions need to
|
|
|
|
wait for replies from,
|
|
|
|
and <replaceable class="parameter">standby_name</replaceable>
|
2016-12-19 13:15:30 +01:00
|
|
|
is the name of a standby server.
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>FIRST</literal> and <literal>ANY</literal> specify the method to choose
|
2016-12-19 13:15:30 +01:00
|
|
|
synchronous standbys from the listed servers.
|
|
|
|
</para>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The keyword <literal>FIRST</literal>, coupled with
|
2016-12-19 13:15:30 +01:00
|
|
|
<replaceable class="parameter">num_sync</replaceable>, specifies a
|
|
|
|
priority-based synchronous replication and makes transaction commits
|
|
|
|
wait until their WAL records are replicated to
|
|
|
|
<replaceable class="parameter">num_sync</replaceable> synchronous
|
|
|
|
standbys chosen based on their priorities. For example, a setting of
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>FIRST 3 (s1, s2, s3, s4)</literal> will cause each commit to wait for
|
2016-12-19 13:15:30 +01:00
|
|
|
replies from three higher-priority standbys chosen from standby servers
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>s1</literal>, <literal>s2</literal>, <literal>s3</literal> and <literal>s4</literal>.
|
2016-12-19 13:15:30 +01:00
|
|
|
The standbys whose names appear earlier in the list are given higher
|
|
|
|
priority and will be considered as synchronous. Other standby servers
|
|
|
|
appearing later in this list represent potential synchronous standbys.
|
|
|
|
If any of the current synchronous standbys disconnects for whatever
|
|
|
|
reason, it will be replaced immediately with the next-highest-priority
|
2017-10-09 03:44:17 +02:00
|
|
|
standby. The keyword <literal>FIRST</literal> is optional.
|
2016-12-19 13:15:30 +01:00
|
|
|
</para>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The keyword <literal>ANY</literal>, coupled with
|
2016-12-19 13:15:30 +01:00
|
|
|
<replaceable class="parameter">num_sync</replaceable>, specifies a
|
|
|
|
quorum-based synchronous replication and makes transaction commits
|
2017-10-09 03:44:17 +02:00
|
|
|
wait until their WAL records are replicated to <emphasis>at least</emphasis>
|
2016-12-19 13:15:30 +01:00
|
|
|
<replaceable class="parameter">num_sync</replaceable> listed standbys.
|
2017-10-09 03:44:17 +02:00
|
|
|
For example, a setting of <literal>ANY 3 (s1, s2, s3, s4)</literal> will cause
|
2016-12-19 13:15:30 +01:00
|
|
|
each commit to proceed as soon as at least any three standbys of
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>s1</literal>, <literal>s2</literal>, <literal>s3</literal> and <literal>s4</literal>
|
2016-12-19 13:15:30 +01:00
|
|
|
reply.
|
|
|
|
</para>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>FIRST</literal> and <literal>ANY</literal> are case-insensitive. If these
|
2016-12-19 13:15:30 +01:00
|
|
|
keywords are used as the name of a standby server,
|
|
|
|
its <replaceable class="parameter">standby_name</replaceable> must
|
|
|
|
be double-quoted.
|
|
|
|
</para>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The third syntax was used before <productname>PostgreSQL</productname>
|
Support multiple synchronous standby servers.
Previously synchronous replication offered only the ability to confirm
that all changes made by a transaction had been transferred to at most
one synchronous standby server.
This commit extends synchronous replication so that it supports multiple
synchronous standby servers. It enables users to consider one or more
standby servers as synchronous, and increase the level of transaction
durability by ensuring that transaction commits wait for replies from
all of those synchronous standbys.
Multiple synchronous standby servers are configured in
synchronous_standby_names which is extended to support new syntax of
'num_sync ( standby_name [ , ... ] )', where num_sync specifies
the number of synchronous standbys that transaction commits need to
wait for replies from and standby_name is the name of a standby
server.
The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before
is also still supported. It's the same as new syntax with num_sync=1.
This commit doesn't include "quorum commit" feature which was discussed
in pgsql-hackers. Synchronous standbys are chosen based on their priorities.
synchronous_standby_names determines the priority of each standby for
being chosen as a synchronous standby. The standbys whose names appear
earlier in the list are given higher priority and will be considered as
synchronous. Other standby servers appearing later in this list
represent potential synchronous standbys.
The regression test for multiple synchronous standbys is not included
in this commit. It should come later.
Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao
Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs,
Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen,
Rajeev Rastogi
Many thanks to the various individuals who were involved in
discussing and developing this feature.
2016-04-06 10:18:25 +02:00
|
|
|
version 9.6 and is still supported. It's the same as the first syntax
|
2017-10-09 03:44:17 +02:00
|
|
|
with <literal>FIRST</literal> and
|
2016-12-19 13:15:30 +01:00
|
|
|
<replaceable class="parameter">num_sync</replaceable> equal to 1.
|
2017-10-09 03:44:17 +02:00
|
|
|
For example, <literal>FIRST 1 (s1, s2)</literal> and <literal>s1, s2</literal> have
|
|
|
|
the same meaning: either <literal>s1</literal> or <literal>s2</literal> is chosen
|
2016-12-19 13:15:30 +01:00
|
|
|
as a synchronous standby.
|
Support multiple synchronous standby servers.
Previously synchronous replication offered only the ability to confirm
that all changes made by a transaction had been transferred to at most
one synchronous standby server.
This commit extends synchronous replication so that it supports multiple
synchronous standby servers. It enables users to consider one or more
standby servers as synchronous, and increase the level of transaction
durability by ensuring that transaction commits wait for replies from
all of those synchronous standbys.
Multiple synchronous standby servers are configured in
synchronous_standby_names which is extended to support new syntax of
'num_sync ( standby_name [ , ... ] )', where num_sync specifies
the number of synchronous standbys that transaction commits need to
wait for replies from and standby_name is the name of a standby
server.
The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before
is also still supported. It's the same as new syntax with num_sync=1.
This commit doesn't include "quorum commit" feature which was discussed
in pgsql-hackers. Synchronous standbys are chosen based on their priorities.
synchronous_standby_names determines the priority of each standby for
being chosen as a synchronous standby. The standbys whose names appear
earlier in the list are given higher priority and will be considered as
synchronous. Other standby servers appearing later in this list
represent potential synchronous standbys.
The regression test for multiple synchronous standbys is not included
in this commit. It should come later.
Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao
Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs,
Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen,
Rajeev Rastogi
Many thanks to the various individuals who were involved in
discussing and developing this feature.
2016-04-06 10:18:25 +02:00
|
|
|
</para>
|
2011-03-06 23:49:16 +01:00
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The special entry <literal>*</literal> matches any standby name.
|
2017-07-02 06:10:57 +02:00
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
There is no mechanism to enforce uniqueness of standby names. In case
|
|
|
|
of duplicates one of the matching standbys will be considered as
|
|
|
|
higher priority, though exactly which one is indeterminate.
|
2011-03-06 23:49:16 +01:00
|
|
|
</para>
|
Support multiple synchronous standby servers.
Previously synchronous replication offered only the ability to confirm
that all changes made by a transaction had been transferred to at most
one synchronous standby server.
This commit extends synchronous replication so that it supports multiple
synchronous standby servers. It enables users to consider one or more
standby servers as synchronous, and increase the level of transaction
durability by ensuring that transaction commits wait for replies from
all of those synchronous standbys.
Multiple synchronous standby servers are configured in
synchronous_standby_names which is extended to support new syntax of
'num_sync ( standby_name [ , ... ] )', where num_sync specifies
the number of synchronous standbys that transaction commits need to
wait for replies from and standby_name is the name of a standby
server.
The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before
is also still supported. It's the same as new syntax with num_sync=1.
This commit doesn't include "quorum commit" feature which was discussed
in pgsql-hackers. Synchronous standbys are chosen based on their priorities.
synchronous_standby_names determines the priority of each standby for
being chosen as a synchronous standby. The standbys whose names appear
earlier in the list are given higher priority and will be considered as
synchronous. Other standby servers appearing later in this list
represent potential synchronous standbys.
The regression test for multiple synchronous standbys is not included
in this commit. It should come later.
Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao
Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs,
Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen,
Rajeev Rastogi
Many thanks to the various individuals who were involved in
discussing and developing this feature.
2016-04-06 10:18:25 +02:00
|
|
|
<note>
|
|
|
|
<para>
|
Clean up parsing of synchronous_standby_names GUC variable.
Commit 989be0810dffd08b added a flex/bison lexer/parser to interpret
synchronous_standby_names. It was done in a pretty crufty way, though,
making assorted end-use sites responsible for calling the parser at the
right times. That was not only vulnerable to errors of omission, but made
it possible that lexer/parser errors occur at very undesirable times,
and created memory leakages even if there was no error.
Instead, perform the parsing once during check_synchronous_standby_names
and let guc.c manage the resulting data. To do that, we have to flatten
the parsed representation into a single hunk of malloc'd memory, but that
is not very hard.
While at it, work a little harder on making useful error reports for
parsing problems; the previous code felt that "synchronous_standby_names
parser returned 1" was an appropriate user-facing error message. (To
be fair, it did also log a syntax error message, but separately from the
GUC problem report, which is at best confusing.) It had some outright
bugs in the face of invalid input, too.
I (tgl) also concluded that we need to restrict unquoted names in
synchronous_standby_names to be just SQL identifiers. The previous coding
would accept darn near anything, which (1) makes the quoting convention
both nearly-unnecessary and formally ambiguous, (2) makes it very hard to
understand what is a syntax error and what is a creative interpretation of
the input as a standby name, and (3) makes it impossible to further extend
the syntax in future without a compatibility break. I presume that we're
intending future extensions of the syntax, else this parsing infrastructure
is massive overkill, so (3) is an important objection. Since we've taken
a compatibility hit for non-identifier names with this change anyway, we
might as well lock things down now and insist that users use double quotes
for standby names that aren't identifiers.
Kyotaro Horiguchi and Tom Lane
2016-04-27 23:55:19 +02:00
|
|
|
Each <replaceable class="parameter">standby_name</replaceable>
|
|
|
|
should have the form of a valid SQL identifier, unless it
|
2017-10-09 03:44:17 +02:00
|
|
|
is <literal>*</literal>. You can use double-quoting if necessary. But note
|
Clean up parsing of synchronous_standby_names GUC variable.
Commit 989be0810dffd08b added a flex/bison lexer/parser to interpret
synchronous_standby_names. It was done in a pretty crufty way, though,
making assorted end-use sites responsible for calling the parser at the
right times. That was not only vulnerable to errors of omission, but made
it possible that lexer/parser errors occur at very undesirable times,
and created memory leakages even if there was no error.
Instead, perform the parsing once during check_synchronous_standby_names
and let guc.c manage the resulting data. To do that, we have to flatten
the parsed representation into a single hunk of malloc'd memory, but that
is not very hard.
While at it, work a little harder on making useful error reports for
parsing problems; the previous code felt that "synchronous_standby_names
parser returned 1" was an appropriate user-facing error message. (To
be fair, it did also log a syntax error message, but separately from the
GUC problem report, which is at best confusing.) It had some outright
bugs in the face of invalid input, too.
I (tgl) also concluded that we need to restrict unquoted names in
synchronous_standby_names to be just SQL identifiers. The previous coding
would accept darn near anything, which (1) makes the quoting convention
both nearly-unnecessary and formally ambiguous, (2) makes it very hard to
understand what is a syntax error and what is a creative interpretation of
the input as a standby name, and (3) makes it impossible to further extend
the syntax in future without a compatibility break. I presume that we're
intending future extensions of the syntax, else this parsing infrastructure
is massive overkill, so (3) is an important objection. Since we've taken
a compatibility hit for non-identifier names with this change anyway, we
might as well lock things down now and insist that users use double quotes
for standby names that aren't identifiers.
Kyotaro Horiguchi and Tom Lane
2016-04-27 23:55:19 +02:00
|
|
|
that <replaceable class="parameter">standby_name</replaceable>s are
|
|
|
|
compared to standby application names case-insensitively, whether
|
|
|
|
double-quoted or not.
|
Support multiple synchronous standby servers.
Previously synchronous replication offered only the ability to confirm
that all changes made by a transaction had been transferred to at most
one synchronous standby server.
This commit extends synchronous replication so that it supports multiple
synchronous standby servers. It enables users to consider one or more
standby servers as synchronous, and increase the level of transaction
durability by ensuring that transaction commits wait for replies from
all of those synchronous standbys.
Multiple synchronous standby servers are configured in
synchronous_standby_names which is extended to support new syntax of
'num_sync ( standby_name [ , ... ] )', where num_sync specifies
the number of synchronous standbys that transaction commits need to
wait for replies from and standby_name is the name of a standby
server.
The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before
is also still supported. It's the same as new syntax with num_sync=1.
This commit doesn't include "quorum commit" feature which was discussed
in pgsql-hackers. Synchronous standbys are chosen based on their priorities.
synchronous_standby_names determines the priority of each standby for
being chosen as a synchronous standby. The standbys whose names appear
earlier in the list are given higher priority and will be considered as
synchronous. Other standby servers appearing later in this list
represent potential synchronous standbys.
The regression test for multiple synchronous standbys is not included
in this commit. It should come later.
Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao
Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs,
Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen,
Rajeev Rastogi
Many thanks to the various individuals who were involved in
discussing and developing this feature.
2016-04-06 10:18:25 +02:00
|
|
|
</para>
|
|
|
|
</note>
|
2011-03-06 23:49:16 +01:00
|
|
|
<para>
|
2011-07-07 21:10:32 +02:00
|
|
|
If no synchronous standby names are specified here, then synchronous
|
|
|
|
replication is not enabled and transaction commits will not wait for
|
2011-04-04 22:13:01 +02:00
|
|
|
replication. This is the default configuration. Even when
|
|
|
|
synchronous replication is enabled, individual transactions can be
|
|
|
|
configured not to wait for replication by setting the
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-synchronous-commit"/> parameter to
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>local</literal> or <literal>off</literal>.
|
2011-03-06 23:49:16 +01:00
|
|
|
</para>
|
2011-06-13 18:23:42 +02:00
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2011-06-13 18:23:42 +02:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
2011-03-06 23:49:16 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2011-07-19 04:40:03 +02:00
|
|
|
<varlistentry id="guc-vacuum-defer-cleanup-age" xreflabel="vacuum_defer_cleanup_age">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>vacuum_defer_cleanup_age</varname> (<type>integer</type>)
|
2011-07-19 04:40:03 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>vacuum_defer_cleanup_age</varname> configuration parameter</primary>
|
2011-07-19 04:40:03 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2011-07-19 04:40:03 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Specifies the number of transactions by which <command>VACUUM</command> and
|
|
|
|
<acronym>HOT</acronym> updates will defer cleanup of dead row versions. The
|
2011-07-19 04:40:03 +02:00
|
|
|
default is zero transactions, meaning that dead row versions can be
|
|
|
|
removed as soon as possible, that is, as soon as they are no longer
|
|
|
|
visible to any open transaction. You may wish to set this to a
|
|
|
|
non-zero value on a primary server that is supporting hot standby
|
2017-11-23 15:39:47 +01:00
|
|
|
servers, as described in <xref linkend="hot-standby"/>. This allows
|
2011-07-19 04:40:03 +02:00
|
|
|
more time for queries on the standby to complete without incurring
|
|
|
|
conflicts due to early cleanup of rows. However, since the value
|
|
|
|
is measured in terms of number of write transactions occurring on the
|
|
|
|
primary server, it is difficult to predict just how much additional
|
|
|
|
grace time will be made available to standby queries.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2013-02-04 17:39:55 +01:00
|
|
|
file or on the server command line.
|
2011-07-19 04:40:03 +02:00
|
|
|
</para>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
You should also consider setting <varname>hot_standby_feedback</varname>
|
2011-07-19 04:40:03 +02:00
|
|
|
on standby server(s) as an alternative to using this parameter.
|
|
|
|
</para>
|
2016-04-08 21:36:30 +02:00
|
|
|
<para>
|
|
|
|
This does not prevent cleanup of dead rows which have reached the age
|
2017-10-09 03:44:17 +02:00
|
|
|
specified by <varname>old_snapshot_threshold</varname>.
|
2016-04-08 21:36:30 +02:00
|
|
|
</para>
|
2011-07-19 04:40:03 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2011-03-06 23:49:16 +01:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
|
2011-07-07 21:10:32 +02:00
|
|
|
<sect2 id="runtime-config-replication-standby">
|
|
|
|
<title>Standby Servers</title>
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
|
2010-07-03 22:43:58 +02:00
|
|
|
<para>
|
2021-03-31 22:23:25 +02:00
|
|
|
These settings control the behavior of a
|
|
|
|
<link linkend="standby-server-operation">standby server</link>
|
|
|
|
that is
|
2020-06-15 19:12:58 +02:00
|
|
|
to receive replication data. Their values on the primary server
|
2011-07-07 21:10:32 +02:00
|
|
|
are irrelevant.
|
2010-07-03 22:43:58 +02:00
|
|
|
</para>
|
|
|
|
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
<variablelist>
|
|
|
|
|
2018-11-25 16:31:16 +01:00
|
|
|
<varlistentry id="guc-primary-conninfo" xreflabel="primary_conninfo">
|
|
|
|
<term><varname>primary_conninfo</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>primary_conninfo</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies a connection string to be used for the standby server
|
|
|
|
to connect with a sending server. This string is in the format
|
|
|
|
described in <xref linkend="libpq-connstring"/>. If any option is
|
|
|
|
unspecified in this string, then the corresponding environment
|
|
|
|
variable (see <xref linkend="libpq-envars"/>) is checked. If the
|
|
|
|
environment variable is not set either, then
|
|
|
|
defaults are used.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
The connection string should specify the host name (or address)
|
|
|
|
of the sending server, as well as the port number if it is not
|
|
|
|
the same as the standby server's default.
|
|
|
|
Also specify a user name corresponding to a suitably-privileged role
|
|
|
|
on the sending server (see
|
|
|
|
<xref linkend="streaming-replication-authentication"/>).
|
|
|
|
A password needs to be provided too, if the sender demands password
|
|
|
|
authentication. It can be provided in the
|
|
|
|
<varname>primary_conninfo</varname> string, or in a separate
|
|
|
|
<filename>~/.pgpass</filename> file on the standby server (use
|
|
|
|
<literal>replication</literal> as the database name).
|
|
|
|
Do not specify a database name in the
|
|
|
|
<varname>primary_conninfo</varname> string.
|
|
|
|
</para>
|
|
|
|
<para>
|
2020-03-27 23:43:41 +01:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
|
|
|
file or on the server command line.
|
|
|
|
If this parameter is changed while the WAL receiver process is
|
2020-06-07 15:06:51 +02:00
|
|
|
running, that process is signaled to shut down and expected to
|
2020-03-27 23:43:41 +01:00
|
|
|
restart with the new setting (except if <varname>primary_conninfo</varname>
|
|
|
|
is an empty string).
|
2018-11-25 16:31:16 +01:00
|
|
|
This setting has no effect if the server is not in standby mode.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
<varlistentry id="guc-primary-slot-name" xreflabel="primary_slot_name">
|
|
|
|
<term><varname>primary_slot_name</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>primary_slot_name</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Optionally specifies an existing replication slot to be used when
|
|
|
|
connecting to the sending server via streaming replication to control
|
|
|
|
resource removal on the upstream node
|
|
|
|
(see <xref linkend="streaming-replication-slots"/>).
|
2020-03-27 23:43:41 +01:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
|
|
|
file or on the server command line.
|
|
|
|
If this parameter is changed while the WAL receiver process is running,
|
2020-06-07 15:06:51 +02:00
|
|
|
that process is signaled to shut down and expected to restart with the
|
2020-03-27 23:43:41 +01:00
|
|
|
new setting.
|
2018-11-25 16:31:16 +01:00
|
|
|
This setting has no effect if <varname>primary_conninfo</varname> is not
|
2020-03-27 23:43:41 +01:00
|
|
|
set or the server is not in standby mode.
|
2018-11-25 16:31:16 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-promote-trigger-file" xreflabel="promote_trigger_file">
|
|
|
|
<term><varname>promote_trigger_file</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>promote_trigger_file</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies a trigger file whose presence ends recovery in the
|
|
|
|
standby. Even if this value is not set, you can still promote
|
|
|
|
the standby using <command>pg_ctl promote</command> or calling
|
2020-04-21 07:05:43 +02:00
|
|
|
<function>pg_promote()</function>.
|
2019-02-04 09:28:17 +01:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
|
|
|
file or on the server command line.
|
2018-11-25 16:31:16 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2010-04-29 23:36:19 +02:00
|
|
|
<varlistentry id="guc-hot-standby" xreflabel="hot_standby">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>hot_standby</varname> (<type>boolean</type>)
|
2009-12-25 02:09:31 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>hot_standby</varname> configuration parameter</primary>
|
2009-12-25 02:09:31 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Introduce wal_level GUC to explicitly control if information needed for
archival or hot standby should be WAL-logged, instead of deducing that from
other options like archive_mode. This replaces recovery_connections GUC in
the primary, where it now has no effect, but it's still used in the standby
to enable/disable hot standby.
Remove the WAL-logging of "unlogged operations", like creating an index
without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
the wal_mode setting and the settings that affect how much shared memory a
hot standby server needs to track master transactions (max_connections,
max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
change, at server restart, write a WAL record noting the new settings and
update pg_control. This allows us to notice the change in those settings in
the standby at the right moment, they used to be included in checkpoint
records, but that meant that a changed value was not reflected in the
standby until the first checkpoint after the change.
Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
the sequence it used to follow, before hot standby and subsequent patches
changed it to 0x9003.
2010-04-28 18:10:43 +02:00
|
|
|
Specifies whether or not you can connect and run queries during
|
2017-11-23 15:39:47 +01:00
|
|
|
recovery, as described in <xref linkend="hot-standby"/>.
|
2017-05-02 11:12:30 +02:00
|
|
|
The default value is <literal>on</literal>.
|
Introduce wal_level GUC to explicitly control if information needed for
archival or hot standby should be WAL-logged, instead of deducing that from
other options like archive_mode. This replaces recovery_connections GUC in
the primary, where it now has no effect, but it's still used in the standby
to enable/disable hot standby.
Remove the WAL-logging of "unlogged operations", like creating an index
without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
the wal_mode setting and the settings that affect how much shared memory a
hot standby server needs to track master transactions (max_connections,
max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
change, at server restart, write a WAL record noting the new settings and
update pg_control. This allows us to notice the change in those settings in
the standby at the right moment, they used to be included in checkpoint
records, but that meant that a changed value was not reflected in the
standby until the first checkpoint after the change.
Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
the sequence it used to follow, before hot standby and subsequent patches
changed it to 0x9003.
2010-04-28 18:10:43 +02:00
|
|
|
This parameter can only be set at server start. It only has effect
|
|
|
|
during archive recovery or in standby mode.
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2010-07-03 22:43:58 +02:00
|
|
|
<varlistentry id="guc-max-standby-archive-delay" xreflabel="max_standby_archive_delay">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>max_standby_archive_delay</varname> (<type>integer</type>)
|
2009-12-25 02:09:31 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_standby_archive_delay</varname> configuration parameter</primary>
|
2009-12-25 02:09:31 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2010-07-03 22:43:58 +02:00
|
|
|
When Hot Standby is active, this parameter determines how long the
|
|
|
|
standby server should wait before canceling standby queries that
|
|
|
|
conflict with about-to-be-applied WAL entries, as described in
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="hot-standby-conflict"/>.
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>max_standby_archive_delay</varname> applies when WAL data is
|
2010-07-03 22:43:58 +02:00
|
|
|
being read from WAL archive (and is therefore not current).
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
|
|
|
The default is 30 seconds.
|
2010-07-03 22:43:58 +02:00
|
|
|
A value of -1 allows the standby to wait forever for conflicting
|
|
|
|
queries to complete.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
2010-03-03 00:38:17 +01:00
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Note that <varname>max_standby_archive_delay</varname> is not the same as the
|
2010-07-03 22:43:58 +02:00
|
|
|
maximum length of time a query can run before cancellation; rather it
|
|
|
|
is the maximum total time allowed to apply any one WAL segment's data.
|
|
|
|
Thus, if one query has resulted in significant delay earlier in the
|
|
|
|
WAL segment, subsequent conflicting queries will have much less grace
|
|
|
|
time.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-max-standby-streaming-delay" xreflabel="max_standby_streaming_delay">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>max_standby_streaming_delay</varname> (<type>integer</type>)
|
2010-07-03 22:43:58 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_standby_streaming_delay</varname> configuration parameter</primary>
|
2010-07-03 22:43:58 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2010-07-03 22:43:58 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
When Hot Standby is active, this parameter determines how long the
|
|
|
|
standby server should wait before canceling standby queries that
|
|
|
|
conflict with about-to-be-applied WAL entries, as described in
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="hot-standby-conflict"/>.
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>max_standby_streaming_delay</varname> applies when WAL data is
|
2010-07-03 22:43:58 +02:00
|
|
|
being received via streaming replication.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
|
|
|
The default is 30 seconds.
|
2010-07-03 22:43:58 +02:00
|
|
|
A value of -1 allows the standby to wait forever for conflicting
|
|
|
|
queries to complete.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2010-07-03 22:43:58 +02:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Note that <varname>max_standby_streaming_delay</varname> is not the same as
|
2010-07-03 22:43:58 +02:00
|
|
|
the maximum length of time a query can run before cancellation; rather
|
|
|
|
it is the maximum total time allowed to apply WAL data once it has
|
|
|
|
been received from the primary server. Thus, if one query has
|
|
|
|
resulted in significant delay, subsequent conflicting queries will
|
|
|
|
have much less grace time until the standby server has caught up
|
|
|
|
again.
|
|
|
|
</para>
|
2011-02-15 18:02:53 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2020-01-14 14:07:11 +01:00
|
|
|
<varlistentry id="guc-wal-receiver-create-temp-slot" xreflabel="wal_receiver_create_temp_slot">
|
|
|
|
<term><varname>wal_receiver_create_temp_slot</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>wal_receiver_create_temp_slot</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2020-03-27 23:43:41 +01:00
|
|
|
Specifies whether the WAL receiver process should create a temporary replication
|
2020-01-14 14:07:11 +01:00
|
|
|
slot on the remote instance when no permanent replication slot to use
|
|
|
|
has been configured (using <xref linkend="guc-primary-slot-name"/>).
|
2020-04-02 21:04:51 +02:00
|
|
|
The default is off. This parameter can only be set in the
|
2020-03-27 23:43:41 +01:00
|
|
|
<filename>postgresql.conf</filename> file or on the server command line.
|
|
|
|
If this parameter is changed while the WAL receiver process is running,
|
2020-06-07 15:06:51 +02:00
|
|
|
that process is signaled to shut down and expected to restart with
|
2020-03-27 23:43:41 +01:00
|
|
|
the new setting.
|
2020-01-14 14:07:11 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2011-02-15 18:02:53 +01:00
|
|
|
<varlistentry id="guc-wal-receiver-status-interval" xreflabel="wal_receiver_status_interval">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>wal_receiver_status_interval</varname> (<type>integer</type>)
|
2011-02-15 18:02:53 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>wal_receiver_status_interval</varname> configuration parameter</primary>
|
2011-02-15 18:02:53 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2011-02-15 18:02:53 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2011-07-04 23:00:14 +02:00
|
|
|
Specifies the minimum frequency for the WAL receiver
|
2011-02-15 18:02:53 +01:00
|
|
|
process on the standby to send information about replication progress
|
2011-07-19 04:40:03 +02:00
|
|
|
to the primary or upstream standby, where it can be seen using the
|
2020-05-29 10:14:33 +02:00
|
|
|
<link linkend="monitoring-pg-stat-replication-view">
|
|
|
|
<structname>pg_stat_replication</structname></link>
|
2018-05-07 16:16:17 +02:00
|
|
|
view. The standby will report
|
2017-05-12 19:51:27 +02:00
|
|
|
the last write-ahead log location it has written, the last position it
|
2011-07-07 21:10:32 +02:00
|
|
|
has flushed to disk, and the last position it has applied.
|
2021-02-24 03:15:58 +01:00
|
|
|
This parameter's value is the maximum amount of time between reports.
|
|
|
|
Updates are sent each time the write or flush positions change, or as
|
|
|
|
often as specified by this parameter if set to a non-zero value.
|
|
|
|
There are additional cases where updates are sent while ignoring this
|
|
|
|
parameter; for example, when processing of the existing WAL completes
|
|
|
|
or when <varname>synchronous_commit</varname> is set to
|
|
|
|
<literal>remote_apply</literal>.
|
|
|
|
Thus, the apply position may lag slightly behind the true position.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as seconds.
|
2021-02-24 03:15:58 +01:00
|
|
|
The default value is 10 seconds. This parameter can only be set in
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
the <filename>postgresql.conf</filename> file or on the server
|
|
|
|
command line.
|
2011-02-15 18:02:53 +01:00
|
|
|
</para>
|
2011-02-16 20:29:37 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2014-07-20 04:20:29 +02:00
|
|
|
<varlistentry id="guc-hot-standby-feedback" xreflabel="hot_standby_feedback">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>hot_standby_feedback</varname> (<type>boolean</type>)
|
2011-02-16 20:29:37 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>hot_standby_feedback</varname> configuration parameter</primary>
|
2011-02-16 20:29:37 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2011-02-16 20:29:37 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies whether or not a hot standby will send feedback to the primary
|
2011-07-19 04:40:03 +02:00
|
|
|
or upstream standby
|
2011-02-16 20:29:37 +01:00
|
|
|
about queries currently executing on the standby. This parameter can
|
2011-04-27 22:51:46 +02:00
|
|
|
be used to eliminate query cancels caused by cleanup records, but
|
2011-03-01 17:32:23 +01:00
|
|
|
can cause database bloat on the primary for some workloads.
|
2011-06-13 18:23:42 +02:00
|
|
|
Feedback messages will not be sent more frequently than once per
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>wal_receiver_status_interval</varname>. The default value is
|
2011-06-13 18:23:42 +02:00
|
|
|
<literal>off</literal>. This parameter can only be set in the
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>postgresql.conf</filename> file or on the server command line.
|
2011-02-16 20:29:37 +01:00
|
|
|
</para>
|
2011-07-19 04:40:03 +02:00
|
|
|
<para>
|
|
|
|
If cascaded replication is in use the feedback is passed upstream
|
|
|
|
until it eventually reaches the primary. Standbys make no other use
|
|
|
|
of feedback they receive other than to pass upstream.
|
|
|
|
</para>
|
2016-04-08 21:36:30 +02:00
|
|
|
<para>
|
|
|
|
This setting does not override the behavior of
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>old_snapshot_threshold</varname> on the primary; a snapshot on the
|
2016-04-08 21:36:30 +02:00
|
|
|
standby which exceeds the primary's age threshold can become invalid,
|
|
|
|
resulting in cancellation of transactions on the standby. This is
|
2017-10-09 03:44:17 +02:00
|
|
|
because <varname>old_snapshot_threshold</varname> is intended to provide an
|
2016-04-08 21:36:30 +02:00
|
|
|
absolute limit on the time which dead rows can contribute to bloat,
|
|
|
|
which would otherwise be violated because of the configuration of a
|
|
|
|
standby.
|
|
|
|
</para>
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2012-10-11 16:39:52 +02:00
|
|
|
<varlistentry id="guc-wal-receiver-timeout" xreflabel="wal_receiver_timeout">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>wal_receiver_timeout</varname> (<type>integer</type>)
|
2012-10-11 16:39:52 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>wal_receiver_timeout</varname> configuration parameter</primary>
|
2012-10-11 16:39:52 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2012-10-11 16:39:52 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Terminate replication connections that are inactive for longer
|
|
|
|
than this amount of time. This is useful for
|
2012-10-11 16:39:52 +02:00
|
|
|
the receiving standby server to detect a primary node crash or network
|
|
|
|
outage.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
2012-10-11 16:39:52 +02:00
|
|
|
The default value is 60 seconds.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
A value of zero disables the timeout mechanism.
|
|
|
|
This parameter can only be set in
|
|
|
|
the <filename>postgresql.conf</filename> file or on the server
|
|
|
|
command line.
|
2012-10-11 16:39:52 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2015-02-23 12:55:17 +01:00
|
|
|
<varlistentry id="guc-wal-retrieve-retry-interval" xreflabel="wal_retrieve_retry_interval">
|
|
|
|
<term><varname>wal_retrieve_retry_interval</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>wal_retrieve_retry_interval</varname> configuration parameter</primary>
|
2015-02-23 12:55:17 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Specifies how long the standby server should wait when WAL data is not
|
2015-02-23 12:55:17 +01:00
|
|
|
available from any sources (streaming replication,
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
local <filename>pg_wal</filename> or WAL archive) before trying
|
|
|
|
again to retrieve WAL data.
|
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
|
|
|
The default value is 5 seconds.
|
|
|
|
This parameter can only be set in
|
|
|
|
the <filename>postgresql.conf</filename> file or on the server
|
|
|
|
command line.
|
2015-02-23 12:55:17 +01:00
|
|
|
</para>
|
2015-11-23 15:13:44 +01:00
|
|
|
<para>
|
|
|
|
This parameter is useful in configurations where a node in recovery
|
|
|
|
needs to control the amount of time to wait for new WAL data to be
|
|
|
|
available. For example, in archive recovery, it is possible to
|
|
|
|
make the recovery more responsive in the detection of a new WAL
|
|
|
|
log file by reducing the value of this parameter. On a system with
|
|
|
|
low WAL activity, increasing it reduces the amount of requests necessary
|
|
|
|
to access WAL archives, something useful for example in cloud
|
|
|
|
environments where the amount of times an infrastructure is accessed
|
|
|
|
is taken into account.
|
|
|
|
</para>
|
2015-02-23 12:55:17 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2018-11-25 16:31:16 +01:00
|
|
|
<varlistentry id="guc-recovery-min-apply-delay" xreflabel="recovery_min_apply_delay">
|
|
|
|
<term><varname>recovery_min_apply_delay</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>recovery_min_apply_delay</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
By default, a standby server restores WAL records from the
|
|
|
|
sending server as soon as possible. It may be useful to have a time-delayed
|
|
|
|
copy of the data, offering opportunities to correct data loss errors.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
This parameter allows you to delay recovery by a specified amount
|
|
|
|
of time. For example, if
|
2018-11-25 16:31:16 +01:00
|
|
|
you set this parameter to <literal>5min</literal>, the standby will
|
|
|
|
replay each transaction commit only when the system time on the standby
|
2020-06-15 19:12:58 +02:00
|
|
|
is at least five minutes past the commit time reported by the primary.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
|
|
|
The default is zero, adding no delay.
|
2018-11-25 16:31:16 +01:00
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
It is possible that the replication delay between servers exceeds the
|
|
|
|
value of this parameter, in which case no delay is added.
|
|
|
|
Note that the delay is calculated between the WAL time stamp as written
|
2020-06-15 19:12:58 +02:00
|
|
|
on primary and the current time on the standby. Delays in transfer
|
2018-11-25 16:31:16 +01:00
|
|
|
because of network lag or cascading replication configurations
|
|
|
|
may reduce the actual wait time significantly. If the system
|
2020-06-15 19:12:58 +02:00
|
|
|
clocks on primary and standby are not synchronized, this may lead to
|
2018-11-25 16:31:16 +01:00
|
|
|
recovery applying records earlier than expected; but that is not a
|
|
|
|
major issue because useful settings of this parameter are much larger
|
|
|
|
than typical time deviations between servers.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
The delay occurs only on WAL records for transaction commits.
|
|
|
|
Other records are replayed as quickly as possible, which
|
|
|
|
is not a problem because MVCC visibility rules ensure their effects
|
|
|
|
are not visible until the corresponding commit record is applied.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
The delay occurs once the database in recovery has reached a consistent
|
|
|
|
state, until the standby is promoted or triggered. After that the standby
|
|
|
|
will end recovery without further waiting.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
This parameter is intended for use with streaming replication deployments;
|
2019-10-18 15:24:18 +02:00
|
|
|
however, if the parameter is specified it will be honored in all cases
|
|
|
|
except crash recovery.
|
2018-11-25 16:31:16 +01:00
|
|
|
|
|
|
|
<varname>hot_standby_feedback</varname> will be delayed by use of this feature
|
2020-06-15 19:12:58 +02:00
|
|
|
which could lead to bloat on the primary; use both together with care.
|
2018-11-25 16:31:16 +01:00
|
|
|
|
|
|
|
<warning>
|
|
|
|
<para>
|
|
|
|
Synchronous replication is affected by this setting when <varname>synchronous_commit</varname>
|
|
|
|
is set to <literal>remote_apply</literal>; every <literal>COMMIT</literal>
|
|
|
|
will need to wait to be applied.
|
|
|
|
</para>
|
|
|
|
</warning>
|
|
|
|
</para>
|
|
|
|
<para>
|
2019-02-04 09:28:17 +01:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
|
|
|
file or on the server command line.
|
2018-11-25 16:31:16 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
2017-01-19 18:00:00 +01:00
|
|
|
|
|
|
|
<sect2 id="runtime-config-replication-subscriber">
|
|
|
|
<title>Subscribers</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
These settings control the behavior of a logical replication subscriber.
|
|
|
|
Their values on the publisher are irrelevant.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-11-23 08:46:42 +01:00
|
|
|
Note that <varname>wal_receiver_timeout</varname>,
|
|
|
|
<varname>wal_receiver_status_interval</varname> and
|
2017-01-19 18:00:00 +01:00
|
|
|
<varname>wal_retrieve_retry_interval</varname> configuration parameters
|
|
|
|
affect the logical replication workers as well.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
|
|
|
|
<varlistentry id="guc-max-logical-replication-workers" xreflabel="max_logical_replication_workers">
|
|
|
|
<term><varname>max_logical_replication_workers</varname> (<type>int</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_logical_replication_workers</varname> configuration parameter</primary>
|
2017-01-19 18:00:00 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies maximum number of logical replication workers. This includes
|
|
|
|
both apply workers and table synchronization workers.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
Logical replication workers are taken from the pool defined by
|
|
|
|
<varname>max_worker_processes</varname>.
|
|
|
|
</para>
|
|
|
|
<para>
|
doc: Fix description of some GUCs in docs and postgresql.conf.sample
The following parameters have been imprecise, or incorrect, about their
description (PGC_POSTMASTER or PGC_SIGHUP):
- autovacuum_work_mem (docs, as of 9.6~)
- huge_page_size (docs, as of 14~)
- max_logical_replication_workers (docs, as of 10~)
- max_sync_workers_per_subscription (docs, as of 10~)
- min_dynamic_shared_memory (docs, as of 14~)
- recovery_init_sync_method (postgresql.conf.sample, as of 14~)
- remove_temp_files_after_crash (docs, as of 14~)
- restart_after_crash (docs, as of 9.6~)
- ssl_min_protocol_version (docs, as of 12~)
- ssl_max_protocol_version (docs, as of 12~)
This commit adjusts the description of all these parameters to be more
consistent with the practice used for the others.
Revewed-by: Justin Pryzby
Discussion: https://postgr.es/m/YK2ltuLpe+FbRXzA@paquier.xyz
Backpatch-through: 9.6
2021-05-27 07:57:28 +02:00
|
|
|
The default value is 4. This parameter can only be set at server
|
|
|
|
start.
|
2017-01-19 18:00:00 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2017-03-23 13:36:36 +01:00
|
|
|
<varlistentry id="guc-max-sync-workers-per-subscription" xreflabel="max_sync_workers_per_subscription">
|
|
|
|
<term><varname>max_sync_workers_per_subscription</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_sync_workers_per_subscription</varname> configuration parameter</primary>
|
2017-03-23 13:36:36 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Maximum number of synchronization workers per subscription. This
|
2017-06-18 20:01:45 +02:00
|
|
|
parameter controls the amount of parallelism of the initial data copy
|
2017-03-23 13:36:36 +01:00
|
|
|
during the subscription initialization or when new tables are added.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
Currently, there can be only one synchronization worker per table.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
The synchronization workers are taken from the pool defined by
|
|
|
|
<varname>max_logical_replication_workers</varname>.
|
|
|
|
</para>
|
|
|
|
<para>
|
doc: Fix description of some GUCs in docs and postgresql.conf.sample
The following parameters have been imprecise, or incorrect, about their
description (PGC_POSTMASTER or PGC_SIGHUP):
- autovacuum_work_mem (docs, as of 9.6~)
- huge_page_size (docs, as of 14~)
- max_logical_replication_workers (docs, as of 10~)
- max_sync_workers_per_subscription (docs, as of 10~)
- min_dynamic_shared_memory (docs, as of 14~)
- recovery_init_sync_method (postgresql.conf.sample, as of 14~)
- remove_temp_files_after_crash (docs, as of 14~)
- restart_after_crash (docs, as of 9.6~)
- ssl_min_protocol_version (docs, as of 12~)
- ssl_max_protocol_version (docs, as of 12~)
This commit adjusts the description of all these parameters to be more
consistent with the practice used for the others.
Revewed-by: Justin Pryzby
Discussion: https://postgr.es/m/YK2ltuLpe+FbRXzA@paquier.xyz
Backpatch-through: 9.6
2021-05-27 07:57:28 +02:00
|
|
|
The default value is 2. This parameter can only be set in the
|
|
|
|
<filename>postgresql.conf</filename> file or on the server command
|
|
|
|
line.
|
2017-03-23 13:36:36 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2017-01-19 18:00:00 +01:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="runtime-config-query">
|
|
|
|
<title>Query Planning</title>
|
|
|
|
|
|
|
|
<sect2 id="runtime-config-query-enable">
|
|
|
|
<title>Planner Method Configuration</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
These configuration parameters provide a crude method of
|
|
|
|
influencing the query plans chosen by the query optimizer. If
|
|
|
|
the default plan chosen by the optimizer for a particular query
|
2017-10-09 03:44:17 +02:00
|
|
|
is not optimal, a <emphasis>temporary</emphasis> solution is to use one
|
2005-09-13 00:11:38 +02:00
|
|
|
of these configuration parameters to force the optimizer to
|
2010-02-03 18:25:06 +01:00
|
|
|
choose a different plan.
|
2005-09-13 00:11:38 +02:00
|
|
|
Better ways to improve the quality of the
|
2017-06-01 23:45:53 +02:00
|
|
|
plans chosen by the optimizer include adjusting the planner cost
|
2017-11-23 15:39:47 +01:00
|
|
|
constants (see <xref linkend="runtime-config-query-constants"/>),
|
Improve <xref> vs. <command> formatting in the documentation
SQL commands are generally marked up as <command>, except when a link
to a reference page is used using <xref>. But the latter doesn't
create monospace markup, so this looks strange especially when a
paragraph contains a mix of links and non-links.
We considered putting <command> in the <refentrytitle> on the target
side, but that creates some formatting side effects elsewhere.
Generally, it seems safer to solve this on the link source side.
We can't put the <xref> inside the <command>; the DTD doesn't allow
this. DocBook 5 would allow the <command> to have the linkend
attribute itself, but we are not there yet.
So to solve this for now, convert the <xref>s to <link> plus
<command>. This gives the correct look and also gives some more
flexibility what we can put into the link text (e.g., subcommands or
other clauses). In the future, these could then be converted to
DocBook 5 style.
I haven't converted absolutely all xrefs to SQL command reference
pages, only those where we care about the appearance of the link text
or where it was otherwise appropriate to make the appearance match a
bit better. Also in some cases, the links where repetitive, so in
those cases the links where just removed and replaced by a plain
<command>. In cases where we just want the link and don't
specifically care about the generated link text (typically phrased
"for further information see <xref ...>") the xref is kept.
Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/87o8pco34z.fsf@wibble.ilmari.org
2020-10-03 16:16:51 +02:00
|
|
|
running <link linkend="sql-analyze"><command>ANALYZE</command></link> manually, increasing
|
2010-02-03 18:25:06 +01:00
|
|
|
the value of the <xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="guc-default-statistics-target"/> configuration parameter,
|
2005-09-13 00:11:38 +02:00
|
|
|
and increasing the amount of statistics collected for
|
|
|
|
specific columns using <command>ALTER TABLE SET
|
|
|
|
STATISTICS</command>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<variablelist>
|
Add support for asynchronous execution.
This implements asynchronous execution, which runs multiple parts of a
non-parallel-aware Append concurrently rather than serially to improve
performance when possible. Currently, the only node type that can be
run concurrently is a ForeignScan that is an immediate child of such an
Append. In the case where such ForeignScans access data on different
remote servers, this would run those ForeignScans concurrently, and
overlap the remote operations to be performed simultaneously, so it'll
improve the performance especially when the operations involve
time-consuming ones such as remote join and remote aggregation.
We may extend this to other node types such as joins or aggregates over
ForeignScans in the future.
This also adds the support for postgres_fdw, which is enabled by the
table-level/server-level option "async_capable". The default is false.
Robert Haas, Kyotaro Horiguchi, Thomas Munro, and myself. This commit
is mostly based on the patch proposed by Robert Haas, but also uses
stuff from the patch proposed by Kyotaro Horiguchi and from the patch
proposed by Thomas Munro. Reviewed by Kyotaro Horiguchi, Konstantin
Knizhnik, Andrey Lepikhov, Movead Li, Thomas Munro, Justin Pryzby, and
others.
Discussion: https://postgr.es/m/CA%2BTgmoaXQEt4tZ03FtQhnzeDEMzBck%2BLrni0UWHVVgOTnA6C1w%40mail.gmail.com
Discussion: https://postgr.es/m/CA%2BhUKGLBRyu0rHrDCMC4%3DRn3252gogyp1SjOgG8SEKKZv%3DFwfQ%40mail.gmail.com
Discussion: https://postgr.es/m/20200228.170650.667613673625155850.horikyota.ntt%40gmail.com
2021-03-31 11:45:00 +02:00
|
|
|
<varlistentry id="guc-enable-async-append" xreflabel="enable_async_append">
|
|
|
|
<term><varname>enable_async_append</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>enable_async_append</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's use of async-aware
|
|
|
|
append plan types. The default is <literal>on</literal>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-enable-bitmapscan" xreflabel="enable_bitmapscan">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>enable_bitmapscan</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
|
|
|
<primary>bitmap scan</primary>
|
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>enable_bitmapscan</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's use of bitmap-scan plan
|
2017-10-09 03:44:17 +02:00
|
|
|
types. The default is <literal>on</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2017-03-09 13:40:36 +01:00
|
|
|
<varlistentry id="guc-enable-gathermerge" xreflabel="enable_gathermerge">
|
|
|
|
<term><varname>enable_gathermerge</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>enable_gathermerge</varname> configuration parameter</primary>
|
2017-03-09 13:40:36 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's use of gather
|
2017-10-09 03:44:17 +02:00
|
|
|
merge plan types. The default is <literal>on</literal>.
|
2017-03-09 13:40:36 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-enable-hashagg" xreflabel="enable_hashagg">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>enable_hashagg</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>enable_hashagg</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's use of hashed
|
2017-10-09 03:44:17 +02:00
|
|
|
aggregation plan types. The default is <literal>on</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-enable-hashjoin" xreflabel="enable_hashjoin">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>enable_hashjoin</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>enable_hashjoin</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's use of hash-join plan
|
2017-10-09 03:44:17 +02:00
|
|
|
types. The default is <literal>on</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2020-07-05 11:41:52 +02:00
|
|
|
<varlistentry id="guc-enable-incremental-sort" xreflabel="enable_incremental_sort">
|
|
|
|
<term><varname>enable_incremental_sort</varname> (<type>boolean</type>)
|
Implement Incremental Sort
Incremental Sort is an optimized variant of multikey sort for cases when
the input is already sorted by a prefix of the requested sort keys. For
example when the relation is already sorted by (key1, key2) and we need
to sort it by (key1, key2, key3) we can simply split the input rows into
groups having equal values in (key1, key2), and only sort/compare the
remaining column key3.
This has a number of benefits:
- Reduced memory consumption, because only a single group (determined by
values in the sorted prefix) needs to be kept in memory. This may also
eliminate the need to spill to disk.
- Lower startup cost, because Incremental Sort produce results after each
prefix group, which is beneficial for plans where startup cost matters
(like for example queries with LIMIT clause).
We consider both Sort and Incremental Sort, and decide based on costing.
The implemented algorithm operates in two different modes:
- Fetching a minimum number of tuples without check of equality on the
prefix keys, and sorting on all columns when safe.
- Fetching all tuples for a single prefix group and then sorting by
comparing only the remaining (non-prefix) keys.
We always start in the first mode, and employ a heuristic to switch into
the second mode if we believe it's beneficial - the goal is to minimize
the number of unnecessary comparions while keeping memory consumption
below work_mem.
This is a very old patch series. The idea was originally proposed by
Alexander Korotkov back in 2013, and then revived in 2017. In 2018 the
patch was taken over by James Coleman, who wrote and rewrote most of the
current code.
There were many reviewers/contributors since 2013 - I've done my best to
pick the most active ones, and listed them in this commit message.
Author: James Coleman, Alexander Korotkov
Reviewed-by: Tomas Vondra, Andreas Karlsson, Marti Raudsepp, Peter Geoghegan, Robert Haas, Thomas Munro, Antonin Houska, Andres Freund, Alexander Kuzmenkov
Discussion: https://postgr.es/m/CAPpHfdscOX5an71nHd8WSUH6GNOCf=V7wgDaTXdDd9=goN-gfA@mail.gmail.com
Discussion: https://postgr.es/m/CAPpHfds1waRZ=NOmueYq0sx1ZSCnt+5QJvizT8ndT2=etZEeAQ@mail.gmail.com
2020-04-06 21:33:28 +02:00
|
|
|
<indexterm>
|
2020-07-05 11:41:52 +02:00
|
|
|
<primary><varname>enable_incremental_sort</varname> configuration parameter</primary>
|
Implement Incremental Sort
Incremental Sort is an optimized variant of multikey sort for cases when
the input is already sorted by a prefix of the requested sort keys. For
example when the relation is already sorted by (key1, key2) and we need
to sort it by (key1, key2, key3) we can simply split the input rows into
groups having equal values in (key1, key2), and only sort/compare the
remaining column key3.
This has a number of benefits:
- Reduced memory consumption, because only a single group (determined by
values in the sorted prefix) needs to be kept in memory. This may also
eliminate the need to spill to disk.
- Lower startup cost, because Incremental Sort produce results after each
prefix group, which is beneficial for plans where startup cost matters
(like for example queries with LIMIT clause).
We consider both Sort and Incremental Sort, and decide based on costing.
The implemented algorithm operates in two different modes:
- Fetching a minimum number of tuples without check of equality on the
prefix keys, and sorting on all columns when safe.
- Fetching all tuples for a single prefix group and then sorting by
comparing only the remaining (non-prefix) keys.
We always start in the first mode, and employ a heuristic to switch into
the second mode if we believe it's beneficial - the goal is to minimize
the number of unnecessary comparions while keeping memory consumption
below work_mem.
This is a very old patch series. The idea was originally proposed by
Alexander Korotkov back in 2013, and then revived in 2017. In 2018 the
patch was taken over by James Coleman, who wrote and rewrote most of the
current code.
There were many reviewers/contributors since 2013 - I've done my best to
pick the most active ones, and listed them in this commit message.
Author: James Coleman, Alexander Korotkov
Reviewed-by: Tomas Vondra, Andreas Karlsson, Marti Raudsepp, Peter Geoghegan, Robert Haas, Thomas Munro, Antonin Houska, Andres Freund, Alexander Kuzmenkov
Discussion: https://postgr.es/m/CAPpHfdscOX5an71nHd8WSUH6GNOCf=V7wgDaTXdDd9=goN-gfA@mail.gmail.com
Discussion: https://postgr.es/m/CAPpHfds1waRZ=NOmueYq0sx1ZSCnt+5QJvizT8ndT2=etZEeAQ@mail.gmail.com
2020-04-06 21:33:28 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's use of incremental sort steps.
|
|
|
|
The default is <literal>on</literal>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-enable-indexscan" xreflabel="enable_indexscan">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>enable_indexscan</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
|
|
|
<primary>index scan</primary>
|
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>enable_indexscan</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's use of index-scan plan
|
2017-10-09 03:44:17 +02:00
|
|
|
types. The default is <literal>on</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2011-10-08 02:13:02 +02:00
|
|
|
<varlistentry id="guc-enable-indexonlyscan" xreflabel="enable_indexonlyscan">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>enable_indexonlyscan</varname> (<type>boolean</type>)
|
2011-10-08 02:13:02 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>enable_indexonlyscan</varname> configuration parameter</primary>
|
2011-10-08 02:13:02 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2011-10-08 02:13:02 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's use of index-only-scan plan
|
2017-11-23 15:39:47 +01:00
|
|
|
types (see <xref linkend="indexes-index-only-scans"/>).
|
2017-10-09 03:44:17 +02:00
|
|
|
The default is <literal>on</literal>.
|
2011-10-08 02:13:02 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2010-04-19 02:55:26 +02:00
|
|
|
<varlistentry id="guc-enable-material" xreflabel="enable_material">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>enable_material</varname> (<type>boolean</type>)
|
2010-04-19 02:55:26 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>enable_material</varname> configuration parameter</primary>
|
2010-04-19 02:55:26 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2010-04-19 02:55:26 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's use of materialization.
|
|
|
|
It is impossible to suppress materialization entirely,
|
|
|
|
but turning this variable off prevents the planner from inserting
|
|
|
|
materialize nodes except in cases where it is required for correctness.
|
2017-10-09 03:44:17 +02:00
|
|
|
The default is <literal>on</literal>.
|
2010-04-19 02:55:26 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2021-07-14 02:43:58 +02:00
|
|
|
<varlistentry id="guc-enable-memoize" xreflabel="enable_memoize">
|
|
|
|
<term><varname>enable_memoize</varname> (<type>boolean</type>)
|
Add Result Cache executor node (take 2)
Here we add a new executor node type named "Result Cache". The planner
can include this node type in the plan to have the executor cache the
results from the inner side of parameterized nested loop joins. This
allows caching of tuples for sets of parameters so that in the event that
the node sees the same parameter values again, it can just return the
cached tuples instead of rescanning the inner side of the join all over
again. Internally, result cache uses a hash table in order to quickly
find tuples that have been previously cached.
For certain data sets, this can significantly improve the performance of
joins. The best cases for using this new node type are for join problems
where a large portion of the tuples from the inner side of the join have
no join partner on the outer side of the join. In such cases, hash join
would have to hash values that are never looked up, thus bloating the hash
table and possibly causing it to multi-batch. Merge joins would have to
skip over all of the unmatched rows. If we use a nested loop join with a
result cache, then we only cache tuples that have at least one join
partner on the outer side of the join. The benefits of using a
parameterized nested loop with a result cache increase when there are
fewer distinct values being looked up and the number of lookups of each
value is large. Also, hash probes to lookup the cache can be much faster
than the hash probe in a hash join as it's common that the result cache's
hash table is much smaller than the hash join's due to result cache only
caching useful tuples rather than all tuples from the inner side of the
join. This variation in hash probe performance is more significant when
the hash join's hash table no longer fits into the CPU's L3 cache, but the
result cache's hash table does. The apparent "random" access of hash
buckets with each hash probe can cause a poor L3 cache hit ratio for large
hash tables. Smaller hash tables generally perform better.
The hash table used for the cache limits itself to not exceeding work_mem
* hash_mem_multiplier in size. We maintain a dlist of keys for this cache
and when we're adding new tuples and realize we've exceeded the memory
budget, we evict cache entries starting with the least recently used ones
until we have enough memory to add the new tuples to the cache.
For parameterized nested loop joins, we now consider using one of these
result cache nodes in between the nested loop node and its inner node. We
determine when this might be useful based on cost, which is primarily
driven off of what the expected cache hit ratio will be. Estimating the
cache hit ratio relies on having good distinct estimates on the nested
loop's parameters.
For now, the planner will only consider using a result cache for
parameterized nested loop joins. This works for both normal joins and
also for LATERAL type joins to subqueries. It is possible to use this new
node for other uses in the future. For example, to cache results from
correlated subqueries. However, that's not done here due to some
difficulties obtaining a distinct estimation on the outer plan to
calculate the estimated cache hit ratio. Currently we plan the inner plan
before planning the outer plan so there is no good way to know if a result
cache would be useful or not since we can't estimate the number of times
the subplan will be called until the outer plan is generated.
The functionality being added here is newly introducing a dependency on
the return value of estimate_num_groups() during the join search.
Previously, during the join search, we only ever needed to perform
selectivity estimations. With this commit, we need to use
estimate_num_groups() in order to estimate what the hit ratio on the
result cache will be. In simple terms, if we expect 10 distinct values
and we expect 1000 outer rows, then we'll estimate the hit ratio to be
99%. Since cache hits are very cheap compared to scanning the underlying
nodes on the inner side of the nested loop join, then this will
significantly reduce the planner's cost for the join. However, it's
fairly easy to see here that things will go bad when estimate_num_groups()
incorrectly returns a value that's significantly lower than the actual
number of distinct values. If this happens then that may cause us to make
use of a nested loop join with a result cache instead of some other join
type, such as a merge or hash join. Our distinct estimations have been
known to be a source of trouble in the past, so the extra reliance on them
here could cause the planner to choose slower plans than it did previous
to having this feature. Distinct estimations are also fairly hard to
estimate accurately when several tables have been joined already or when a
WHERE clause filters out a set of values that are correlated to the
expressions we're estimating the number of distinct value for.
For now, the costing we perform during query planning for result caches
does put quite a bit of faith in the distinct estimations being accurate.
When these are accurate then we should generally see faster execution
times for plans containing a result cache. However, in the real world, we
may find that we need to either change the costings to put less trust in
the distinct estimations being accurate or perhaps even disable this
feature by default. There's always an element of risk when we teach the
query planner to do new tricks that it decides to use that new trick at
the wrong time and causes a regression. Users may opt to get the old
behavior by turning the feature off using the enable_resultcache GUC.
Currently, this is enabled by default. It remains to be seen if we'll
maintain that setting for the release.
Additionally, the name "Result Cache" is the best name I could think of
for this new node at the time I started writing the patch. Nobody seems
to strongly dislike the name. A few people did suggest other names but no
other name seemed to dominate in the brief discussion that there was about
names. Let's allow the beta period to see if the current name pleases
enough people. If there's some consensus on a better name, then we can
change it before the release. Please see the 2nd discussion link below
for the discussion on the "Result Cache" name.
Author: David Rowley
Reviewed-by: Andy Fan, Justin Pryzby, Zhihong Yu, Hou Zhijie
Tested-By: Konstantin Knizhnik
Discussion: https://postgr.es/m/CAApHDvrPcQyQdWERGYWx8J%2B2DLUNgXu%2BfOSbQ1UscxrunyXyrQ%40mail.gmail.com
Discussion: https://postgr.es/m/CAApHDvq=yQXr5kqhRviT2RhNKwToaWr9JAN5t+5_PzhuRJ3wvg@mail.gmail.com
2021-04-02 03:10:56 +02:00
|
|
|
<indexterm>
|
2021-07-14 02:43:58 +02:00
|
|
|
<primary><varname>enable_memoize</varname> configuration parameter</primary>
|
Add Result Cache executor node (take 2)
Here we add a new executor node type named "Result Cache". The planner
can include this node type in the plan to have the executor cache the
results from the inner side of parameterized nested loop joins. This
allows caching of tuples for sets of parameters so that in the event that
the node sees the same parameter values again, it can just return the
cached tuples instead of rescanning the inner side of the join all over
again. Internally, result cache uses a hash table in order to quickly
find tuples that have been previously cached.
For certain data sets, this can significantly improve the performance of
joins. The best cases for using this new node type are for join problems
where a large portion of the tuples from the inner side of the join have
no join partner on the outer side of the join. In such cases, hash join
would have to hash values that are never looked up, thus bloating the hash
table and possibly causing it to multi-batch. Merge joins would have to
skip over all of the unmatched rows. If we use a nested loop join with a
result cache, then we only cache tuples that have at least one join
partner on the outer side of the join. The benefits of using a
parameterized nested loop with a result cache increase when there are
fewer distinct values being looked up and the number of lookups of each
value is large. Also, hash probes to lookup the cache can be much faster
than the hash probe in a hash join as it's common that the result cache's
hash table is much smaller than the hash join's due to result cache only
caching useful tuples rather than all tuples from the inner side of the
join. This variation in hash probe performance is more significant when
the hash join's hash table no longer fits into the CPU's L3 cache, but the
result cache's hash table does. The apparent "random" access of hash
buckets with each hash probe can cause a poor L3 cache hit ratio for large
hash tables. Smaller hash tables generally perform better.
The hash table used for the cache limits itself to not exceeding work_mem
* hash_mem_multiplier in size. We maintain a dlist of keys for this cache
and when we're adding new tuples and realize we've exceeded the memory
budget, we evict cache entries starting with the least recently used ones
until we have enough memory to add the new tuples to the cache.
For parameterized nested loop joins, we now consider using one of these
result cache nodes in between the nested loop node and its inner node. We
determine when this might be useful based on cost, which is primarily
driven off of what the expected cache hit ratio will be. Estimating the
cache hit ratio relies on having good distinct estimates on the nested
loop's parameters.
For now, the planner will only consider using a result cache for
parameterized nested loop joins. This works for both normal joins and
also for LATERAL type joins to subqueries. It is possible to use this new
node for other uses in the future. For example, to cache results from
correlated subqueries. However, that's not done here due to some
difficulties obtaining a distinct estimation on the outer plan to
calculate the estimated cache hit ratio. Currently we plan the inner plan
before planning the outer plan so there is no good way to know if a result
cache would be useful or not since we can't estimate the number of times
the subplan will be called until the outer plan is generated.
The functionality being added here is newly introducing a dependency on
the return value of estimate_num_groups() during the join search.
Previously, during the join search, we only ever needed to perform
selectivity estimations. With this commit, we need to use
estimate_num_groups() in order to estimate what the hit ratio on the
result cache will be. In simple terms, if we expect 10 distinct values
and we expect 1000 outer rows, then we'll estimate the hit ratio to be
99%. Since cache hits are very cheap compared to scanning the underlying
nodes on the inner side of the nested loop join, then this will
significantly reduce the planner's cost for the join. However, it's
fairly easy to see here that things will go bad when estimate_num_groups()
incorrectly returns a value that's significantly lower than the actual
number of distinct values. If this happens then that may cause us to make
use of a nested loop join with a result cache instead of some other join
type, such as a merge or hash join. Our distinct estimations have been
known to be a source of trouble in the past, so the extra reliance on them
here could cause the planner to choose slower plans than it did previous
to having this feature. Distinct estimations are also fairly hard to
estimate accurately when several tables have been joined already or when a
WHERE clause filters out a set of values that are correlated to the
expressions we're estimating the number of distinct value for.
For now, the costing we perform during query planning for result caches
does put quite a bit of faith in the distinct estimations being accurate.
When these are accurate then we should generally see faster execution
times for plans containing a result cache. However, in the real world, we
may find that we need to either change the costings to put less trust in
the distinct estimations being accurate or perhaps even disable this
feature by default. There's always an element of risk when we teach the
query planner to do new tricks that it decides to use that new trick at
the wrong time and causes a regression. Users may opt to get the old
behavior by turning the feature off using the enable_resultcache GUC.
Currently, this is enabled by default. It remains to be seen if we'll
maintain that setting for the release.
Additionally, the name "Result Cache" is the best name I could think of
for this new node at the time I started writing the patch. Nobody seems
to strongly dislike the name. A few people did suggest other names but no
other name seemed to dominate in the brief discussion that there was about
names. Let's allow the beta period to see if the current name pleases
enough people. If there's some consensus on a better name, then we can
change it before the release. Please see the 2nd discussion link below
for the discussion on the "Result Cache" name.
Author: David Rowley
Reviewed-by: Andy Fan, Justin Pryzby, Zhihong Yu, Hou Zhijie
Tested-By: Konstantin Knizhnik
Discussion: https://postgr.es/m/CAApHDvrPcQyQdWERGYWx8J%2B2DLUNgXu%2BfOSbQ1UscxrunyXyrQ%40mail.gmail.com
Discussion: https://postgr.es/m/CAApHDvq=yQXr5kqhRviT2RhNKwToaWr9JAN5t+5_PzhuRJ3wvg@mail.gmail.com
2021-04-02 03:10:56 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2021-07-14 02:43:58 +02:00
|
|
|
Enables or disables the query planner's use of memoize plans for
|
Add Result Cache executor node (take 2)
Here we add a new executor node type named "Result Cache". The planner
can include this node type in the plan to have the executor cache the
results from the inner side of parameterized nested loop joins. This
allows caching of tuples for sets of parameters so that in the event that
the node sees the same parameter values again, it can just return the
cached tuples instead of rescanning the inner side of the join all over
again. Internally, result cache uses a hash table in order to quickly
find tuples that have been previously cached.
For certain data sets, this can significantly improve the performance of
joins. The best cases for using this new node type are for join problems
where a large portion of the tuples from the inner side of the join have
no join partner on the outer side of the join. In such cases, hash join
would have to hash values that are never looked up, thus bloating the hash
table and possibly causing it to multi-batch. Merge joins would have to
skip over all of the unmatched rows. If we use a nested loop join with a
result cache, then we only cache tuples that have at least one join
partner on the outer side of the join. The benefits of using a
parameterized nested loop with a result cache increase when there are
fewer distinct values being looked up and the number of lookups of each
value is large. Also, hash probes to lookup the cache can be much faster
than the hash probe in a hash join as it's common that the result cache's
hash table is much smaller than the hash join's due to result cache only
caching useful tuples rather than all tuples from the inner side of the
join. This variation in hash probe performance is more significant when
the hash join's hash table no longer fits into the CPU's L3 cache, but the
result cache's hash table does. The apparent "random" access of hash
buckets with each hash probe can cause a poor L3 cache hit ratio for large
hash tables. Smaller hash tables generally perform better.
The hash table used for the cache limits itself to not exceeding work_mem
* hash_mem_multiplier in size. We maintain a dlist of keys for this cache
and when we're adding new tuples and realize we've exceeded the memory
budget, we evict cache entries starting with the least recently used ones
until we have enough memory to add the new tuples to the cache.
For parameterized nested loop joins, we now consider using one of these
result cache nodes in between the nested loop node and its inner node. We
determine when this might be useful based on cost, which is primarily
driven off of what the expected cache hit ratio will be. Estimating the
cache hit ratio relies on having good distinct estimates on the nested
loop's parameters.
For now, the planner will only consider using a result cache for
parameterized nested loop joins. This works for both normal joins and
also for LATERAL type joins to subqueries. It is possible to use this new
node for other uses in the future. For example, to cache results from
correlated subqueries. However, that's not done here due to some
difficulties obtaining a distinct estimation on the outer plan to
calculate the estimated cache hit ratio. Currently we plan the inner plan
before planning the outer plan so there is no good way to know if a result
cache would be useful or not since we can't estimate the number of times
the subplan will be called until the outer plan is generated.
The functionality being added here is newly introducing a dependency on
the return value of estimate_num_groups() during the join search.
Previously, during the join search, we only ever needed to perform
selectivity estimations. With this commit, we need to use
estimate_num_groups() in order to estimate what the hit ratio on the
result cache will be. In simple terms, if we expect 10 distinct values
and we expect 1000 outer rows, then we'll estimate the hit ratio to be
99%. Since cache hits are very cheap compared to scanning the underlying
nodes on the inner side of the nested loop join, then this will
significantly reduce the planner's cost for the join. However, it's
fairly easy to see here that things will go bad when estimate_num_groups()
incorrectly returns a value that's significantly lower than the actual
number of distinct values. If this happens then that may cause us to make
use of a nested loop join with a result cache instead of some other join
type, such as a merge or hash join. Our distinct estimations have been
known to be a source of trouble in the past, so the extra reliance on them
here could cause the planner to choose slower plans than it did previous
to having this feature. Distinct estimations are also fairly hard to
estimate accurately when several tables have been joined already or when a
WHERE clause filters out a set of values that are correlated to the
expressions we're estimating the number of distinct value for.
For now, the costing we perform during query planning for result caches
does put quite a bit of faith in the distinct estimations being accurate.
When these are accurate then we should generally see faster execution
times for plans containing a result cache. However, in the real world, we
may find that we need to either change the costings to put less trust in
the distinct estimations being accurate or perhaps even disable this
feature by default. There's always an element of risk when we teach the
query planner to do new tricks that it decides to use that new trick at
the wrong time and causes a regression. Users may opt to get the old
behavior by turning the feature off using the enable_resultcache GUC.
Currently, this is enabled by default. It remains to be seen if we'll
maintain that setting for the release.
Additionally, the name "Result Cache" is the best name I could think of
for this new node at the time I started writing the patch. Nobody seems
to strongly dislike the name. A few people did suggest other names but no
other name seemed to dominate in the brief discussion that there was about
names. Let's allow the beta period to see if the current name pleases
enough people. If there's some consensus on a better name, then we can
change it before the release. Please see the 2nd discussion link below
for the discussion on the "Result Cache" name.
Author: David Rowley
Reviewed-by: Andy Fan, Justin Pryzby, Zhihong Yu, Hou Zhijie
Tested-By: Konstantin Knizhnik
Discussion: https://postgr.es/m/CAApHDvrPcQyQdWERGYWx8J%2B2DLUNgXu%2BfOSbQ1UscxrunyXyrQ%40mail.gmail.com
Discussion: https://postgr.es/m/CAApHDvq=yQXr5kqhRviT2RhNKwToaWr9JAN5t+5_PzhuRJ3wvg@mail.gmail.com
2021-04-02 03:10:56 +02:00
|
|
|
caching results from parameterized scans inside nested-loop joins.
|
|
|
|
This plan type allows scans to the underlying plans to be skipped when
|
|
|
|
the results for the current parameters are already in the cache. Less
|
|
|
|
commonly looked up results may be evicted from the cache when more
|
|
|
|
space is required for new entries. The default is
|
|
|
|
<literal>on</literal>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-enable-mergejoin" xreflabel="enable_mergejoin">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>enable_mergejoin</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>enable_mergejoin</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's use of merge-join plan
|
2017-10-09 03:44:17 +02:00
|
|
|
types. The default is <literal>on</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-enable-nestloop" xreflabel="enable_nestloop">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>enable_nestloop</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>enable_nestloop</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's use of nested-loop join
|
2010-02-03 18:25:06 +01:00
|
|
|
plans. It is impossible to suppress nested-loop joins entirely,
|
2005-09-13 00:11:38 +02:00
|
|
|
but turning this variable off discourages the planner from using
|
|
|
|
one if there are other methods available. The default is
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>on</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Support Parallel Append plan nodes.
When we create an Append node, we can spread out the workers over the
subplans instead of piling on to each subplan one at a time, which
should typically be a bit more efficient, both because the startup
cost of any plan executed entirely by one worker is paid only once and
also because of reduced contention. We can also construct Append
plans using a mix of partial and non-partial subplans, which may allow
for parallelism in places that otherwise couldn't support it.
Unfortunately, this patch doesn't handle the important case of
parallelizing UNION ALL by running each branch in a separate worker;
the executor infrastructure is added here, but more planner work is
needed.
Amit Khandekar, Robert Haas, Amul Sul, reviewed and tested by
Ashutosh Bapat, Amit Langote, Rafia Sabih, Amit Kapila, and
Rajkumar Raghuwanshi.
Discussion: http://postgr.es/m/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1+S+vRuUQ@mail.gmail.com
2017-12-05 23:28:39 +01:00
|
|
|
<varlistentry id="guc-enable-parallel-append" xreflabel="enable_parallel_append">
|
|
|
|
<term><varname>enable_parallel_append</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
2017-12-06 00:53:32 +01:00
|
|
|
<primary><varname>enable_parallel_append</varname> configuration parameter</primary>
|
Support Parallel Append plan nodes.
When we create an Append node, we can spread out the workers over the
subplans instead of piling on to each subplan one at a time, which
should typically be a bit more efficient, both because the startup
cost of any plan executed entirely by one worker is paid only once and
also because of reduced contention. We can also construct Append
plans using a mix of partial and non-partial subplans, which may allow
for parallelism in places that otherwise couldn't support it.
Unfortunately, this patch doesn't handle the important case of
parallelizing UNION ALL by running each branch in a separate worker;
the executor infrastructure is added here, but more planner work is
needed.
Amit Khandekar, Robert Haas, Amul Sul, reviewed and tested by
Ashutosh Bapat, Amit Langote, Rafia Sabih, Amit Kapila, and
Rajkumar Raghuwanshi.
Discussion: http://postgr.es/m/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1+S+vRuUQ@mail.gmail.com
2017-12-05 23:28:39 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's use of parallel-aware
|
2017-12-06 00:53:32 +01:00
|
|
|
append plan types. The default is <literal>on</literal>.
|
Support Parallel Append plan nodes.
When we create an Append node, we can spread out the workers over the
subplans instead of piling on to each subplan one at a time, which
should typically be a bit more efficient, both because the startup
cost of any plan executed entirely by one worker is paid only once and
also because of reduced contention. We can also construct Append
plans using a mix of partial and non-partial subplans, which may allow
for parallelism in places that otherwise couldn't support it.
Unfortunately, this patch doesn't handle the important case of
parallelizing UNION ALL by running each branch in a separate worker;
the executor infrastructure is added here, but more planner work is
needed.
Amit Khandekar, Robert Haas, Amul Sul, reviewed and tested by
Ashutosh Bapat, Amit Langote, Rafia Sabih, Amit Kapila, and
Rajkumar Raghuwanshi.
Discussion: http://postgr.es/m/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1+S+vRuUQ@mail.gmail.com
2017-12-05 23:28:39 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Add parallel-aware hash joins.
Introduce parallel-aware hash joins that appear in EXPLAIN plans as Parallel
Hash Join with Parallel Hash. While hash joins could already appear in
parallel queries, they were previously always parallel-oblivious and had a
partial subplan only on the outer side, meaning that the work of the inner
subplan was duplicated in every worker.
After this commit, the planner will consider using a partial subplan on the
inner side too, using the Parallel Hash node to divide the work over the
available CPU cores and combine its results in shared memory. If the join
needs to be split into multiple batches in order to respect work_mem, then
workers process different batches as much as possible and then work together
on the remaining batches.
The advantages of a parallel-aware hash join over a parallel-oblivious hash
join used in a parallel query are that it:
* avoids wasting memory on duplicated hash tables
* avoids wasting disk space on duplicated batch files
* divides the work of building the hash table over the CPUs
One disadvantage is that there is some communication between the participating
CPUs which might outweigh the benefits of parallelism in the case of small
hash tables. This is avoided by the planner's existing reluctance to supply
partial plans for small scans, but it may be necessary to estimate
synchronization costs in future if that situation changes. Another is that
outer batch 0 must be written to disk if multiple batches are required.
A potential future advantage of parallel-aware hash joins is that right and
full outer joins could be supported, since there is a single set of matched
bits for each hashtable, but that is not yet implemented.
A new GUC enable_parallel_hash is defined to control the feature, defaulting
to on.
Author: Thomas Munro
Reviewed-By: Andres Freund, Robert Haas
Tested-By: Rafia Sabih, Prabhat Sahu
Discussion:
https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com
https://postgr.es/m/CAEepm=37HKyJ4U6XOLi=JgfSHM3o6B-GaeO-6hkOmneTDkH+Uw@mail.gmail.com
2017-12-21 08:39:21 +01:00
|
|
|
<varlistentry id="guc-enable-parallel-hash" xreflabel="enable_parallel_hash">
|
|
|
|
<term><varname>enable_parallel_hash</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>enable_parallel_hash</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's use of hash-join plan
|
|
|
|
types with parallel hash. Has no effect if hash-join plans are not
|
|
|
|
also enabled. The default is <literal>on</literal>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2018-04-23 22:57:43 +02:00
|
|
|
<varlistentry id="guc-enable-partition-pruning" xreflabel="enable_partition_pruning">
|
|
|
|
<term><varname>enable_partition_pruning</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>enable_partition_pruning</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's ability to eliminate a
|
|
|
|
partitioned table's partitions from query plans. This also controls
|
|
|
|
the planner's ability to generate query plans which allow the query
|
|
|
|
executor to remove (ignore) partitions during query execution. The
|
|
|
|
default is <literal>on</literal>.
|
2018-05-12 17:08:17 +02:00
|
|
|
See <xref linkend="ddl-partition-pruning"/> for details.
|
2018-04-23 22:57:43 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2018-02-16 16:33:59 +01:00
|
|
|
<varlistentry id="guc-enable-partitionwise-join" xreflabel="enable_partitionwise_join">
|
|
|
|
<term><varname>enable_partitionwise_join</varname> (<type>boolean</type>)
|
Basic partition-wise join functionality.
Instead of joining two partitioned tables in their entirety we can, if
it is an equi-join on the partition keys, join the matching partitions
individually. This involves teaching the planner about "other join"
rels, which are related to regular join rels in the same way that
other member rels are related to baserels. This can use significantly
more CPU time and memory than regular join planning, because there may
now be a set of "other" rels not only for every base relation but also
for every join relation. In most practical cases, this probably
shouldn't be a problem, because (1) it's probably unusual to join many
tables each with many partitions using the partition keys for all
joins and (2) if you do that scenario then you probably have a big
enough machine to handle the increased memory cost of planning and (3)
the resulting plan is highly likely to be better, so what you spend in
planning you'll make up on the execution side. All the same, for now,
turn this feature off by default.
Currently, we can only perform joins between two tables whose
partitioning schemes are absolutely identical. It would be nice to
cope with other scenarios, such as extra partitions on one side or the
other with no match on the other side, but that will have to wait for
a future patch.
Ashutosh Bapat, reviewed and tested by Rajkumar Raghuwanshi, Amit
Langote, Rafia Sabih, Thomas Munro, Dilip Kumar, Antonin Houska, Amit
Khandekar, and by me. A few final adjustments by me.
Discussion: http://postgr.es/m/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com
Discussion: http://postgr.es/m/CAFjFpRcitjfrULr5jfuKWRPsGUX0LQ0k8-yG0Qw2+1LBGNpMdw@mail.gmail.com
2017-10-06 17:11:10 +02:00
|
|
|
<indexterm>
|
2018-02-16 16:33:59 +01:00
|
|
|
<primary><varname>enable_partitionwise_join</varname> configuration parameter</primary>
|
Basic partition-wise join functionality.
Instead of joining two partitioned tables in their entirety we can, if
it is an equi-join on the partition keys, join the matching partitions
individually. This involves teaching the planner about "other join"
rels, which are related to regular join rels in the same way that
other member rels are related to baserels. This can use significantly
more CPU time and memory than regular join planning, because there may
now be a set of "other" rels not only for every base relation but also
for every join relation. In most practical cases, this probably
shouldn't be a problem, because (1) it's probably unusual to join many
tables each with many partitions using the partition keys for all
joins and (2) if you do that scenario then you probably have a big
enough machine to handle the increased memory cost of planning and (3)
the resulting plan is highly likely to be better, so what you spend in
planning you'll make up on the execution side. All the same, for now,
turn this feature off by default.
Currently, we can only perform joins between two tables whose
partitioning schemes are absolutely identical. It would be nice to
cope with other scenarios, such as extra partitions on one side or the
other with no match on the other side, but that will have to wait for
a future patch.
Ashutosh Bapat, reviewed and tested by Rajkumar Raghuwanshi, Amit
Langote, Rafia Sabih, Thomas Munro, Dilip Kumar, Antonin Houska, Amit
Khandekar, and by me. A few final adjustments by me.
Discussion: http://postgr.es/m/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com
Discussion: http://postgr.es/m/CAFjFpRcitjfrULr5jfuKWRPsGUX0LQ0k8-yG0Qw2+1LBGNpMdw@mail.gmail.com
2017-10-06 17:11:10 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-02-16 16:33:59 +01:00
|
|
|
Enables or disables the query planner's use of partitionwise join,
|
Basic partition-wise join functionality.
Instead of joining two partitioned tables in their entirety we can, if
it is an equi-join on the partition keys, join the matching partitions
individually. This involves teaching the planner about "other join"
rels, which are related to regular join rels in the same way that
other member rels are related to baserels. This can use significantly
more CPU time and memory than regular join planning, because there may
now be a set of "other" rels not only for every base relation but also
for every join relation. In most practical cases, this probably
shouldn't be a problem, because (1) it's probably unusual to join many
tables each with many partitions using the partition keys for all
joins and (2) if you do that scenario then you probably have a big
enough machine to handle the increased memory cost of planning and (3)
the resulting plan is highly likely to be better, so what you spend in
planning you'll make up on the execution side. All the same, for now,
turn this feature off by default.
Currently, we can only perform joins between two tables whose
partitioning schemes are absolutely identical. It would be nice to
cope with other scenarios, such as extra partitions on one side or the
other with no match on the other side, but that will have to wait for
a future patch.
Ashutosh Bapat, reviewed and tested by Rajkumar Raghuwanshi, Amit
Langote, Rafia Sabih, Thomas Munro, Dilip Kumar, Antonin Houska, Amit
Khandekar, and by me. A few final adjustments by me.
Discussion: http://postgr.es/m/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com
Discussion: http://postgr.es/m/CAFjFpRcitjfrULr5jfuKWRPsGUX0LQ0k8-yG0Qw2+1LBGNpMdw@mail.gmail.com
2017-10-06 17:11:10 +02:00
|
|
|
which allows a join between partitioned tables to be performed by
|
2018-02-16 16:33:59 +01:00
|
|
|
joining the matching partitions. Partitionwise join currently applies
|
Basic partition-wise join functionality.
Instead of joining two partitioned tables in their entirety we can, if
it is an equi-join on the partition keys, join the matching partitions
individually. This involves teaching the planner about "other join"
rels, which are related to regular join rels in the same way that
other member rels are related to baserels. This can use significantly
more CPU time and memory than regular join planning, because there may
now be a set of "other" rels not only for every base relation but also
for every join relation. In most practical cases, this probably
shouldn't be a problem, because (1) it's probably unusual to join many
tables each with many partitions using the partition keys for all
joins and (2) if you do that scenario then you probably have a big
enough machine to handle the increased memory cost of planning and (3)
the resulting plan is highly likely to be better, so what you spend in
planning you'll make up on the execution side. All the same, for now,
turn this feature off by default.
Currently, we can only perform joins between two tables whose
partitioning schemes are absolutely identical. It would be nice to
cope with other scenarios, such as extra partitions on one side or the
other with no match on the other side, but that will have to wait for
a future patch.
Ashutosh Bapat, reviewed and tested by Rajkumar Raghuwanshi, Amit
Langote, Rafia Sabih, Thomas Munro, Dilip Kumar, Antonin Houska, Amit
Khandekar, and by me. A few final adjustments by me.
Discussion: http://postgr.es/m/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com
Discussion: http://postgr.es/m/CAFjFpRcitjfrULr5jfuKWRPsGUX0LQ0k8-yG0Qw2+1LBGNpMdw@mail.gmail.com
2017-10-06 17:11:10 +02:00
|
|
|
only when the join conditions include all the partition keys, which
|
Allow partitionwise joins in more cases.
Previously, the partitionwise join technique only allowed partitionwise
join when input partitioned tables had exactly the same partition
bounds. This commit extends the technique to some cases when the tables
have different partition bounds, by using an advanced partition-matching
algorithm introduced by this commit. For both the input partitioned
tables, the algorithm checks whether every partition of one input
partitioned table only matches one partition of the other input
partitioned table at most, and vice versa. In such a case the join
between the tables can be broken down into joins between the matching
partitions, so the algorithm produces the pairs of the matching
partitions, plus the partition bounds for the join relation, to allow
partitionwise join for computing the join. Currently, the algorithm
works for list-partitioned and range-partitioned tables, but not
hash-partitioned tables. See comments in partition_bounds_merge().
Ashutosh Bapat and Etsuro Fujita, most of regression tests by Rajkumar
Raghuwanshi, some of the tests by Mark Dilger and Amul Sul, reviewed by
Dmitry Dolgov and Amul Sul, with additional review at various points by
Ashutosh Bapat, Mark Dilger, Robert Haas, Antonin Houska, Amit Langote,
Justin Pryzby, and Tomas Vondra
Discussion: https://postgr.es/m/CAFjFpRdjQvaUEV5DJX3TW6pU5eq54NCkadtxHX2JiJG_GvbrCA@mail.gmail.com
2020-04-08 03:25:00 +02:00
|
|
|
must be of the same data type and have one-to-one matching sets of
|
|
|
|
child partitions. Because partitionwise join planning can use
|
|
|
|
significantly more CPU time and memory during planning, the default is
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>off</literal>.
|
Basic partition-wise join functionality.
Instead of joining two partitioned tables in their entirety we can, if
it is an equi-join on the partition keys, join the matching partitions
individually. This involves teaching the planner about "other join"
rels, which are related to regular join rels in the same way that
other member rels are related to baserels. This can use significantly
more CPU time and memory than regular join planning, because there may
now be a set of "other" rels not only for every base relation but also
for every join relation. In most practical cases, this probably
shouldn't be a problem, because (1) it's probably unusual to join many
tables each with many partitions using the partition keys for all
joins and (2) if you do that scenario then you probably have a big
enough machine to handle the increased memory cost of planning and (3)
the resulting plan is highly likely to be better, so what you spend in
planning you'll make up on the execution side. All the same, for now,
turn this feature off by default.
Currently, we can only perform joins between two tables whose
partitioning schemes are absolutely identical. It would be nice to
cope with other scenarios, such as extra partitions on one side or the
other with no match on the other side, but that will have to wait for
a future patch.
Ashutosh Bapat, reviewed and tested by Rajkumar Raghuwanshi, Amit
Langote, Rafia Sabih, Thomas Munro, Dilip Kumar, Antonin Houska, Amit
Khandekar, and by me. A few final adjustments by me.
Discussion: http://postgr.es/m/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com
Discussion: http://postgr.es/m/CAFjFpRcitjfrULr5jfuKWRPsGUX0LQ0k8-yG0Qw2+1LBGNpMdw@mail.gmail.com
2017-10-06 17:11:10 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Implement partition-wise grouping/aggregation.
If the partition keys of input relation are part of the GROUP BY
clause, all the rows belonging to a given group come from a single
partition. This allows aggregation/grouping over a partitioned
relation to be broken down * into aggregation/grouping on each
partition. This should be no worse, and often better, than the normal
approach.
If the GROUP BY clause does not contain all the partition keys, we can
still perform partial aggregation for each partition and then finalize
aggregation after appending the partial results. This is less certain
to be a win, but it's still useful.
Jeevan Chalke, Ashutosh Bapat, Robert Haas. The larger patch series
of which this patch is a part was also reviewed and tested by Antonin
Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin
Knizhnik, Pascal Legrand, and Rafia Sabih.
Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com
2018-03-22 17:49:48 +01:00
|
|
|
<varlistentry id="guc-enable-partitionwise-aggregate" xreflabel="enable_partitionwise_aggregate">
|
|
|
|
<term><varname>enable_partitionwise_aggregate</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>enable_partitionwise_aggregate</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's use of partitionwise grouping
|
|
|
|
or aggregation, which allows grouping or aggregation on a partitioned
|
|
|
|
tables performed separately for each partition. If the <literal>GROUP
|
|
|
|
BY</literal> clause does not include the partition keys, only partial
|
|
|
|
aggregation can be performed on a per-partition basis, and
|
|
|
|
finalization must be performed later. Because partitionwise grouping
|
|
|
|
or aggregation can use significantly more CPU time and memory during
|
|
|
|
planning, the default is <literal>off</literal>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-enable-seqscan" xreflabel="enable_seqscan">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>enable_seqscan</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
|
|
|
<primary>sequential scan</primary>
|
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>enable_seqscan</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's use of sequential scan
|
2010-02-03 18:25:06 +01:00
|
|
|
plan types. It is impossible to suppress sequential scans
|
2005-09-13 00:11:38 +02:00
|
|
|
entirely, but turning this variable off discourages the planner
|
|
|
|
from using one if there are other methods available. The
|
2017-10-09 03:44:17 +02:00
|
|
|
default is <literal>on</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-enable-sort" xreflabel="enable_sort">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>enable_sort</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>enable_sort</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables or disables the query planner's use of explicit sort
|
2010-02-03 18:25:06 +01:00
|
|
|
steps. It is impossible to suppress explicit sorts entirely,
|
2005-09-13 00:11:38 +02:00
|
|
|
but turning this variable off discourages the planner from
|
|
|
|
using one if there are other methods available. The default
|
2017-10-09 03:44:17 +02:00
|
|
|
is <literal>on</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-enable-tidscan" xreflabel="enable_tidscan">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>enable_tidscan</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>enable_tidscan</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Enables or disables the query planner's use of <acronym>TID</acronym>
|
|
|
|
scan plan types. The default is <literal>on</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2010-11-23 21:27:50 +01:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="runtime-config-query-constants">
|
2010-04-03 09:23:02 +02:00
|
|
|
<title>Planner Cost Constants</title>
|
2005-09-13 00:11:38 +02:00
|
|
|
|
2006-06-05 04:49:58 +02:00
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The <firstterm>cost</firstterm> variables described in this section are measured
|
2006-06-05 04:49:58 +02:00
|
|
|
on an arbitrary scale. Only their relative values matter, hence
|
|
|
|
scaling them all up or down by the same factor will result in no change
|
2010-02-03 18:25:06 +01:00
|
|
|
in the planner's choices. By default, these cost variables are based on
|
|
|
|
the cost of sequential page fetches; that is,
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>seq_page_cost</varname> is conventionally set to <literal>1.0</literal>
|
2006-06-05 04:49:58 +02:00
|
|
|
and the other cost variables are set with reference to that. But
|
|
|
|
you can use a different scale if you prefer, such as actual execution
|
|
|
|
times in milliseconds on a particular machine.
|
|
|
|
</para>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<note>
|
|
|
|
<para>
|
2006-06-05 04:49:58 +02:00
|
|
|
Unfortunately, there is no well-defined method for determining ideal
|
|
|
|
values for the cost variables. They are best treated as averages over
|
2010-02-03 18:25:06 +01:00
|
|
|
the entire mix of queries that a particular installation will receive. This
|
2006-06-05 04:49:58 +02:00
|
|
|
means that changing them on the basis of just a few experiments is very
|
|
|
|
risky.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</note>
|
|
|
|
|
|
|
|
<variablelist>
|
2006-06-05 04:49:58 +02:00
|
|
|
|
|
|
|
<varlistentry id="guc-seq-page-cost" xreflabel="seq_page_cost">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>seq_page_cost</varname> (<type>floating point</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>seq_page_cost</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2006-06-05 04:49:58 +02:00
|
|
|
Sets the planner's estimate of the cost of a disk page fetch
|
|
|
|
that is part of a series of sequential fetches. The default is 1.0.
|
2011-09-29 01:39:54 +02:00
|
|
|
This value can be overridden for tables and indexes in a particular
|
|
|
|
tablespace by setting the tablespace parameter of the same name
|
2017-11-23 15:39:47 +01:00
|
|
|
(see <xref linkend="sql-altertablespace"/>).
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-random-page-cost" xreflabel="random_page_cost">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>random_page_cost</varname> (<type>floating point</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>random_page_cost</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the planner's estimate of the cost of a
|
2006-06-05 04:49:58 +02:00
|
|
|
non-sequentially-fetched disk page. The default is 4.0.
|
2011-09-29 01:39:54 +02:00
|
|
|
This value can be overridden for tables and indexes in a particular
|
|
|
|
tablespace by setting the tablespace parameter of the same name
|
2017-11-23 15:39:47 +01:00
|
|
|
(see <xref linkend="sql-altertablespace"/>).
|
2010-01-05 22:54:00 +01:00
|
|
|
</para>
|
|
|
|
|
2010-01-06 03:41:37 +01:00
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Reducing this value relative to <varname>seq_page_cost</varname>
|
2006-06-05 04:49:58 +02:00
|
|
|
will cause the system to prefer index scans; raising it will
|
|
|
|
make index scans look relatively more expensive. You can raise
|
|
|
|
or lower both values together to change the importance of disk I/O
|
|
|
|
costs relative to CPU costs, which are described by the following
|
|
|
|
parameters.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2006-06-05 04:49:58 +02:00
|
|
|
|
2012-02-14 22:54:54 +01:00
|
|
|
<para>
|
|
|
|
Random access to mechanical disk storage is normally much more expensive
|
2013-10-08 18:12:24 +02:00
|
|
|
than four times sequential access. However, a lower default is used
|
2012-02-14 22:54:54 +01:00
|
|
|
(4.0) because the majority of random accesses to disk, such as indexed
|
|
|
|
reads, are assumed to be in cache. The default value can be thought of
|
|
|
|
as modeling random access as 40 times slower than sequential, while
|
|
|
|
expecting 90% of random reads to be cached.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
If you believe a 90% cache rate is an incorrect assumption
|
|
|
|
for your workload, you can increase random_page_cost to better
|
|
|
|
reflect the true cost of random storage reads. Correspondingly,
|
|
|
|
if your data is likely to be completely in cache, such as when
|
|
|
|
the database is smaller than the total server memory, decreasing
|
|
|
|
random_page_cost can be appropriate. Storage that has a low random
|
2020-09-01 00:33:37 +02:00
|
|
|
read cost relative to sequential, e.g., solid-state drives, might
|
2020-05-22 02:28:38 +02:00
|
|
|
also be better modeled with a lower value for random_page_cost,
|
|
|
|
e.g., <literal>1.1</literal>.
|
2012-02-14 22:54:54 +01:00
|
|
|
</para>
|
|
|
|
|
2006-06-05 04:49:58 +02:00
|
|
|
<tip>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Although the system will let you set <varname>random_page_cost</varname> to
|
|
|
|
less than <varname>seq_page_cost</varname>, it is not physically sensible
|
2006-06-05 04:49:58 +02:00
|
|
|
to do so. However, setting them equal makes sense if the database
|
|
|
|
is entirely cached in RAM, since in that case there is no penalty
|
|
|
|
for touching pages out of sequence. Also, in a heavily-cached
|
|
|
|
database you should lower both values relative to the CPU parameters,
|
|
|
|
since the cost of fetching a page already in RAM is much smaller
|
|
|
|
than it would normally be.
|
|
|
|
</para>
|
|
|
|
</tip>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-cpu-tuple-cost" xreflabel="cpu_tuple_cost">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>cpu_tuple_cost</varname> (<type>floating point</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>cpu_tuple_cost</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the planner's estimate of the cost of processing
|
2006-06-05 04:49:58 +02:00
|
|
|
each row during a query.
|
|
|
|
The default is 0.01.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-cpu-index-tuple-cost" xreflabel="cpu_index_tuple_cost">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>cpu_index_tuple_cost</varname> (<type>floating point</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>cpu_index_tuple_cost</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the planner's estimate of the cost of processing
|
2006-06-05 04:49:58 +02:00
|
|
|
each index entry during an index scan.
|
2006-06-05 05:03:42 +02:00
|
|
|
The default is 0.005.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2010-11-23 21:27:50 +01:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-cpu-operator-cost" xreflabel="cpu_operator_cost">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>cpu_operator_cost</varname> (<type>floating point</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>cpu_operator_cost</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the planner's estimate of the cost of processing each
|
2006-06-05 04:49:58 +02:00
|
|
|
operator or function executed during a query.
|
|
|
|
The default is 0.0025.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2010-11-23 21:27:50 +01:00
|
|
|
|
2016-05-05 19:27:59 +02:00
|
|
|
<varlistentry id="guc-parallel-setup-cost" xreflabel="parallel_setup_cost">
|
|
|
|
<term><varname>parallel_setup_cost</varname> (<type>floating point</type>)
|
Add a Gather executor node.
A Gather executor node runs any number of copies of a plan in an equal
number of workers and merges all of the results into a single tuple
stream. It can also run the plan itself, if the workers are
unavailable or haven't started up yet. It is intended to work with
the Partial Seq Scan node which will be added in future commits.
It could also be used to implement parallel query of a different sort
by itself, without help from Partial Seq Scan, if the single_copy mode
is used. In that mode, a worker executes the plan, and the parallel
leader does not, merely collecting the worker's results. So, a Gather
node could be inserted into a plan to split the execution of that plan
across two processes. Nested Gather nodes aren't currently supported,
but we might want to add support for that in the future.
There's nothing in the planner to actually generate Gather nodes yet,
so it's not quite time to break out the champagne. But we're getting
close.
Amit Kapila. Some designs suggestions were provided by me, and I also
reviewed the patch. Single-copy mode, documentation, and other minor
changes also by me.
2015-10-01 01:23:36 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>parallel_setup_cost</varname> configuration parameter</primary>
|
Add a Gather executor node.
A Gather executor node runs any number of copies of a plan in an equal
number of workers and merges all of the results into a single tuple
stream. It can also run the plan itself, if the workers are
unavailable or haven't started up yet. It is intended to work with
the Partial Seq Scan node which will be added in future commits.
It could also be used to implement parallel query of a different sort
by itself, without help from Partial Seq Scan, if the single_copy mode
is used. In that mode, a worker executes the plan, and the parallel
leader does not, merely collecting the worker's results. So, a Gather
node could be inserted into a plan to split the execution of that plan
across two processes. Nested Gather nodes aren't currently supported,
but we might want to add support for that in the future.
There's nothing in the planner to actually generate Gather nodes yet,
so it's not quite time to break out the champagne. But we're getting
close.
Amit Kapila. Some designs suggestions were provided by me, and I also
reviewed the patch. Single-copy mode, documentation, and other minor
changes also by me.
2015-10-01 01:23:36 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2016-05-05 19:27:59 +02:00
|
|
|
Sets the planner's estimate of the cost of launching parallel worker
|
|
|
|
processes.
|
|
|
|
The default is 1000.
|
Add a Gather executor node.
A Gather executor node runs any number of copies of a plan in an equal
number of workers and merges all of the results into a single tuple
stream. It can also run the plan itself, if the workers are
unavailable or haven't started up yet. It is intended to work with
the Partial Seq Scan node which will be added in future commits.
It could also be used to implement parallel query of a different sort
by itself, without help from Partial Seq Scan, if the single_copy mode
is used. In that mode, a worker executes the plan, and the parallel
leader does not, merely collecting the worker's results. So, a Gather
node could be inserted into a plan to split the execution of that plan
across two processes. Nested Gather nodes aren't currently supported,
but we might want to add support for that in the future.
There's nothing in the planner to actually generate Gather nodes yet,
so it's not quite time to break out the champagne. But we're getting
close.
Amit Kapila. Some designs suggestions were provided by me, and I also
reviewed the patch. Single-copy mode, documentation, and other minor
changes also by me.
2015-10-01 01:23:36 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2016-05-05 19:27:59 +02:00
|
|
|
<varlistentry id="guc-parallel-tuple-cost" xreflabel="parallel_tuple_cost">
|
|
|
|
<term><varname>parallel_tuple_cost</varname> (<type>floating point</type>)
|
Add a Gather executor node.
A Gather executor node runs any number of copies of a plan in an equal
number of workers and merges all of the results into a single tuple
stream. It can also run the plan itself, if the workers are
unavailable or haven't started up yet. It is intended to work with
the Partial Seq Scan node which will be added in future commits.
It could also be used to implement parallel query of a different sort
by itself, without help from Partial Seq Scan, if the single_copy mode
is used. In that mode, a worker executes the plan, and the parallel
leader does not, merely collecting the worker's results. So, a Gather
node could be inserted into a plan to split the execution of that plan
across two processes. Nested Gather nodes aren't currently supported,
but we might want to add support for that in the future.
There's nothing in the planner to actually generate Gather nodes yet,
so it's not quite time to break out the champagne. But we're getting
close.
Amit Kapila. Some designs suggestions were provided by me, and I also
reviewed the patch. Single-copy mode, documentation, and other minor
changes also by me.
2015-10-01 01:23:36 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>parallel_tuple_cost</varname> configuration parameter</primary>
|
Add a Gather executor node.
A Gather executor node runs any number of copies of a plan in an equal
number of workers and merges all of the results into a single tuple
stream. It can also run the plan itself, if the workers are
unavailable or haven't started up yet. It is intended to work with
the Partial Seq Scan node which will be added in future commits.
It could also be used to implement parallel query of a different sort
by itself, without help from Partial Seq Scan, if the single_copy mode
is used. In that mode, a worker executes the plan, and the parallel
leader does not, merely collecting the worker's results. So, a Gather
node could be inserted into a plan to split the execution of that plan
across two processes. Nested Gather nodes aren't currently supported,
but we might want to add support for that in the future.
There's nothing in the planner to actually generate Gather nodes yet,
so it's not quite time to break out the champagne. But we're getting
close.
Amit Kapila. Some designs suggestions were provided by me, and I also
reviewed the patch. Single-copy mode, documentation, and other minor
changes also by me.
2015-10-01 01:23:36 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2016-05-05 19:27:59 +02:00
|
|
|
Sets the planner's estimate of the cost of transferring one tuple
|
|
|
|
from a parallel worker process to another process.
|
|
|
|
The default is 0.1.
|
Add a Gather executor node.
A Gather executor node runs any number of copies of a plan in an equal
number of workers and merges all of the results into a single tuple
stream. It can also run the plan itself, if the workers are
unavailable or haven't started up yet. It is intended to work with
the Partial Seq Scan node which will be added in future commits.
It could also be used to implement parallel query of a different sort
by itself, without help from Partial Seq Scan, if the single_copy mode
is used. In that mode, a worker executes the plan, and the parallel
leader does not, merely collecting the worker's results. So, a Gather
node could be inserted into a plan to split the execution of that plan
across two processes. Nested Gather nodes aren't currently supported,
but we might want to add support for that in the future.
There's nothing in the planner to actually generate Gather nodes yet,
so it's not quite time to break out the champagne. But we're getting
close.
Amit Kapila. Some designs suggestions were provided by me, and I also
reviewed the patch. Single-copy mode, documentation, and other minor
changes also by me.
2015-10-01 01:23:36 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Replace min_parallel_relation_size with two new GUCs.
When min_parallel_relation_size was added, the only supported type
of parallel scan was a parallel sequential scan, but there are
pending patches for parallel index scan, parallel index-only scan,
and parallel bitmap heap scan. Those patches introduce two new
types of complications: first, what's relevant is not really the
total size of the relation but the portion of it that we will scan;
and second, index pages and heap pages shouldn't necessarily be
treated in exactly the same way. Typically, the number of index
pages will be quite small, but that doesn't necessarily mean that
a parallel index scan can't pay off.
Therefore, we introduce min_parallel_table_scan_size, which works
out a degree of parallelism for scans based on the number of table
pages that will be scanned (and which is therefore equivalent to
min_parallel_relation_size for parallel sequential scans) and also
min_parallel_index_scan_size which can be used to work out a degree
of parallelism based on the number of index pages that will be
scanned.
Amit Kapila and Robert Haas
Discussion: http://postgr.es/m/CAA4eK1KowGSYYVpd2qPpaPPA5R90r++QwDFbrRECTE9H_HvpOg@mail.gmail.com
Discussion: http://postgr.es/m/CAA4eK1+TnM4pXQbvn7OXqam+k_HZqb0ROZUMxOiL6DWJYCyYow@mail.gmail.com
2017-02-15 19:37:24 +01:00
|
|
|
<varlistentry id="guc-min-parallel-table-scan-size" xreflabel="min_parallel_table_scan_size">
|
|
|
|
<term><varname>min_parallel_table_scan_size</varname> (<type>integer</type>)
|
2016-06-16 19:47:20 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>min_parallel_table_scan_size</varname> configuration parameter</primary>
|
2016-06-16 19:47:20 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
Replace min_parallel_relation_size with two new GUCs.
When min_parallel_relation_size was added, the only supported type
of parallel scan was a parallel sequential scan, but there are
pending patches for parallel index scan, parallel index-only scan,
and parallel bitmap heap scan. Those patches introduce two new
types of complications: first, what's relevant is not really the
total size of the relation but the portion of it that we will scan;
and second, index pages and heap pages shouldn't necessarily be
treated in exactly the same way. Typically, the number of index
pages will be quite small, but that doesn't necessarily mean that
a parallel index scan can't pay off.
Therefore, we introduce min_parallel_table_scan_size, which works
out a degree of parallelism for scans based on the number of table
pages that will be scanned (and which is therefore equivalent to
min_parallel_relation_size for parallel sequential scans) and also
min_parallel_index_scan_size which can be used to work out a degree
of parallelism based on the number of index pages that will be
scanned.
Amit Kapila and Robert Haas
Discussion: http://postgr.es/m/CAA4eK1KowGSYYVpd2qPpaPPA5R90r++QwDFbrRECTE9H_HvpOg@mail.gmail.com
Discussion: http://postgr.es/m/CAA4eK1+TnM4pXQbvn7OXqam+k_HZqb0ROZUMxOiL6DWJYCyYow@mail.gmail.com
2017-02-15 19:37:24 +01:00
|
|
|
Sets the minimum amount of table data that must be scanned in order
|
|
|
|
for a parallel scan to be considered. For a parallel sequential scan,
|
|
|
|
the amount of table data scanned is always equal to the size of the
|
|
|
|
table, but when indexes are used the amount of table data
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
scanned will normally be less.
|
|
|
|
If this value is specified without units, it is taken as blocks,
|
|
|
|
that is <symbol>BLCKSZ</symbol> bytes, typically 8kB.
|
|
|
|
The default is 8 megabytes (<literal>8MB</literal>).
|
Replace min_parallel_relation_size with two new GUCs.
When min_parallel_relation_size was added, the only supported type
of parallel scan was a parallel sequential scan, but there are
pending patches for parallel index scan, parallel index-only scan,
and parallel bitmap heap scan. Those patches introduce two new
types of complications: first, what's relevant is not really the
total size of the relation but the portion of it that we will scan;
and second, index pages and heap pages shouldn't necessarily be
treated in exactly the same way. Typically, the number of index
pages will be quite small, but that doesn't necessarily mean that
a parallel index scan can't pay off.
Therefore, we introduce min_parallel_table_scan_size, which works
out a degree of parallelism for scans based on the number of table
pages that will be scanned (and which is therefore equivalent to
min_parallel_relation_size for parallel sequential scans) and also
min_parallel_index_scan_size which can be used to work out a degree
of parallelism based on the number of index pages that will be
scanned.
Amit Kapila and Robert Haas
Discussion: http://postgr.es/m/CAA4eK1KowGSYYVpd2qPpaPPA5R90r++QwDFbrRECTE9H_HvpOg@mail.gmail.com
Discussion: http://postgr.es/m/CAA4eK1+TnM4pXQbvn7OXqam+k_HZqb0ROZUMxOiL6DWJYCyYow@mail.gmail.com
2017-02-15 19:37:24 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-min-parallel-index-scan-size" xreflabel="min_parallel_index_scan_size">
|
|
|
|
<term><varname>min_parallel_index_scan_size</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>min_parallel_index_scan_size</varname> configuration parameter</primary>
|
Replace min_parallel_relation_size with two new GUCs.
When min_parallel_relation_size was added, the only supported type
of parallel scan was a parallel sequential scan, but there are
pending patches for parallel index scan, parallel index-only scan,
and parallel bitmap heap scan. Those patches introduce two new
types of complications: first, what's relevant is not really the
total size of the relation but the portion of it that we will scan;
and second, index pages and heap pages shouldn't necessarily be
treated in exactly the same way. Typically, the number of index
pages will be quite small, but that doesn't necessarily mean that
a parallel index scan can't pay off.
Therefore, we introduce min_parallel_table_scan_size, which works
out a degree of parallelism for scans based on the number of table
pages that will be scanned (and which is therefore equivalent to
min_parallel_relation_size for parallel sequential scans) and also
min_parallel_index_scan_size which can be used to work out a degree
of parallelism based on the number of index pages that will be
scanned.
Amit Kapila and Robert Haas
Discussion: http://postgr.es/m/CAA4eK1KowGSYYVpd2qPpaPPA5R90r++QwDFbrRECTE9H_HvpOg@mail.gmail.com
Discussion: http://postgr.es/m/CAA4eK1+TnM4pXQbvn7OXqam+k_HZqb0ROZUMxOiL6DWJYCyYow@mail.gmail.com
2017-02-15 19:37:24 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the minimum amount of index data that must be scanned in order
|
|
|
|
for a parallel scan to be considered. Note that a parallel index scan
|
|
|
|
typically won't touch the entire index; it is the number of pages
|
|
|
|
which the planner believes will actually be touched by the scan which
|
2020-01-20 03:27:49 +01:00
|
|
|
is relevant. This parameter is also used to decide whether a
|
|
|
|
particular index can participate in a parallel vacuum. See
|
|
|
|
<xref linkend="sql-vacuum"/>.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as blocks,
|
|
|
|
that is <symbol>BLCKSZ</symbol> bytes, typically 8kB.
|
|
|
|
The default is 512 kilobytes (<literal>512kB</literal>).
|
2016-06-16 19:47:20 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2006-06-05 04:49:58 +02:00
|
|
|
<varlistentry id="guc-effective-cache-size" xreflabel="effective_cache_size">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>effective_cache_size</varname> (<type>integer</type>)
|
2006-06-05 04:49:58 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>effective_cache_size</varname> configuration parameter</primary>
|
2006-06-05 04:49:58 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2006-06-05 04:49:58 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the planner's assumption about the effective size of the
|
2014-01-27 06:05:49 +01:00
|
|
|
disk cache that is available to a single query. This is
|
|
|
|
factored into estimates of the cost of using an index; a
|
|
|
|
higher value makes it more likely index scans will be used, a
|
2006-06-05 04:49:58 +02:00
|
|
|
lower value makes it more likely sequential scans will be
|
|
|
|
used. When setting this parameter you should consider both
|
|
|
|
<productname>PostgreSQL</productname>'s shared buffers and the
|
|
|
|
portion of the kernel's disk cache that will be used for
|
2018-11-02 14:11:00 +01:00
|
|
|
<productname>PostgreSQL</productname> data files, though some
|
|
|
|
data might exist in both places. Also, take
|
2006-09-26 00:12:24 +02:00
|
|
|
into account the expected number of concurrent queries on different
|
|
|
|
tables, since they will have to share the available
|
2006-06-05 04:49:58 +02:00
|
|
|
space. This parameter has no effect on the size of shared
|
|
|
|
memory allocated by <productname>PostgreSQL</productname>, nor
|
2007-01-20 22:30:26 +01:00
|
|
|
does it reserve kernel disk cache; it is used only for estimation
|
2011-02-01 21:23:35 +01:00
|
|
|
purposes. The system also does not assume data remains in
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
the disk cache between queries.
|
|
|
|
If this value is specified without units, it is taken as blocks,
|
|
|
|
that is <symbol>BLCKSZ</symbol> bytes, typically 8kB.
|
|
|
|
The default is 4 gigabytes (<literal>4GB</literal>).
|
|
|
|
(If <symbol>BLCKSZ</symbol> is not 8kB, the default value scales
|
|
|
|
proportionally to it.)
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2018-03-28 23:22:42 +02:00
|
|
|
<varlistentry id="guc-jit-above-cost" xreflabel="jit_above_cost">
|
|
|
|
<term><varname>jit_above_cost</varname> (<type>floating point</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>jit_above_cost</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-09-15 23:24:35 +02:00
|
|
|
Sets the query cost above which JIT compilation is activated, if
|
|
|
|
enabled (see <xref linkend="jit"/>).
|
|
|
|
Performing <acronym>JIT</acronym> costs planning time but can
|
|
|
|
accelerate query execution.
|
|
|
|
Setting this to <literal>-1</literal> disables JIT compilation.
|
2018-03-28 23:22:42 +02:00
|
|
|
The default is <literal>100000</literal>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2018-09-15 23:24:35 +02:00
|
|
|
<varlistentry id="guc-jit-inline-above-cost" xreflabel="jit_inline_above_cost">
|
|
|
|
<term><varname>jit_inline_above_cost</varname> (<type>floating point</type>)
|
2018-03-28 23:22:42 +02:00
|
|
|
<indexterm>
|
2018-09-15 23:24:35 +02:00
|
|
|
<primary><varname>jit_inline_above_cost</varname> configuration parameter</primary>
|
2018-03-28 23:22:42 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-09-15 23:24:35 +02:00
|
|
|
Sets the query cost above which JIT compilation attempts to inline
|
|
|
|
functions and operators. Inlining adds planning time, but can
|
|
|
|
improve execution speed. It is not meaningful to set this to less
|
|
|
|
than <varname>jit_above_cost</varname>.
|
|
|
|
Setting this to <literal>-1</literal> disables inlining.
|
2018-03-28 23:22:42 +02:00
|
|
|
The default is <literal>500000</literal>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2018-09-15 23:24:35 +02:00
|
|
|
<varlistentry id="guc-jit-optimize-above-cost" xreflabel="jit_optimize_above_cost">
|
|
|
|
<term><varname>jit_optimize_above_cost</varname> (<type>floating point</type>)
|
2018-03-28 23:22:42 +02:00
|
|
|
<indexterm>
|
2018-09-15 23:24:35 +02:00
|
|
|
<primary><varname>jit_optimize_above_cost</varname> configuration parameter</primary>
|
2018-03-28 23:22:42 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-09-15 23:24:35 +02:00
|
|
|
Sets the query cost above which JIT compilation applies expensive
|
|
|
|
optimizations. Such optimization adds planning time, but can improve
|
|
|
|
execution speed. It is not meaningful to set this to less
|
|
|
|
than <varname>jit_above_cost</varname>, and it is unlikely to be
|
|
|
|
beneficial to set it to more
|
|
|
|
than <varname>jit_inline_above_cost</varname>.
|
|
|
|
Setting this to <literal>-1</literal> disables expensive optimizations.
|
2018-03-28 23:22:42 +02:00
|
|
|
The default is <literal>500000</literal>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="runtime-config-query-geqo">
|
|
|
|
<title>Genetic Query Optimizer</title>
|
|
|
|
|
2009-06-02 19:37:55 +02:00
|
|
|
<para>
|
|
|
|
The genetic query optimizer (GEQO) is an algorithm that does query
|
|
|
|
planning using heuristic searching. This reduces planning time for
|
|
|
|
complex queries (those joining many relations), at the cost of producing
|
|
|
|
plans that are sometimes inferior to those found by the normal
|
2012-04-10 02:49:01 +02:00
|
|
|
exhaustive-search algorithm.
|
2017-11-23 15:39:47 +01:00
|
|
|
For more information see <xref linkend="geqo"/>.
|
2009-06-02 19:37:55 +02:00
|
|
|
</para>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<variablelist>
|
|
|
|
|
|
|
|
<varlistentry id="guc-geqo" xreflabel="geqo">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>geqo</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
|
|
|
<primary>genetic query optimization</primary>
|
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
|
|
<primary>GEQO</primary>
|
|
|
|
<see>genetic query optimization</see>
|
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>geqo</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2009-06-02 19:37:55 +02:00
|
|
|
Enables or disables genetic query optimization.
|
|
|
|
This is on by default. It is usually best not to turn it off in
|
2010-02-03 18:25:06 +01:00
|
|
|
production; the <varname>geqo_threshold</varname> variable provides
|
|
|
|
more granular control of GEQO.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-geqo-threshold" xreflabel="geqo_threshold">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>geqo_threshold</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>geqo_threshold</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Use genetic query optimization to plan queries with at least
|
2017-10-09 03:44:17 +02:00
|
|
|
this many <literal>FROM</literal> items involved. (Note that a
|
|
|
|
<literal>FULL OUTER JOIN</literal> construct counts as only one <literal>FROM</literal>
|
2005-09-13 00:11:38 +02:00
|
|
|
item.) The default is 12. For simpler queries it is usually best
|
2012-04-10 02:49:01 +02:00
|
|
|
to use the regular, exhaustive-search planner, but for queries with
|
|
|
|
many tables the exhaustive search takes too long, often
|
|
|
|
longer than the penalty of executing a suboptimal plan. Thus,
|
|
|
|
a threshold on the size of the query is a convenient way to manage
|
|
|
|
use of GEQO.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-geqo-effort" xreflabel="geqo_effort">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>geqo_effort</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>geqo_effort</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2009-06-02 19:37:55 +02:00
|
|
|
Controls the trade-off between planning time and query plan
|
|
|
|
quality in GEQO. This variable must be an integer in the
|
2007-01-20 22:30:26 +01:00
|
|
|
range from 1 to 10. The default value is five. Larger values
|
2005-09-13 00:11:38 +02:00
|
|
|
increase the time spent doing query planning, but also
|
|
|
|
increase the likelihood that an efficient query plan will be
|
|
|
|
chosen.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
<varname>geqo_effort</varname> doesn't actually do anything
|
|
|
|
directly; it is only used to compute the default values for
|
|
|
|
the other variables that influence GEQO behavior (described
|
|
|
|
below). If you prefer, you can set the other parameters by
|
|
|
|
hand instead.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-geqo-pool-size" xreflabel="geqo_pool_size">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>geqo_pool_size</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>geqo_pool_size</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2009-06-02 19:37:55 +02:00
|
|
|
Controls the pool size used by GEQO, that is the
|
2005-09-13 00:11:38 +02:00
|
|
|
number of individuals in the genetic population. It must be
|
|
|
|
at least two, and useful values are typically 100 to 1000. If
|
|
|
|
it is set to zero (the default setting) then a suitable
|
2009-06-02 19:37:55 +02:00
|
|
|
value is chosen based on <varname>geqo_effort</varname> and
|
2005-09-13 00:11:38 +02:00
|
|
|
the number of tables in the query.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-geqo-generations" xreflabel="geqo_generations">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>geqo_generations</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>geqo_generations</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2009-06-02 19:37:55 +02:00
|
|
|
Controls the number of generations used by GEQO, that is
|
|
|
|
the number of iterations of the algorithm. It must
|
2005-09-13 00:11:38 +02:00
|
|
|
be at least one, and useful values are in the same range as
|
|
|
|
the pool size. If it is set to zero (the default setting)
|
2009-06-02 19:37:55 +02:00
|
|
|
then a suitable value is chosen based on
|
2005-09-13 00:11:38 +02:00
|
|
|
<varname>geqo_pool_size</varname>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-geqo-selection-bias" xreflabel="geqo_selection_bias">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>geqo_selection_bias</varname> (<type>floating point</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>geqo_selection_bias</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Controls the selection bias used by GEQO. The selection bias
|
|
|
|
is the selective pressure within the population. Values can be
|
|
|
|
from 1.50 to 2.00; the latter is the default.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2009-07-16 22:55:44 +02:00
|
|
|
|
|
|
|
<varlistentry id="guc-geqo-seed" xreflabel="geqo_seed">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>geqo_seed</varname> (<type>floating point</type>)
|
2009-07-16 22:55:44 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>geqo_seed</varname> configuration parameter</primary>
|
2009-07-16 22:55:44 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2009-07-16 22:55:44 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Controls the initial value of the random number generator used
|
|
|
|
by GEQO to select random paths through the join order search space.
|
|
|
|
The value can range from zero (the default) to one. Varying the
|
|
|
|
value changes the set of join paths explored, and may result in a
|
|
|
|
better or worse best path being found.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="runtime-config-query-other">
|
|
|
|
<title>Other Planner Options</title>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
|
|
|
|
<varlistentry id="guc-default-statistics-target" xreflabel="default_statistics_target">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>default_statistics_target</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>default_statistics_target</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2010-02-03 18:25:06 +01:00
|
|
|
Sets the default statistics target for table columns without
|
|
|
|
a column-specific target set via <command>ALTER TABLE
|
2017-10-09 03:44:17 +02:00
|
|
|
SET STATISTICS</command>. Larger values increase the time needed to
|
|
|
|
do <command>ANALYZE</command>, but might improve the quality of the
|
2008-12-13 20:13:44 +01:00
|
|
|
planner's estimates. The default is 100. For more information
|
2017-10-09 03:44:17 +02:00
|
|
|
on the use of statistics by the <productname>PostgreSQL</productname>
|
2017-11-23 15:39:47 +01:00
|
|
|
query planner, refer to <xref linkend="planner-stats"/>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-constraint-exclusion" xreflabel="constraint_exclusion">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>constraint_exclusion</varname> (<type>enum</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
|
|
|
<primary>constraint exclusion</primary>
|
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>constraint_exclusion</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2009-03-23 16:14:14 +01:00
|
|
|
Controls the query planner's use of table constraints to
|
2009-01-07 23:40:49 +01:00
|
|
|
optimize queries.
|
2017-10-09 03:44:17 +02:00
|
|
|
The allowed values of <varname>constraint_exclusion</varname> are
|
|
|
|
<literal>on</literal> (examine constraints for all tables),
|
|
|
|
<literal>off</literal> (never examine constraints), and
|
Clean up handling of constraint_exclusion and enable_partition_pruning.
The interaction of these parameters was a bit confused/confusing,
and in fact v11 entirely misses the opportunity to apply partition
constraints when a partition is accessed directly (rather than
indirectly from its parent).
In HEAD, establish the principle that enable_partition_pruning controls
partition pruning and nothing else. When accessing a partition via its
parent, we do partition pruning (if enabled by enable_partition_pruning)
and then there is no need to consider partition constraints in the
constraint_exclusion logic. When accessing a partition directly, its
partition constraints are applied by the constraint_exclusion logic,
only if constraint_exclusion = on.
In v11, we can't have such a clean division of these GUCs' effects,
partly because we don't want to break compatibility too much in a
released branch, and partly because the clean coding requires
inheritance_planner to have applied partition pruning to a partitioned
target table, which it doesn't in v11. However, we can tweak things
enough to cover the missed case, which seems like a good idea since
it's potentially a performance regression from v10. This patch keeps
v11's previous behavior in which enable_partition_pruning overrides
constraint_exclusion for an inherited target table, though.
In HEAD, also teach relation_excluded_by_constraints that it's okay to use
inheritable constraints when trying to prune a traditional inheritance
tree. This might not be thought worthy of effort given that that feature
is semi-deprecated now, but we have enough infrastructure that it only
takes a couple more lines of code to do it correctly.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/9813f079-f16b-61c8-9ab7-4363cab28d80@lab.ntt.co.jp
Discussion: https://postgr.es/m/29069.1555970894@sss.pgh.pa.us
2019-04-30 21:03:35 +02:00
|
|
|
<literal>partition</literal> (examine constraints only for inheritance
|
|
|
|
child tables and <literal>UNION ALL</literal> subqueries).
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>partition</literal> is the default setting.
|
Clean up handling of constraint_exclusion and enable_partition_pruning.
The interaction of these parameters was a bit confused/confusing,
and in fact v11 entirely misses the opportunity to apply partition
constraints when a partition is accessed directly (rather than
indirectly from its parent).
In HEAD, establish the principle that enable_partition_pruning controls
partition pruning and nothing else. When accessing a partition via its
parent, we do partition pruning (if enabled by enable_partition_pruning)
and then there is no need to consider partition constraints in the
constraint_exclusion logic. When accessing a partition directly, its
partition constraints are applied by the constraint_exclusion logic,
only if constraint_exclusion = on.
In v11, we can't have such a clean division of these GUCs' effects,
partly because we don't want to break compatibility too much in a
released branch, and partly because the clean coding requires
inheritance_planner to have applied partition pruning to a partitioned
target table, which it doesn't in v11. However, we can tweak things
enough to cover the missed case, which seems like a good idea since
it's potentially a performance regression from v10. This patch keeps
v11's previous behavior in which enable_partition_pruning overrides
constraint_exclusion for an inherited target table, though.
In HEAD, also teach relation_excluded_by_constraints that it's okay to use
inheritable constraints when trying to prune a traditional inheritance
tree. This might not be thought worthy of effort given that that feature
is semi-deprecated now, but we have enough infrastructure that it only
takes a couple more lines of code to do it correctly.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/9813f079-f16b-61c8-9ab7-4363cab28d80@lab.ntt.co.jp
Discussion: https://postgr.es/m/29069.1555970894@sss.pgh.pa.us
2019-04-30 21:03:35 +02:00
|
|
|
It is often used with traditional inheritance trees to improve
|
|
|
|
performance.
|
2010-02-03 18:25:06 +01:00
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
|
|
|
|
<para>
|
2009-01-07 23:40:49 +01:00
|
|
|
When this parameter allows it for a particular table, the planner
|
2017-10-09 03:44:17 +02:00
|
|
|
compares query conditions with the table's <literal>CHECK</literal>
|
2009-01-07 23:40:49 +01:00
|
|
|
constraints, and omits scanning tables for which the conditions
|
|
|
|
contradict the constraints. For example:
|
2005-09-13 00:11:38 +02:00
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
CREATE TABLE parent(key integer, ...);
|
|
|
|
CREATE TABLE child1000(check (key between 1000 and 1999)) INHERITS(parent);
|
|
|
|
CREATE TABLE child2000(check (key between 2000 and 2999)) INHERITS(parent);
|
|
|
|
...
|
|
|
|
SELECT * FROM parent WHERE key = 2400;
|
|
|
|
</programlisting>
|
|
|
|
|
2017-10-09 03:44:17 +02:00
|
|
|
With constraint exclusion enabled, this <command>SELECT</command>
|
|
|
|
will not scan <structname>child1000</structname> at all, improving performance.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2009-01-07 23:40:49 +01:00
|
|
|
Currently, constraint exclusion is enabled by default
|
2018-04-23 22:57:43 +02:00
|
|
|
only for cases that are often used to implement table partitioning via
|
Clean up handling of constraint_exclusion and enable_partition_pruning.
The interaction of these parameters was a bit confused/confusing,
and in fact v11 entirely misses the opportunity to apply partition
constraints when a partition is accessed directly (rather than
indirectly from its parent).
In HEAD, establish the principle that enable_partition_pruning controls
partition pruning and nothing else. When accessing a partition via its
parent, we do partition pruning (if enabled by enable_partition_pruning)
and then there is no need to consider partition constraints in the
constraint_exclusion logic. When accessing a partition directly, its
partition constraints are applied by the constraint_exclusion logic,
only if constraint_exclusion = on.
In v11, we can't have such a clean division of these GUCs' effects,
partly because we don't want to break compatibility too much in a
released branch, and partly because the clean coding requires
inheritance_planner to have applied partition pruning to a partitioned
target table, which it doesn't in v11. However, we can tweak things
enough to cover the missed case, which seems like a good idea since
it's potentially a performance regression from v10. This patch keeps
v11's previous behavior in which enable_partition_pruning overrides
constraint_exclusion for an inherited target table, though.
In HEAD, also teach relation_excluded_by_constraints that it's okay to use
inheritable constraints when trying to prune a traditional inheritance
tree. This might not be thought worthy of effort given that that feature
is semi-deprecated now, but we have enough infrastructure that it only
takes a couple more lines of code to do it correctly.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/9813f079-f16b-61c8-9ab7-4363cab28d80@lab.ntt.co.jp
Discussion: https://postgr.es/m/29069.1555970894@sss.pgh.pa.us
2019-04-30 21:03:35 +02:00
|
|
|
inheritance trees. Turning it on for all tables imposes extra
|
2018-04-23 22:57:43 +02:00
|
|
|
planning overhead that is quite noticeable on simple queries, and most
|
|
|
|
often will yield no benefit for simple queries. If you have no
|
Clean up handling of constraint_exclusion and enable_partition_pruning.
The interaction of these parameters was a bit confused/confusing,
and in fact v11 entirely misses the opportunity to apply partition
constraints when a partition is accessed directly (rather than
indirectly from its parent).
In HEAD, establish the principle that enable_partition_pruning controls
partition pruning and nothing else. When accessing a partition via its
parent, we do partition pruning (if enabled by enable_partition_pruning)
and then there is no need to consider partition constraints in the
constraint_exclusion logic. When accessing a partition directly, its
partition constraints are applied by the constraint_exclusion logic,
only if constraint_exclusion = on.
In v11, we can't have such a clean division of these GUCs' effects,
partly because we don't want to break compatibility too much in a
released branch, and partly because the clean coding requires
inheritance_planner to have applied partition pruning to a partitioned
target table, which it doesn't in v11. However, we can tweak things
enough to cover the missed case, which seems like a good idea since
it's potentially a performance regression from v10. This patch keeps
v11's previous behavior in which enable_partition_pruning overrides
constraint_exclusion for an inherited target table, though.
In HEAD, also teach relation_excluded_by_constraints that it's okay to use
inheritable constraints when trying to prune a traditional inheritance
tree. This might not be thought worthy of effort given that that feature
is semi-deprecated now, but we have enough infrastructure that it only
takes a couple more lines of code to do it correctly.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/9813f079-f16b-61c8-9ab7-4363cab28d80@lab.ntt.co.jp
Discussion: https://postgr.es/m/29069.1555970894@sss.pgh.pa.us
2019-04-30 21:03:35 +02:00
|
|
|
tables that are partitioned using traditional inheritance, you might
|
|
|
|
prefer to turn it off entirely. (Note that the equivalent feature for
|
|
|
|
partitioned tables is controlled by a separate parameter,
|
|
|
|
<xref linkend="guc-enable-partition-pruning"/>.)
|
2005-11-02 00:19:05 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-11-23 15:39:47 +01:00
|
|
|
Refer to <xref linkend="ddl-partitioning-constraint-exclusion"/> for
|
Clean up handling of constraint_exclusion and enable_partition_pruning.
The interaction of these parameters was a bit confused/confusing,
and in fact v11 entirely misses the opportunity to apply partition
constraints when a partition is accessed directly (rather than
indirectly from its parent).
In HEAD, establish the principle that enable_partition_pruning controls
partition pruning and nothing else. When accessing a partition via its
parent, we do partition pruning (if enabled by enable_partition_pruning)
and then there is no need to consider partition constraints in the
constraint_exclusion logic. When accessing a partition directly, its
partition constraints are applied by the constraint_exclusion logic,
only if constraint_exclusion = on.
In v11, we can't have such a clean division of these GUCs' effects,
partly because we don't want to break compatibility too much in a
released branch, and partly because the clean coding requires
inheritance_planner to have applied partition pruning to a partitioned
target table, which it doesn't in v11. However, we can tweak things
enough to cover the missed case, which seems like a good idea since
it's potentially a performance regression from v10. This patch keeps
v11's previous behavior in which enable_partition_pruning overrides
constraint_exclusion for an inherited target table, though.
In HEAD, also teach relation_excluded_by_constraints that it's okay to use
inheritable constraints when trying to prune a traditional inheritance
tree. This might not be thought worthy of effort given that that feature
is semi-deprecated now, but we have enough infrastructure that it only
takes a couple more lines of code to do it correctly.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/9813f079-f16b-61c8-9ab7-4363cab28d80@lab.ntt.co.jp
Discussion: https://postgr.es/m/29069.1555970894@sss.pgh.pa.us
2019-04-30 21:03:35 +02:00
|
|
|
more information on using constraint exclusion to implement
|
|
|
|
partitioning.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2008-05-02 23:26:10 +02:00
|
|
|
<varlistentry id="guc-cursor-tuple-fraction" xreflabel="cursor_tuple_fraction">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>cursor_tuple_fraction</varname> (<type>floating point</type>)
|
2008-05-02 23:26:10 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>cursor_tuple_fraction</varname> configuration parameter</primary>
|
2008-05-02 23:26:10 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2008-05-02 23:26:10 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the planner's estimate of the fraction of a cursor's rows that
|
|
|
|
will be retrieved. The default is 0.1. Smaller values of this
|
2017-10-09 03:44:17 +02:00
|
|
|
setting bias the planner towards using <quote>fast start</quote> plans
|
2008-05-02 23:26:10 +02:00
|
|
|
for cursors, which will retrieve the first few rows quickly while
|
|
|
|
perhaps taking a long time to fetch all rows. Larger values
|
|
|
|
put more emphasis on the total estimated time. At the maximum
|
|
|
|
setting of 1.0, cursors are planned exactly like regular queries,
|
|
|
|
considering only the total estimated time and not how soon the
|
|
|
|
first rows might be delivered.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-from-collapse-limit" xreflabel="from_collapse_limit">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>from_collapse_limit</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>from_collapse_limit</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
The planner will merge sub-queries into upper queries if the
|
|
|
|
resulting <literal>FROM</literal> list would have no more than
|
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
|
|
|
this many items. Smaller values reduce planning time but might
|
2009-06-02 19:37:55 +02:00
|
|
|
yield inferior query plans. The default is eight.
|
2017-11-23 15:39:47 +01:00
|
|
|
For more information see <xref linkend="explicit-joins"/>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2009-06-02 19:37:55 +02:00
|
|
|
|
|
|
|
<para>
|
2017-11-23 15:39:47 +01:00
|
|
|
Setting this value to <xref linkend="guc-geqo-threshold"/> or more
|
2012-04-10 02:49:01 +02:00
|
|
|
may trigger use of the GEQO planner, resulting in non-optimal
|
2017-11-23 15:39:47 +01:00
|
|
|
plans. See <xref linkend="runtime-config-query-geqo"/>.
|
2009-06-02 19:37:55 +02:00
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2018-03-28 23:22:42 +02:00
|
|
|
<varlistentry id="guc-jit" xreflabel="jit">
|
|
|
|
<term><varname>jit</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>jit</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-09-15 23:24:35 +02:00
|
|
|
Determines whether <acronym>JIT</acronym> compilation may be used by
|
2018-03-28 23:22:42 +02:00
|
|
|
<productname>PostgreSQL</productname>, if available (see <xref
|
|
|
|
linkend="jit"/>).
|
|
|
|
The default is <literal>on</literal>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-join-collapse-limit" xreflabel="join_collapse_limit">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>join_collapse_limit</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>join_collapse_limit</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The planner will rewrite explicit <literal>JOIN</literal>
|
|
|
|
constructs (except <literal>FULL JOIN</literal>s) into lists of
|
|
|
|
<literal>FROM</literal> items whenever a list of no more than this many items
|
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
|
|
|
would result. Smaller values reduce planning time but might
|
2005-12-20 03:30:36 +01:00
|
|
|
yield inferior query plans.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
By default, this variable is set the same as
|
|
|
|
<varname>from_collapse_limit</varname>, which is appropriate
|
|
|
|
for most uses. Setting it to 1 prevents any reordering of
|
2017-10-09 03:44:17 +02:00
|
|
|
explicit <literal>JOIN</literal>s. Thus, the explicit join order
|
2005-09-13 00:11:38 +02:00
|
|
|
specified in the query will be the actual order in which the
|
2010-02-03 18:25:06 +01:00
|
|
|
relations are joined. Because the query planner does not always choose
|
|
|
|
the optimal join order, advanced users can elect to
|
2005-09-13 00:11:38 +02:00
|
|
|
temporarily set this variable to 1, and then specify the join
|
2005-12-20 03:30:36 +01:00
|
|
|
order they desire explicitly.
|
2017-11-23 15:39:47 +01:00
|
|
|
For more information see <xref linkend="explicit-joins"/>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2009-06-02 19:37:55 +02:00
|
|
|
|
|
|
|
<para>
|
2017-11-23 15:39:47 +01:00
|
|
|
Setting this value to <xref linkend="guc-geqo-threshold"/> or more
|
2012-04-10 02:49:01 +02:00
|
|
|
may trigger use of the GEQO planner, resulting in non-optimal
|
2017-11-23 15:39:47 +01:00
|
|
|
plans. See <xref linkend="runtime-config-query-geqo"/>.
|
2009-06-02 19:37:55 +02:00
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2006-09-14 13:16:27 +02:00
|
|
|
|
2018-07-16 13:35:41 +02:00
|
|
|
<varlistentry id="guc-plan-cache_mode" xreflabel="plan_cache_mode">
|
|
|
|
<term><varname>plan_cache_mode</varname> (<type>enum</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>plan_cache_mode</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Prepared statements (either explicitly prepared or implicitly
|
2019-09-30 20:31:12 +02:00
|
|
|
generated, for example by PL/pgSQL) can be executed using custom or
|
|
|
|
generic plans. Custom plans are made afresh for each execution
|
|
|
|
using its specific set of parameter values, while generic plans do
|
|
|
|
not rely on the parameter values and can be re-used across
|
|
|
|
executions. Thus, use of a generic plan saves planning time, but if
|
|
|
|
the ideal plan depends strongly on the parameter values then a
|
|
|
|
generic plan may be inefficient. The choice between these options
|
|
|
|
is normally made automatically, but it can be overridden
|
|
|
|
with <varname>plan_cache_mode</varname>.
|
|
|
|
The allowed values are <literal>auto</literal> (the default),
|
2018-07-16 13:35:41 +02:00
|
|
|
<literal>force_custom_plan</literal> and
|
2019-09-30 20:31:12 +02:00
|
|
|
<literal>force_generic_plan</literal>.
|
|
|
|
This setting is considered when a cached plan is to be executed,
|
|
|
|
not when it is prepared.
|
|
|
|
For more information see <xref linkend="sql-prepare"/>.
|
2018-07-16 13:35:41 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="runtime-config-logging">
|
|
|
|
<title>Error Reporting and Logging</title>
|
|
|
|
|
|
|
|
<indexterm zone="runtime-config-logging">
|
|
|
|
<primary>server log</primary>
|
|
|
|
</indexterm>
|
|
|
|
|
|
|
|
<sect2 id="runtime-config-logging-where">
|
2019-09-08 10:26:35 +02:00
|
|
|
<title>Where to Log</title>
|
2005-09-13 00:11:38 +02:00
|
|
|
|
|
|
|
<indexterm zone="runtime-config-logging-where">
|
|
|
|
<primary>where to log</primary>
|
|
|
|
</indexterm>
|
|
|
|
|
2017-03-03 07:02:45 +01:00
|
|
|
<indexterm>
|
|
|
|
<primary>current_logfiles</primary>
|
|
|
|
<secondary>and the log_destination configuration parameter</secondary>
|
|
|
|
</indexterm>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<variablelist>
|
|
|
|
|
|
|
|
<varlistentry id="guc-log-destination" xreflabel="log_destination">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_destination</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_destination</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
<productname>PostgreSQL</productname> supports several methods
|
|
|
|
for logging server messages, including
|
2007-08-19 03:41:25 +02:00
|
|
|
<systemitem>stderr</systemitem>, <systemitem>csvlog</systemitem> and
|
2010-11-23 21:27:50 +01:00
|
|
|
<systemitem>syslog</systemitem>. On Windows,
|
2005-09-13 00:11:38 +02:00
|
|
|
<systemitem>eventlog</systemitem> is also supported. Set this
|
2006-01-23 19:16:41 +01:00
|
|
|
parameter to a list of desired log destinations separated by
|
2010-11-23 21:27:50 +01:00
|
|
|
commas. The default is to log to <systemitem>stderr</systemitem>
|
2005-09-13 00:11:38 +02:00
|
|
|
only.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2006-01-23 19:16:41 +01:00
|
|
|
file or on the server command line.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2007-09-22 21:10:44 +02:00
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
If <systemitem>csvlog</systemitem> is included in <varname>log_destination</varname>,
|
2007-09-22 21:10:44 +02:00
|
|
|
log entries are output in <quote>comma separated
|
2017-10-09 03:44:17 +02:00
|
|
|
value</quote> (<acronym>CSV</acronym>) format, which is convenient for
|
2010-02-03 18:25:06 +01:00
|
|
|
loading logs into programs.
|
2017-11-23 15:39:47 +01:00
|
|
|
See <xref linkend="runtime-config-logging-csvlog"/> for details.
|
|
|
|
<xref linkend="guc-logging-collector"/> must be enabled to generate
|
2007-09-22 21:10:44 +02:00
|
|
|
CSV-format log output.
|
2007-08-19 03:41:25 +02:00
|
|
|
</para>
|
2017-03-03 07:02:45 +01:00
|
|
|
<para>
|
|
|
|
When either <systemitem>stderr</systemitem> or
|
|
|
|
<systemitem>csvlog</systemitem> are included, the file
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>current_logfiles</filename> is created to record the location
|
2017-03-03 07:02:45 +01:00
|
|
|
of the log file(s) currently in use by the logging collector and the
|
|
|
|
associated logging destination. This provides a convenient way to
|
|
|
|
find the logs currently in use by the instance. Here is an example of
|
|
|
|
this file's content:
|
|
|
|
<programlisting>
|
2017-03-27 16:34:33 +02:00
|
|
|
stderr log/postgresql.log
|
|
|
|
csvlog log/postgresql.csv
|
2017-03-03 07:02:45 +01:00
|
|
|
</programlisting>
|
|
|
|
|
|
|
|
<filename>current_logfiles</filename> is recreated when a new log file
|
|
|
|
is created as an effect of rotation, and
|
2017-10-09 03:44:17 +02:00
|
|
|
when <varname>log_destination</varname> is reloaded. It is removed when
|
2017-03-03 07:02:45 +01:00
|
|
|
neither <systemitem>stderr</systemitem>
|
|
|
|
nor <systemitem>csvlog</systemitem> are included
|
2017-10-09 03:44:17 +02:00
|
|
|
in <varname>log_destination</varname>, and when the logging collector is
|
2017-03-03 07:02:45 +01:00
|
|
|
disabled.
|
|
|
|
</para>
|
2007-10-20 06:00:38 +02:00
|
|
|
|
|
|
|
<note>
|
|
|
|
<para>
|
|
|
|
On most Unix systems, you will need to alter the configuration of
|
|
|
|
your system's <application>syslog</application> daemon in order
|
|
|
|
to make use of the <systemitem>syslog</systemitem> option for
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>log_destination</varname>. <productname>PostgreSQL</productname>
|
2007-10-20 06:00:38 +02:00
|
|
|
can log to <application>syslog</application> facilities
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>LOCAL0</literal> through <literal>LOCAL7</literal> (see <xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="guc-syslog-facility"/>), but the default
|
2007-10-20 06:00:38 +02:00
|
|
|
<application>syslog</application> configuration on most platforms
|
2010-02-03 18:25:06 +01:00
|
|
|
will discard all such messages. You will need to add something like:
|
2007-10-20 06:00:38 +02:00
|
|
|
<programlisting>
|
|
|
|
local0.* /var/log/postgresql
|
|
|
|
</programlisting>
|
|
|
|
to the <application>syslog</application> daemon's configuration file
|
|
|
|
to make it work.
|
|
|
|
</para>
|
2011-10-25 20:02:55 +02:00
|
|
|
<para>
|
|
|
|
On Windows, when you use the <literal>eventlog</literal>
|
2017-10-09 03:44:17 +02:00
|
|
|
option for <varname>log_destination</varname>, you should
|
2011-10-25 20:02:55 +02:00
|
|
|
register an event source and its library with the operating
|
|
|
|
system so that the Windows Event Viewer can display event
|
|
|
|
log messages cleanly.
|
2017-11-23 15:39:47 +01:00
|
|
|
See <xref linkend="event-log-registration"/> for details.
|
2011-10-25 20:02:55 +02:00
|
|
|
</para>
|
2007-10-20 06:00:38 +02:00
|
|
|
</note>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2007-08-19 03:41:25 +02:00
|
|
|
<varlistentry id="guc-logging-collector" xreflabel="logging_collector">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>logging_collector</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>logging_collector</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter enables the <firstterm>logging collector</firstterm>, which
|
2012-03-05 20:08:52 +01:00
|
|
|
is a background process that captures log messages
|
2017-10-09 03:44:17 +02:00
|
|
|
sent to <systemitem>stderr</systemitem> and redirects them into log files.
|
2007-09-22 21:10:44 +02:00
|
|
|
This approach is often more useful than
|
2017-10-09 03:44:17 +02:00
|
|
|
logging to <application>syslog</application>, since some types of messages
|
|
|
|
might not appear in <application>syslog</application> output. (One common
|
2012-03-05 20:08:52 +01:00
|
|
|
example is dynamic-linker failure messages; another is error messages
|
2017-10-09 03:44:17 +02:00
|
|
|
produced by scripts such as <varname>archive_command</varname>.)
|
2006-01-23 19:16:41 +01:00
|
|
|
This parameter can only be set at server start.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2009-09-10 17:02:46 +02:00
|
|
|
|
2012-03-05 20:08:52 +01:00
|
|
|
<note>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
It is possible to log to <systemitem>stderr</systemitem> without using the
|
2012-03-05 20:08:52 +01:00
|
|
|
logging collector; the log messages will just go to wherever the
|
2017-10-09 03:44:17 +02:00
|
|
|
server's <systemitem>stderr</systemitem> is directed. However, that method is
|
2012-03-05 20:08:52 +01:00
|
|
|
only suitable for low log volumes, since it provides no convenient
|
|
|
|
way to rotate log files. Also, on some platforms not using the
|
|
|
|
logging collector can result in lost or garbled log output, because
|
|
|
|
multiple processes writing concurrently to the same log file can
|
|
|
|
overwrite each other's output.
|
|
|
|
</para>
|
|
|
|
</note>
|
|
|
|
|
2009-09-10 17:02:46 +02:00
|
|
|
<note>
|
|
|
|
<para>
|
|
|
|
The logging collector is designed to never lose messages. This means
|
|
|
|
that in case of extremely high load, server processes could be
|
2012-03-05 20:08:52 +01:00
|
|
|
blocked while trying to send additional log messages when the
|
2017-10-09 03:44:17 +02:00
|
|
|
collector has fallen behind. In contrast, <application>syslog</application>
|
2012-03-05 20:08:52 +01:00
|
|
|
prefers to drop messages if it cannot write them, which means it
|
|
|
|
may fail to log some messages in such cases but it will not block
|
|
|
|
the rest of the system.
|
2009-09-10 17:02:46 +02:00
|
|
|
</para>
|
|
|
|
</note>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-log-directory" xreflabel="log_directory">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_directory</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_directory</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
When <varname>logging_collector</varname> is enabled,
|
2007-08-19 03:41:25 +02:00
|
|
|
this parameter determines the directory in which log files will be created.
|
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
|
|
|
It can be specified as an absolute path, or relative to the
|
2005-09-13 00:11:38 +02:00
|
|
|
cluster data directory.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2006-01-23 19:16:41 +01:00
|
|
|
file or on the server command line.
|
2017-03-27 16:34:33 +02:00
|
|
|
The default is <literal>log</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-log-filename" xreflabel="log_filename">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_filename</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_filename</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2007-08-19 03:41:25 +02:00
|
|
|
When <varname>logging_collector</varname> is enabled,
|
|
|
|
this parameter sets the file names of the created log files. The value
|
2013-05-21 03:13:13 +02:00
|
|
|
is treated as a <function>strftime</function> pattern,
|
2007-08-04 03:26:54 +02:00
|
|
|
so <literal>%</literal>-escapes can be used to specify time-varying
|
|
|
|
file names. (Note that if there are
|
|
|
|
any time-zone-dependent <literal>%</literal>-escapes, the computation
|
2009-02-24 13:09:09 +01:00
|
|
|
is done in the zone specified
|
2017-11-23 15:39:47 +01:00
|
|
|
by <xref linkend="guc-log-timezone"/>.)
|
2011-09-06 04:58:10 +02:00
|
|
|
The supported <literal>%</literal>-escapes are similar to those
|
|
|
|
listed in the Open Group's <ulink
|
2020-07-18 15:43:35 +02:00
|
|
|
url="https://pubs.opengroup.org/onlinepubs/009695399/functions/strftime.html">strftime
|
2011-09-06 04:58:10 +02:00
|
|
|
</ulink> specification.
|
2015-09-11 03:22:21 +02:00
|
|
|
Note that the system's <function>strftime</function> is not used
|
2008-06-14 23:59:59 +02:00
|
|
|
directly, so platform-specific (nonstandard) extensions do not work.
|
2014-06-26 07:27:27 +02:00
|
|
|
The default is <literal>postgresql-%Y-%m-%d_%H%M%S.log</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2009-02-24 13:09:09 +01:00
|
|
|
<para>
|
|
|
|
If you specify a file name without escapes, you should plan to
|
|
|
|
use a log rotation utility to avoid eventually filling the
|
|
|
|
entire disk. In releases prior to 8.4, if
|
|
|
|
no <literal>%</literal> escapes were
|
|
|
|
present, <productname>PostgreSQL</productname> would append
|
|
|
|
the epoch of the new log file's creation time, but this is no
|
|
|
|
longer the case.
|
|
|
|
</para>
|
2007-08-19 03:41:25 +02:00
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
If CSV-format output is enabled in <varname>log_destination</varname>,
|
|
|
|
<literal>.csv</literal> will be appended to the timestamped
|
2007-09-22 21:10:44 +02:00
|
|
|
log file name to create the file name for CSV-format output.
|
2017-10-09 03:44:17 +02:00
|
|
|
(If <varname>log_filename</varname> ends in <literal>.log</literal>, the suffix is
|
2007-09-22 21:10:44 +02:00
|
|
|
replaced instead.)
|
2007-08-19 03:41:25 +02:00
|
|
|
</para>
|
2008-06-14 23:59:59 +02:00
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2008-06-14 23:59:59 +02:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2010-07-17 00:25:51 +02:00
|
|
|
<varlistentry id="guc-log-file-mode" xreflabel="log_file_mode">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_file_mode</varname> (<type>integer</type>)
|
2010-07-17 00:25:51 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_file_mode</varname> configuration parameter</primary>
|
2010-07-17 00:25:51 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2010-07-17 00:25:51 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
On Unix systems this parameter sets the permissions for log files
|
|
|
|
when <varname>logging_collector</varname> is enabled. (On Microsoft
|
|
|
|
Windows this parameter is ignored.)
|
|
|
|
The parameter value is expected to be a numeric mode
|
|
|
|
specified in the format accepted by the
|
|
|
|
<function>chmod</function> and <function>umask</function>
|
|
|
|
system calls. (To use the customary octal format the number
|
|
|
|
must start with a <literal>0</literal> (zero).)
|
|
|
|
</para>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The default permissions are <literal>0600</literal>, meaning only the
|
2010-07-17 00:25:51 +02:00
|
|
|
server owner can read or write the log files. The other commonly
|
2017-10-09 03:44:17 +02:00
|
|
|
useful setting is <literal>0640</literal>, allowing members of the owner's
|
2010-07-17 00:25:51 +02:00
|
|
|
group to read the files. Note however that to make use of such a
|
2017-11-23 15:39:47 +01:00
|
|
|
setting, you'll need to alter <xref linkend="guc-log-directory"/> to
|
2010-07-17 00:25:51 +02:00
|
|
|
store the files somewhere outside the cluster data directory. In
|
|
|
|
any case, it's unwise to make the log files world-readable, since
|
|
|
|
they might contain sensitive data.
|
|
|
|
</para>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2010-07-17 00:25:51 +02:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-log-rotation-age" xreflabel="log_rotation_age">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_rotation_age</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_rotation_age</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2007-08-19 03:41:25 +02:00
|
|
|
When <varname>logging_collector</varname> is enabled,
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
this parameter determines the maximum amount of time to use an
|
|
|
|
individual log file, after which a new log file will be created.
|
|
|
|
If this value is specified without units, it is taken as minutes.
|
|
|
|
The default is 24 hours.
|
|
|
|
Set to zero to disable time-based creation of new log files.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2006-01-23 19:16:41 +01:00
|
|
|
file or on the server command line.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-log-rotation-size" xreflabel="log_rotation_size">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_rotation_size</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_rotation_size</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2007-08-19 03:41:25 +02:00
|
|
|
When <varname>logging_collector</varname> is enabled,
|
|
|
|
this parameter determines the maximum size of an individual log file.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
After this amount of data has been emitted into a log file,
|
|
|
|
a new log file will be created.
|
|
|
|
If this value is specified without units, it is taken as kilobytes.
|
|
|
|
The default is 10 megabytes.
|
|
|
|
Set to zero to disable size-based creation of new log files.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2006-01-23 19:16:41 +01:00
|
|
|
file or on the server command line.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-log-truncate-on-rotation" xreflabel="log_truncate_on_rotation">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_truncate_on_rotation</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_truncate_on_rotation</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2007-08-19 03:41:25 +02:00
|
|
|
When <varname>logging_collector</varname> is enabled,
|
|
|
|
this parameter will cause <productname>PostgreSQL</productname> to truncate (overwrite),
|
2005-09-13 00:11:38 +02:00
|
|
|
rather than append to, any existing log file of the same name.
|
|
|
|
However, truncation will occur only when a new file is being opened
|
|
|
|
due to time-based rotation, not during server startup or size-based
|
|
|
|
rotation. When off, pre-existing files will be appended to in
|
2006-01-23 19:16:41 +01:00
|
|
|
all cases. For example, using this setting in combination with
|
2005-09-13 00:11:38 +02:00
|
|
|
a <varname>log_filename</varname> like <literal>postgresql-%H.log</literal>
|
|
|
|
would result in generating twenty-four hourly log files and then
|
|
|
|
cyclically overwriting them.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2006-01-23 19:16:41 +01:00
|
|
|
file or on the server command line.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
Example: To keep 7 days of logs, one log file per day named
|
2010-11-23 21:27:50 +01:00
|
|
|
<literal>server_log.Mon</literal>, <literal>server_log.Tue</literal>,
|
2005-09-13 00:11:38 +02:00
|
|
|
etc, and automatically overwrite last week's log with this week's log,
|
2010-11-23 21:27:50 +01:00
|
|
|
set <varname>log_filename</varname> to <literal>server_log.%a</literal>,
|
|
|
|
<varname>log_truncate_on_rotation</varname> to <literal>on</literal>, and
|
2005-09-13 00:11:38 +02:00
|
|
|
<varname>log_rotation_age</varname> to <literal>1440</literal>.
|
|
|
|
</para>
|
|
|
|
<para>
|
2010-11-23 21:27:50 +01:00
|
|
|
Example: To keep 24 hours of logs, one log file per hour, but
|
|
|
|
also rotate sooner if the log file size exceeds 1GB, set
|
|
|
|
<varname>log_filename</varname> to <literal>server_log.%H%M</literal>,
|
|
|
|
<varname>log_truncate_on_rotation</varname> to <literal>on</literal>,
|
|
|
|
<varname>log_rotation_age</varname> to <literal>60</literal>, and
|
2005-09-13 00:11:38 +02:00
|
|
|
<varname>log_rotation_size</varname> to <literal>1000000</literal>.
|
2017-10-09 03:44:17 +02:00
|
|
|
Including <literal>%M</literal> in <varname>log_filename</varname> allows
|
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
|
|
|
any size-driven rotations that might occur to select a file name
|
2005-11-05 00:14:02 +01:00
|
|
|
different from the hour's initial file name.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-syslog-facility" xreflabel="syslog_facility">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>syslog_facility</varname> (<type>enum</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>syslog_facility</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
When logging to <application>syslog</application> is enabled, this parameter
|
2005-09-13 00:11:38 +02:00
|
|
|
determines the <application>syslog</application>
|
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
|
|
|
<quote>facility</quote> to be used. You can choose
|
2017-10-09 03:44:17 +02:00
|
|
|
from <literal>LOCAL0</literal>, <literal>LOCAL1</literal>,
|
|
|
|
<literal>LOCAL2</literal>, <literal>LOCAL3</literal>, <literal>LOCAL4</literal>,
|
|
|
|
<literal>LOCAL5</literal>, <literal>LOCAL6</literal>, <literal>LOCAL7</literal>;
|
|
|
|
the default is <literal>LOCAL0</literal>. See also the
|
2005-09-13 00:11:38 +02:00
|
|
|
documentation of your system's
|
|
|
|
<application>syslog</application> daemon.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2006-01-23 19:16:41 +01:00
|
|
|
file or on the server command line.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2010-11-23 21:27:50 +01:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-syslog-ident" xreflabel="syslog_ident">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>syslog_ident</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>syslog_ident</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
When logging to <application>syslog</application> is enabled, this parameter
|
2005-09-13 00:11:38 +02:00
|
|
|
determines the program name used to identify
|
|
|
|
<productname>PostgreSQL</productname> messages in
|
|
|
|
<application>syslog</application> logs. The default is
|
|
|
|
<literal>postgres</literal>.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2006-01-23 19:16:41 +01:00
|
|
|
file or on the server command line.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2009-08-24 22:08:32 +02:00
|
|
|
|
2016-02-27 04:34:30 +01:00
|
|
|
<varlistentry id="guc-syslog-sequence-numbers" xreflabel="syslog_sequence_numbers">
|
|
|
|
<term><varname>syslog_sequence_numbers</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>syslog_sequence_numbers</varname> configuration parameter</primary>
|
2016-02-27 04:34:30 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
When logging to <application>syslog</application> and this is on (the
|
|
|
|
default), then each message will be prefixed by an increasing
|
|
|
|
sequence number (such as <literal>[2]</literal>). This circumvents
|
|
|
|
the <quote>--- last message repeated N times ---</quote> suppression
|
|
|
|
that many syslog implementations perform by default. In more modern
|
2016-06-07 20:15:21 +02:00
|
|
|
syslog implementations, repeated message suppression can be configured
|
2016-02-27 04:34:30 +01:00
|
|
|
(for example, <literal>$RepeatedMsgReduction</literal>
|
|
|
|
in <productname>rsyslog</productname>), so this might not be
|
|
|
|
necessary. Also, you could turn this off if you actually want to
|
|
|
|
suppress repeated messages.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2016-02-27 04:34:30 +01:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2016-03-16 03:48:53 +01:00
|
|
|
<varlistentry id="guc-syslog-split-messages" xreflabel="syslog_split_messages">
|
|
|
|
<term><varname>syslog_split_messages</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>syslog_split_messages</varname> configuration parameter</primary>
|
2016-03-16 03:48:53 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
When logging to <application>syslog</application> is enabled, this parameter
|
2016-03-16 03:48:53 +01:00
|
|
|
determines how messages are delivered to syslog. When on (the
|
|
|
|
default), messages are split by lines, and long lines are split so
|
|
|
|
that they will fit into 1024 bytes, which is a typical size limit for
|
|
|
|
traditional syslog implementations. When off, PostgreSQL server log
|
|
|
|
messages are delivered to the syslog service as is, and it is up to
|
|
|
|
the syslog service to cope with the potentially bulky messages.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
If syslog is ultimately logging to a text file, then the effect will
|
|
|
|
be the same either way, and it is best to leave the setting on, since
|
|
|
|
most syslog implementations either cannot handle large messages or
|
|
|
|
would need to be specially configured to handle them. But if syslog
|
|
|
|
is ultimately writing into some other medium, it might be necessary or
|
|
|
|
more useful to keep messages logically together.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2016-03-16 03:48:53 +01:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2011-10-25 20:02:55 +02:00
|
|
|
<varlistentry id="guc-event-source" xreflabel="event_source">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>event_source</varname> (<type>string</type>)
|
2011-10-25 20:02:55 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>event_source</varname> configuration parameter</primary>
|
2011-10-25 20:02:55 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2011-10-25 20:02:55 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
When logging to <application>event log</application> is enabled, this parameter
|
2011-10-25 20:02:55 +02:00
|
|
|
determines the program name used to identify
|
|
|
|
<productname>PostgreSQL</productname> messages in
|
|
|
|
the log. The default is <literal>PostgreSQL</literal>.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2011-10-25 20:02:55 +02:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="runtime-config-logging-when">
|
2019-09-08 10:26:35 +02:00
|
|
|
<title>When to Log</title>
|
2005-09-13 00:11:38 +02:00
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
|
|
|
|
<varlistentry id="guc-log-min-messages" xreflabel="log_min_messages">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_min_messages</varname> (<type>enum</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_min_messages</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2019-01-07 19:19:46 +01:00
|
|
|
Controls which <link linkend="runtime-config-severity-levels">message
|
|
|
|
levels</link> are written to the server log.
|
2017-10-09 03:44:17 +02:00
|
|
|
Valid values are <literal>DEBUG5</literal>, <literal>DEBUG4</literal>,
|
|
|
|
<literal>DEBUG3</literal>, <literal>DEBUG2</literal>, <literal>DEBUG1</literal>,
|
|
|
|
<literal>INFO</literal>, <literal>NOTICE</literal>, <literal>WARNING</literal>,
|
|
|
|
<literal>ERROR</literal>, <literal>LOG</literal>, <literal>FATAL</literal>, and
|
|
|
|
<literal>PANIC</literal>. Each level includes all the levels that
|
2005-09-13 00:11:38 +02:00
|
|
|
follow it. The later the level, the fewer messages are sent
|
2017-10-09 03:44:17 +02:00
|
|
|
to the log. The default is <literal>WARNING</literal>. Note that
|
|
|
|
<literal>LOG</literal> has a different rank here than in
|
Disallow setting client_min_messages higher than ERROR.
Previously it was possible to set client_min_messages to FATAL or PANIC,
which had the effect of suppressing transmission of regular ERROR messages
to the client. Perhaps that seemed like a useful option in the past, but
the trouble with it is that it breaks guarantees that are explicitly made
in our FE/BE protocol spec about how a query cycle can end. While libpq
and psql manage to cope with the omission, that's mostly because they
are not very bright; client libraries that have more semantic knowledge
are likely to get confused. Notably, pgODBC doesn't behave very sanely.
Let's fix this by getting rid of the ability to set client_min_messages
above ERROR.
In HEAD, just remove the FATAL and PANIC options from the set of allowed
enum values for client_min_messages. (This change also affects
trace_recovery_messages, but that's OK since these aren't useful values
for that variable either.)
In the back branches, there was concern that rejecting these values might
break applications that are explicitly setting things that way. I'm
pretty skeptical of that argument, but accommodate it by accepting these
values and then internally setting the variable to ERROR anyway.
In all branches, this allows a couple of tiny simplifications in the
logic in elog.c, so do that.
Also respond to the point that was made that client_min_messages has
exactly nothing to do with the server's logging behavior, and therefore
does not belong in the "When To Log" subsection of the documentation.
The "Statement Behavior" subsection is a better match, so move it there.
Jonah Harris and Tom Lane
Discussion: https://postgr.es/m/7809.1541521180@sss.pgh.pa.us
Discussion: https://postgr.es/m/15479-ef0f4cc2fd995ca2@postgresql.org
2018-11-08 23:33:25 +01:00
|
|
|
<xref linkend="guc-client-min-messages"/>.
|
2005-09-13 00:11:38 +02:00
|
|
|
Only superusers can change this setting.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-log-min-error-statement" xreflabel="log_min_error_statement">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_min_error_statement</varname> (<type>enum</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_min_error_statement</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2010-02-03 18:25:06 +01:00
|
|
|
Controls which SQL statements that cause an error
|
|
|
|
condition are recorded in the server log. The current
|
2006-11-21 02:23:37 +01:00
|
|
|
SQL statement is included in the log entry for any message of
|
2019-01-07 19:19:46 +01:00
|
|
|
the specified
|
|
|
|
<link linkend="runtime-config-severity-levels">severity</link>
|
|
|
|
or higher.
|
2006-11-21 02:23:37 +01:00
|
|
|
Valid values are <literal>DEBUG5</literal>,
|
2005-09-13 00:11:38 +02:00
|
|
|
<literal>DEBUG4</literal>, <literal>DEBUG3</literal>,
|
|
|
|
<literal>DEBUG2</literal>, <literal>DEBUG1</literal>,
|
|
|
|
<literal>INFO</literal>, <literal>NOTICE</literal>,
|
|
|
|
<literal>WARNING</literal>, <literal>ERROR</literal>,
|
2007-03-03 00:37:23 +01:00
|
|
|
<literal>LOG</literal>,
|
2006-11-21 02:23:37 +01:00
|
|
|
<literal>FATAL</literal>, and <literal>PANIC</literal>.
|
|
|
|
The default is <literal>ERROR</literal>, which means statements
|
2007-03-03 00:37:23 +01:00
|
|
|
causing errors, log messages, fatal errors, or panics will be logged.
|
2006-11-21 02:23:37 +01:00
|
|
|
To effectively turn off logging of failing statements,
|
|
|
|
set this parameter to <literal>PANIC</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
Only superusers can change this setting.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2010-11-23 21:27:50 +01:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-log-min-duration-statement" xreflabel="log_min_duration_statement">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_min_duration_statement</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_min_duration_statement</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2006-09-08 00:52:01 +02:00
|
|
|
Causes the duration of each completed statement to be logged
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
if the statement ran for at least the specified amount of time.
|
2019-08-04 20:29:00 +02:00
|
|
|
For example, if you set it to <literal>250ms</literal>
|
|
|
|
then all SQL statements that run 250ms or longer will be
|
|
|
|
logged. Enabling this parameter can be helpful in tracking down
|
|
|
|
unoptimized queries in your applications.
|
2020-03-10 22:34:01 +01:00
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
|
|
|
Setting this to zero prints all statement durations.
|
2020-04-10 04:18:39 +02:00
|
|
|
<literal>-1</literal> (the default) disables logging statement
|
|
|
|
durations. Only superusers can change this setting.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2006-09-08 00:52:01 +02:00
|
|
|
|
2019-11-04 01:57:45 +01:00
|
|
|
<para>
|
|
|
|
This overrides <xref linkend="guc-log-min-duration-sample"/>,
|
|
|
|
meaning that queries with duration exceeding this setting are not
|
|
|
|
subject to sampling and are always logged.
|
|
|
|
</para>
|
|
|
|
|
2006-09-08 00:52:01 +02:00
|
|
|
<para>
|
|
|
|
For clients using extended query protocol, durations of the Parse,
|
|
|
|
Bind, and Execute steps are logged independently.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<note>
|
|
|
|
<para>
|
|
|
|
When using this option together with
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-log-statement"/>,
|
2006-09-08 00:52:01 +02:00
|
|
|
the text of statements that are logged because of
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>log_statement</varname> will not be repeated in the
|
2006-09-08 00:52:01 +02:00
|
|
|
duration log message.
|
2017-10-09 03:44:17 +02:00
|
|
|
If you are not using <application>syslog</application>, it is recommended
|
2006-09-08 00:52:01 +02:00
|
|
|
that you log the PID or session ID using
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-log-line-prefix"/>
|
2006-09-08 00:52:01 +02:00
|
|
|
so that you can link the statement message to the later
|
|
|
|
duration message using the process ID or session ID.
|
|
|
|
</para>
|
|
|
|
</note>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2019-11-04 01:57:45 +01:00
|
|
|
<varlistentry id="guc-log-min-duration-sample" xreflabel="log_min_duration_sample">
|
|
|
|
<term><varname>log_min_duration_sample</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>log_min_duration_sample</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2020-03-10 22:34:01 +01:00
|
|
|
Allows sampling the duration of completed statements that ran for
|
|
|
|
at least the specified amount of time. This produces the same
|
|
|
|
kind of log entries as
|
|
|
|
<xref linkend="guc-log-min-duration-statement"/>, but only for a
|
|
|
|
subset of the executed statements, with sample rate controlled by
|
|
|
|
<xref linkend="guc-log-statement-sample-rate"/>.
|
|
|
|
For example, if you set it to <literal>100ms</literal> then all
|
|
|
|
SQL statements that run 100ms or longer will be considered for
|
|
|
|
sampling. Enabling this parameter can be helpful when the
|
|
|
|
traffic is too high to log all queries.
|
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
2019-11-04 01:57:45 +01:00
|
|
|
Setting this to zero samples all statement durations.
|
2020-04-10 04:18:39 +02:00
|
|
|
<literal>-1</literal> (the default) disables sampling statement
|
|
|
|
durations. Only superusers can change this setting.
|
2019-11-04 01:57:45 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2020-03-10 22:34:01 +01:00
|
|
|
This setting has lower priority
|
|
|
|
than <varname>log_min_duration_statement</varname>, meaning that
|
|
|
|
statements with durations
|
|
|
|
exceeding <varname>log_min_duration_statement</varname> are not
|
|
|
|
subject to sampling and are always logged.
|
2019-11-04 01:57:45 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2020-03-10 22:34:01 +01:00
|
|
|
Other notes for <varname>log_min_duration_statement</varname>
|
|
|
|
apply also to this setting.
|
2019-11-04 01:57:45 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-log-statement-sample-rate" xreflabel="log_statement_sample_rate">
|
2020-03-10 22:34:01 +01:00
|
|
|
<term><varname>log_statement_sample_rate</varname> (<type>floating point</type>)
|
2019-11-04 01:57:45 +01:00
|
|
|
<indexterm>
|
|
|
|
<primary><varname>log_statement_sample_rate</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Determines the fraction of statements with duration exceeding
|
2020-03-10 22:34:01 +01:00
|
|
|
<xref linkend="guc-log-min-duration-sample"/> that will be logged.
|
|
|
|
Sampling is stochastic, for example <literal>0.5</literal> means
|
|
|
|
there is statistically one chance in two that any given statement
|
|
|
|
will be logged.
|
|
|
|
The default is <literal>1.0</literal>, meaning to log all sampled
|
2019-11-04 01:57:45 +01:00
|
|
|
statements.
|
2020-03-10 22:34:01 +01:00
|
|
|
Setting this to zero disables sampled statement-duration logging,
|
|
|
|
the same as setting
|
2019-11-04 01:57:45 +01:00
|
|
|
<varname>log_min_duration_sample</varname> to
|
|
|
|
<literal>-1</literal>.
|
|
|
|
Only superusers can change this setting.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2019-04-03 23:43:59 +02:00
|
|
|
<varlistentry id="guc-log-transaction-sample-rate" xreflabel="log_transaction_sample_rate">
|
2020-03-10 22:34:01 +01:00
|
|
|
<term><varname>log_transaction_sample_rate</varname> (<type>floating point</type>)
|
2019-04-03 23:43:59 +02:00
|
|
|
<indexterm>
|
|
|
|
<primary><varname>log_transaction_sample_rate</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2020-03-10 22:34:01 +01:00
|
|
|
Sets the fraction of transactions whose statements are all logged,
|
2019-04-03 23:43:59 +02:00
|
|
|
in addition to statements logged for other reasons. It applies to
|
|
|
|
each new transaction regardless of its statements' durations.
|
2020-03-10 22:34:01 +01:00
|
|
|
Sampling is stochastic, for example <literal>0.1</literal> means
|
|
|
|
there is statistically one chance in ten that any given transaction
|
|
|
|
will be logged.
|
|
|
|
<varname>log_transaction_sample_rate</varname> can be helpful to
|
|
|
|
construct a sample of transactions.
|
|
|
|
The default is <literal>0</literal>, meaning not to log
|
|
|
|
statements from any additional transactions. Setting this
|
|
|
|
to <literal>1</literal> logs all statements of all transactions.
|
2019-11-04 02:00:26 +01:00
|
|
|
Only superusers can change this setting.
|
2019-04-03 23:43:59 +02:00
|
|
|
</para>
|
|
|
|
<note>
|
|
|
|
<para>
|
|
|
|
Like all statement-logging options, this option can add significant
|
|
|
|
overhead.
|
|
|
|
</para>
|
|
|
|
</note>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Report progress of startup operations that take a long time.
Users sometimes get concerned whe they start the server and it
emits a few messages and then doesn't emit any more messages for
a long time. Generally, what's happening is either that the
system is taking a long time to apply WAL, or it's taking a
long time to reset unlogged relations, or it's taking a long
time to fsync the data directory, but it's not easy to tell
which is the case.
To fix that, add a new 'log_startup_progress_interval' setting,
by default 10s. When an operation that is known to be potentially
long-running takes more than this amount of time, we'll log a
status update each time this interval elapses.
To avoid undesirable log chatter, don't log anything about WAL
replay when in standby mode.
Nitin Jadhav and Robert Haas, reviewed by Amul Sul, Bharath
Rupireddy, Justin Pryzby, Michael Paquier, and Álvaro Herrera.
Discussion: https://postgr.es/m/CA+TgmoaHQrgDFOBwgY16XCoMtXxsrVGFB2jNCvb7-ubuEe1MGg@mail.gmail.com
Discussion: https://postgr.es/m/CAMm1aWaHF7VE69572_OLQ+MgpT5RUiUDgF1x5RrtkJBLdpRj3Q@mail.gmail.com
2021-10-25 17:51:57 +02:00
|
|
|
<varlistentry id="guc-log-startup-progress-interval" xreflabel="log_startup_progress_interval">
|
|
|
|
<term><varname>log_startup_progress_interval</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>log_startup_progress_interval</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the amount of time after which the startup process will log
|
|
|
|
a message about a long-running operation that is still in progress,
|
|
|
|
as well as the interval between further progress messages for that
|
|
|
|
operation. This setting is applied separately to each operation.
|
|
|
|
For example, if syncing the data directory takes 25 seconds and
|
|
|
|
thereafter resetting unlogged relations takes 8 seconds, and if this
|
|
|
|
setting has the default value of 10 seconds, then a messages will be
|
|
|
|
logged for syncing the data directory after it has been in progress
|
|
|
|
for 10 seconds and again after it has been in progress for 20 seconds,
|
|
|
|
but nothing will be logged for resetting unlogged operations.
|
|
|
|
A setting of <literal>0</literal> disables the feature. If this value
|
|
|
|
is specified without units, it is taken as milliseconds.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
|
2007-09-22 21:10:44 +02:00
|
|
|
<para>
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="runtime-config-severity-levels"/> explains the message
|
2017-10-09 03:44:17 +02:00
|
|
|
severity levels used by <productname>PostgreSQL</productname>. If logging output
|
2007-09-22 21:10:44 +02:00
|
|
|
is sent to <systemitem>syslog</systemitem> or Windows'
|
|
|
|
<systemitem>eventlog</systemitem>, the severity levels are translated
|
|
|
|
as shown in the table.
|
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
|
2007-09-22 21:10:44 +02:00
|
|
|
<table id="runtime-config-severity-levels">
|
2011-01-29 19:00:18 +01:00
|
|
|
<title>Message Severity Levels</title>
|
2007-09-22 21:10:44 +02:00
|
|
|
<tgroup cols="4">
|
2020-05-06 18:23:43 +02:00
|
|
|
<colspec colname="col1" colwidth="1*"/>
|
|
|
|
<colspec colname="col2" colwidth="2*"/>
|
|
|
|
<colspec colname="col3" colwidth="1*"/>
|
|
|
|
<colspec colname="col4" colwidth="1*"/>
|
2007-09-22 21:10:44 +02:00
|
|
|
<thead>
|
|
|
|
<row>
|
|
|
|
<entry>Severity</entry>
|
|
|
|
<entry>Usage</entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><systemitem>syslog</systemitem></entry>
|
|
|
|
<entry><systemitem>eventlog</systemitem></entry>
|
2007-09-22 21:10:44 +02:00
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
|
|
|
|
<tbody>
|
|
|
|
<row>
|
2020-05-06 18:23:43 +02:00
|
|
|
<entry><literal>DEBUG1 .. DEBUG5</literal></entry>
|
2007-09-22 21:10:44 +02:00
|
|
|
<entry>Provides successively-more-detailed information for use by
|
|
|
|
developers.</entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>DEBUG</literal></entry>
|
|
|
|
<entry><literal>INFORMATION</literal></entry>
|
2007-09-22 21:10:44 +02:00
|
|
|
</row>
|
|
|
|
|
|
|
|
<row>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>INFO</literal></entry>
|
2007-09-22 21:10:44 +02:00
|
|
|
<entry>Provides information implicitly requested by the user,
|
2017-10-09 03:44:17 +02:00
|
|
|
e.g., output from <command>VACUUM VERBOSE</command>.</entry>
|
|
|
|
<entry><literal>INFO</literal></entry>
|
|
|
|
<entry><literal>INFORMATION</literal></entry>
|
2007-09-22 21:10:44 +02:00
|
|
|
</row>
|
|
|
|
|
|
|
|
<row>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>NOTICE</literal></entry>
|
2007-09-22 21:10:44 +02:00
|
|
|
<entry>Provides information that might be helpful to users, e.g.,
|
|
|
|
notice of truncation of long identifiers.</entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>NOTICE</literal></entry>
|
|
|
|
<entry><literal>INFORMATION</literal></entry>
|
2007-09-22 21:10:44 +02:00
|
|
|
</row>
|
|
|
|
|
|
|
|
<row>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>WARNING</literal></entry>
|
|
|
|
<entry>Provides warnings of likely problems, e.g., <command>COMMIT</command>
|
2007-09-22 21:10:44 +02:00
|
|
|
outside a transaction block.</entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>NOTICE</literal></entry>
|
|
|
|
<entry><literal>WARNING</literal></entry>
|
2007-09-22 21:10:44 +02:00
|
|
|
</row>
|
|
|
|
|
|
|
|
<row>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>ERROR</literal></entry>
|
2007-09-22 21:10:44 +02:00
|
|
|
<entry>Reports an error that caused the current command to
|
|
|
|
abort.</entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>WARNING</literal></entry>
|
|
|
|
<entry><literal>ERROR</literal></entry>
|
2007-09-22 21:10:44 +02:00
|
|
|
</row>
|
|
|
|
|
|
|
|
<row>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>LOG</literal></entry>
|
2007-09-22 21:10:44 +02:00
|
|
|
<entry>Reports information of interest to administrators, e.g.,
|
|
|
|
checkpoint activity.</entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>INFO</literal></entry>
|
|
|
|
<entry><literal>INFORMATION</literal></entry>
|
2007-09-22 21:10:44 +02:00
|
|
|
</row>
|
|
|
|
|
|
|
|
<row>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>FATAL</literal></entry>
|
2007-09-22 21:10:44 +02:00
|
|
|
<entry>Reports an error that caused the current session to
|
|
|
|
abort.</entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>ERR</literal></entry>
|
|
|
|
<entry><literal>ERROR</literal></entry>
|
2007-09-22 21:10:44 +02:00
|
|
|
</row>
|
|
|
|
|
|
|
|
<row>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>PANIC</literal></entry>
|
2007-09-22 21:10:44 +02:00
|
|
|
<entry>Reports an error that caused all database sessions to abort.</entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>CRIT</literal></entry>
|
|
|
|
<entry><literal>ERROR</literal></entry>
|
2007-09-22 21:10:44 +02:00
|
|
|
</row>
|
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
2005-09-13 00:11:38 +02:00
|
|
|
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="runtime-config-logging-what">
|
2019-09-08 10:26:35 +02:00
|
|
|
<title>What to Log</title>
|
2005-09-13 00:11:38 +02:00
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
|
2009-11-29 00:38:08 +01:00
|
|
|
<varlistentry id="guc-application-name" xreflabel="application_name">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>application_name</varname> (<type>string</type>)
|
2009-11-29 00:38:08 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>application_name</varname> configuration parameter</primary>
|
2009-11-29 00:38:08 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2009-11-29 00:38:08 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
The <varname>application_name</varname> can be any string of less than
|
2017-10-09 03:44:17 +02:00
|
|
|
<symbol>NAMEDATALEN</symbol> characters (64 characters in a standard build).
|
2009-11-29 00:38:08 +01:00
|
|
|
It is typically set by an application upon connection to the server.
|
2017-10-09 03:44:17 +02:00
|
|
|
The name will be displayed in the <structname>pg_stat_activity</structname> view
|
2009-11-29 00:38:08 +01:00
|
|
|
and included in CSV log entries. It can also be included in regular
|
2017-11-23 15:39:47 +01:00
|
|
|
log entries via the <xref linkend="guc-log-line-prefix"/> parameter.
|
2009-11-29 00:38:08 +01:00
|
|
|
Only printable ASCII characters may be used in the
|
|
|
|
<varname>application_name</varname> value. Other characters will be
|
|
|
|
replaced with question marks (<literal>?</literal>).
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry>
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>debug_print_parse</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>debug_print_parse</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
|
|
|
<term><varname>debug_print_rewritten</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>debug_print_rewritten</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
|
|
|
<term><varname>debug_print_plan</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>debug_print_plan</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2008-08-19 20:30:04 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
These parameters enable various debugging output to be emitted.
|
|
|
|
When set, they print the resulting parse tree, the query rewriter
|
|
|
|
output, or the execution plan for each executed query.
|
2017-10-09 03:44:17 +02:00
|
|
|
These messages are emitted at <literal>LOG</literal> message level, so by
|
2008-08-19 20:30:04 +02:00
|
|
|
default they will appear in the server log but will not be sent to the
|
|
|
|
client. You can change that by adjusting
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-client-min-messages"/> and/or
|
|
|
|
<xref linkend="guc-log-min-messages"/>.
|
2008-08-19 20:30:04 +02:00
|
|
|
These parameters are off by default.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry>
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>debug_pretty_print</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>debug_pretty_print</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2008-08-19 20:30:04 +02:00
|
|
|
When set, <varname>debug_pretty_print</varname> indents the messages
|
|
|
|
produced by <varname>debug_print_parse</varname>,
|
|
|
|
<varname>debug_print_rewritten</varname>, or
|
|
|
|
<varname>debug_print_plan</varname>. This results in more readable
|
2017-10-09 03:44:17 +02:00
|
|
|
but much longer output than the <quote>compact</quote> format used when
|
2008-08-19 20:30:04 +02:00
|
|
|
it is off. It is on by default.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2021-04-12 06:53:17 +02:00
|
|
|
<varlistentry id="guc-log-autovacuum-min-duration" xreflabel="log_autovacuum_min_duration">
|
|
|
|
<term><varname>log_autovacuum_min_duration</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>log_autovacuum_min_duration</varname></primary>
|
|
|
|
<secondary>configuration parameter</secondary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Causes each action executed by autovacuum to be logged if it ran for at
|
|
|
|
least the specified amount of time. Setting this to zero logs
|
|
|
|
all autovacuum actions. <literal>-1</literal> (the default) disables
|
|
|
|
logging autovacuum actions.
|
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
|
|
|
For example, if you set this to
|
|
|
|
<literal>250ms</literal> then all automatic vacuums and analyzes that run
|
|
|
|
250ms or longer will be logged. In addition, when this parameter is
|
|
|
|
set to any value other than <literal>-1</literal>, a message will be
|
|
|
|
logged if an autovacuum action is skipped due to a conflicting lock or a
|
|
|
|
concurrently dropped relation. Enabling this parameter can be helpful
|
|
|
|
in tracking autovacuum activity. This parameter can only be set in
|
|
|
|
the <filename>postgresql.conf</filename> file or on the server command line;
|
|
|
|
but the setting can be overridden for individual tables by
|
|
|
|
changing table storage parameters.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2007-06-30 21:12:02 +02:00
|
|
|
<varlistentry id="guc-log-checkpoints" xreflabel="log_checkpoints">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_checkpoints</varname> (<type>boolean</type>)
|
2007-06-30 21:12:02 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_checkpoints</varname> configuration parameter</primary>
|
2007-06-30 21:12:02 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2007-06-30 21:12:02 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2011-02-03 03:08:53 +01:00
|
|
|
Causes checkpoints and restartpoints to be logged in the server log.
|
|
|
|
Some statistics are included in the log messages, including the number
|
|
|
|
of buffers written and the time spent writing them.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2007-06-30 21:12:02 +02:00
|
|
|
file or on the server command line. The default is off.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-log-connections" xreflabel="log_connections">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_connections</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_connections</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2007-06-30 21:12:02 +02:00
|
|
|
Causes each attempted connection to the server to be logged,
|
Add some information about authenticated identity via log_connections
The "authenticated identity" is the string used by an authentication
method to identify a particular user. In many common cases, this is the
same as the PostgreSQL username, but for some third-party authentication
methods, the identifier in use may be shortened or otherwise translated
(e.g. through pg_ident user mappings) before the server stores it.
To help administrators see who has actually interacted with the system,
this commit adds the capability to store the original identity when
authentication succeeds within the backend's Port, and generates a log
entry when log_connections is enabled. The log entries generated look
something like this (where a local user named "foouser" is connecting to
the database as the database user called "admin"):
LOG: connection received: host=[local]
LOG: connection authenticated: identity="foouser" method=peer (/data/pg_hba.conf:88)
LOG: connection authorized: user=admin database=postgres application_name=psql
Port->authn_id is set according to the authentication method:
bsd: the PostgreSQL username (aka the local username)
cert: the client's Subject DN
gss: the user principal
ident: the remote username
ldap: the final bind DN
pam: the PostgreSQL username (aka PAM username)
password (and all pw-challenge methods): the PostgreSQL username
peer: the peer's pw_name
radius: the PostgreSQL username (aka the RADIUS username)
sspi: either the down-level (SAM-compatible) logon name, if
compat_realm=1, or the User Principal Name if compat_realm=0
The trust auth method does not set an authenticated identity. Neither
does clientcert=verify-full.
Port->authn_id could be used for other purposes, like a superuser-only
extra column in pg_stat_activity, but this is left as future work.
PostgresNode::connect_{ok,fails}() have been modified to let tests check
the backend log files for required or prohibited patterns, using the
new log_like and log_unlike parameters. This uses a method based on a
truncation of the existing server log file, like issues_sql_like().
Tests are added to the ldap, kerberos, authentication and SSL test
suites.
Author: Jacob Champion
Reviewed-by: Stephen Frost, Magnus Hagander, Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/c55788dd1773c521c862e8e0dddb367df51222be.camel@vmware.com
2021-04-07 03:16:39 +02:00
|
|
|
as well as successful completion of both client authentication (if
|
|
|
|
necessary) and authorization.
|
2014-09-14 03:01:49 +02:00
|
|
|
Only superusers can change this parameter at session start,
|
|
|
|
and it cannot be changed at all within a session.
|
2017-10-09 03:44:17 +02:00
|
|
|
The default is <literal>off</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2007-06-30 21:12:02 +02:00
|
|
|
|
|
|
|
<note>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Some client programs, like <application>psql</application>, attempt
|
2010-11-23 21:27:50 +01:00
|
|
|
to connect twice while determining if a password is required, so
|
2017-10-09 03:44:17 +02:00
|
|
|
duplicate <quote>connection received</quote> messages do not
|
2007-06-30 21:12:02 +02:00
|
|
|
necessarily indicate a problem.
|
|
|
|
</para>
|
|
|
|
</note>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-log-disconnections" xreflabel="log_disconnections">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_disconnections</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_disconnections</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2014-09-14 03:01:49 +02:00
|
|
|
Causes session terminations to be logged. The log output
|
|
|
|
provides information similar to <varname>log_connections</varname>,
|
|
|
|
plus the duration of the session.
|
|
|
|
Only superusers can change this parameter at session start,
|
|
|
|
and it cannot be changed at all within a session.
|
2017-10-09 03:44:17 +02:00
|
|
|
The default is <literal>off</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
|
|
|
|
<varlistentry id="guc-log-duration" xreflabel="log_duration">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_duration</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_duration</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2006-09-08 00:52:01 +02:00
|
|
|
Causes the duration of every completed statement to be logged.
|
2017-10-09 03:44:17 +02:00
|
|
|
The default is <literal>off</literal>.
|
2006-09-08 00:52:01 +02:00
|
|
|
Only superusers can change this setting.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2006-09-08 00:52:01 +02:00
|
|
|
|
2006-09-08 17:55:53 +02:00
|
|
|
<para>
|
|
|
|
For clients using extended query protocol, durations of the Parse,
|
|
|
|
Bind, and Execute steps are logged independently.
|
|
|
|
</para>
|
2006-09-08 00:52:01 +02:00
|
|
|
|
|
|
|
<note>
|
|
|
|
<para>
|
2019-04-03 23:54:02 +02:00
|
|
|
The difference between enabling <varname>log_duration</varname> and setting
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-log-min-duration-statement"/> to zero is that
|
2017-10-09 03:44:17 +02:00
|
|
|
exceeding <varname>log_min_duration_statement</varname> forces the text of
|
2006-09-08 17:55:53 +02:00
|
|
|
the query to be logged, but this option doesn't. Thus, if
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>log_duration</varname> is <literal>on</literal> and
|
|
|
|
<varname>log_min_duration_statement</varname> has a positive value, all
|
2006-09-08 17:55:53 +02:00
|
|
|
durations are logged but the query text is included only for
|
|
|
|
statements exceeding the threshold. This behavior can be useful for
|
|
|
|
gathering statistics in high-load installations.
|
2006-09-08 00:52:01 +02:00
|
|
|
</para>
|
|
|
|
</note>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2007-06-30 21:12:02 +02:00
|
|
|
|
2010-02-16 22:35:51 +01:00
|
|
|
<varlistentry id="guc-log-error-verbosity" xreflabel="log_error_verbosity">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_error_verbosity</varname> (<type>enum</type>)
|
2010-02-16 22:35:51 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_error_verbosity</varname> configuration parameter</primary>
|
2010-02-16 22:35:51 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2010-02-16 22:35:51 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Controls the amount of detail written in the server log for each
|
2017-10-09 03:44:17 +02:00
|
|
|
message that is logged. Valid values are <literal>TERSE</literal>,
|
|
|
|
<literal>DEFAULT</literal>, and <literal>VERBOSE</literal>, each adding more
|
|
|
|
fields to displayed messages. <literal>TERSE</literal> excludes
|
|
|
|
the logging of <literal>DETAIL</literal>, <literal>HINT</literal>,
|
|
|
|
<literal>QUERY</literal>, and <literal>CONTEXT</literal> error information.
|
|
|
|
<literal>VERBOSE</literal> output includes the <symbol>SQLSTATE</symbol> error
|
2017-11-23 15:39:47 +01:00
|
|
|
code (see also <xref linkend="errcodes-appendix"/>) and the source code file name, function name,
|
2010-02-16 22:35:51 +01:00
|
|
|
and line number that generated the error.
|
|
|
|
Only superusers can change this setting.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2007-06-30 21:12:02 +02:00
|
|
|
<varlistentry id="guc-log-hostname" xreflabel="log_hostname">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_hostname</varname> (<type>boolean</type>)
|
2007-06-30 21:12:02 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_hostname</varname> configuration parameter</primary>
|
2007-06-30 21:12:02 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2007-06-30 21:12:02 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
By default, connection log messages only show the IP address of the
|
2010-02-03 18:25:06 +01:00
|
|
|
connecting host. Turning this parameter on causes logging of the
|
2007-06-30 21:12:02 +02:00
|
|
|
host name as well. Note that depending on your host name resolution
|
|
|
|
setup this might impose a non-negligible performance penalty.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2007-06-30 21:12:02 +02:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2010-11-23 21:27:50 +01:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-log-line-prefix" xreflabel="log_line_prefix">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_line_prefix</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_line_prefix</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
This is a <function>printf</function>-style string that is output at the
|
2007-06-22 18:15:23 +02:00
|
|
|
beginning of each log line.
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>%</literal> characters begin <quote>escape sequences</quote>
|
2007-06-22 18:15:23 +02:00
|
|
|
that are replaced with status information as outlined below.
|
|
|
|
Unrecognized escapes are ignored. Other
|
2005-09-13 00:11:38 +02:00
|
|
|
characters are copied straight to the log line. Some escapes are
|
2013-09-26 23:54:20 +02:00
|
|
|
only recognized by session processes, and will be treated as empty by
|
|
|
|
background processes such as the main server process. Status
|
|
|
|
information may be aligned either left or right by specifying a
|
|
|
|
numeric literal after the % and before the option. A negative
|
|
|
|
value will cause the status information to be padded on the
|
|
|
|
right with spaces to give it a minimum width, whereas a positive
|
|
|
|
value will pad on the left. Padding can be useful to aid human
|
|
|
|
readability in log files.
|
2020-03-15 11:20:21 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2016-10-17 22:31:13 +02:00
|
|
|
file or on the server command line. The default is
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>'%m [%p] '</literal> which logs a time stamp and the process ID.
|
2020-03-15 11:20:21 +01:00
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
|
|
|
|
<informaltable>
|
|
|
|
<tgroup cols="3">
|
|
|
|
<thead>
|
|
|
|
<row>
|
|
|
|
<entry>Escape</entry>
|
|
|
|
<entry>Effect</entry>
|
|
|
|
<entry>Session only</entry>
|
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
<tbody>
|
2009-11-29 00:38:08 +01:00
|
|
|
<row>
|
|
|
|
<entry><literal>%a</literal></entry>
|
|
|
|
<entry>Application name</entry>
|
|
|
|
<entry>yes</entry>
|
|
|
|
</row>
|
2005-09-13 00:11:38 +02:00
|
|
|
<row>
|
|
|
|
<entry><literal>%u</literal></entry>
|
2011-01-12 17:59:21 +01:00
|
|
|
<entry>User name</entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
<entry>yes</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>%d</literal></entry>
|
|
|
|
<entry>Database name</entry>
|
|
|
|
<entry>yes</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>%r</literal></entry>
|
|
|
|
<entry>Remote host name or IP address, and remote port</entry>
|
|
|
|
<entry>yes</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>%h</literal></entry>
|
2005-11-05 00:14:02 +01:00
|
|
|
<entry>Remote host name or IP address</entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
<entry>yes</entry>
|
|
|
|
</row>
|
2020-03-15 11:20:21 +01:00
|
|
|
<row>
|
|
|
|
<entry><literal>%b</literal></entry>
|
|
|
|
<entry>Backend type</entry>
|
|
|
|
<entry>no</entry>
|
|
|
|
</row>
|
2005-09-13 00:11:38 +02:00
|
|
|
<row>
|
|
|
|
<entry><literal>%p</literal></entry>
|
|
|
|
<entry>Process ID</entry>
|
|
|
|
<entry>no</entry>
|
|
|
|
</row>
|
2020-08-03 06:38:48 +02:00
|
|
|
<row>
|
|
|
|
<entry><literal>%P</literal></entry>
|
|
|
|
<entry>Process ID of the parallel group leader, if this process
|
|
|
|
is a parallel query worker</entry>
|
|
|
|
<entry>no</entry>
|
|
|
|
</row>
|
2005-09-13 00:11:38 +02:00
|
|
|
<row>
|
|
|
|
<entry><literal>%t</literal></entry>
|
2007-08-04 03:26:54 +02:00
|
|
|
<entry>Time stamp without milliseconds</entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
<entry>no</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>%m</literal></entry>
|
|
|
|
<entry>Time stamp with milliseconds</entry>
|
|
|
|
<entry>no</entry>
|
|
|
|
</row>
|
2015-09-07 22:46:31 +02:00
|
|
|
<row>
|
|
|
|
<entry><literal>%n</literal></entry>
|
|
|
|
<entry>Time stamp with milliseconds (as a Unix epoch)</entry>
|
|
|
|
<entry>no</entry>
|
|
|
|
</row>
|
2005-09-13 00:11:38 +02:00
|
|
|
<row>
|
|
|
|
<entry><literal>%i</literal></entry>
|
2007-06-22 18:15:23 +02:00
|
|
|
<entry>Command tag: type of session's current command</entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
<entry>yes</entry>
|
|
|
|
</row>
|
2009-07-03 21:14:25 +02:00
|
|
|
<row>
|
|
|
|
<entry><literal>%e</literal></entry>
|
2010-08-23 04:43:25 +02:00
|
|
|
<entry>SQLSTATE error code</entry>
|
2009-07-03 21:14:25 +02:00
|
|
|
<entry>no</entry>
|
|
|
|
</row>
|
2005-09-13 00:11:38 +02:00
|
|
|
<row>
|
|
|
|
<entry><literal>%c</literal></entry>
|
2007-06-22 18:15:23 +02:00
|
|
|
<entry>Session ID: see below</entry>
|
2007-08-03 01:39:45 +02:00
|
|
|
<entry>no</entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>%l</literal></entry>
|
2007-12-11 21:07:31 +01:00
|
|
|
<entry>Number of the log line for each session or process, starting at 1</entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
<entry>no</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>%s</literal></entry>
|
2007-08-04 03:26:54 +02:00
|
|
|
<entry>Process start time stamp</entry>
|
2007-08-03 01:39:45 +02:00
|
|
|
<entry>no</entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</row>
|
2007-09-05 20:10:48 +02:00
|
|
|
<row>
|
|
|
|
<entry><literal>%v</literal></entry>
|
|
|
|
<entry>Virtual transaction ID (backendID/localXID)</entry>
|
|
|
|
<entry>no</entry>
|
|
|
|
</row>
|
2005-09-13 00:11:38 +02:00
|
|
|
<row>
|
|
|
|
<entry><literal>%x</literal></entry>
|
2007-09-05 20:10:48 +02:00
|
|
|
<entry>Transaction ID (0 if none is assigned)</entry>
|
|
|
|
<entry>no</entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><literal>%q</literal></entry>
|
2007-06-22 18:15:23 +02:00
|
|
|
<entry>Produces no output, but tells non-session
|
|
|
|
processes to stop at this point in the string; ignored by
|
|
|
|
session processes</entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
<entry>no</entry>
|
|
|
|
</row>
|
Make use of in-core query id added by commit 5fd9dfa5f5
Use the in-core query id computation for pg_stat_activity,
log_line_prefix, and EXPLAIN VERBOSE.
Similar to other fields in pg_stat_activity, only the queryid from the
top level statements are exposed, and if the backends status isn't
active then the queryid from the last executed statements is displayed.
Add a %Q placeholder to include the queryid in log_line_prefix, which
will also only expose top level statements.
For EXPLAIN VERBOSE, if a query identifier has been computed, either by
enabling compute_query_id or using a third-party module, display it.
Bump catalog version.
Discussion: https://postgr.es/m/20210407125726.tkvjdbw76hxnpwfi@nol
Author: Julien Rouhaud
Reviewed-by: Alvaro Herrera, Nitin Jadhav, Zhihong Yu
2021-04-07 20:03:56 +02:00
|
|
|
<row>
|
|
|
|
<entry><literal>%Q</literal></entry>
|
2021-07-15 23:22:58 +02:00
|
|
|
<entry>Query identifier of the current query. Query
|
Make use of in-core query id added by commit 5fd9dfa5f5
Use the in-core query id computation for pg_stat_activity,
log_line_prefix, and EXPLAIN VERBOSE.
Similar to other fields in pg_stat_activity, only the queryid from the
top level statements are exposed, and if the backends status isn't
active then the queryid from the last executed statements is displayed.
Add a %Q placeholder to include the queryid in log_line_prefix, which
will also only expose top level statements.
For EXPLAIN VERBOSE, if a query identifier has been computed, either by
enabling compute_query_id or using a third-party module, display it.
Bump catalog version.
Discussion: https://postgr.es/m/20210407125726.tkvjdbw76hxnpwfi@nol
Author: Julien Rouhaud
Reviewed-by: Alvaro Herrera, Nitin Jadhav, Zhihong Yu
2021-04-07 20:03:56 +02:00
|
|
|
identifiers are not computed by default, so this field
|
|
|
|
will be zero unless <xref linkend="guc-compute-query-id"/>
|
|
|
|
parameter is enabled or a third-party module that computes
|
|
|
|
query identifiers is configured.</entry>
|
|
|
|
<entry>yes</entry>
|
|
|
|
</row>
|
2005-09-13 00:11:38 +02:00
|
|
|
<row>
|
|
|
|
<entry><literal>%%</literal></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry>Literal <literal>%</literal></entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
<entry>no</entry>
|
|
|
|
</row>
|
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</informaltable>
|
2007-06-22 18:15:23 +02:00
|
|
|
|
2020-03-15 11:20:21 +01:00
|
|
|
<para>
|
|
|
|
The backend type corresponds to the column
|
2020-05-29 10:14:33 +02:00
|
|
|
<structfield>backend_type</structfield> in the view
|
|
|
|
<link linkend="monitoring-pg-stat-activity-view">
|
|
|
|
<structname>pg_stat_activity</structname></link>,
|
|
|
|
but additional types can appear
|
2020-03-15 11:20:21 +01:00
|
|
|
in the log that don't show in that view.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The <literal>%c</literal> escape prints a quasi-unique session identifier,
|
2013-05-06 14:59:39 +02:00
|
|
|
consisting of two 4-byte hexadecimal numbers (without leading zeros)
|
|
|
|
separated by a dot. The numbers are the process start time and the
|
2017-10-09 03:44:17 +02:00
|
|
|
process ID, so <literal>%c</literal> can also be used as a space saving way
|
2009-06-03 02:38:34 +02:00
|
|
|
of printing those items. For example, to generate the session
|
2017-10-09 03:44:17 +02:00
|
|
|
identifier from <literal>pg_stat_activity</literal>, use this query:
|
2009-06-03 02:38:34 +02:00
|
|
|
<programlisting>
|
2015-06-04 23:57:39 +02:00
|
|
|
SELECT to_hex(trunc(EXTRACT(EPOCH FROM backend_start))::integer) || '.' ||
|
2013-05-06 14:59:39 +02:00
|
|
|
to_hex(pid)
|
2009-06-03 02:38:34 +02:00
|
|
|
FROM pg_stat_activity;
|
|
|
|
</programlisting>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2007-06-22 18:15:23 +02:00
|
|
|
|
|
|
|
<tip>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
If you set a nonempty value for <varname>log_line_prefix</varname>,
|
2007-06-22 18:15:23 +02:00
|
|
|
you should usually make its last character be a space, to provide
|
|
|
|
visual separation from the rest of the log line. A punctuation
|
2010-02-03 18:25:06 +01:00
|
|
|
character can be used too.
|
2007-06-22 18:15:23 +02:00
|
|
|
</para>
|
|
|
|
</tip>
|
|
|
|
|
|
|
|
<tip>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
<application>Syslog</application> produces its own
|
2007-06-22 18:15:23 +02:00
|
|
|
time stamp and process ID information, so you probably do not want to
|
2017-10-09 03:44:17 +02:00
|
|
|
include those escapes if you are logging to <application>syslog</application>.
|
2007-06-22 18:15:23 +02:00
|
|
|
</para>
|
|
|
|
</tip>
|
2016-10-17 22:31:13 +02:00
|
|
|
|
|
|
|
<tip>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The <literal>%q</literal> escape is useful when including information that is
|
2016-10-17 22:31:13 +02:00
|
|
|
only available in session (backend) context like user or database
|
|
|
|
name. For example:
|
|
|
|
<programlisting>
|
|
|
|
log_line_prefix = '%m [%p] %q%u@%d/%a '
|
|
|
|
</programlisting>
|
|
|
|
</para>
|
|
|
|
</tip>
|
2021-04-20 18:57:59 +02:00
|
|
|
|
|
|
|
<note>
|
|
|
|
<para>
|
|
|
|
The <literal>%Q</literal> escape always reports a zero identifier
|
|
|
|
for lines output by <xref linkend="guc-log-statement"/> because
|
|
|
|
<varname>log_statement</varname> generates output before an
|
|
|
|
identifier can be calculated, including invalid statements for
|
|
|
|
which an identifier cannot be calculated.
|
|
|
|
</para>
|
|
|
|
</note>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2007-06-30 21:12:02 +02:00
|
|
|
<varlistentry id="guc-log-lock-waits" xreflabel="log_lock_waits">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_lock_waits</varname> (<type>boolean</type>)
|
2007-06-30 21:12:02 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_lock_waits</varname> configuration parameter</primary>
|
2007-06-30 21:12:02 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2007-06-30 21:12:02 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Controls whether a log message is produced when a session waits
|
2017-11-23 15:39:47 +01:00
|
|
|
longer than <xref linkend="guc-deadlock-timeout"/> to acquire a
|
2007-06-30 21:12:02 +02:00
|
|
|
lock. This is useful in determining if lock waits are causing
|
2017-10-09 03:44:17 +02:00
|
|
|
poor performance. The default is <literal>off</literal>.
|
2017-07-19 18:58:36 +02:00
|
|
|
Only superusers can change this setting.
|
2007-06-30 21:12:02 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2021-01-07 16:47:03 +01:00
|
|
|
<varlistentry id="guc-log-recovery-conflict-waits" xreflabel="log_recovery_conflict_waits">
|
|
|
|
<term><varname>log_recovery_conflict_waits</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>log_recovery_conflict_waits</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Controls whether a log message is produced when the startup process
|
2021-01-13 14:59:17 +01:00
|
|
|
waits longer than <varname>deadlock_timeout</varname>
|
2021-01-07 16:47:03 +01:00
|
|
|
for recovery conflicts. This is useful in determining if recovery
|
|
|
|
conflicts prevent the recovery from applying WAL.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The default is <literal>off</literal>. This parameter can only be set
|
|
|
|
in the <filename>postgresql.conf</filename> file or on the server
|
|
|
|
command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2020-04-02 21:04:51 +02:00
|
|
|
<varlistentry id="guc-log-parameter-max-length" xreflabel="log_parameter_max_length">
|
|
|
|
<term><varname>log_parameter_max_length</varname> (<type>integer</type>)
|
2019-12-11 22:03:35 +01:00
|
|
|
<indexterm>
|
2020-04-02 21:04:51 +02:00
|
|
|
<primary><varname>log_parameter_max_length</varname> configuration parameter</primary>
|
2019-12-11 22:03:35 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2020-09-21 18:43:42 +02:00
|
|
|
If greater than zero, each bind parameter value logged with a
|
|
|
|
non-error statement-logging message is trimmed to this many bytes.
|
|
|
|
Zero disables logging of bind parameters for non-error statement logs.
|
2020-04-02 21:04:51 +02:00
|
|
|
<literal>-1</literal> (the default) allows bind parameters to be
|
|
|
|
logged in full.
|
|
|
|
If this value is specified without units, it is taken as bytes.
|
2019-12-11 22:03:35 +01:00
|
|
|
Only superusers can change this setting.
|
|
|
|
</para>
|
2020-04-02 21:04:51 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
This setting only affects log messages printed as a result of
|
|
|
|
<xref linkend="guc-log-statement"/>,
|
|
|
|
<xref linkend="guc-log-duration"/>, and related settings. Non-zero
|
|
|
|
values of this setting add some overhead, particularly if parameters
|
|
|
|
are sent in binary form, since then conversion to text is required.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-log-parameter-max-length-on-error" xreflabel="log_parameter_max_length_on_error">
|
|
|
|
<term><varname>log_parameter_max_length_on_error</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>log_parameter_max_length_on_error</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
If greater than zero, each bind parameter value reported in error
|
|
|
|
messages is trimmed to this many bytes.
|
|
|
|
Zero (the default) disables including bind parameters in error
|
|
|
|
messages.
|
|
|
|
<literal>-1</literal> allows bind parameters to be printed in full.
|
|
|
|
If this value is specified without units, it is taken as bytes.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Non-zero values of this setting add overhead, as
|
|
|
|
<productname>PostgreSQL</productname> will need to store textual
|
|
|
|
representations of parameter values in memory at the start of each
|
|
|
|
statement, whether or not an error eventually occurs. The overhead
|
|
|
|
is greater when bind parameters are sent in binary form than when
|
|
|
|
they are sent as text, since the former case requires data
|
|
|
|
conversion while the latter only requires copying the string.
|
|
|
|
</para>
|
2019-12-11 22:03:35 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-log-statement" xreflabel="log_statement">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_statement</varname> (<type>enum</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_statement</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Controls which SQL statements are logged. Valid values are
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>none</literal> (off), <literal>ddl</literal>, <literal>mod</literal>, and
|
|
|
|
<literal>all</literal> (all statements). <literal>ddl</literal> logs all data definition
|
|
|
|
statements, such as <command>CREATE</command>, <command>ALTER</command>, and
|
|
|
|
<command>DROP</command> statements. <literal>mod</literal> logs all
|
|
|
|
<literal>ddl</literal> statements, plus data-modifying statements
|
|
|
|
such as <command>INSERT</command>,
|
|
|
|
<command>UPDATE</command>, <command>DELETE</command>, <command>TRUNCATE</command>,
|
|
|
|
and <command>COPY FROM</command>.
|
|
|
|
<command>PREPARE</command>, <command>EXECUTE</command>, and
|
|
|
|
<command>EXPLAIN ANALYZE</command> statements are also logged if their
|
2006-09-08 00:52:01 +02:00
|
|
|
contained command is of an appropriate type. For clients using
|
|
|
|
extended query protocol, logging occurs when an Execute message
|
|
|
|
is received, and values of the Bind parameters are included
|
|
|
|
(with any embedded single-quote marks doubled).
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2006-09-08 00:52:01 +02:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The default is <literal>none</literal>. Only superusers can change this
|
2005-09-13 00:11:38 +02:00
|
|
|
setting.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<note>
|
|
|
|
<para>
|
2006-10-20 00:55:25 +02:00
|
|
|
Statements that contain simple syntax errors are not logged
|
2017-10-09 03:44:17 +02:00
|
|
|
even by the <varname>log_statement</varname> = <literal>all</literal> setting,
|
2006-10-20 00:55:25 +02:00
|
|
|
because the log message is emitted only after basic parsing has
|
|
|
|
been done to determine the statement type. In the case of extended
|
|
|
|
query protocol, this setting likewise does not log statements that
|
|
|
|
fail before the Execute phase (i.e., during parse analysis or
|
2017-10-09 03:44:17 +02:00
|
|
|
planning). Set <varname>log_min_error_statement</varname> to
|
|
|
|
<literal>ERROR</literal> (or lower) to log such statements.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</note>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2014-09-12 19:55:45 +02:00
|
|
|
<varlistentry id="guc-log-replication-commands" xreflabel="log_replication_commands">
|
|
|
|
<term><varname>log_replication_commands</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_replication_commands</varname> configuration parameter</primary>
|
2014-09-12 19:55:45 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Causes each replication command to be logged in the server log.
|
2017-11-23 15:39:47 +01:00
|
|
|
See <xref linkend="protocol-replication"/> for more information about
|
2017-10-09 03:44:17 +02:00
|
|
|
replication command. The default value is <literal>off</literal>.
|
2014-09-12 19:55:45 +02:00
|
|
|
Only superusers can change this setting.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2007-01-09 22:31:17 +01:00
|
|
|
<varlistentry id="guc-log-temp-files" xreflabel="log_temp_files">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_temp_files</varname> (<type>integer</type>)
|
2007-01-09 22:31:17 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_temp_files</varname> configuration parameter</primary>
|
2007-01-09 22:31:17 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2007-01-09 22:31:17 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2010-02-03 18:25:06 +01:00
|
|
|
Controls logging of temporary file names and sizes.
|
2007-06-30 21:12:02 +02:00
|
|
|
Temporary files can be
|
|
|
|
created for sorts, hashes, and temporary query results.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If enabled by this setting, a log entry is emitted for each
|
|
|
|
temporary file when it is deleted.
|
2010-02-03 18:25:06 +01:00
|
|
|
A value of zero logs all temporary file information, while positive
|
2009-02-15 19:28:48 +01:00
|
|
|
values log only files whose size is greater than or equal to
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
the specified amount of data.
|
|
|
|
If this value is specified without units, it is taken as kilobytes.
|
|
|
|
The default setting is -1, which disables such logging.
|
2008-08-22 20:47:07 +02:00
|
|
|
Only superusers can change this setting.
|
2007-01-09 22:31:17 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2007-08-04 03:26:54 +02:00
|
|
|
<varlistentry id="guc-log-timezone" xreflabel="log_timezone">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_timezone</varname> (<type>string</type>)
|
2007-08-04 03:26:54 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_timezone</varname> configuration parameter</primary>
|
2007-08-04 03:26:54 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2007-08-04 03:26:54 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2011-05-10 19:48:40 +02:00
|
|
|
Sets the time zone used for timestamps written in the server log.
|
2017-11-23 15:39:47 +01:00
|
|
|
Unlike <xref linkend="guc-timezone"/>, this value is cluster-wide,
|
2007-08-04 03:26:54 +02:00
|
|
|
so that all sessions will report timestamps consistently.
|
2017-10-09 03:44:17 +02:00
|
|
|
The built-in default is <literal>GMT</literal>, but that is typically
|
|
|
|
overridden in <filename>postgresql.conf</filename>; <application>initdb</application>
|
2011-09-09 23:59:11 +02:00
|
|
|
will install a setting there corresponding to its system environment.
|
2017-11-23 15:39:47 +01:00
|
|
|
See <xref linkend="datatype-timezones"/> for more information.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2007-08-04 03:26:54 +02:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
2007-08-19 03:41:25 +02:00
|
|
|
</sect2>
|
|
|
|
<sect2 id="runtime-config-logging-csvlog">
|
2007-09-22 21:10:44 +02:00
|
|
|
<title>Using CSV-Format Log Output</title>
|
2007-08-19 03:41:25 +02:00
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Including <literal>csvlog</literal> in the <varname>log_destination</varname> list
|
2009-11-29 00:38:08 +01:00
|
|
|
provides a convenient way to import log files into a database table.
|
2010-02-03 18:25:06 +01:00
|
|
|
This option emits log lines in comma-separated-values
|
2017-10-09 03:44:17 +02:00
|
|
|
(<acronym>CSV</acronym>) format,
|
2009-11-29 00:38:08 +01:00
|
|
|
with these columns:
|
2011-05-19 00:14:45 +02:00
|
|
|
time stamp with milliseconds,
|
2009-11-29 00:38:08 +01:00
|
|
|
user name,
|
|
|
|
database name,
|
|
|
|
process ID,
|
|
|
|
client host:port number,
|
|
|
|
session ID,
|
|
|
|
per-session line number,
|
|
|
|
command tag,
|
|
|
|
session start time,
|
|
|
|
virtual transaction ID,
|
|
|
|
regular transaction ID,
|
|
|
|
error severity,
|
2010-08-23 04:43:25 +02:00
|
|
|
SQLSTATE code,
|
2009-11-29 00:38:08 +01:00
|
|
|
error message,
|
|
|
|
error message detail,
|
|
|
|
hint,
|
|
|
|
internal query that led to the error (if any),
|
|
|
|
character count of the error position therein,
|
|
|
|
error context,
|
2007-12-11 16:19:05 +01:00
|
|
|
user query that led to the error (if any and enabled by
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>log_min_error_statement</varname>),
|
2009-11-29 00:38:08 +01:00
|
|
|
character count of the error position therein,
|
|
|
|
location of the error in the PostgreSQL source code
|
2017-10-09 03:44:17 +02:00
|
|
|
(if <varname>log_error_verbosity</varname> is set to <literal>verbose</literal>),
|
2021-04-08 04:30:30 +02:00
|
|
|
application name, backend type, process ID of parallel group leader,
|
|
|
|
and query id.
|
2007-09-22 21:10:44 +02:00
|
|
|
Here is a sample table definition for storing CSV-format log output:
|
2007-08-19 03:41:25 +02:00
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
CREATE TABLE postgres_log
|
|
|
|
(
|
2007-12-11 21:07:31 +01:00
|
|
|
log_time timestamp(3) with time zone,
|
2007-09-27 20:15:36 +02:00
|
|
|
user_name text,
|
2007-08-19 03:41:25 +02:00
|
|
|
database_name text,
|
2007-09-22 21:10:44 +02:00
|
|
|
process_id integer,
|
2007-12-11 21:07:31 +01:00
|
|
|
connection_from text,
|
|
|
|
session_id text,
|
|
|
|
session_line_num bigint,
|
2007-08-19 03:41:25 +02:00
|
|
|
command_tag text,
|
2007-09-22 21:10:44 +02:00
|
|
|
session_start_time timestamp with time zone,
|
2007-09-27 20:15:36 +02:00
|
|
|
virtual_transaction_id text,
|
2007-09-22 21:10:44 +02:00
|
|
|
transaction_id bigint,
|
2007-08-19 03:41:25 +02:00
|
|
|
error_severity text,
|
|
|
|
sql_state_code text,
|
2007-09-27 20:15:36 +02:00
|
|
|
message text,
|
2007-12-11 16:19:05 +01:00
|
|
|
detail text,
|
|
|
|
hint text,
|
|
|
|
internal_query text,
|
|
|
|
internal_query_pos integer,
|
|
|
|
context text,
|
|
|
|
query text,
|
|
|
|
query_pos integer,
|
|
|
|
location text,
|
2009-11-29 00:38:08 +01:00
|
|
|
application_name text,
|
2020-03-15 11:20:21 +01:00
|
|
|
backend_type text,
|
2020-08-03 06:38:48 +02:00
|
|
|
leader_pid integer,
|
2021-04-08 04:30:30 +02:00
|
|
|
query_id bigint,
|
2007-12-11 21:07:31 +01:00
|
|
|
PRIMARY KEY (session_id, session_line_num)
|
2007-08-19 03:41:25 +02:00
|
|
|
);
|
|
|
|
</programlisting>
|
2007-11-28 16:42:31 +01:00
|
|
|
</para>
|
2007-08-19 03:41:25 +02:00
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
To import a log file into this table, use the <command>COPY FROM</command>
|
2007-09-22 21:10:44 +02:00
|
|
|
command:
|
2007-08-19 03:41:25 +02:00
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
|
|
|
|
</programlisting>
|
2020-08-31 22:59:59 +02:00
|
|
|
It is also possible to access the file as a foreign table, using
|
|
|
|
the supplied <xref linkend="file-fdw"/> module.
|
2007-11-28 16:42:31 +01:00
|
|
|
</para>
|
2007-08-19 03:41:25 +02:00
|
|
|
|
|
|
|
<para>
|
2007-09-22 21:10:44 +02:00
|
|
|
There are a few things you need to do to simplify importing CSV log
|
2010-02-03 18:25:06 +01:00
|
|
|
files:
|
2007-08-19 03:41:25 +02:00
|
|
|
|
|
|
|
<orderedlist>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2007-09-22 21:10:44 +02:00
|
|
|
Set <varname>log_filename</varname> and
|
2017-10-09 03:44:17 +02:00
|
|
|
<varname>log_rotation_age</varname> to provide a consistent,
|
2007-09-22 21:10:44 +02:00
|
|
|
predictable naming scheme for your log files. This lets you
|
|
|
|
predict what the file name will be and know when an individual log
|
|
|
|
file is complete and therefore ready to be imported.
|
2007-08-19 03:41:25 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2010-11-23 21:27:50 +01:00
|
|
|
Set <varname>log_rotation_size</varname> to 0 to disable
|
|
|
|
size-based log rotation, as it makes the log file name difficult
|
|
|
|
to predict.
|
2007-08-19 03:41:25 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Set <varname>log_truncate_on_rotation</varname> to <literal>on</literal> so
|
2007-09-22 21:10:44 +02:00
|
|
|
that old log data isn't mixed with the new in the same file.
|
2007-08-19 03:41:25 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2007-09-22 21:10:44 +02:00
|
|
|
The table definition above includes a primary key specification.
|
|
|
|
This is useful to protect against accidentally importing the same
|
2017-10-09 03:44:17 +02:00
|
|
|
information twice. The <command>COPY</command> command commits all of the
|
2007-09-22 21:10:44 +02:00
|
|
|
data it imports at one time, so any error will cause the entire
|
|
|
|
import to fail. If you import a partial log file and later import
|
|
|
|
the file again when it is complete, the primary key violation will
|
|
|
|
cause the import to fail. Wait until the log is complete and
|
|
|
|
closed before importing. This procedure will also protect against
|
|
|
|
accidentally importing a partial line that hasn't been completely
|
2017-10-09 03:44:17 +02:00
|
|
|
written, which would also cause <command>COPY</command> to fail.
|
2007-08-19 03:41:25 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</orderedlist>
|
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</sect2>
|
2015-10-04 17:14:28 +02:00
|
|
|
|
|
|
|
<sect2>
|
|
|
|
<title>Process Title</title>
|
|
|
|
|
|
|
|
<para>
|
2016-08-28 18:37:23 +02:00
|
|
|
These settings control how process titles of server processes are
|
|
|
|
modified. Process titles are typically viewed using programs like
|
2017-10-09 03:44:17 +02:00
|
|
|
<application>ps</application> or, on Windows, <application>Process Explorer</application>.
|
2017-11-23 15:39:47 +01:00
|
|
|
See <xref linkend="monitoring-ps"/> for details.
|
2015-10-04 17:14:28 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry id="guc-cluster-name" xreflabel="cluster_name">
|
|
|
|
<term><varname>cluster_name</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>cluster_name</varname> configuration parameter</primary>
|
2015-10-04 17:14:28 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2019-02-08 08:17:21 +01:00
|
|
|
Sets a name that identifies this database cluster (instance) for
|
|
|
|
various purposes. The cluster name appears in the process title for
|
|
|
|
all server processes in this cluster. Moreover, it is the default
|
|
|
|
application name for a standby connection (see <xref
|
|
|
|
linkend="guc-synchronous-standby-names"/>.)
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The name can be any string of less
|
2017-10-09 03:44:17 +02:00
|
|
|
than <symbol>NAMEDATALEN</symbol> characters (64 characters in a standard
|
2015-10-04 17:14:28 +02:00
|
|
|
build). Only printable ASCII characters may be used in the
|
|
|
|
<varname>cluster_name</varname> value. Other characters will be
|
|
|
|
replaced with question marks (<literal>?</literal>). No name is shown
|
2017-10-09 03:44:17 +02:00
|
|
|
if this parameter is set to the empty string <literal>''</literal> (which is
|
2015-10-04 17:14:28 +02:00
|
|
|
the default). This parameter can only be set at server start.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-update-process-title" xreflabel="update_process_title">
|
|
|
|
<term><varname>update_process_title</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>update_process_title</varname> configuration parameter</primary>
|
2015-10-04 17:14:28 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables updating of the process title every time a new SQL command
|
2016-08-28 18:37:23 +02:00
|
|
|
is received by the server.
|
2017-10-09 03:44:17 +02:00
|
|
|
This setting defaults to <literal>on</literal> on most platforms, but it
|
|
|
|
defaults to <literal>off</literal> on Windows due to that platform's larger
|
2016-08-28 18:37:23 +02:00
|
|
|
overhead for updating the process title.
|
2015-10-04 17:14:28 +02:00
|
|
|
Only superusers can change this setting.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
</sect1>
|
2005-09-13 00:11:38 +02:00
|
|
|
|
|
|
|
<sect1 id="runtime-config-statistics">
|
2011-02-01 23:00:26 +01:00
|
|
|
<title>Run-time Statistics</title>
|
2005-09-13 00:11:38 +02:00
|
|
|
|
|
|
|
<sect2 id="runtime-config-statistics-collector">
|
|
|
|
<title>Query and Index Statistics Collector</title>
|
2006-01-23 19:16:41 +01:00
|
|
|
|
|
|
|
<para>
|
2007-09-24 05:12:23 +02:00
|
|
|
These parameters control server-wide statistics collection features.
|
2006-01-23 19:16:41 +01:00
|
|
|
When statistics collection is enabled, the data that is produced can be
|
|
|
|
accessed via the <structname>pg_stat</structname> and
|
|
|
|
<structname>pg_statio</structname> family of system views.
|
2017-11-23 15:39:47 +01:00
|
|
|
Refer to <xref linkend="monitoring"/> for more information.
|
2006-01-23 19:16:41 +01:00
|
|
|
</para>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<variablelist>
|
|
|
|
|
2007-09-24 05:12:23 +02:00
|
|
|
<varlistentry id="guc-track-activities" xreflabel="track_activities">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>track_activities</varname> (<type>boolean</type>)
|
2006-06-19 03:51:22 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>track_activities</varname> configuration parameter</primary>
|
2006-06-19 03:51:22 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2006-06-19 03:51:22 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables the collection of information on the currently
|
Make use of in-core query id added by commit 5fd9dfa5f5
Use the in-core query id computation for pg_stat_activity,
log_line_prefix, and EXPLAIN VERBOSE.
Similar to other fields in pg_stat_activity, only the queryid from the
top level statements are exposed, and if the backends status isn't
active then the queryid from the last executed statements is displayed.
Add a %Q placeholder to include the queryid in log_line_prefix, which
will also only expose top level statements.
For EXPLAIN VERBOSE, if a query identifier has been computed, either by
enabling compute_query_id or using a third-party module, display it.
Bump catalog version.
Discussion: https://postgr.es/m/20210407125726.tkvjdbw76hxnpwfi@nol
Author: Julien Rouhaud
Reviewed-by: Alvaro Herrera, Nitin Jadhav, Zhihong Yu
2021-04-07 20:03:56 +02:00
|
|
|
executing command of each session, along with its identifier and the
|
|
|
|
time when that command began execution. This parameter is on by
|
2006-06-19 03:51:22 +02:00
|
|
|
default. Note that even when enabled, this information is not
|
|
|
|
visible to all users, only to superusers and the user owning
|
2010-02-03 18:25:06 +01:00
|
|
|
the session being reported on, so it should not represent a
|
2006-06-19 03:51:22 +02:00
|
|
|
security risk.
|
|
|
|
Only superusers can change this setting.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2008-06-30 12:58:47 +02:00
|
|
|
<varlistentry id="guc-track-activity-query-size" xreflabel="track_activity_query_size">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>track_activity_query_size</varname> (<type>integer</type>)
|
2008-06-30 12:58:47 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>track_activity_query_size</varname> configuration parameter</primary>
|
2008-06-30 12:58:47 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2008-06-30 12:58:47 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Specifies the amount of memory reserved to store the text of the
|
|
|
|
currently executing command for each active session, for the
|
2017-10-09 03:44:17 +02:00
|
|
|
<structname>pg_stat_activity</structname>.<structfield>query</structfield> field.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as bytes.
|
|
|
|
The default value is 1024 bytes.
|
|
|
|
This parameter can only be set at server start.
|
2008-06-30 12:58:47 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2007-09-24 05:12:23 +02:00
|
|
|
<varlistentry id="guc-track-counts" xreflabel="track_counts">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>track_counts</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>track_counts</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2007-09-24 05:12:23 +02:00
|
|
|
Enables collection of statistics on database activity.
|
|
|
|
This parameter is on by default, because the autovacuum
|
2007-01-16 19:26:02 +01:00
|
|
|
daemon needs the collected information.
|
2006-01-23 19:16:41 +01:00
|
|
|
Only superusers can change this setting.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2012-04-29 22:23:54 +02:00
|
|
|
<varlistentry id="guc-track-io-timing" xreflabel="track_io_timing">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>track_io_timing</varname> (<type>boolean</type>)
|
2012-03-27 20:52:37 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>track_io_timing</varname> configuration parameter</primary>
|
2012-03-27 20:52:37 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2012-03-27 20:52:37 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables timing of database I/O calls. This parameter is off by
|
Track total amounts of times spent writing and syncing WAL data to disk.
This commit adds new GUC track_wal_io_timing. When this is enabled,
the total amounts of time XLogWrite writes and issue_xlog_fsync syncs
WAL data to disk are counted in pg_stat_wal. This information would be
useful to check how much WAL write and sync affect the performance.
Enabling track_wal_io_timing will make the server query the operating
system for the current time every time WAL is written or synced,
which may cause significant overhead on some platforms. To avoid such
additional overhead in the server with track_io_timing enabled,
this commit introduces track_wal_io_timing as a separate parameter from
track_io_timing.
Note that WAL write and sync activity by walreceiver has not been tracked yet.
This commit makes the server also track the numbers of times XLogWrite
writes and issue_xlog_fsync syncs WAL data to disk, in pg_stat_wal,
regardless of the setting of track_wal_io_timing. This counters can be
used to calculate the WAL write and sync time per request, for example.
Bump PGSTAT_FILE_FORMAT_ID.
Bump catalog version.
Author: Masahiro Ikeda
Reviewed-By: Japin Li, Hayato Kuroda, Masahiko Sawada, David Johnston, Fujii Masao
Discussion: https://postgr.es/m/0509ad67b585a5b86a83d445dfa75392@oss.nttdata.com
2021-03-09 08:52:06 +01:00
|
|
|
default, as it will repeatedly query the operating system for
|
2012-03-27 20:52:37 +02:00
|
|
|
the current time, which may cause significant overhead on some
|
2017-11-23 15:39:47 +01:00
|
|
|
platforms. You can use the <xref linkend="pgtesttiming"/> tool to
|
2012-04-29 22:23:54 +02:00
|
|
|
measure the overhead of timing on your system.
|
|
|
|
I/O timing information is
|
2020-05-29 10:14:33 +02:00
|
|
|
displayed in <link linkend="monitoring-pg-stat-database-view">
|
|
|
|
<structname>pg_stat_database</structname></link>, in the output of
|
2021-03-16 19:46:48 +01:00
|
|
|
<xref linkend="sql-explain"/> when the <literal>BUFFERS</literal> option
|
|
|
|
is used, by autovacuum for auto-vacuums and auto-analyzes, when
|
|
|
|
<xref linkend="guc-log-autovacuum-min-duration"/> is set and by
|
|
|
|
<xref linkend="pgstatstatements"/>. Only superusers can change this
|
|
|
|
setting.
|
2012-03-27 20:52:37 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Track total amounts of times spent writing and syncing WAL data to disk.
This commit adds new GUC track_wal_io_timing. When this is enabled,
the total amounts of time XLogWrite writes and issue_xlog_fsync syncs
WAL data to disk are counted in pg_stat_wal. This information would be
useful to check how much WAL write and sync affect the performance.
Enabling track_wal_io_timing will make the server query the operating
system for the current time every time WAL is written or synced,
which may cause significant overhead on some platforms. To avoid such
additional overhead in the server with track_io_timing enabled,
this commit introduces track_wal_io_timing as a separate parameter from
track_io_timing.
Note that WAL write and sync activity by walreceiver has not been tracked yet.
This commit makes the server also track the numbers of times XLogWrite
writes and issue_xlog_fsync syncs WAL data to disk, in pg_stat_wal,
regardless of the setting of track_wal_io_timing. This counters can be
used to calculate the WAL write and sync time per request, for example.
Bump PGSTAT_FILE_FORMAT_ID.
Bump catalog version.
Author: Masahiro Ikeda
Reviewed-By: Japin Li, Hayato Kuroda, Masahiko Sawada, David Johnston, Fujii Masao
Discussion: https://postgr.es/m/0509ad67b585a5b86a83d445dfa75392@oss.nttdata.com
2021-03-09 08:52:06 +01:00
|
|
|
<varlistentry id="guc-track-wal-io-timing" xreflabel="track_wal_io_timing">
|
|
|
|
<term><varname>track_wal_io_timing</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>track_wal_io_timing</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables timing of WAL I/O calls. This parameter is off by default,
|
|
|
|
as it will repeatedly query the operating system for the current time,
|
|
|
|
which may cause significant overhead on some platforms.
|
|
|
|
You can use the <application>pg_test_timing</application> tool to
|
|
|
|
measure the overhead of timing on your system.
|
|
|
|
I/O timing information is
|
|
|
|
displayed in <link linkend="monitoring-pg-stat-wal-view">
|
|
|
|
<structname>pg_stat_wal</structname></link>. Only superusers can
|
|
|
|
change this setting.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2008-05-15 02:17:41 +02:00
|
|
|
<varlistentry id="guc-track-functions" xreflabel="track_functions">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>track_functions</varname> (<type>enum</type>)
|
2008-05-15 02:17:41 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>track_functions</varname> configuration parameter</primary>
|
2008-05-15 02:17:41 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2008-05-15 02:17:41 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables tracking of function call counts and time used. Specify
|
2009-02-23 00:50:30 +01:00
|
|
|
<literal>pl</literal> to track only procedural-language functions,
|
2008-05-15 02:17:41 +02:00
|
|
|
<literal>all</literal> to also track SQL and C language functions.
|
2009-02-23 00:50:30 +01:00
|
|
|
The default is <literal>none</literal>, which disables function
|
|
|
|
statistics tracking. Only superusers can change this setting.
|
2008-05-15 02:17:41 +02:00
|
|
|
</para>
|
2009-02-23 00:50:30 +01:00
|
|
|
|
|
|
|
<note>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
SQL-language functions that are simple enough to be <quote>inlined</quote>
|
2009-02-23 00:50:30 +01:00
|
|
|
into the calling query will not be tracked, regardless of this
|
|
|
|
setting.
|
|
|
|
</para>
|
|
|
|
</note>
|
2008-05-15 02:17:41 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2008-08-15 10:37:41 +02:00
|
|
|
<varlistentry id="guc-stats-temp-directory" xreflabel="stats_temp_directory">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>stats_temp_directory</varname> (<type>string</type>)
|
2008-08-15 10:37:41 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>stats_temp_directory</varname> configuration parameter</primary>
|
2008-08-15 10:37:41 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2008-08-15 10:37:41 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2008-08-25 21:03:37 +02:00
|
|
|
Sets the directory to store temporary statistics data in. This can be
|
|
|
|
a path relative to the data directory or an absolute path. The default
|
2010-02-03 18:25:06 +01:00
|
|
|
is <filename>pg_stat_tmp</filename>. Pointing this at a RAM-based
|
|
|
|
file system will decrease physical I/O requirements and can lead to
|
2008-08-25 21:03:37 +02:00
|
|
|
improved performance.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2008-08-25 21:03:37 +02:00
|
|
|
file or on the server command line.
|
2008-08-15 10:37:41 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2006-06-19 03:51:22 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
|
|
|
|
<sect2 id="runtime-config-statistics-monitor">
|
|
|
|
<title>Statistics Monitoring</title>
|
|
|
|
<variablelist>
|
|
|
|
|
2021-04-07 19:06:47 +02:00
|
|
|
<varlistentry id="guc-compute-query-id" xreflabel="compute_query_id">
|
2021-05-15 20:13:09 +02:00
|
|
|
<term><varname>compute_query_id</varname> (<type>enum</type>)
|
2021-04-07 19:06:47 +02:00
|
|
|
<indexterm>
|
|
|
|
<primary><varname>compute_query_id</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
Make use of in-core query id added by commit 5fd9dfa5f5
Use the in-core query id computation for pg_stat_activity,
log_line_prefix, and EXPLAIN VERBOSE.
Similar to other fields in pg_stat_activity, only the queryid from the
top level statements are exposed, and if the backends status isn't
active then the queryid from the last executed statements is displayed.
Add a %Q placeholder to include the queryid in log_line_prefix, which
will also only expose top level statements.
For EXPLAIN VERBOSE, if a query identifier has been computed, either by
enabling compute_query_id or using a third-party module, display it.
Bump catalog version.
Discussion: https://postgr.es/m/20210407125726.tkvjdbw76hxnpwfi@nol
Author: Julien Rouhaud
Reviewed-by: Alvaro Herrera, Nitin Jadhav, Zhihong Yu
2021-04-07 20:03:56 +02:00
|
|
|
Enables in-core computation of a query identifier.
|
|
|
|
Query identifiers can be displayed in the <link
|
|
|
|
linkend="monitoring-pg-stat-activity-view"><structname>pg_stat_activity</structname></link>
|
|
|
|
view, using <command>EXPLAIN</command>, or emitted in the log if
|
|
|
|
configured via the <xref linkend="guc-log-line-prefix"/> parameter.
|
|
|
|
The <xref linkend="pgstatstatements"/> extension also requires a query
|
|
|
|
identifier to be computed. Note that an external module can
|
|
|
|
alternatively be used if the in-core query identifier computation
|
2021-04-09 06:53:07 +02:00
|
|
|
method is not acceptable. In this case, in-core computation
|
2021-05-15 20:13:09 +02:00
|
|
|
must be always disabled.
|
|
|
|
Valid values are <literal>off</literal> (always disabled),
|
|
|
|
<literal>on</literal> (always enabled) and <literal>auto</literal>,
|
|
|
|
which lets modules such as <xref linkend="pgstatstatements"/>
|
|
|
|
automatically enable it.
|
|
|
|
The default is <literal>auto</literal>.
|
2021-04-07 19:06:47 +02:00
|
|
|
</para>
|
|
|
|
<note>
|
|
|
|
<para>
|
2021-04-09 06:53:07 +02:00
|
|
|
To ensure that only one query identifier is calculated and
|
2021-04-07 19:06:47 +02:00
|
|
|
displayed, extensions that calculate query identifiers should
|
|
|
|
throw an error if a query identifier has already been computed.
|
|
|
|
</para>
|
|
|
|
</note>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2006-06-19 03:51:22 +02:00
|
|
|
<varlistentry>
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_statement_stats</varname> (<type>boolean</type>)
|
2006-06-19 03:51:22 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_statement_stats</varname> configuration parameter</primary>
|
2006-06-19 03:51:22 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
|
|
|
<term><varname>log_parser_stats</varname> (<type>boolean</type>)
|
2006-06-19 03:51:22 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_parser_stats</varname> configuration parameter</primary>
|
2006-06-19 03:51:22 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
|
|
|
<term><varname>log_planner_stats</varname> (<type>boolean</type>)
|
2006-06-19 03:51:22 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_planner_stats</varname> configuration parameter</primary>
|
2006-06-19 03:51:22 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
|
|
|
<term><varname>log_executor_stats</varname> (<type>boolean</type>)
|
2006-06-19 03:51:22 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_executor_stats</varname> configuration parameter</primary>
|
2006-06-19 03:51:22 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2006-06-19 03:51:22 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2010-02-03 18:25:06 +01:00
|
|
|
For each query, output performance statistics of the respective
|
2006-06-19 03:51:22 +02:00
|
|
|
module to the server log. This is a crude profiling
|
2017-10-09 03:44:17 +02:00
|
|
|
instrument, similar to the Unix <function>getrusage()</function> operating
|
2010-02-03 18:25:06 +01:00
|
|
|
system facility. <varname>log_statement_stats</varname> reports total
|
2006-06-19 03:51:22 +02:00
|
|
|
statement statistics, while the others report per-module statistics.
|
|
|
|
<varname>log_statement_stats</varname> cannot be enabled together with
|
|
|
|
any of the per-module options. All of these options are disabled by
|
|
|
|
default. Only superusers can change these settings.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
</variablelist>
|
2006-06-19 03:51:22 +02:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="runtime-config-autovacuum">
|
|
|
|
<title>Automatic Vacuuming</title>
|
|
|
|
|
2005-09-13 03:51:18 +02:00
|
|
|
<indexterm>
|
|
|
|
<primary>autovacuum</primary>
|
2006-01-21 20:34:42 +01:00
|
|
|
<secondary>configuration parameters</secondary>
|
2005-09-13 03:51:18 +02:00
|
|
|
</indexterm>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
These settings control the behavior of the <firstterm>autovacuum</firstterm>
|
2017-11-23 15:39:47 +01:00
|
|
|
feature. Refer to <xref linkend="autovacuum"/> for more information.
|
2015-11-11 23:13:38 +01:00
|
|
|
Note that many of these settings can be overridden on a per-table
|
Doc: fix "Unresolved ID reference" warnings, clean up man page cross-refs.
Use xreflabel attributes instead of endterm attributes to control the
appearance of links to subsections of SQL command reference pages.
This is simpler, it matches what we do elsewhere (e.g. for GUC variables),
and it doesn't draw "Unresolved ID reference" warnings from the PDF
toolchain.
Fix some places where the text was absolutely dependent on an <xref>
rendering exactly so, by using a <link> around the required text
instead. At least one of those spots had already been turned into
bad grammar by subsequent changes, and the whole idea is just too
fragile for my taste. <xref> does NOT have fixed output, don't write
as if it does.
Consistently include a page-level link in cross-man-page references,
because otherwise they are useless/nonsensical in man-page output.
Likewise, be consistent about mentioning "below" or "above" in same-page
references; we were doing that in about 90% of the cases, but now it's
100%.
Also get rid of another nonfunctional-in-PDF idea, of making
cross-references to functions by sticking ID tags on <row> constructs.
We can put the IDs on <indexterm>s instead --- which is probably not any
more sensible in abstract terms, but it works where the other doesn't.
(There is talk of attaching cross-reference IDs to most or all of
the docs' function descriptions, but for now I just fixed the two
that exist.)
Discussion: https://postgr.es/m/14480.1589154358@sss.pgh.pa.us
2020-05-11 20:15:49 +02:00
|
|
|
basis; see <xref linkend="sql-createtable-storage-parameters"/>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
|
|
|
|
<varlistentry id="guc-autovacuum" xreflabel="autovacuum">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>autovacuum</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>autovacuum</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2006-01-21 20:34:42 +01:00
|
|
|
Controls whether the server should run the
|
2007-09-24 05:12:23 +02:00
|
|
|
autovacuum launcher daemon. This is on by default; however,
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-track-counts"/> must also be enabled for
|
2007-09-24 05:12:23 +02:00
|
|
|
autovacuum to work.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2015-11-11 23:13:38 +01:00
|
|
|
file or on the server command line; however, autovacuuming can be
|
|
|
|
disabled for individual tables by changing table storage parameters.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2007-05-15 17:35:46 +02:00
|
|
|
<para>
|
|
|
|
Note that even when this parameter is disabled, the system
|
2007-09-24 05:12:23 +02:00
|
|
|
will launch autovacuum processes if necessary to
|
2007-05-15 17:35:46 +02:00
|
|
|
prevent transaction ID wraparound. See <xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="vacuum-for-wraparound"/> for more information.
|
2007-05-15 17:35:46 +02:00
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2007-09-24 05:12:23 +02:00
|
|
|
<varlistentry id="guc-autovacuum-max-workers" xreflabel="autovacuum_max_workers">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>autovacuum_max_workers</varname> (<type>integer</type>)
|
2007-04-18 18:44:18 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>autovacuum_max_workers</varname> configuration parameter</primary>
|
2007-04-18 18:44:18 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2007-04-18 18:44:18 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2007-09-24 05:12:23 +02:00
|
|
|
Specifies the maximum number of autovacuum processes (other than the
|
2015-11-11 23:13:38 +01:00
|
|
|
autovacuum launcher) that may be running at any one time. The default
|
2009-09-13 21:52:29 +02:00
|
|
|
is three. This parameter can only be set at server start.
|
2007-04-18 18:44:18 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-autovacuum-naptime" xreflabel="autovacuum_naptime">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>autovacuum_naptime</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>autovacuum_naptime</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2007-04-16 20:30:04 +02:00
|
|
|
Specifies the minimum delay between autovacuum runs on any given
|
|
|
|
database. In each round the daemon examines the
|
2017-10-09 03:44:17 +02:00
|
|
|
database and issues <command>VACUUM</command> and <command>ANALYZE</command> commands
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
as needed for tables in that database.
|
|
|
|
If this value is specified without units, it is taken as seconds.
|
|
|
|
The default is one minute (<literal>1min</literal>).
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2006-01-23 19:16:41 +01:00
|
|
|
file or on the server command line.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>autovacuum_vacuum_threshold</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2019-04-16 16:16:20 +02:00
|
|
|
<primary><varname>autovacuum_vacuum_threshold</varname></primary>
|
|
|
|
<secondary>configuration parameter</secondary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the minimum number of updated or deleted tuples needed
|
2017-10-09 03:44:17 +02:00
|
|
|
to trigger a <command>VACUUM</command> in any one table.
|
2007-07-24 03:53:56 +02:00
|
|
|
The default is 50 tuples.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2015-11-11 23:13:38 +01:00
|
|
|
file or on the server command line;
|
|
|
|
but the setting can be overridden for individual tables by
|
|
|
|
changing table storage parameters.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Trigger autovacuum based on number of INSERTs
Traditionally autovacuum has only ever invoked a worker based on the
estimated number of dead tuples in a table and for anti-wraparound
purposes. For the latter, with certain classes of tables such as
insert-only tables, anti-wraparound vacuums could be the first vacuum that
the table ever receives. This could often lead to autovacuum workers being
busy for extended periods of time due to having to potentially freeze
every page in the table. This could be particularly bad for very large
tables. New clusters, or recently pg_restored clusters could suffer even
more as many large tables may have the same relfrozenxid, which could
result in large numbers of tables requiring an anti-wraparound vacuum all
at once.
Here we aim to reduce the work required by anti-wraparound and aggressive
vacuums in general, by triggering autovacuum when the table has received
enough INSERTs. This is controlled by adding two new GUCs and reloptions;
autovacuum_vacuum_insert_threshold and
autovacuum_vacuum_insert_scale_factor. These work exactly the same as the
existing scale factor and threshold controls, only base themselves off the
number of inserts since the last vacuum, rather than the number of dead
tuples. New controls were added rather than reusing the existing
controls, to allow these new vacuums to be tuned independently and perhaps
even completely disabled altogether, which can be done by setting
autovacuum_vacuum_insert_threshold to -1.
We make no attempt to skip index cleanup operations on these vacuums as
they may trigger for an insert-mostly table which continually doesn't have
enough dead tuples to trigger an autovacuum for the purpose of removing
those dead tuples. If we were to skip cleaning the indexes in this case,
then it is possible for the index(es) to become bloated over time.
There are additional benefits to triggering autovacuums based on inserts,
as tables which never contain enough dead tuples to trigger an autovacuum
are now more likely to receive a vacuum, which can mark more of the table
as "allvisible" and encourage the query planner to make use of Index Only
Scans.
Currently, we still obey vacuum_freeze_min_age when triggering these new
autovacuums based on INSERTs. For large insert-only tables, it may be
beneficial to lower the table's autovacuum_freeze_min_age so that tuples
are eligible to be frozen sooner. Here we've opted not to zero that for
these types of vacuums, since the table may just be insert-mostly and we
may otherwise freeze tuples that are still destined to be updated or
removed in the near future.
There was some debate to what exactly the new scale factor and threshold
should default to. For now, these are set to 0.2 and 1000, respectively.
There may be some motivation to adjust these before the release.
Author: Laurenz Albe, Darafei Praliaskouski
Reviewed-by: Alvaro Herrera, Masahiko Sawada, Chris Travers, Andres Freund, Justin Pryzby
Discussion: https://postgr.es/m/CAC8Q8t%2Bj36G_bLF%3D%2B0iMo6jGNWnLnWb1tujXuJr-%2Bx8ZCCTqoQ%40mail.gmail.com
2020-03-28 07:20:12 +01:00
|
|
|
<varlistentry id="guc-autovacuum-vacuum-insert-threshold" xreflabel="autovacuum_vacuum_insert_threshold">
|
|
|
|
<term><varname>autovacuum_vacuum_insert_threshold</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>autovacuum_vacuum_insert_threshold</varname></primary>
|
|
|
|
<secondary>configuration parameter</secondary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the number of inserted tuples needed to trigger a
|
|
|
|
<command>VACUUM</command> in any one table.
|
|
|
|
The default is 1000 tuples. If -1 is specified, autovacuum will not
|
|
|
|
trigger a <command>VACUUM</command> operation on any tables based on
|
|
|
|
the number of inserts.
|
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
|
|
|
file or on the server command line;
|
|
|
|
but the setting can be overridden for individual tables by
|
|
|
|
changing table storage parameters.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-autovacuum-analyze-threshold" xreflabel="autovacuum_analyze_threshold">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>autovacuum_analyze_threshold</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2019-04-16 16:16:20 +02:00
|
|
|
<primary><varname>autovacuum_analyze_threshold</varname></primary>
|
|
|
|
<secondary>configuration parameter</secondary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the minimum number of inserted, updated or deleted tuples
|
2017-10-09 03:44:17 +02:00
|
|
|
needed to trigger an <command>ANALYZE</command> in any one table.
|
2007-07-24 03:53:56 +02:00
|
|
|
The default is 50 tuples.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2015-11-11 23:13:38 +01:00
|
|
|
file or on the server command line;
|
|
|
|
but the setting can be overridden for individual tables by
|
|
|
|
changing table storage parameters.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-autovacuum-vacuum-scale-factor" xreflabel="autovacuum_vacuum_scale_factor">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>autovacuum_vacuum_scale_factor</varname> (<type>floating point</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2019-04-16 16:16:20 +02:00
|
|
|
<primary><varname>autovacuum_vacuum_scale_factor</varname></primary>
|
|
|
|
<secondary>configuration parameter</secondary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies a fraction of the table size to add to
|
|
|
|
<varname>autovacuum_vacuum_threshold</varname>
|
2017-10-09 03:44:17 +02:00
|
|
|
when deciding whether to trigger a <command>VACUUM</command>.
|
2007-01-20 22:30:26 +01:00
|
|
|
The default is 0.2 (20% of table size).
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2015-11-11 23:13:38 +01:00
|
|
|
file or on the server command line;
|
|
|
|
but the setting can be overridden for individual tables by
|
|
|
|
changing table storage parameters.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Trigger autovacuum based on number of INSERTs
Traditionally autovacuum has only ever invoked a worker based on the
estimated number of dead tuples in a table and for anti-wraparound
purposes. For the latter, with certain classes of tables such as
insert-only tables, anti-wraparound vacuums could be the first vacuum that
the table ever receives. This could often lead to autovacuum workers being
busy for extended periods of time due to having to potentially freeze
every page in the table. This could be particularly bad for very large
tables. New clusters, or recently pg_restored clusters could suffer even
more as many large tables may have the same relfrozenxid, which could
result in large numbers of tables requiring an anti-wraparound vacuum all
at once.
Here we aim to reduce the work required by anti-wraparound and aggressive
vacuums in general, by triggering autovacuum when the table has received
enough INSERTs. This is controlled by adding two new GUCs and reloptions;
autovacuum_vacuum_insert_threshold and
autovacuum_vacuum_insert_scale_factor. These work exactly the same as the
existing scale factor and threshold controls, only base themselves off the
number of inserts since the last vacuum, rather than the number of dead
tuples. New controls were added rather than reusing the existing
controls, to allow these new vacuums to be tuned independently and perhaps
even completely disabled altogether, which can be done by setting
autovacuum_vacuum_insert_threshold to -1.
We make no attempt to skip index cleanup operations on these vacuums as
they may trigger for an insert-mostly table which continually doesn't have
enough dead tuples to trigger an autovacuum for the purpose of removing
those dead tuples. If we were to skip cleaning the indexes in this case,
then it is possible for the index(es) to become bloated over time.
There are additional benefits to triggering autovacuums based on inserts,
as tables which never contain enough dead tuples to trigger an autovacuum
are now more likely to receive a vacuum, which can mark more of the table
as "allvisible" and encourage the query planner to make use of Index Only
Scans.
Currently, we still obey vacuum_freeze_min_age when triggering these new
autovacuums based on INSERTs. For large insert-only tables, it may be
beneficial to lower the table's autovacuum_freeze_min_age so that tuples
are eligible to be frozen sooner. Here we've opted not to zero that for
these types of vacuums, since the table may just be insert-mostly and we
may otherwise freeze tuples that are still destined to be updated or
removed in the near future.
There was some debate to what exactly the new scale factor and threshold
should default to. For now, these are set to 0.2 and 1000, respectively.
There may be some motivation to adjust these before the release.
Author: Laurenz Albe, Darafei Praliaskouski
Reviewed-by: Alvaro Herrera, Masahiko Sawada, Chris Travers, Andres Freund, Justin Pryzby
Discussion: https://postgr.es/m/CAC8Q8t%2Bj36G_bLF%3D%2B0iMo6jGNWnLnWb1tujXuJr-%2Bx8ZCCTqoQ%40mail.gmail.com
2020-03-28 07:20:12 +01:00
|
|
|
<varlistentry id="guc-autovacuum-vacuum-insert-scale-factor" xreflabel="autovacuum_vacuum_insert_scale_factor">
|
|
|
|
<term><varname>autovacuum_vacuum_insert_scale_factor</varname> (<type>floating point</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>autovacuum_vacuum_insert_scale_factor</varname></primary>
|
|
|
|
<secondary>configuration parameter</secondary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies a fraction of the table size to add to
|
|
|
|
<varname>autovacuum_vacuum_insert_threshold</varname>
|
|
|
|
when deciding whether to trigger a <command>VACUUM</command>.
|
|
|
|
The default is 0.2 (20% of table size).
|
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
|
|
|
file or on the server command line;
|
|
|
|
but the setting can be overridden for individual tables by
|
|
|
|
changing table storage parameters.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-autovacuum-analyze-scale-factor" xreflabel="autovacuum_analyze_scale_factor">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>autovacuum_analyze_scale_factor</varname> (<type>floating point</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2019-04-16 16:16:20 +02:00
|
|
|
<primary><varname>autovacuum_analyze_scale_factor</varname></primary>
|
|
|
|
<secondary>configuration parameter</secondary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies a fraction of the table size to add to
|
|
|
|
<varname>autovacuum_analyze_threshold</varname>
|
2017-10-09 03:44:17 +02:00
|
|
|
when deciding whether to trigger an <command>ANALYZE</command>.
|
2007-01-20 22:30:26 +01:00
|
|
|
The default is 0.1 (10% of table size).
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2015-11-11 23:13:38 +01:00
|
|
|
file or on the server command line;
|
|
|
|
but the setting can be overridden for individual tables by
|
|
|
|
changing table storage parameters.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
<varlistentry id="guc-autovacuum-freeze-max-age" xreflabel="autovacuum_freeze_max_age">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>autovacuum_freeze_max_age</varname> (<type>integer</type>)
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
<indexterm>
|
2019-04-16 16:16:20 +02:00
|
|
|
<primary><varname>autovacuum_freeze_max_age</varname></primary>
|
|
|
|
<secondary>configuration parameter</secondary>
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the maximum age (in transactions) that a table's
|
2017-10-09 03:44:17 +02:00
|
|
|
<structname>pg_class</structname>.<structfield>relfrozenxid</structfield> field can
|
|
|
|
attain before a <command>VACUUM</command> operation is forced
|
2010-08-24 15:32:25 +02:00
|
|
|
to prevent transaction ID wraparound within the table.
|
|
|
|
Note that the system will launch autovacuum processes to
|
|
|
|
prevent wraparound even when autovacuum is otherwise disabled.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Vacuum also allows removal of old files from the
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>pg_xact</filename> subdirectory, which is why the default
|
2010-08-24 15:32:25 +02:00
|
|
|
is a relatively low 200 million transactions.
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
This parameter can only be set at server start, but the setting
|
2009-02-09 21:57:59 +01:00
|
|
|
can be reduced for individual tables by
|
2015-11-11 23:13:38 +01:00
|
|
|
changing table storage parameters.
|
2017-11-23 15:39:47 +01:00
|
|
|
For more information see <xref linkend="vacuum-for-wraparound"/>.
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
<varlistentry id="guc-autovacuum-multixact-freeze-max-age" xreflabel="autovacuum_multixact_freeze_max_age">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>autovacuum_multixact_freeze_max_age</varname> (<type>integer</type>)
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
<indexterm>
|
2019-04-16 16:16:20 +02:00
|
|
|
<primary><varname>autovacuum_multixact_freeze_max_age</varname></primary>
|
|
|
|
<secondary>configuration parameter</secondary>
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the maximum age (in multixacts) that a table's
|
2017-10-09 03:44:17 +02:00
|
|
|
<structname>pg_class</structname>.<structfield>relminmxid</structfield> field can
|
|
|
|
attain before a <command>VACUUM</command> operation is forced to
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
prevent multixact ID wraparound within the table.
|
|
|
|
Note that the system will launch autovacuum processes to
|
|
|
|
prevent wraparound even when autovacuum is otherwise disabled.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Vacuuming multixacts also allows removal of old files from the
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>pg_multixact/members</filename> and <filename>pg_multixact/offsets</filename>
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
subdirectories, which is why the default is a relatively low
|
|
|
|
400 million multixacts.
|
2015-11-11 23:13:38 +01:00
|
|
|
This parameter can only be set at server start, but the setting can
|
|
|
|
be reduced for individual tables by changing table storage parameters.
|
2017-11-23 15:39:47 +01:00
|
|
|
For more information see <xref linkend="vacuum-for-multixact-wraparound"/>.
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-autovacuum-vacuum-cost-delay" xreflabel="autovacuum_vacuum_cost_delay">
|
2019-03-10 20:01:39 +01:00
|
|
|
<term><varname>autovacuum_vacuum_cost_delay</varname> (<type>floating point</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2019-04-16 16:16:20 +02:00
|
|
|
<primary><varname>autovacuum_vacuum_cost_delay</varname></primary>
|
|
|
|
<secondary>configuration parameter</secondary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the cost delay value that will be used in automatic
|
2017-10-09 03:44:17 +02:00
|
|
|
<command>VACUUM</command> operations. If -1 is specified, the regular
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-vacuum-cost-delay"/> value will be used.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
2019-03-10 20:16:21 +01:00
|
|
|
The default value is 2 milliseconds.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2015-11-11 23:13:38 +01:00
|
|
|
file or on the server command line;
|
|
|
|
but the setting can be overridden for individual tables by
|
|
|
|
changing table storage parameters.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-autovacuum-vacuum-cost-limit" xreflabel="autovacuum_vacuum_cost_limit">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>autovacuum_vacuum_cost_limit</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2019-04-16 16:16:20 +02:00
|
|
|
<primary><varname>autovacuum_vacuum_cost_limit</varname></primary>
|
|
|
|
<secondary>configuration parameter</secondary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the cost limit value that will be used in automatic
|
2017-10-09 03:44:17 +02:00
|
|
|
<command>VACUUM</command> operations. If -1 is specified (which is the
|
2005-09-13 00:11:38 +02:00
|
|
|
default), the regular
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-vacuum-cost-limit"/> value will be used. Note that
|
2007-04-16 20:30:04 +02:00
|
|
|
the value is distributed proportionally among the running autovacuum
|
2015-11-11 23:13:38 +01:00
|
|
|
workers, if there is more than one, so that the sum of the limits for
|
|
|
|
each worker does not exceed the value of this variable.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2015-11-11 23:13:38 +01:00
|
|
|
file or on the server command line;
|
|
|
|
but the setting can be overridden for individual tables by
|
|
|
|
changing table storage parameters.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
</variablelist>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="runtime-config-client">
|
|
|
|
<title>Client Connection Defaults</title>
|
|
|
|
|
|
|
|
<sect2 id="runtime-config-client-statement">
|
|
|
|
<title>Statement Behavior</title>
|
|
|
|
<variablelist>
|
|
|
|
|
Disallow setting client_min_messages higher than ERROR.
Previously it was possible to set client_min_messages to FATAL or PANIC,
which had the effect of suppressing transmission of regular ERROR messages
to the client. Perhaps that seemed like a useful option in the past, but
the trouble with it is that it breaks guarantees that are explicitly made
in our FE/BE protocol spec about how a query cycle can end. While libpq
and psql manage to cope with the omission, that's mostly because they
are not very bright; client libraries that have more semantic knowledge
are likely to get confused. Notably, pgODBC doesn't behave very sanely.
Let's fix this by getting rid of the ability to set client_min_messages
above ERROR.
In HEAD, just remove the FATAL and PANIC options from the set of allowed
enum values for client_min_messages. (This change also affects
trace_recovery_messages, but that's OK since these aren't useful values
for that variable either.)
In the back branches, there was concern that rejecting these values might
break applications that are explicitly setting things that way. I'm
pretty skeptical of that argument, but accommodate it by accepting these
values and then internally setting the variable to ERROR anyway.
In all branches, this allows a couple of tiny simplifications in the
logic in elog.c, so do that.
Also respond to the point that was made that client_min_messages has
exactly nothing to do with the server's logging behavior, and therefore
does not belong in the "When To Log" subsection of the documentation.
The "Statement Behavior" subsection is a better match, so move it there.
Jonah Harris and Tom Lane
Discussion: https://postgr.es/m/7809.1541521180@sss.pgh.pa.us
Discussion: https://postgr.es/m/15479-ef0f4cc2fd995ca2@postgresql.org
2018-11-08 23:33:25 +01:00
|
|
|
<varlistentry id="guc-client-min-messages" xreflabel="client_min_messages">
|
|
|
|
<term><varname>client_min_messages</varname> (<type>enum</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>client_min_messages</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2019-01-07 19:19:46 +01:00
|
|
|
Controls which
|
|
|
|
<link linkend="runtime-config-severity-levels">message levels</link>
|
|
|
|
are sent to the client.
|
Disallow setting client_min_messages higher than ERROR.
Previously it was possible to set client_min_messages to FATAL or PANIC,
which had the effect of suppressing transmission of regular ERROR messages
to the client. Perhaps that seemed like a useful option in the past, but
the trouble with it is that it breaks guarantees that are explicitly made
in our FE/BE protocol spec about how a query cycle can end. While libpq
and psql manage to cope with the omission, that's mostly because they
are not very bright; client libraries that have more semantic knowledge
are likely to get confused. Notably, pgODBC doesn't behave very sanely.
Let's fix this by getting rid of the ability to set client_min_messages
above ERROR.
In HEAD, just remove the FATAL and PANIC options from the set of allowed
enum values for client_min_messages. (This change also affects
trace_recovery_messages, but that's OK since these aren't useful values
for that variable either.)
In the back branches, there was concern that rejecting these values might
break applications that are explicitly setting things that way. I'm
pretty skeptical of that argument, but accommodate it by accepting these
values and then internally setting the variable to ERROR anyway.
In all branches, this allows a couple of tiny simplifications in the
logic in elog.c, so do that.
Also respond to the point that was made that client_min_messages has
exactly nothing to do with the server's logging behavior, and therefore
does not belong in the "When To Log" subsection of the documentation.
The "Statement Behavior" subsection is a better match, so move it there.
Jonah Harris and Tom Lane
Discussion: https://postgr.es/m/7809.1541521180@sss.pgh.pa.us
Discussion: https://postgr.es/m/15479-ef0f4cc2fd995ca2@postgresql.org
2018-11-08 23:33:25 +01:00
|
|
|
Valid values are <literal>DEBUG5</literal>,
|
|
|
|
<literal>DEBUG4</literal>, <literal>DEBUG3</literal>, <literal>DEBUG2</literal>,
|
|
|
|
<literal>DEBUG1</literal>, <literal>LOG</literal>, <literal>NOTICE</literal>,
|
|
|
|
<literal>WARNING</literal>, and <literal>ERROR</literal>.
|
|
|
|
Each level includes all the levels that follow it. The later the level,
|
|
|
|
the fewer messages are sent. The default is
|
|
|
|
<literal>NOTICE</literal>. Note that <literal>LOG</literal> has a different
|
|
|
|
rank here than in <xref linkend="guc-log-min-messages"/>.
|
|
|
|
</para>
|
2019-01-07 19:19:46 +01:00
|
|
|
<para>
|
|
|
|
<literal>INFO</literal> level messages are always sent to the client.
|
|
|
|
</para>
|
Disallow setting client_min_messages higher than ERROR.
Previously it was possible to set client_min_messages to FATAL or PANIC,
which had the effect of suppressing transmission of regular ERROR messages
to the client. Perhaps that seemed like a useful option in the past, but
the trouble with it is that it breaks guarantees that are explicitly made
in our FE/BE protocol spec about how a query cycle can end. While libpq
and psql manage to cope with the omission, that's mostly because they
are not very bright; client libraries that have more semantic knowledge
are likely to get confused. Notably, pgODBC doesn't behave very sanely.
Let's fix this by getting rid of the ability to set client_min_messages
above ERROR.
In HEAD, just remove the FATAL and PANIC options from the set of allowed
enum values for client_min_messages. (This change also affects
trace_recovery_messages, but that's OK since these aren't useful values
for that variable either.)
In the back branches, there was concern that rejecting these values might
break applications that are explicitly setting things that way. I'm
pretty skeptical of that argument, but accommodate it by accepting these
values and then internally setting the variable to ERROR anyway.
In all branches, this allows a couple of tiny simplifications in the
logic in elog.c, so do that.
Also respond to the point that was made that client_min_messages has
exactly nothing to do with the server's logging behavior, and therefore
does not belong in the "When To Log" subsection of the documentation.
The "Statement Behavior" subsection is a better match, so move it there.
Jonah Harris and Tom Lane
Discussion: https://postgr.es/m/7809.1541521180@sss.pgh.pa.us
Discussion: https://postgr.es/m/15479-ef0f4cc2fd995ca2@postgresql.org
2018-11-08 23:33:25 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-search-path" xreflabel="search_path">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>search_path</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>search_path</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<indexterm><primary>path</primary><secondary>for schemas</secondary></indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This variable specifies the order in which schemas are searched
|
|
|
|
when an object (table, data type, function, etc.) is referenced by a
|
2010-02-03 18:25:06 +01:00
|
|
|
simple name with no schema specified. When there are objects of
|
2005-09-13 00:11:38 +02:00
|
|
|
identical names in different schemas, the one found first
|
|
|
|
in the search path is used. An object that is not in any of the
|
|
|
|
schemas in the search path can only be referenced by specifying
|
|
|
|
its containing schema with a qualified (dotted) name.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2010-02-03 18:25:06 +01:00
|
|
|
The value for <varname>search_path</varname> must be a comma-separated
|
2012-04-11 17:29:22 +02:00
|
|
|
list of schema names. Any name that is not an existing schema, or is
|
2017-10-09 03:44:17 +02:00
|
|
|
a schema for which the user does not have <literal>USAGE</literal>
|
2012-04-11 17:29:22 +02:00
|
|
|
permission, is silently ignored.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
If one of the list items is the special name
|
|
|
|
<literal>$user</literal>, then the schema having the name returned by
|
2018-12-20 19:55:11 +01:00
|
|
|
<function>CURRENT_USER</function> is substituted, if there is such a schema
|
2017-10-09 03:44:17 +02:00
|
|
|
and the user has <literal>USAGE</literal> permission for it.
|
2012-04-11 17:29:22 +02:00
|
|
|
(If not, <literal>$user</literal> is ignored.)
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The system catalog schema, <literal>pg_catalog</literal>, is always
|
2005-09-13 00:11:38 +02:00
|
|
|
searched, whether it is mentioned in the path or not. If it is
|
|
|
|
mentioned in the path then it will be searched in the specified
|
2017-10-09 03:44:17 +02:00
|
|
|
order. If <literal>pg_catalog</literal> is not in the path then it will
|
|
|
|
be searched <emphasis>before</emphasis> searching any of the path items.
|
2007-04-20 04:37:38 +02:00
|
|
|
</para>
|
|
|
|
|
2019-08-05 16:48:41 +02:00
|
|
|
<!-- To further split hairs, funcname('foo') does not use the temporary
|
|
|
|
schema, even when it considers typname='funcname'. This paragraph
|
|
|
|
refers to function names in a loose sense, "pg_proc.proname or
|
|
|
|
func_name grammar production". -->
|
2007-04-20 04:37:38 +02:00
|
|
|
<para>
|
|
|
|
Likewise, the current session's temporary-table schema,
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>pg_temp_<replaceable>nnn</replaceable></literal>, is always searched if it
|
2007-04-20 04:37:38 +02:00
|
|
|
exists. It can be explicitly listed in the path by using the
|
2017-10-09 03:44:17 +02:00
|
|
|
alias <literal>pg_temp</literal><indexterm><primary>pg_temp</primary></indexterm>. If it is not listed in the path then
|
|
|
|
it is searched first (even before <literal>pg_catalog</literal>). However,
|
2007-04-20 04:37:38 +02:00
|
|
|
the temporary schema is only searched for relation (table, view,
|
2010-02-03 18:25:06 +01:00
|
|
|
sequence, etc) and data type names. It is never searched for
|
2007-04-20 04:37:38 +02:00
|
|
|
function or operator names.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
When objects are created without specifying a particular target
|
2012-04-11 17:29:22 +02:00
|
|
|
schema, they will be placed in the first valid schema named in
|
|
|
|
<varname>search_path</varname>. An error is reported if the search
|
|
|
|
path is empty.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The default value for this parameter is
|
2012-04-11 17:29:22 +02:00
|
|
|
<literal>"$user", public</literal>.
|
|
|
|
This setting supports shared use of a database (where no users
|
2017-10-09 03:44:17 +02:00
|
|
|
have private schemas, and all share use of <literal>public</literal>),
|
2005-09-13 00:11:38 +02:00
|
|
|
private per-user schemas, and combinations of these. Other
|
|
|
|
effects can be obtained by altering the default search path
|
|
|
|
setting, either globally or per-user.
|
|
|
|
</para>
|
|
|
|
|
Document security implications of search_path and the public schema.
The ability to create like-named objects in different schemas opens up
the potential for users to change the behavior of other users' queries,
maliciously or accidentally. When you connect to a PostgreSQL server,
you should remove from your search_path any schema for which a user
other than yourself or superusers holds the CREATE privilege. If you do
not, other users holding CREATE privilege can redefine the behavior of
your commands, causing them to perform arbitrary SQL statements under
your identity. "SET search_path = ..." and "SELECT
pg_catalog.set_config(...)" are not vulnerable to such hijacking, so one
can use either as the first command of a session. As special
exceptions, the following client applications behave as documented
regardless of search_path settings and schema privileges: clusterdb
createdb createlang createuser dropdb droplang dropuser ecpg (not
programs it generates) initdb oid2name pg_archivecleanup pg_basebackup
pg_config pg_controldata pg_ctl pg_dump pg_dumpall pg_isready
pg_receivewal pg_recvlogical pg_resetwal pg_restore pg_rewind pg_standby
pg_test_fsync pg_test_timing pg_upgrade pg_waldump reindexdb vacuumdb
vacuumlo. Not included are core client programs that run user-specified
SQL commands, namely psql and pgbench. PostgreSQL encourages non-core
client applications to do likewise.
Document this in the context of libpq connections, psql connections,
dblink connections, ECPG connections, extension packaging, and schema
usage patterns. The principal defense for applications is "SELECT
pg_catalog.set_config('search_path', '', false)", and the principal
defense for databases is "REVOKE CREATE ON SCHEMA public FROM PUBLIC".
Either one is sufficient to prevent attack. After a REVOKE, consider
auditing the public schema for objects named like pg_catalog objects.
Authors of SECURITY DEFINER functions use some of the same defenses, and
the CREATE FUNCTION reference page already covered them thoroughly.
This is a good opportunity to audit SECURITY DEFINER functions for
robust security practice.
Back-patch to 9.3 (all supported versions).
Reviewed by Michael Paquier and Jonathan S. Katz. Reported by Arseniy
Sharoglazov.
Security: CVE-2018-1058
2018-02-26 16:39:44 +01:00
|
|
|
<para>
|
|
|
|
For more information on schema handling, see
|
|
|
|
<xref linkend="ddl-schemas"/>. In particular, the default
|
|
|
|
configuration is suitable only when the database has a single user or
|
|
|
|
a few mutually-trusting users.
|
|
|
|
</para>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<para>
|
|
|
|
The current effective value of the search path can be examined
|
|
|
|
via the <acronym>SQL</acronym> function
|
2017-10-09 03:44:17 +02:00
|
|
|
<function>current_schemas</function>
|
2017-11-23 15:39:47 +01:00
|
|
|
(see <xref linkend="functions-info"/>).
|
2011-06-13 18:37:49 +02:00
|
|
|
This is not quite the same as
|
2005-09-13 00:11:38 +02:00
|
|
|
examining the value of <varname>search_path</varname>, since
|
2017-10-09 03:44:17 +02:00
|
|
|
<function>current_schemas</function> shows how the items
|
2005-09-13 00:11:38 +02:00
|
|
|
appearing in <varname>search_path</varname> were resolved.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Row-Level Security Policies (RLS)
Building on the updatable security-barrier views work, add the
ability to define policies on tables to limit the set of rows
which are returned from a query and which are allowed to be added
to a table. Expressions defined by the policy for filtering are
added to the security barrier quals of the query, while expressions
defined to check records being added to a table are added to the
with-check options of the query.
New top-level commands are CREATE/ALTER/DROP POLICY and are
controlled by the table owner. Row Security is able to be enabled
and disabled by the owner on a per-table basis using
ALTER TABLE .. ENABLE/DISABLE ROW SECURITY.
Per discussion, ROW SECURITY is disabled on tables by default and
must be enabled for policies on the table to be used. If no
policies exist on a table with ROW SECURITY enabled, a default-deny
policy is used and no records will be visible.
By default, row security is applied at all times except for the
table owner and the superuser. A new GUC, row_security, is added
which can be set to ON, OFF, or FORCE. When set to FORCE, row
security will be applied even for the table owner and superusers.
When set to OFF, row security will be disabled when allowed and an
error will be thrown if the user does not have rights to bypass row
security.
Per discussion, pg_dump sets row_security = OFF by default to ensure
that exports and backups will have all data in the table or will
error if there are insufficient privileges to bypass row security.
A new option has been added to pg_dump, --enable-row-security, to
ask pg_dump to export with row security enabled.
A new role capability, BYPASSRLS, which can only be set by the
superuser, is added to allow other users to be able to bypass row
security using row_security = OFF.
Many thanks to the various individuals who have helped with the
design, particularly Robert Haas for his feedback.
Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean
Rasheed, with additional changes and rework by me.
Reviewers have included all of the above, Greg Smith,
Jeff McCormick, and Robert Haas.
2014-09-19 17:18:35 +02:00
|
|
|
<varlistentry id="guc-row-security" xreflabel="row_security">
|
2015-10-04 02:20:22 +02:00
|
|
|
<term><varname>row_security</varname> (<type>boolean</type>)
|
Row-Level Security Policies (RLS)
Building on the updatable security-barrier views work, add the
ability to define policies on tables to limit the set of rows
which are returned from a query and which are allowed to be added
to a table. Expressions defined by the policy for filtering are
added to the security barrier quals of the query, while expressions
defined to check records being added to a table are added to the
with-check options of the query.
New top-level commands are CREATE/ALTER/DROP POLICY and are
controlled by the table owner. Row Security is able to be enabled
and disabled by the owner on a per-table basis using
ALTER TABLE .. ENABLE/DISABLE ROW SECURITY.
Per discussion, ROW SECURITY is disabled on tables by default and
must be enabled for policies on the table to be used. If no
policies exist on a table with ROW SECURITY enabled, a default-deny
policy is used and no records will be visible.
By default, row security is applied at all times except for the
table owner and the superuser. A new GUC, row_security, is added
which can be set to ON, OFF, or FORCE. When set to FORCE, row
security will be applied even for the table owner and superusers.
When set to OFF, row security will be disabled when allowed and an
error will be thrown if the user does not have rights to bypass row
security.
Per discussion, pg_dump sets row_security = OFF by default to ensure
that exports and backups will have all data in the table or will
error if there are insufficient privileges to bypass row security.
A new option has been added to pg_dump, --enable-row-security, to
ask pg_dump to export with row security enabled.
A new role capability, BYPASSRLS, which can only be set by the
superuser, is added to allow other users to be able to bypass row
security using row_security = OFF.
Many thanks to the various individuals who have helped with the
design, particularly Robert Haas for his feedback.
Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean
Rasheed, with additional changes and rework by me.
Reviewers have included all of the above, Greg Smith,
Jeff McCormick, and Robert Haas.
2014-09-19 17:18:35 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>row_security</varname> configuration parameter</primary>
|
Row-Level Security Policies (RLS)
Building on the updatable security-barrier views work, add the
ability to define policies on tables to limit the set of rows
which are returned from a query and which are allowed to be added
to a table. Expressions defined by the policy for filtering are
added to the security barrier quals of the query, while expressions
defined to check records being added to a table are added to the
with-check options of the query.
New top-level commands are CREATE/ALTER/DROP POLICY and are
controlled by the table owner. Row Security is able to be enabled
and disabled by the owner on a per-table basis using
ALTER TABLE .. ENABLE/DISABLE ROW SECURITY.
Per discussion, ROW SECURITY is disabled on tables by default and
must be enabled for policies on the table to be used. If no
policies exist on a table with ROW SECURITY enabled, a default-deny
policy is used and no records will be visible.
By default, row security is applied at all times except for the
table owner and the superuser. A new GUC, row_security, is added
which can be set to ON, OFF, or FORCE. When set to FORCE, row
security will be applied even for the table owner and superusers.
When set to OFF, row security will be disabled when allowed and an
error will be thrown if the user does not have rights to bypass row
security.
Per discussion, pg_dump sets row_security = OFF by default to ensure
that exports and backups will have all data in the table or will
error if there are insufficient privileges to bypass row security.
A new option has been added to pg_dump, --enable-row-security, to
ask pg_dump to export with row security enabled.
A new role capability, BYPASSRLS, which can only be set by the
superuser, is added to allow other users to be able to bypass row
security using row_security = OFF.
Many thanks to the various individuals who have helped with the
design, particularly Robert Haas for his feedback.
Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean
Rasheed, with additional changes and rework by me.
Reviewers have included all of the above, Greg Smith,
Jeff McCormick, and Robert Haas.
2014-09-19 17:18:35 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2015-10-04 02:19:57 +02:00
|
|
|
This variable controls whether to raise an error in lieu of applying a
|
2017-10-09 03:44:17 +02:00
|
|
|
row security policy. When set to <literal>on</literal>, policies apply
|
|
|
|
normally. When set to <literal>off</literal>, queries fail which would
|
|
|
|
otherwise apply at least one policy. The default is <literal>on</literal>.
|
|
|
|
Change to <literal>off</literal> where limited row visibility could cause
|
|
|
|
incorrect results; for example, <application>pg_dump</application> makes that
|
2015-10-04 02:19:57 +02:00
|
|
|
change by default. This variable has no effect on roles which bypass
|
|
|
|
every row security policy, to wit, superusers and roles with
|
2017-10-09 03:44:17 +02:00
|
|
|
the <literal>BYPASSRLS</literal> attribute.
|
Row-Level Security Policies (RLS)
Building on the updatable security-barrier views work, add the
ability to define policies on tables to limit the set of rows
which are returned from a query and which are allowed to be added
to a table. Expressions defined by the policy for filtering are
added to the security barrier quals of the query, while expressions
defined to check records being added to a table are added to the
with-check options of the query.
New top-level commands are CREATE/ALTER/DROP POLICY and are
controlled by the table owner. Row Security is able to be enabled
and disabled by the owner on a per-table basis using
ALTER TABLE .. ENABLE/DISABLE ROW SECURITY.
Per discussion, ROW SECURITY is disabled on tables by default and
must be enabled for policies on the table to be used. If no
policies exist on a table with ROW SECURITY enabled, a default-deny
policy is used and no records will be visible.
By default, row security is applied at all times except for the
table owner and the superuser. A new GUC, row_security, is added
which can be set to ON, OFF, or FORCE. When set to FORCE, row
security will be applied even for the table owner and superusers.
When set to OFF, row security will be disabled when allowed and an
error will be thrown if the user does not have rights to bypass row
security.
Per discussion, pg_dump sets row_security = OFF by default to ensure
that exports and backups will have all data in the table or will
error if there are insufficient privileges to bypass row security.
A new option has been added to pg_dump, --enable-row-security, to
ask pg_dump to export with row security enabled.
A new role capability, BYPASSRLS, which can only be set by the
superuser, is added to allow other users to be able to bypass row
security using row_security = OFF.
Many thanks to the various individuals who have helped with the
design, particularly Robert Haas for his feedback.
Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean
Rasheed, with additional changes and rework by me.
Reviewers have included all of the above, Greg Smith,
Jeff McCormick, and Robert Haas.
2014-09-19 17:18:35 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
For more information on row security policies,
|
2017-11-23 15:39:47 +01:00
|
|
|
see <xref linkend="sql-createpolicy"/>.
|
Row-Level Security Policies (RLS)
Building on the updatable security-barrier views work, add the
ability to define policies on tables to limit the set of rows
which are returned from a query and which are allowed to be added
to a table. Expressions defined by the policy for filtering are
added to the security barrier quals of the query, while expressions
defined to check records being added to a table are added to the
with-check options of the query.
New top-level commands are CREATE/ALTER/DROP POLICY and are
controlled by the table owner. Row Security is able to be enabled
and disabled by the owner on a per-table basis using
ALTER TABLE .. ENABLE/DISABLE ROW SECURITY.
Per discussion, ROW SECURITY is disabled on tables by default and
must be enabled for policies on the table to be used. If no
policies exist on a table with ROW SECURITY enabled, a default-deny
policy is used and no records will be visible.
By default, row security is applied at all times except for the
table owner and the superuser. A new GUC, row_security, is added
which can be set to ON, OFF, or FORCE. When set to FORCE, row
security will be applied even for the table owner and superusers.
When set to OFF, row security will be disabled when allowed and an
error will be thrown if the user does not have rights to bypass row
security.
Per discussion, pg_dump sets row_security = OFF by default to ensure
that exports and backups will have all data in the table or will
error if there are insufficient privileges to bypass row security.
A new option has been added to pg_dump, --enable-row-security, to
ask pg_dump to export with row security enabled.
A new role capability, BYPASSRLS, which can only be set by the
superuser, is added to allow other users to be able to bypass row
security using row_security = OFF.
Many thanks to the various individuals who have helped with the
design, particularly Robert Haas for his feedback.
Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean
Rasheed, with additional changes and rework by me.
Reviewers have included all of the above, Greg Smith,
Jeff McCormick, and Robert Haas.
2014-09-19 17:18:35 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2019-04-04 02:37:00 +02:00
|
|
|
<varlistentry id="guc-default-table-access-method" xreflabel="default_table_access_method">
|
|
|
|
<term><varname>default_table_access_method</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>default_table_access_method</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This parameter specifies the default table access method to use when
|
|
|
|
creating tables or materialized views if the <command>CREATE</command>
|
|
|
|
command does not explicitly specify an access method, or when
|
|
|
|
<command>SELECT ... INTO</command> is used, which does not allow to
|
|
|
|
specify a table access method. The default is <literal>heap</literal>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-default-tablespace" xreflabel="default_tablespace">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>default_tablespace</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>default_tablespace</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<indexterm><primary>tablespace</primary><secondary>default</secondary></indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This variable specifies the default tablespace in which to create
|
2017-10-09 03:44:17 +02:00
|
|
|
objects (tables and indexes) when a <command>CREATE</command> command does
|
2021-04-29 17:31:24 +02:00
|
|
|
not explicitly specify a tablespace.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The value is either the name of a tablespace, or an empty string
|
|
|
|
to specify using the default tablespace of the current database.
|
|
|
|
If the value does not match the name of any existing tablespace,
|
2017-10-09 03:44:17 +02:00
|
|
|
<productname>PostgreSQL</productname> will automatically use the default
|
2007-06-03 19:08:34 +02:00
|
|
|
tablespace of the current database. If a nondefault tablespace
|
2017-10-09 03:44:17 +02:00
|
|
|
is specified, the user must have <literal>CREATE</literal> privilege
|
2007-06-03 19:08:34 +02:00
|
|
|
for it, or creation attempts will fail.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
This variable is not used for temporary tables; for them,
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-temp-tablespaces"/> is consulted instead.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
2010-11-27 22:08:32 +01:00
|
|
|
<para>
|
|
|
|
This variable is also not used when creating databases.
|
|
|
|
By default, a new database inherits its tablespace setting from
|
|
|
|
the template database it is copied from.
|
|
|
|
</para>
|
|
|
|
|
2021-04-29 17:31:24 +02:00
|
|
|
<para>
|
|
|
|
If this parameter is set to a value other than the empty string
|
|
|
|
when a partitioned table is created, the partitioned table's
|
|
|
|
tablespace will be set to that value, which will be used as
|
|
|
|
the default tablespace for partitions created in the future,
|
|
|
|
even if <varname>default_tablespace</varname> has changed since then.
|
|
|
|
</para>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<para>
|
|
|
|
For more information on tablespaces,
|
2017-11-23 15:39:47 +01:00
|
|
|
see <xref linkend="manage-ag-tablespaces"/>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2021-03-24 17:36:08 +01:00
|
|
|
<varlistentry id="guc-default-toast-compression" xreflabel="default_toast_compression">
|
|
|
|
<term><varname>default_toast_compression</varname> (<type>enum</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>default_toast_compression</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This variable sets the default
|
|
|
|
<link linkend="storage-toast">TOAST</link>
|
2021-05-27 19:24:24 +02:00
|
|
|
compression method for values of compressible columns.
|
|
|
|
(This can be overridden for individual columns by setting
|
|
|
|
the <literal>COMPRESSION</literal> column option in
|
|
|
|
<command>CREATE TABLE</command> or
|
|
|
|
<command>ALTER TABLE</command>.)
|
|
|
|
The supported compression methods are <literal>pglz</literal> and
|
|
|
|
(if <productname>PostgreSQL</productname> was compiled with
|
|
|
|
<option>--with-lz4</option>) <literal>lz4</literal>.
|
2021-03-24 17:36:08 +01:00
|
|
|
The default is <literal>pglz</literal>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2007-06-03 19:08:34 +02:00
|
|
|
<varlistentry id="guc-temp-tablespaces" xreflabel="temp_tablespaces">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>temp_tablespaces</varname> (<type>string</type>)
|
2007-06-03 19:08:34 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>temp_tablespaces</varname> configuration parameter</primary>
|
2007-06-03 19:08:34 +02:00
|
|
|
</indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<indexterm><primary>tablespace</primary><secondary>temporary</secondary></indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2007-06-03 19:08:34 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2010-02-03 18:25:06 +01:00
|
|
|
This variable specifies tablespaces in which to create temporary
|
2007-06-03 19:08:34 +02:00
|
|
|
objects (temp tables and indexes on temp tables) when a
|
2017-10-09 03:44:17 +02:00
|
|
|
<command>CREATE</command> command does not explicitly specify a tablespace.
|
2007-06-03 19:08:34 +02:00
|
|
|
Temporary files for purposes such as sorting large data sets
|
2010-02-03 18:25:06 +01:00
|
|
|
are also created in these tablespaces.
|
2007-06-03 19:08:34 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The value is a list of names of tablespaces. When there is more than
|
2017-10-09 03:44:17 +02:00
|
|
|
one name in the list, <productname>PostgreSQL</productname> chooses a random
|
2007-06-07 21:19:57 +02:00
|
|
|
member of the list each time a temporary object is to be created;
|
|
|
|
except that within a transaction, successively created temporary
|
|
|
|
objects are placed in successive tablespaces from the list.
|
2007-11-19 03:26:10 +01:00
|
|
|
If the selected element of the list is an empty string,
|
2017-10-09 03:44:17 +02:00
|
|
|
<productname>PostgreSQL</productname> will automatically use the default
|
2007-06-07 21:19:57 +02:00
|
|
|
tablespace of the current database instead.
|
2007-06-03 19:08:34 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
When <varname>temp_tablespaces</varname> is set interactively, specifying a
|
2007-06-07 21:19:57 +02:00
|
|
|
nonexistent tablespace is an error, as is specifying a tablespace for
|
2017-10-09 03:44:17 +02:00
|
|
|
which the user does not have <literal>CREATE</literal> privilege. However,
|
2007-06-07 21:19:57 +02:00
|
|
|
when using a previously set value, nonexistent tablespaces are
|
|
|
|
ignored, as are tablespaces for which the user lacks
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>CREATE</literal> privilege. In particular, this rule applies when
|
|
|
|
using a value set in <filename>postgresql.conf</filename>.
|
2007-06-03 19:08:34 +02:00
|
|
|
</para>
|
|
|
|
|
2007-11-19 03:26:10 +01:00
|
|
|
<para>
|
|
|
|
The default value is an empty string, which results in all temporary
|
|
|
|
objects being created in the default tablespace of the current
|
|
|
|
database.
|
|
|
|
</para>
|
|
|
|
|
2007-06-03 19:08:34 +02:00
|
|
|
<para>
|
2017-11-23 15:39:47 +01:00
|
|
|
See also <xref linkend="guc-default-tablespace"/>.
|
2007-06-03 19:08:34 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-check-function-bodies" xreflabel="check_function_bodies">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>check_function_bodies</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>check_function_bodies</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter is normally on. When set to <literal>off</literal>, it
|
2021-04-10 18:08:28 +02:00
|
|
|
disables validation of the routine body string during <xref
|
|
|
|
linkend="sql-createfunction"/> and <xref
|
|
|
|
linkend="sql-createprocedure"/>. Disabling validation avoids side
|
|
|
|
effects of the validation process, in particular preventing false
|
|
|
|
positives due to problems such as forward references.
|
|
|
|
Set this parameter
|
2017-10-09 03:44:17 +02:00
|
|
|
to <literal>off</literal> before loading functions on behalf of other
|
|
|
|
users; <application>pg_dump</application> does so automatically.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-default-transaction-isolation" xreflabel="default_transaction_isolation">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>default_transaction_isolation</varname> (<type>enum</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
|
|
|
<primary>transaction isolation level</primary>
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
<secondary>setting default</secondary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>default_transaction_isolation</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Each SQL transaction has an isolation level, which can be
|
|
|
|
either <quote>read uncommitted</quote>, <quote>read
|
|
|
|
committed</quote>, <quote>repeatable read</quote>, or
|
|
|
|
<quote>serializable</quote>. This parameter controls the
|
|
|
|
default isolation level of each new transaction. The default
|
|
|
|
is <quote>read committed</quote>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-11-23 15:39:47 +01:00
|
|
|
Consult <xref linkend="mvcc"/> and <xref
|
|
|
|
linkend="sql-set-transaction"/> for more information.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-default-transaction-read-only" xreflabel="default_transaction_read_only">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>default_transaction_read_only</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
|
|
|
<primary>read-only transaction</primary>
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
<secondary>setting default</secondary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>default_transaction_read_only</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
A read-only SQL transaction cannot alter non-temporary tables.
|
|
|
|
This parameter controls the default read-only status of each new
|
2017-10-09 03:44:17 +02:00
|
|
|
transaction. The default is <literal>off</literal> (read/write).
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-11-23 15:39:47 +01:00
|
|
|
Consult <xref linkend="sql-set-transaction"/> for more information.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
<varlistentry id="guc-default-transaction-deferrable" xreflabel="default_transaction_deferrable">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>default_transaction_deferrable</varname> (<type>boolean</type>)
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
<indexterm>
|
|
|
|
<primary>deferrable transaction</primary>
|
|
|
|
<secondary>setting default</secondary>
|
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>default_transaction_deferrable</varname> configuration parameter</primary>
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
When running at the <literal>serializable</literal> isolation level,
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
a deferrable read-only SQL transaction may be delayed before
|
|
|
|
it is allowed to proceed. However, once it begins executing
|
|
|
|
it does not incur any of the overhead required to ensure
|
|
|
|
serializability; so serialization code will have no reason to
|
|
|
|
force it to abort because of concurrent updates, making this
|
|
|
|
option suitable for long-running read-only transactions.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
This parameter controls the default deferrable status of each
|
|
|
|
new transaction. It currently has no effect on read-write
|
|
|
|
transactions or those operating at isolation levels lower
|
2017-10-09 03:44:17 +02:00
|
|
|
than <literal>serializable</literal>. The default is <literal>off</literal>.
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-11-23 15:39:47 +01:00
|
|
|
Consult <xref linkend="sql-set-transaction"/> for more information.
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2021-07-17 17:52:54 +02:00
|
|
|
<varlistentry id="guc-transaction-isolation" xreflabel="transaction_isolation">
|
|
|
|
<term><varname>transaction_isolation</varname> (<type>enum</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary>transaction isolation level</primary>
|
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>transaction_isolation</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This parameter reflects the current transaction's isolation level.
|
|
|
|
At the beginning of each transaction, it is set to the current value
|
|
|
|
of <xref linkend="guc-default-transaction-isolation"/>.
|
|
|
|
Any subsequent attempt to change it is equivalent to a <xref
|
|
|
|
linkend="sql-set-transaction"/> command.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-transaction-read-only" xreflabel="transaction_read_only">
|
|
|
|
<term><varname>transaction_read_only</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary>read-only transaction</primary>
|
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>transaction_read_only</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This parameter reflects the current transaction's read-only status.
|
|
|
|
At the beginning of each transaction, it is set to the current value
|
|
|
|
of <xref linkend="guc-default-transaction-read-only"/>.
|
|
|
|
Any subsequent attempt to change it is equivalent to a <xref
|
|
|
|
linkend="sql-set-transaction"/> command.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-transaction-deferrable" xreflabel="transaction_deferrable">
|
|
|
|
<term><varname>transaction_deferrable</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary>deferrable transaction</primary>
|
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>transaction_deferrable</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This parameter reflects the current transaction's deferrability status.
|
|
|
|
At the beginning of each transaction, it is set to the current value
|
|
|
|
of <xref linkend="guc-default-transaction-deferrable"/>.
|
|
|
|
Any subsequent attempt to change it is equivalent to a <xref
|
|
|
|
linkend="sql-set-transaction"/> command.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
|
2008-01-27 20:12:28 +01:00
|
|
|
<varlistentry id="guc-session-replication-role" xreflabel="session_replication_role">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>session_replication_role</varname> (<type>enum</type>)
|
2008-01-27 20:12:28 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>session_replication_role</varname> configuration parameter</primary>
|
2008-01-27 20:12:28 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2008-01-27 20:12:28 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Controls firing of replication-related triggers and rules for the
|
|
|
|
current session. Setting this variable requires
|
|
|
|
superuser privilege and results in discarding any previously cached
|
2017-10-09 03:44:17 +02:00
|
|
|
query plans. Possible values are <literal>origin</literal> (the default),
|
|
|
|
<literal>replica</literal> and <literal>local</literal>.
|
2018-01-18 15:34:51 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The intended use of this setting is that logical replication systems
|
|
|
|
set it to <literal>replica</literal> when they are applying replicated
|
|
|
|
changes. The effect of that will be that triggers and rules (that
|
|
|
|
have not been altered from their default configuration) will not fire
|
Improve <xref> vs. <command> formatting in the documentation
SQL commands are generally marked up as <command>, except when a link
to a reference page is used using <xref>. But the latter doesn't
create monospace markup, so this looks strange especially when a
paragraph contains a mix of links and non-links.
We considered putting <command> in the <refentrytitle> on the target
side, but that creates some formatting side effects elsewhere.
Generally, it seems safer to solve this on the link source side.
We can't put the <xref> inside the <command>; the DTD doesn't allow
this. DocBook 5 would allow the <command> to have the linkend
attribute itself, but we are not there yet.
So to solve this for now, convert the <xref>s to <link> plus
<command>. This gives the correct look and also gives some more
flexibility what we can put into the link text (e.g., subcommands or
other clauses). In the future, these could then be converted to
DocBook 5 style.
I haven't converted absolutely all xrefs to SQL command reference
pages, only those where we care about the appearance of the link text
or where it was otherwise appropriate to make the appearance match a
bit better. Also in some cases, the links where repetitive, so in
those cases the links where just removed and replaced by a plain
<command>. In cases where we just want the link and don't
specifically care about the generated link text (typically phrased
"for further information see <xref ...>") the xref is kept.
Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/87o8pco34z.fsf@wibble.ilmari.org
2020-10-03 16:16:51 +02:00
|
|
|
on the replica. See the <link linkend="sql-altertable"><command>ALTER TABLE</command></link> clauses
|
2018-01-18 15:34:51 +01:00
|
|
|
<literal>ENABLE TRIGGER</literal> and <literal>ENABLE RULE</literal>
|
|
|
|
for more information.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
PostgreSQL treats the settings <literal>origin</literal> and
|
|
|
|
<literal>local</literal> the same internally. Third-party replication
|
|
|
|
systems may use these two values for their internal purposes, for
|
|
|
|
example using <literal>local</literal> to designate a session whose
|
|
|
|
changes should not be replicated.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Since foreign keys are implemented as triggers, setting this parameter
|
|
|
|
to <literal>replica</literal> also disables all foreign key checks,
|
|
|
|
which can leave data in an inconsistent state if improperly used.
|
2008-01-27 20:12:28 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-statement-timeout" xreflabel="statement_timeout">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>statement_timeout</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>statement_timeout</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Abort any statement that takes more than the specified amount of time.
|
Reset statement_timeout between queries of a multi-query string.
Historically, we started the timer (if StatementTimeout > 0) at the
beginning of a simple-Query message and usually let it run until the
end, so that the timeout limit applied to the entire query string,
and intra-string changes of the statement_timeout GUC had no effect.
But, confusingly, a COMMIT within the string would reset the state
and allow a fresh timeout cycle to start with the current setting.
Commit f8e5f156b changed the behavior of statement_timeout for extended
query protocol, and as an apparently-unintended side effect, a change in
the statement_timeout GUC during a multi-statement simple-Query message
might have an effect immediately --- but only if it was going from
"disabled" to "enabled".
This is all pretty confusing, not to mention completely undocumented.
Let's change things so that the timeout is always reset between queries
of a multi-query string, whether they're transaction control commands
or not. Thus the active timeout setting is applied to each query in
the string, separately. This costs a few more cycles if statement_timeout
is active, but it provides much more intuitive behavior, especially if one
changes statement_timeout in one of the queries of the string.
Also, add something to the documentation to explain all this.
Per bug #16035 from Raj Mohite. Although this is a bug fix, I'm hesitant
to back-patch it; conceivably somebody has worked out the old behavior
and is depending on it. (But note that this change should make the
behavior less restrictive in most cases, since the timeout will now
be applied to shorter segments of code.)
Discussion: https://postgr.es/m/16035-456e6e69ebfd4374@postgresql.org
2019-10-25 17:15:50 +02:00
|
|
|
If <varname>log_min_error_statement</varname> is set
|
|
|
|
to <literal>ERROR</literal> or lower, the statement that timed out
|
|
|
|
will also be logged.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
Reset statement_timeout between queries of a multi-query string.
Historically, we started the timer (if StatementTimeout > 0) at the
beginning of a simple-Query message and usually let it run until the
end, so that the timeout limit applied to the entire query string,
and intra-string changes of the statement_timeout GUC had no effect.
But, confusingly, a COMMIT within the string would reset the state
and allow a fresh timeout cycle to start with the current setting.
Commit f8e5f156b changed the behavior of statement_timeout for extended
query protocol, and as an apparently-unintended side effect, a change in
the statement_timeout GUC during a multi-statement simple-Query message
might have an effect immediately --- but only if it was going from
"disabled" to "enabled".
This is all pretty confusing, not to mention completely undocumented.
Let's change things so that the timeout is always reset between queries
of a multi-query string, whether they're transaction control commands
or not. Thus the active timeout setting is applied to each query in
the string, separately. This costs a few more cycles if statement_timeout
is active, but it provides much more intuitive behavior, especially if one
changes statement_timeout in one of the queries of the string.
Also, add something to the documentation to explain all this.
Per bug #16035 from Raj Mohite. Although this is a bug fix, I'm hesitant
to back-patch it; conceivably somebody has worked out the old behavior
and is depending on it. (But note that this change should make the
behavior less restrictive in most cases, since the timeout will now
be applied to shorter segments of code.)
Discussion: https://postgr.es/m/16035-456e6e69ebfd4374@postgresql.org
2019-10-25 17:15:50 +02:00
|
|
|
A value of zero (the default) disables the timeout.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The timeout is measured from the time a command arrives at the
|
|
|
|
server until it is completed by the server. If multiple SQL
|
|
|
|
statements appear in a single simple-Query message, the timeout
|
|
|
|
is applied to each statement separately.
|
|
|
|
(<productname>PostgreSQL</productname> versions before 13 usually
|
|
|
|
treated the timeout as applying to the whole query string.)
|
|
|
|
In extended query protocol, the timeout starts running when any
|
|
|
|
query-related message (Parse, Bind, Execute, Describe) arrives, and
|
2020-06-07 15:06:51 +02:00
|
|
|
it is canceled by completion of an Execute or Sync message.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
2008-03-11 17:59:00 +01:00
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Setting <varname>statement_timeout</varname> in
|
|
|
|
<filename>postgresql.conf</filename> is not recommended because it would
|
2013-03-17 04:22:17 +01:00
|
|
|
affect all sessions.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-lock-timeout" xreflabel="lock_timeout">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>lock_timeout</varname> (<type>integer</type>)
|
2013-03-17 04:22:17 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>lock_timeout</varname> configuration parameter</primary>
|
2013-03-17 04:22:17 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2013-03-17 04:22:17 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Abort any statement that waits longer than the specified amount of
|
|
|
|
time while attempting to acquire a lock on a table, index,
|
2013-03-17 04:22:17 +01:00
|
|
|
row, or other database object. The time limit applies separately to
|
|
|
|
each lock acquisition attempt. The limit applies both to explicit
|
2017-10-09 03:44:17 +02:00
|
|
|
locking requests (such as <command>LOCK TABLE</command>, or <command>SELECT
|
|
|
|
FOR UPDATE</command> without <literal>NOWAIT</literal>) and to implicitly-acquired
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
locks.
|
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
|
|
|
A value of zero (the default) disables the timeout.
|
2013-03-17 04:22:17 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Unlike <varname>statement_timeout</varname>, this timeout can only occur
|
|
|
|
while waiting for locks. Note that if <varname>statement_timeout</varname>
|
|
|
|
is nonzero, it is rather pointless to set <varname>lock_timeout</varname> to
|
2013-03-17 04:22:17 +01:00
|
|
|
the same or larger value, since the statement timeout would always
|
2019-10-07 20:33:31 +02:00
|
|
|
trigger first. If <varname>log_min_error_statement</varname> is set to
|
|
|
|
<literal>ERROR</literal> or lower, the statement that timed out will be
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
logged.
|
2013-03-17 04:22:17 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Setting <varname>lock_timeout</varname> in
|
|
|
|
<filename>postgresql.conf</filename> is not recommended because it would
|
2013-03-17 04:22:17 +01:00
|
|
|
affect all sessions.
|
2008-03-11 17:59:00 +01:00
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
|
2016-03-16 16:30:45 +01:00
|
|
|
<varlistentry id="guc-idle-in-transaction-session-timeout" xreflabel="idle_in_transaction_session_timeout">
|
|
|
|
<term><varname>idle_in_transaction_session_timeout</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>idle_in_transaction_session_timeout</varname> configuration parameter</primary>
|
2016-03-16 16:30:45 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2021-01-07 00:28:42 +01:00
|
|
|
Terminate any session that has been idle (that is, waiting for a
|
|
|
|
client query) within an open transaction for longer than the
|
|
|
|
specified amount of time.
|
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
|
|
|
A value of zero (the default) disables the timeout.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
This option can be used to ensure that idle sessions do not hold
|
|
|
|
locks for an unreasonable amount of time. Even when no significant
|
|
|
|
locks are held, an open transaction prevents vacuuming away
|
|
|
|
recently-dead tuples that may be visible only to this transaction;
|
|
|
|
so remaining idle for a long time can contribute to table bloat.
|
|
|
|
See <xref linkend="routine-vacuuming"/> for more details.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-idle-session-timeout" xreflabel="idle_session_timeout">
|
|
|
|
<term><varname>idle_session_timeout</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>idle_session_timeout</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Terminate any session that has been idle (that is, waiting for a
|
|
|
|
client query), but not within an open transaction, for longer than
|
|
|
|
the specified amount of time.
|
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
|
|
|
A value of zero (the default) disables the timeout.
|
2016-03-16 16:30:45 +01:00
|
|
|
</para>
|
2021-01-07 00:28:42 +01:00
|
|
|
|
|
|
|
<para>
|
|
|
|
Unlike the case with an open transaction, an idle session without a
|
|
|
|
transaction imposes no large costs on the server, so there is less
|
|
|
|
need to enable this timeout
|
|
|
|
than <varname>idle_in_transaction_session_timeout</varname>.
|
|
|
|
</para>
|
|
|
|
|
2016-03-16 16:30:45 +01:00
|
|
|
<para>
|
2021-01-07 00:28:42 +01:00
|
|
|
Be wary of enforcing this timeout on connections made through
|
|
|
|
connection-pooling software or other middleware, as such a layer
|
|
|
|
may not react well to unexpected connection closure. It may be
|
|
|
|
helpful to enable this timeout only for interactive sessions,
|
|
|
|
perhaps by applying it only to particular users.
|
2016-03-16 16:30:45 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2009-01-16 14:27:24 +01:00
|
|
|
<varlistentry id="guc-vacuum-freeze-table-age" xreflabel="vacuum_freeze_table_age">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>vacuum_freeze_table_age</varname> (<type>integer</type>)
|
2009-01-16 14:27:24 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>vacuum_freeze_table_age</varname> configuration parameter</primary>
|
2009-01-16 14:27:24 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2009-01-16 14:27:24 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
<command>VACUUM</command> performs an aggressive scan if the table's
|
|
|
|
<structname>pg_class</structname>.<structfield>relfrozenxid</structfield> field has reached
|
2016-03-10 22:12:10 +01:00
|
|
|
the age specified by this setting. An aggressive scan differs from
|
2017-10-09 03:44:17 +02:00
|
|
|
a regular <command>VACUUM</command> in that it visits every page that might
|
2016-03-10 22:12:10 +01:00
|
|
|
contain unfrozen XIDs or MXIDs, not just those that might contain dead
|
|
|
|
tuples. The default is 150 million transactions. Although users can
|
2020-09-21 18:43:42 +02:00
|
|
|
set this value anywhere from zero to two billion, <command>VACUUM</command>
|
2016-03-10 22:12:10 +01:00
|
|
|
will silently limit the effective value to 95% of
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-autovacuum-freeze-max-age"/>, so that a
|
2020-09-21 18:43:42 +02:00
|
|
|
periodic manual <command>VACUUM</command> has a chance to run before an
|
2009-01-16 14:27:24 +01:00
|
|
|
anti-wraparound autovacuum is launched for the table. For more
|
|
|
|
information see
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="vacuum-for-wraparound"/>.
|
2009-01-16 14:27:24 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
<varlistentry id="guc-vacuum-freeze-min-age" xreflabel="vacuum_freeze_min_age">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>vacuum_freeze_min_age</varname> (<type>integer</type>)
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>vacuum_freeze_min_age</varname> configuration parameter</primary>
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Specifies the cutoff age (in transactions) that <command>VACUUM</command>
|
2013-12-24 02:32:29 +01:00
|
|
|
should use to decide whether to freeze row versions
|
|
|
|
while scanning a table.
|
2009-01-16 14:27:24 +01:00
|
|
|
The default is 50 million transactions. Although
|
2007-01-20 22:30:26 +01:00
|
|
|
users can set this value anywhere from zero to one billion,
|
2017-10-09 03:44:17 +02:00
|
|
|
<command>VACUUM</command> will silently limit the effective value to half
|
2017-11-23 15:39:47 +01:00
|
|
|
the value of <xref linkend="guc-autovacuum-freeze-max-age"/>, so
|
2007-01-20 22:30:26 +01:00
|
|
|
that there is not an unreasonably short time between forced
|
|
|
|
autovacuums. For more information see <xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="vacuum-for-wraparound"/>.
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2007-06-03 19:08:34 +02:00
|
|
|
|
Add wraparound failsafe to VACUUM.
Add a failsafe mechanism that is triggered by VACUUM when it notices
that the table's relfrozenxid and/or relminmxid are dangerously far in
the past. VACUUM checks the age of the table dynamically, at regular
intervals.
When the failsafe triggers, VACUUM takes extraordinary measures to
finish as quickly as possible so that relfrozenxid and/or relminmxid can
be advanced. VACUUM will stop applying any cost-based delay that may be
in effect. VACUUM will also bypass any further index vacuuming and heap
vacuuming -- it only completes whatever remaining pruning and freezing
is required. Bypassing index/heap vacuuming is enabled by commit
8523492d, which made it possible to dynamically trigger the mechanism
already used within VACUUM when it is run with INDEX_CLEANUP off.
It is expected that the failsafe will almost always trigger within an
autovacuum to prevent wraparound, long after the autovacuum began.
However, the failsafe mechanism can trigger in any VACUUM operation.
Even in a non-aggressive VACUUM, where we're likely to not advance
relfrozenxid, it still seems like a good idea to finish off remaining
pruning and freezing. An aggressive/anti-wraparound VACUUM will be
launched immediately afterwards. Note that the anti-wraparound VACUUM
that follows will itself trigger the failsafe, usually before it even
begins its first (and only) pass over the heap.
The failsafe is controlled by two new GUCs: vacuum_failsafe_age, and
vacuum_multixact_failsafe_age. There are no equivalent reloptions,
since that isn't expected to be useful. The GUCs have rather high
defaults (both default to 1.6 billion), and are expected to generally
only be used to make the failsafe trigger sooner/more frequently.
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAD21AoD0SkE11fMw4jD4RENAwBMcw1wasVnwpJVw3tVqPOQgAw@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzmgH3ySGYeC-m-eOBsa2=sDwa292-CFghV4rESYo39FsQ@mail.gmail.com
2021-04-07 21:37:45 +02:00
|
|
|
<varlistentry id="guc-vacuum-failsafe-age" xreflabel="vacuum_failsafe_age">
|
|
|
|
<term><varname>vacuum_failsafe_age</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>vacuum_failsafe_age</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the maximum age (in transactions) that a table's
|
|
|
|
<structname>pg_class</structname>.<structfield>relfrozenxid</structfield>
|
|
|
|
field can attain before <command>VACUUM</command> takes
|
|
|
|
extraordinary measures to avoid system-wide transaction ID
|
|
|
|
wraparound failure. This is <command>VACUUM</command>'s
|
|
|
|
strategy of last resort. The failsafe typically triggers
|
|
|
|
when an autovacuum to prevent transaction ID wraparound has
|
|
|
|
already been running for some time, though it's possible for
|
|
|
|
the failsafe to trigger during any <command>VACUUM</command>.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
When the failsafe is triggered, any cost-based delay that is
|
|
|
|
in effect will no longer be applied, and further non-essential
|
|
|
|
maintenance tasks (such as index vacuuming) are bypassed.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
The default is 1.6 billion transactions. Although users can
|
|
|
|
set this value anywhere from zero to 2.1 billion,
|
|
|
|
<command>VACUUM</command> will silently adjust the effective
|
|
|
|
value to no less than 105% of <xref
|
|
|
|
linkend="guc-autovacuum-freeze-max-age"/>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
<varlistentry id="guc-vacuum-multixact-freeze-table-age" xreflabel="vacuum_multixact_freeze_table_age">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>vacuum_multixact_freeze_table_age</varname> (<type>integer</type>)
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>vacuum_multixact_freeze_table_age</varname> configuration parameter</primary>
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
<command>VACUUM</command> performs an aggressive scan if the table's
|
|
|
|
<structname>pg_class</structname>.<structfield>relminmxid</structfield> field has reached
|
2016-03-10 22:12:10 +01:00
|
|
|
the age specified by this setting. An aggressive scan differs from
|
2017-10-09 03:44:17 +02:00
|
|
|
a regular <command>VACUUM</command> in that it visits every page that might
|
2016-03-10 22:12:10 +01:00
|
|
|
contain unfrozen XIDs or MXIDs, not just those that might contain dead
|
|
|
|
tuples. The default is 150 million multixacts.
|
2020-09-21 18:43:42 +02:00
|
|
|
Although users can set this value anywhere from zero to two billion,
|
2017-10-09 03:44:17 +02:00
|
|
|
<command>VACUUM</command> will silently limit the effective value to 95% of
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="guc-autovacuum-multixact-freeze-max-age"/>, so that a
|
2020-09-21 18:43:42 +02:00
|
|
|
periodic manual <command>VACUUM</command> has a chance to run before an
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
anti-wraparound is launched for the table.
|
2017-11-23 15:39:47 +01:00
|
|
|
For more information see <xref linkend="vacuum-for-multixact-wraparound"/>.
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-vacuum-multixact-freeze-min-age" xreflabel="vacuum_multixact_freeze_min_age">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>vacuum_multixact_freeze_min_age</varname> (<type>integer</type>)
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>vacuum_multixact_freeze_min_age</varname> configuration parameter</primary>
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Specifies the cutoff age (in multixacts) that <command>VACUUM</command>
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
should use to decide whether to replace multixact IDs with a newer
|
|
|
|
transaction ID or multixact ID while scanning a table. The default
|
|
|
|
is 5 million multixacts.
|
|
|
|
Although users can set this value anywhere from zero to one billion,
|
2017-10-09 03:44:17 +02:00
|
|
|
<command>VACUUM</command> will silently limit the effective value to half
|
2017-11-23 15:39:47 +01:00
|
|
|
the value of <xref linkend="guc-autovacuum-multixact-freeze-max-age"/>,
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
so that there is not an unreasonably short time between forced
|
|
|
|
autovacuums.
|
2017-11-23 15:39:47 +01:00
|
|
|
For more information see <xref linkend="vacuum-for-multixact-wraparound"/>.
|
Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.
Therefore, we now have multixact-specific freezing parameters:
vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)
vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids). Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.
autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids). This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.
Backpatch to 9.3 where multixacts were made to persist enough to require
freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.
Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 23:30:30 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2021-10-12 19:59:24 +02:00
|
|
|
<varlistentry id="guc-vacuum-multixact-failsafe-age" xreflabel="vacuum_multixact_failsafe_age">
|
Add wraparound failsafe to VACUUM.
Add a failsafe mechanism that is triggered by VACUUM when it notices
that the table's relfrozenxid and/or relminmxid are dangerously far in
the past. VACUUM checks the age of the table dynamically, at regular
intervals.
When the failsafe triggers, VACUUM takes extraordinary measures to
finish as quickly as possible so that relfrozenxid and/or relminmxid can
be advanced. VACUUM will stop applying any cost-based delay that may be
in effect. VACUUM will also bypass any further index vacuuming and heap
vacuuming -- it only completes whatever remaining pruning and freezing
is required. Bypassing index/heap vacuuming is enabled by commit
8523492d, which made it possible to dynamically trigger the mechanism
already used within VACUUM when it is run with INDEX_CLEANUP off.
It is expected that the failsafe will almost always trigger within an
autovacuum to prevent wraparound, long after the autovacuum began.
However, the failsafe mechanism can trigger in any VACUUM operation.
Even in a non-aggressive VACUUM, where we're likely to not advance
relfrozenxid, it still seems like a good idea to finish off remaining
pruning and freezing. An aggressive/anti-wraparound VACUUM will be
launched immediately afterwards. Note that the anti-wraparound VACUUM
that follows will itself trigger the failsafe, usually before it even
begins its first (and only) pass over the heap.
The failsafe is controlled by two new GUCs: vacuum_failsafe_age, and
vacuum_multixact_failsafe_age. There are no equivalent reloptions,
since that isn't expected to be useful. The GUCs have rather high
defaults (both default to 1.6 billion), and are expected to generally
only be used to make the failsafe trigger sooner/more frequently.
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAD21AoD0SkE11fMw4jD4RENAwBMcw1wasVnwpJVw3tVqPOQgAw@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzmgH3ySGYeC-m-eOBsa2=sDwa292-CFghV4rESYo39FsQ@mail.gmail.com
2021-04-07 21:37:45 +02:00
|
|
|
<term><varname>vacuum_multixact_failsafe_age</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>vacuum_multixact_failsafe_age</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2021-07-20 02:20:25 +02:00
|
|
|
Specifies the maximum age (in multixacts) that a table's
|
Add wraparound failsafe to VACUUM.
Add a failsafe mechanism that is triggered by VACUUM when it notices
that the table's relfrozenxid and/or relminmxid are dangerously far in
the past. VACUUM checks the age of the table dynamically, at regular
intervals.
When the failsafe triggers, VACUUM takes extraordinary measures to
finish as quickly as possible so that relfrozenxid and/or relminmxid can
be advanced. VACUUM will stop applying any cost-based delay that may be
in effect. VACUUM will also bypass any further index vacuuming and heap
vacuuming -- it only completes whatever remaining pruning and freezing
is required. Bypassing index/heap vacuuming is enabled by commit
8523492d, which made it possible to dynamically trigger the mechanism
already used within VACUUM when it is run with INDEX_CLEANUP off.
It is expected that the failsafe will almost always trigger within an
autovacuum to prevent wraparound, long after the autovacuum began.
However, the failsafe mechanism can trigger in any VACUUM operation.
Even in a non-aggressive VACUUM, where we're likely to not advance
relfrozenxid, it still seems like a good idea to finish off remaining
pruning and freezing. An aggressive/anti-wraparound VACUUM will be
launched immediately afterwards. Note that the anti-wraparound VACUUM
that follows will itself trigger the failsafe, usually before it even
begins its first (and only) pass over the heap.
The failsafe is controlled by two new GUCs: vacuum_failsafe_age, and
vacuum_multixact_failsafe_age. There are no equivalent reloptions,
since that isn't expected to be useful. The GUCs have rather high
defaults (both default to 1.6 billion), and are expected to generally
only be used to make the failsafe trigger sooner/more frequently.
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAD21AoD0SkE11fMw4jD4RENAwBMcw1wasVnwpJVw3tVqPOQgAw@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzmgH3ySGYeC-m-eOBsa2=sDwa292-CFghV4rESYo39FsQ@mail.gmail.com
2021-04-07 21:37:45 +02:00
|
|
|
<structname>pg_class</structname>.<structfield>relminmxid</structfield>
|
|
|
|
field can attain before <command>VACUUM</command> takes
|
|
|
|
extraordinary measures to avoid system-wide multixact ID
|
|
|
|
wraparound failure. This is <command>VACUUM</command>'s
|
|
|
|
strategy of last resort. The failsafe typically triggers when
|
|
|
|
an autovacuum to prevent transaction ID wraparound has already
|
|
|
|
been running for some time, though it's possible for the
|
|
|
|
failsafe to trigger during any <command>VACUUM</command>.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
When the failsafe is triggered, any cost-based delay that is
|
|
|
|
in effect will no longer be applied, and further non-essential
|
|
|
|
maintenance tasks (such as index vacuuming) are bypassed.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
The default is 1.6 billion multixacts. Although users can set
|
|
|
|
this value anywhere from zero to 2.1 billion,
|
|
|
|
<command>VACUUM</command> will silently adjust the effective
|
|
|
|
value to no less than 105% of <xref
|
|
|
|
linkend="guc-autovacuum-multixact-freeze-max-age"/>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2009-08-04 18:08:37 +02:00
|
|
|
<varlistentry id="guc-bytea-output" xreflabel="bytea_output">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>bytea_output</varname> (<type>enum</type>)
|
2009-08-04 18:08:37 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>bytea_output</varname> configuration parameter</primary>
|
2009-08-04 18:08:37 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2009-08-04 18:08:37 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the output format for values of type <type>bytea</type>.
|
|
|
|
Valid values are <literal>hex</literal> (the default)
|
|
|
|
and <literal>escape</literal> (the traditional PostgreSQL
|
2017-11-23 15:39:47 +01:00
|
|
|
format). See <xref linkend="datatype-binary"/> for more
|
2009-08-04 18:08:37 +02:00
|
|
|
information. The <type>bytea</type> type always
|
|
|
|
accepts both formats on input, regardless of this setting.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2007-01-19 17:58:46 +01:00
|
|
|
<varlistentry id="guc-xmlbinary" xreflabel="xmlbinary">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>xmlbinary</varname> (<type>enum</type>)
|
2007-01-19 17:58:46 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>xmlbinary</varname> configuration parameter</primary>
|
2007-01-19 17:58:46 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2007-01-19 17:58:46 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets how binary values are to be encoded in XML. This applies
|
|
|
|
for example when <type>bytea</type> values are converted to
|
|
|
|
XML by the functions <function>xmlelement</function> or
|
|
|
|
<function>xmlforest</function>. Possible values are
|
|
|
|
<literal>base64</literal> and <literal>hex</literal>, which
|
|
|
|
are both defined in the XML Schema standard. The default is
|
|
|
|
<literal>base64</literal>. For further information about
|
2017-11-23 15:39:47 +01:00
|
|
|
XML-related functions, see <xref linkend="functions-xml"/>.
|
2007-01-19 17:58:46 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The actual choice here is mostly a matter of taste,
|
|
|
|
constrained only by possible restrictions in client
|
|
|
|
applications. Both methods support all possible values,
|
|
|
|
although the hex encoding will be somewhat larger than the
|
|
|
|
base64 encoding.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2007-06-03 19:08:34 +02:00
|
|
|
|
2007-01-25 12:53:52 +01:00
|
|
|
<varlistentry id="guc-xmloption" xreflabel="xmloption">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>xmloption</varname> (<type>enum</type>)
|
2007-01-25 12:53:52 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>xmloption</varname> configuration parameter</primary>
|
2007-01-25 12:53:52 +01:00
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>SET XML OPTION</varname></primary>
|
2007-01-25 12:53:52 +01:00
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
2007-04-02 17:27:02 +02:00
|
|
|
<primary>XML option</primary>
|
2007-01-25 12:53:52 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2007-01-25 12:53:52 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets whether <literal>DOCUMENT</literal> or
|
|
|
|
<literal>CONTENT</literal> is implicit when converting between
|
|
|
|
XML and character string values. See <xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="datatype-xml"/> for a description of this. Valid
|
2007-01-25 12:53:52 +01:00
|
|
|
values are <literal>DOCUMENT</literal> and
|
|
|
|
<literal>CONTENT</literal>. The default is
|
|
|
|
<literal>CONTENT</literal>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
According to the SQL standard, the command to set this option is
|
|
|
|
<synopsis>
|
|
|
|
SET XML OPTION { DOCUMENT | CONTENT };
|
|
|
|
</synopsis>
|
|
|
|
This syntax is also available in PostgreSQL.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
2014-11-11 13:08:21 +01:00
|
|
|
</varlistentry>
|
|
|
|
|
2014-11-13 04:14:48 +01:00
|
|
|
<varlistentry id="guc-gin-pending-list-limit" xreflabel="gin_pending_list_limit">
|
|
|
|
<term><varname>gin_pending_list_limit</varname> (<type>integer</type>)
|
2014-11-11 13:08:21 +01:00
|
|
|
<indexterm>
|
2019-04-16 16:16:20 +02:00
|
|
|
<primary><varname>gin_pending_list_limit</varname></primary>
|
|
|
|
<secondary>configuration parameter</secondary>
|
2014-11-11 13:08:21 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
Sets the maximum size of a GIN index's pending list, which is used
|
2017-10-09 03:44:17 +02:00
|
|
|
when <literal>fastupdate</literal> is enabled. If the list grows
|
2014-11-11 13:08:21 +01:00
|
|
|
larger than this maximum size, it is cleaned up by moving
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
the entries in it to the index's main GIN data structure in bulk.
|
|
|
|
If this value is specified without units, it is taken as kilobytes.
|
2017-10-09 03:44:17 +02:00
|
|
|
The default is four megabytes (<literal>4MB</literal>). This setting
|
2014-11-11 13:08:21 +01:00
|
|
|
can be overridden for individual GIN indexes by changing
|
2015-11-11 23:13:38 +01:00
|
|
|
index storage parameters.
|
2017-11-23 15:39:47 +01:00
|
|
|
See <xref linkend="gin-fast-update"/> and <xref linkend="gin-tips"/>
|
2014-11-11 13:08:21 +01:00
|
|
|
for more information.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
2007-01-25 12:53:52 +01:00
|
|
|
</varlistentry>
|
2007-06-03 19:08:34 +02:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="runtime-config-client-format">
|
|
|
|
<title>Locale and Formatting</title>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
|
|
|
|
<varlistentry id="guc-datestyle" xreflabel="DateStyle">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>DateStyle</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>DateStyle</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the display format for date and time values, as well as the
|
|
|
|
rules for interpreting ambiguous date input values. For
|
|
|
|
historical reasons, this variable contains two independent
|
2017-10-09 03:44:17 +02:00
|
|
|
components: the output format specification (<literal>ISO</literal>,
|
|
|
|
<literal>Postgres</literal>, <literal>SQL</literal>, or <literal>German</literal>)
|
2005-09-13 00:11:38 +02:00
|
|
|
and the input/output specification for year/month/day ordering
|
2017-10-09 03:44:17 +02:00
|
|
|
(<literal>DMY</literal>, <literal>MDY</literal>, or <literal>YMD</literal>). These
|
|
|
|
can be set separately or together. The keywords <literal>Euro</literal>
|
|
|
|
and <literal>European</literal> are synonyms for <literal>DMY</literal>; the
|
|
|
|
keywords <literal>US</literal>, <literal>NonEuro</literal>, and
|
|
|
|
<literal>NonEuropean</literal> are synonyms for <literal>MDY</literal>. See
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="datatype-datetime"/> for more information. The
|
2017-10-09 03:44:17 +02:00
|
|
|
built-in default is <literal>ISO, MDY</literal>, but
|
2005-12-09 16:51:14 +01:00
|
|
|
<application>initdb</application> will initialize the
|
|
|
|
configuration file with a setting that corresponds to the
|
|
|
|
behavior of the chosen <varname>lc_time</varname> locale.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2008-11-09 01:28:35 +01:00
|
|
|
<varlistentry id="guc-intervalstyle" xreflabel="IntervalStyle">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>IntervalStyle</varname> (<type>enum</type>)
|
2008-11-09 01:28:35 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>IntervalStyle</varname> configuration parameter</primary>
|
2008-11-09 01:28:35 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2008-11-09 01:28:35 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the display format for interval values.
|
2017-10-09 03:44:17 +02:00
|
|
|
The value <literal>sql_standard</literal> will produce
|
2008-11-09 01:28:35 +01:00
|
|
|
output matching <acronym>SQL</acronym> standard interval literals.
|
2017-10-09 03:44:17 +02:00
|
|
|
The value <literal>postgres</literal> (which is the default) will produce
|
|
|
|
output matching <productname>PostgreSQL</productname> releases prior to 8.4
|
2017-11-23 15:39:47 +01:00
|
|
|
when the <xref linkend="guc-datestyle"/>
|
2017-10-09 03:44:17 +02:00
|
|
|
parameter was set to <literal>ISO</literal>.
|
|
|
|
The value <literal>postgres_verbose</literal> will produce output
|
|
|
|
matching <productname>PostgreSQL</productname> releases prior to 8.4
|
|
|
|
when the <varname>DateStyle</varname>
|
|
|
|
parameter was set to non-<literal>ISO</literal> output.
|
|
|
|
The value <literal>iso_8601</literal> will produce output matching the time
|
|
|
|
interval <quote>format with designators</quote> defined in section
|
2008-11-11 03:42:33 +01:00
|
|
|
4.4.3.2 of ISO 8601.
|
2008-11-09 01:28:35 +01:00
|
|
|
</para>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The <varname>IntervalStyle</varname> parameter also affects the
|
2008-11-09 01:28:35 +01:00
|
|
|
interpretation of ambiguous interval input. See
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="datatype-interval-input"/> for more information.
|
2008-11-09 01:28:35 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2012-05-10 19:55:49 +02:00
|
|
|
<varlistentry id="guc-timezone" xreflabel="TimeZone">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>TimeZone</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>TimeZone</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<indexterm><primary>time zone</primary></indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2007-01-20 22:30:26 +01:00
|
|
|
Sets the time zone for displaying and interpreting time stamps.
|
2017-10-09 03:44:17 +02:00
|
|
|
The built-in default is <literal>GMT</literal>, but that is typically
|
|
|
|
overridden in <filename>postgresql.conf</filename>; <application>initdb</application>
|
2011-09-09 23:59:11 +02:00
|
|
|
will install a setting there corresponding to its system environment.
|
2017-11-23 15:39:47 +01:00
|
|
|
See <xref linkend="datatype-timezones"/> for more information.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2006-07-25 05:51:23 +02:00
|
|
|
<varlistentry id="guc-timezone-abbreviations" xreflabel="timezone_abbreviations">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>timezone_abbreviations</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>timezone_abbreviations</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<indexterm><primary>time zone names</primary></indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2006-07-25 05:51:23 +02:00
|
|
|
Sets the collection of time zone abbreviations that will be accepted
|
2017-10-09 03:44:17 +02:00
|
|
|
by the server for datetime input. The default is <literal>'Default'</literal>,
|
2006-07-25 05:51:23 +02:00
|
|
|
which is a collection that works in most of the world; there are
|
Support timezone abbreviations that sometimes change.
Up to now, PG has assumed that any given timezone abbreviation (such as
"EDT") represents a constant GMT offset in the usage of any particular
region; we had a way to configure what that offset was, but not for it
to be changeable over time. But, as with most things horological, this
view of the world is too simplistic: there are numerous regions that have
at one time or another switched to a different GMT offset but kept using
the same timezone abbreviation. Almost the entire Russian Federation did
that a few years ago, and later this month they're going to do it again.
And there are similar examples all over the world.
To cope with this, invent the notion of a "dynamic timezone abbreviation",
which is one that is referenced to a particular underlying timezone
(as defined in the IANA timezone database) and means whatever it currently
means in that zone. For zones that use or have used daylight-savings time,
the standard and DST abbreviations continue to have the property that you
can specify standard or DST time and get that time offset whether or not
DST was theoretically in effect at the time. However, the abbreviations
mean what they meant at the time in question (or most recently before that
time) rather than being absolutely fixed.
The standard abbreviation-list files have been changed to use this behavior
for abbreviations that have actually varied in meaning since 1970. The
old simple-numeric definitions are kept for abbreviations that have not
changed, since they are a bit faster to resolve.
While this is clearly a new feature, it seems necessary to back-patch it
into all active branches, because otherwise use of Russian zone
abbreviations is going to become even more problematic than it already was.
This change supersedes the changes in commit 513d06ded et al to modify the
fixed meanings of the Russian abbreviations; since we've not shipped that
yet, this will avoid an undesirably incompatible (not to mention incorrect)
change in behavior for timestamps between 2011 and 2014.
This patch makes some cosmetic changes in ecpglib to keep its usage of
datetime lookup tables as similar as possible to the backend code, but
doesn't do anything about the increasingly obsolete set of timezone
abbreviation definitions that are hard-wired into ecpglib. Whatever we
do about that will likely not be appropriate material for back-patching.
Also, a potential free() of a garbage pointer after an out-of-memory
failure in ecpglib has been fixed.
This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that
caused it to produce unexpected results near a timezone transition, if
both the "before" and "after" states are marked as standard time. We'd
only ever thought about or tested transitions between standard and DST
time, but that's not what's happening when a zone simply redefines their
base GMT offset.
In passing, update the SGML documentation to refer to the Olson/zoneinfo/
zic timezone database as the "IANA" database, since it's now being
maintained under the auspices of IANA.
2014-10-16 21:22:10 +02:00
|
|
|
also <literal>'Australia'</literal> and <literal>'India'</literal>,
|
|
|
|
and other collections can be defined for a particular installation.
|
2017-11-23 15:39:47 +01:00
|
|
|
See <xref linkend="datetime-config-files"/> for more information.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-extra-float-digits" xreflabel="extra_float_digits">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>extra_float_digits</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
|
|
|
<primary>significant digits</primary>
|
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
|
|
<primary>floating-point</primary>
|
|
|
|
<secondary>display</secondary>
|
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>extra_float_digits</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Change floating-point output format for improved performance.
Previously, floating-point output was done by rounding to a specific
decimal precision; by default, to 6 or 15 decimal digits (losing
information) or as requested using extra_float_digits. Drivers that
wanted exact float values, and applications like pg_dump that must
preserve values exactly, set extra_float_digits=3 (or sometimes 2 for
historical reasons, though this isn't enough for float4).
Unfortunately, decimal rounded output is slow enough to become a
noticable bottleneck when dealing with large result sets or COPY of
large tables when many floating-point values are involved.
Floating-point output can be done much faster when the output is not
rounded to a specific decimal length, but rather is chosen as the
shortest decimal representation that is closer to the original float
value than to any other value representable in the same precision. The
recently published Ryu algorithm by Ulf Adams is both relatively
simple and remarkably fast.
Accordingly, change float4out/float8out to output shortest decimal
representations if extra_float_digits is greater than 0, and make that
the new default. Applications that need rounded output can set
extra_float_digits back to 0 or below, and take the resulting
performance hit.
We make one concession to portability for systems with buggy
floating-point input: we do not output decimal values that fall
exactly halfway between adjacent representable binary values (which
would rely on the reader doing round-to-nearest-even correctly). This
is known to be a problem at least for VS2013 on Windows.
Our version of the Ryu code originates from
https://github.com/ulfjack/ryu/ at commit c9c3fb1979, but with the
following (significant) modifications:
- Output format is changed to use fixed-point notation for small
exponents, as printf would, and also to use lowercase 'e', a
minimum of 2 exponent digits, and a mandatory sign on the exponent,
to keep the formatting as close as possible to previous output.
- The output of exact midpoint values is disabled as noted above.
- The integer fast-path code is changed somewhat (since we have
fixed-point output and the upstream did not).
- Our project style has been largely applied to the code with the
exception of C99 declaration-after-statement, which has been
retained as an exception to our present policy.
- Most of upstream's debugging and conditionals are removed, and we
use our own configure tests to determine things like uint128
availability.
Changing the float output format obviously affects a number of
regression tests. This patch uses an explicit setting of
extra_float_digits=0 for test output that is not expected to be
exactly reproducible (e.g. due to numerical instability or differing
algorithms for transcendental functions).
Conversions from floats to numeric are unchanged by this patch. These
may appear in index expressions and it is not yet clear whether any
change should be made, so that can be left for another day.
This patch assumes that the only supported floating point format is
now IEEE format, and the documentation is updated to reflect that.
Code by me, adapting the work of Ulf Adams and other contributors.
References:
https://dl.acm.org/citation.cfm?id=3192369
Reviewed-by: Tom Lane, Andres Freund, Donald Dong
Discussion: https://postgr.es/m/87r2el1bx6.fsf@news-spur.riddles.org.uk
2019-02-13 16:20:33 +01:00
|
|
|
This parameter adjusts the number of digits used for textual output of
|
2017-10-09 03:44:17 +02:00
|
|
|
floating-point values, including <type>float4</type>, <type>float8</type>,
|
Change floating-point output format for improved performance.
Previously, floating-point output was done by rounding to a specific
decimal precision; by default, to 6 or 15 decimal digits (losing
information) or as requested using extra_float_digits. Drivers that
wanted exact float values, and applications like pg_dump that must
preserve values exactly, set extra_float_digits=3 (or sometimes 2 for
historical reasons, though this isn't enough for float4).
Unfortunately, decimal rounded output is slow enough to become a
noticable bottleneck when dealing with large result sets or COPY of
large tables when many floating-point values are involved.
Floating-point output can be done much faster when the output is not
rounded to a specific decimal length, but rather is chosen as the
shortest decimal representation that is closer to the original float
value than to any other value representable in the same precision. The
recently published Ryu algorithm by Ulf Adams is both relatively
simple and remarkably fast.
Accordingly, change float4out/float8out to output shortest decimal
representations if extra_float_digits is greater than 0, and make that
the new default. Applications that need rounded output can set
extra_float_digits back to 0 or below, and take the resulting
performance hit.
We make one concession to portability for systems with buggy
floating-point input: we do not output decimal values that fall
exactly halfway between adjacent representable binary values (which
would rely on the reader doing round-to-nearest-even correctly). This
is known to be a problem at least for VS2013 on Windows.
Our version of the Ryu code originates from
https://github.com/ulfjack/ryu/ at commit c9c3fb1979, but with the
following (significant) modifications:
- Output format is changed to use fixed-point notation for small
exponents, as printf would, and also to use lowercase 'e', a
minimum of 2 exponent digits, and a mandatory sign on the exponent,
to keep the formatting as close as possible to previous output.
- The output of exact midpoint values is disabled as noted above.
- The integer fast-path code is changed somewhat (since we have
fixed-point output and the upstream did not).
- Our project style has been largely applied to the code with the
exception of C99 declaration-after-statement, which has been
retained as an exception to our present policy.
- Most of upstream's debugging and conditionals are removed, and we
use our own configure tests to determine things like uint128
availability.
Changing the float output format obviously affects a number of
regression tests. This patch uses an explicit setting of
extra_float_digits=0 for test output that is not expected to be
exactly reproducible (e.g. due to numerical instability or differing
algorithms for transcendental functions).
Conversions from floats to numeric are unchanged by this patch. These
may appear in index expressions and it is not yet clear whether any
change should be made, so that can be left for another day.
This patch assumes that the only supported floating point format is
now IEEE format, and the documentation is updated to reflect that.
Code by me, adapting the work of Ulf Adams and other contributors.
References:
https://dl.acm.org/citation.cfm?id=3192369
Reviewed-by: Tom Lane, Andres Freund, Donald Dong
Discussion: https://postgr.es/m/87r2el1bx6.fsf@news-spur.riddles.org.uk
2019-02-13 16:20:33 +01:00
|
|
|
and geometric data types.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
Change floating-point output format for improved performance.
Previously, floating-point output was done by rounding to a specific
decimal precision; by default, to 6 or 15 decimal digits (losing
information) or as requested using extra_float_digits. Drivers that
wanted exact float values, and applications like pg_dump that must
preserve values exactly, set extra_float_digits=3 (or sometimes 2 for
historical reasons, though this isn't enough for float4).
Unfortunately, decimal rounded output is slow enough to become a
noticable bottleneck when dealing with large result sets or COPY of
large tables when many floating-point values are involved.
Floating-point output can be done much faster when the output is not
rounded to a specific decimal length, but rather is chosen as the
shortest decimal representation that is closer to the original float
value than to any other value representable in the same precision. The
recently published Ryu algorithm by Ulf Adams is both relatively
simple and remarkably fast.
Accordingly, change float4out/float8out to output shortest decimal
representations if extra_float_digits is greater than 0, and make that
the new default. Applications that need rounded output can set
extra_float_digits back to 0 or below, and take the resulting
performance hit.
We make one concession to portability for systems with buggy
floating-point input: we do not output decimal values that fall
exactly halfway between adjacent representable binary values (which
would rely on the reader doing round-to-nearest-even correctly). This
is known to be a problem at least for VS2013 on Windows.
Our version of the Ryu code originates from
https://github.com/ulfjack/ryu/ at commit c9c3fb1979, but with the
following (significant) modifications:
- Output format is changed to use fixed-point notation for small
exponents, as printf would, and also to use lowercase 'e', a
minimum of 2 exponent digits, and a mandatory sign on the exponent,
to keep the formatting as close as possible to previous output.
- The output of exact midpoint values is disabled as noted above.
- The integer fast-path code is changed somewhat (since we have
fixed-point output and the upstream did not).
- Our project style has been largely applied to the code with the
exception of C99 declaration-after-statement, which has been
retained as an exception to our present policy.
- Most of upstream's debugging and conditionals are removed, and we
use our own configure tests to determine things like uint128
availability.
Changing the float output format obviously affects a number of
regression tests. This patch uses an explicit setting of
extra_float_digits=0 for test output that is not expected to be
exactly reproducible (e.g. due to numerical instability or differing
algorithms for transcendental functions).
Conversions from floats to numeric are unchanged by this patch. These
may appear in index expressions and it is not yet clear whether any
change should be made, so that can be left for another day.
This patch assumes that the only supported floating point format is
now IEEE format, and the documentation is updated to reflect that.
Code by me, adapting the work of Ulf Adams and other contributors.
References:
https://dl.acm.org/citation.cfm?id=3192369
Reviewed-by: Tom Lane, Andres Freund, Donald Dong
Discussion: https://postgr.es/m/87r2el1bx6.fsf@news-spur.riddles.org.uk
2019-02-13 16:20:33 +01:00
|
|
|
<para>
|
|
|
|
If the value is 1 (the default) or above, float values are output in
|
|
|
|
shortest-precise format; see <xref linkend="datatype-float"/>. The
|
|
|
|
actual number of digits generated depends only on the value being
|
|
|
|
output, not on the value of this parameter. At most 17 digits are
|
|
|
|
required for <type>float8</type> values, and 9 for <type>float4</type>
|
|
|
|
values. This format is both fast and precise, preserving the original
|
|
|
|
binary float value exactly when correctly read. For historical
|
|
|
|
compatibility, values up to 3 are permitted.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
If the value is zero or negative, then the output is rounded to a
|
|
|
|
given decimal precision. The precision used is the standard number of
|
|
|
|
digits for the type (<literal>FLT_DIG</literal>
|
|
|
|
or <literal>DBL_DIG</literal> as appropriate) reduced according to the
|
2019-08-20 05:36:31 +02:00
|
|
|
value of this parameter. (For example, specifying -1 will cause
|
|
|
|
<type>float4</type> values to be output rounded to 5 significant
|
|
|
|
digits, and <type>float8</type> values
|
Change floating-point output format for improved performance.
Previously, floating-point output was done by rounding to a specific
decimal precision; by default, to 6 or 15 decimal digits (losing
information) or as requested using extra_float_digits. Drivers that
wanted exact float values, and applications like pg_dump that must
preserve values exactly, set extra_float_digits=3 (or sometimes 2 for
historical reasons, though this isn't enough for float4).
Unfortunately, decimal rounded output is slow enough to become a
noticable bottleneck when dealing with large result sets or COPY of
large tables when many floating-point values are involved.
Floating-point output can be done much faster when the output is not
rounded to a specific decimal length, but rather is chosen as the
shortest decimal representation that is closer to the original float
value than to any other value representable in the same precision. The
recently published Ryu algorithm by Ulf Adams is both relatively
simple and remarkably fast.
Accordingly, change float4out/float8out to output shortest decimal
representations if extra_float_digits is greater than 0, and make that
the new default. Applications that need rounded output can set
extra_float_digits back to 0 or below, and take the resulting
performance hit.
We make one concession to portability for systems with buggy
floating-point input: we do not output decimal values that fall
exactly halfway between adjacent representable binary values (which
would rely on the reader doing round-to-nearest-even correctly). This
is known to be a problem at least for VS2013 on Windows.
Our version of the Ryu code originates from
https://github.com/ulfjack/ryu/ at commit c9c3fb1979, but with the
following (significant) modifications:
- Output format is changed to use fixed-point notation for small
exponents, as printf would, and also to use lowercase 'e', a
minimum of 2 exponent digits, and a mandatory sign on the exponent,
to keep the formatting as close as possible to previous output.
- The output of exact midpoint values is disabled as noted above.
- The integer fast-path code is changed somewhat (since we have
fixed-point output and the upstream did not).
- Our project style has been largely applied to the code with the
exception of C99 declaration-after-statement, which has been
retained as an exception to our present policy.
- Most of upstream's debugging and conditionals are removed, and we
use our own configure tests to determine things like uint128
availability.
Changing the float output format obviously affects a number of
regression tests. This patch uses an explicit setting of
extra_float_digits=0 for test output that is not expected to be
exactly reproducible (e.g. due to numerical instability or differing
algorithms for transcendental functions).
Conversions from floats to numeric are unchanged by this patch. These
may appear in index expressions and it is not yet clear whether any
change should be made, so that can be left for another day.
This patch assumes that the only supported floating point format is
now IEEE format, and the documentation is updated to reflect that.
Code by me, adapting the work of Ulf Adams and other contributors.
References:
https://dl.acm.org/citation.cfm?id=3192369
Reviewed-by: Tom Lane, Andres Freund, Donald Dong
Discussion: https://postgr.es/m/87r2el1bx6.fsf@news-spur.riddles.org.uk
2019-02-13 16:20:33 +01:00
|
|
|
rounded to 14 digits.) This format is slower and does not preserve all
|
|
|
|
the bits of the binary float value, but may be more human-readable.
|
|
|
|
</para>
|
|
|
|
<note>
|
|
|
|
<para>
|
|
|
|
The meaning of this parameter, and its default value, changed
|
|
|
|
in <productname>PostgreSQL</productname> 12;
|
|
|
|
see <xref linkend="datatype-float"/> for further discussion.
|
|
|
|
</para>
|
|
|
|
</note>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-client-encoding" xreflabel="client_encoding">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>client_encoding</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>client_encoding</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<indexterm><primary>character set</primary></indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the client-side encoding (character set).
|
|
|
|
The default is to use the database encoding.
|
2011-02-01 06:26:17 +01:00
|
|
|
The character sets supported by the <productname>PostgreSQL</productname>
|
2017-11-23 15:39:47 +01:00
|
|
|
server are described in <xref linkend="multibyte-charset-supported"/>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-lc-messages" xreflabel="lc_messages">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>lc_messages</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>lc_messages</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the language in which messages are displayed. Acceptable
|
2017-11-23 15:39:47 +01:00
|
|
|
values are system-dependent; see <xref linkend="locale"/> for
|
2005-09-13 00:11:38 +02:00
|
|
|
more information. If this variable is set to the empty string
|
|
|
|
(which is the default) then the value is inherited from the
|
|
|
|
execution environment of the server in a system-dependent way.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
On some systems, this locale category does not exist. Setting
|
|
|
|
this variable will still work, but there will be no effect.
|
|
|
|
Also, there is a chance that no translated messages for the
|
|
|
|
desired language exist. In that case you will continue to see
|
|
|
|
the English messages.
|
|
|
|
</para>
|
2006-01-23 19:16:41 +01:00
|
|
|
|
|
|
|
<para>
|
|
|
|
Only superusers can change this setting, because it affects the
|
2010-02-03 18:25:06 +01:00
|
|
|
messages sent to the server log as well as to the client, and
|
|
|
|
an improper value might obscure the readability of the server
|
|
|
|
logs.
|
2006-01-23 19:16:41 +01:00
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-lc-monetary" xreflabel="lc_monetary">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>lc_monetary</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>lc_monetary</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the locale to use for formatting monetary amounts, for
|
|
|
|
example with the <function>to_char</function> family of
|
|
|
|
functions. Acceptable values are system-dependent; see <xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="locale"/> for more information. If this variable is
|
2005-09-13 00:11:38 +02:00
|
|
|
set to the empty string (which is the default) then the value
|
|
|
|
is inherited from the execution environment of the server in a
|
|
|
|
system-dependent way.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-lc-numeric" xreflabel="lc_numeric">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>lc_numeric</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>lc_numeric</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Sets the locale to use for formatting numbers, for example
|
|
|
|
with the <function>to_char</function> family of
|
|
|
|
functions. Acceptable values are system-dependent; see <xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="locale"/> for more information. If this variable is
|
2005-09-13 00:11:38 +02:00
|
|
|
set to the empty string (which is the default) then the value
|
|
|
|
is inherited from the execution environment of the server in a
|
|
|
|
system-dependent way.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-lc-time" xreflabel="lc_time">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>lc_time</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>lc_time</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2008-05-19 20:08:16 +02:00
|
|
|
Sets the locale to use for formatting dates and times, for example
|
|
|
|
with the <function>to_char</function> family of
|
|
|
|
functions. Acceptable values are system-dependent; see <xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="locale"/> for more information. If this variable is
|
2005-09-13 00:11:38 +02:00
|
|
|
set to the empty string (which is the default) then the value
|
|
|
|
is inherited from the execution environment of the server in a
|
|
|
|
system-dependent way.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2007-08-22 06:45:20 +02:00
|
|
|
<varlistentry id="guc-default-text-search-config" xreflabel="default_text_search_config">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>default_text_search_config</varname> (<type>string</type>)
|
2007-08-22 06:45:20 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>default_text_search_config</varname> configuration parameter</primary>
|
2007-08-22 06:45:20 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2007-08-22 06:45:20 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Selects the text search configuration that is used by those variants
|
|
|
|
of the text search functions that do not have an explicit argument
|
|
|
|
specifying the configuration.
|
2017-11-23 15:39:47 +01:00
|
|
|
See <xref linkend="textsearch"/> for further information.
|
2017-10-09 03:44:17 +02:00
|
|
|
The built-in default is <literal>pg_catalog.simple</literal>, but
|
2007-08-22 06:45:20 +02:00
|
|
|
<application>initdb</application> will initialize the
|
|
|
|
configuration file with a setting that corresponds to the
|
|
|
|
chosen <varname>lc_ctype</varname> locale, if a configuration
|
|
|
|
matching that locale can be identified.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
|
|
|
|
</sect2>
|
2013-06-13 04:28:24 +02:00
|
|
|
|
|
|
|
<sect2 id="runtime-config-client-preload">
|
|
|
|
<title>Shared Library Preloading</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Several settings are available for preloading shared libraries into the
|
|
|
|
server, in order to load additional functionality or achieve performance
|
|
|
|
benefits. For example, a setting of
|
|
|
|
<literal>'$libdir/mylib'</literal> would cause
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>mylib.so</literal> (or on some platforms,
|
|
|
|
<literal>mylib.sl</literal>) to be preloaded from the installation's standard
|
2013-06-13 04:28:24 +02:00
|
|
|
library directory. The differences between the settings are when they
|
|
|
|
take effect and what privileges are required to change them.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
<productname>PostgreSQL</productname> procedural language libraries can
|
|
|
|
be preloaded in this way, typically by using the
|
|
|
|
syntax <literal>'$libdir/plXXX'</literal> where
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>XXX</literal> is <literal>pgsql</literal>, <literal>perl</literal>,
|
|
|
|
<literal>tcl</literal>, or <literal>python</literal>.
|
2013-06-13 04:28:24 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Only shared libraries specifically intended to be used with PostgreSQL
|
|
|
|
can be loaded this way. Every PostgreSQL-supported library has
|
2017-10-09 03:44:17 +02:00
|
|
|
a <quote>magic block</quote> that is checked to guarantee compatibility. For
|
2013-06-13 04:28:24 +02:00
|
|
|
this reason, non-PostgreSQL libraries cannot be loaded in this way. You
|
|
|
|
might be able to use operating-system facilities such
|
|
|
|
as <envar>LD_PRELOAD</envar> for that.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
In general, refer to the documentation of a specific module for the
|
|
|
|
recommended way to load that module.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry id="guc-local-preload-libraries" xreflabel="local_preload_libraries">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>local_preload_libraries</varname> (<type>string</type>)
|
2013-06-13 04:28:24 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>local_preload_libraries</varname> configuration parameter</primary>
|
2013-06-13 04:28:24 +02:00
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><filename>$libdir/plugins</filename></primary>
|
2013-06-13 04:28:24 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2013-06-13 04:28:24 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This variable specifies one or more shared libraries that are to be
|
2014-12-23 05:05:46 +01:00
|
|
|
preloaded at connection start.
|
2017-06-20 19:39:57 +02:00
|
|
|
It contains a comma-separated list of library names, where each name
|
Improve <xref> vs. <command> formatting in the documentation
SQL commands are generally marked up as <command>, except when a link
to a reference page is used using <xref>. But the latter doesn't
create monospace markup, so this looks strange especially when a
paragraph contains a mix of links and non-links.
We considered putting <command> in the <refentrytitle> on the target
side, but that creates some formatting side effects elsewhere.
Generally, it seems safer to solve this on the link source side.
We can't put the <xref> inside the <command>; the DTD doesn't allow
this. DocBook 5 would allow the <command> to have the linkend
attribute itself, but we are not there yet.
So to solve this for now, convert the <xref>s to <link> plus
<command>. This gives the correct look and also gives some more
flexibility what we can put into the link text (e.g., subcommands or
other clauses). In the future, these could then be converted to
DocBook 5 style.
I haven't converted absolutely all xrefs to SQL command reference
pages, only those where we care about the appearance of the link text
or where it was otherwise appropriate to make the appearance match a
bit better. Also in some cases, the links where repetitive, so in
those cases the links where just removed and replaced by a plain
<command>. In cases where we just want the link and don't
specifically care about the generated link text (typically phrased
"for further information see <xref ...>") the xref is kept.
Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/87o8pco34z.fsf@wibble.ilmari.org
2020-10-03 16:16:51 +02:00
|
|
|
is interpreted as for the <link linkend="sql-load"><command>LOAD</command></link> command.
|
2017-06-20 19:39:57 +02:00
|
|
|
Whitespace between entries is ignored; surround a library name with
|
|
|
|
double quotes if you need to include whitespace or commas in the name.
|
2014-12-23 05:05:46 +01:00
|
|
|
The parameter value only takes effect at the start of the connection.
|
|
|
|
Subsequent changes have no effect. If a specified library is not
|
2013-06-13 04:28:24 +02:00
|
|
|
found, the connection attempt will fail.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
This option can be set by any user. Because of that, the libraries
|
|
|
|
that can be loaded are restricted to those appearing in the
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>plugins</filename> subdirectory of the installation's
|
2013-06-13 04:28:24 +02:00
|
|
|
standard library directory. (It is the database administrator's
|
2017-10-09 03:44:17 +02:00
|
|
|
responsibility to ensure that only <quote>safe</quote> libraries
|
|
|
|
are installed there.) Entries in <varname>local_preload_libraries</varname>
|
2013-06-13 04:28:24 +02:00
|
|
|
can specify this directory explicitly, for example
|
|
|
|
<literal>$libdir/plugins/mylib</literal>, or just specify
|
|
|
|
the library name — <literal>mylib</literal> would have
|
|
|
|
the same effect as <literal>$libdir/plugins/mylib</literal>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2014-12-23 05:05:46 +01:00
|
|
|
The intent of this feature is to allow unprivileged users to load
|
|
|
|
debugging or performance-measurement libraries into specific sessions
|
2017-10-09 03:44:17 +02:00
|
|
|
without requiring an explicit <command>LOAD</command> command. To that end,
|
2014-12-23 05:05:46 +01:00
|
|
|
it would be typical to set this parameter using
|
|
|
|
the <envar>PGOPTIONS</envar> environment variable on the client or by
|
|
|
|
using
|
2017-10-09 03:44:17 +02:00
|
|
|
<command>ALTER ROLE SET</command>.
|
2014-12-23 05:05:46 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
However, unless a module is specifically designed to be used in this way by
|
2013-06-13 04:28:24 +02:00
|
|
|
non-superusers, this is usually not the right setting to use. Look
|
2017-11-23 15:39:47 +01:00
|
|
|
at <xref linkend="guc-session-preload-libraries"/> instead.
|
2013-06-13 04:28:24 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
|
|
|
|
<varlistentry id="guc-session-preload-libraries" xreflabel="session_preload_libraries">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>session_preload_libraries</varname> (<type>string</type>)
|
2013-06-13 04:28:24 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>session_preload_libraries</varname> configuration parameter</primary>
|
2013-06-13 04:28:24 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2013-06-13 04:28:24 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This variable specifies one or more shared libraries that are to be
|
2017-06-20 19:39:57 +02:00
|
|
|
preloaded at connection start.
|
|
|
|
It contains a comma-separated list of library names, where each name
|
Improve <xref> vs. <command> formatting in the documentation
SQL commands are generally marked up as <command>, except when a link
to a reference page is used using <xref>. But the latter doesn't
create monospace markup, so this looks strange especially when a
paragraph contains a mix of links and non-links.
We considered putting <command> in the <refentrytitle> on the target
side, but that creates some formatting side effects elsewhere.
Generally, it seems safer to solve this on the link source side.
We can't put the <xref> inside the <command>; the DTD doesn't allow
this. DocBook 5 would allow the <command> to have the linkend
attribute itself, but we are not there yet.
So to solve this for now, convert the <xref>s to <link> plus
<command>. This gives the correct look and also gives some more
flexibility what we can put into the link text (e.g., subcommands or
other clauses). In the future, these could then be converted to
DocBook 5 style.
I haven't converted absolutely all xrefs to SQL command reference
pages, only those where we care about the appearance of the link text
or where it was otherwise appropriate to make the appearance match a
bit better. Also in some cases, the links where repetitive, so in
those cases the links where just removed and replaced by a plain
<command>. In cases where we just want the link and don't
specifically care about the generated link text (typically phrased
"for further information see <xref ...>") the xref is kept.
Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/87o8pco34z.fsf@wibble.ilmari.org
2020-10-03 16:16:51 +02:00
|
|
|
is interpreted as for the <link linkend="sql-load"><command>LOAD</command></link> command.
|
2017-06-20 19:39:57 +02:00
|
|
|
Whitespace between entries is ignored; surround a library name with
|
|
|
|
double quotes if you need to include whitespace or commas in the name.
|
2013-06-13 04:28:24 +02:00
|
|
|
The parameter value only takes effect at the start of the connection.
|
|
|
|
Subsequent changes have no effect. If a specified library is not
|
|
|
|
found, the connection attempt will fail.
|
2017-06-20 19:39:57 +02:00
|
|
|
Only superusers can change this setting.
|
2013-06-13 04:28:24 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The intent of this feature is to allow debugging or
|
|
|
|
performance-measurement libraries to be loaded into specific sessions
|
|
|
|
without an explicit
|
2017-10-09 03:44:17 +02:00
|
|
|
<command>LOAD</command> command being given. For
|
2017-11-23 15:39:47 +01:00
|
|
|
example, <xref linkend="auto-explain"/> could be enabled for all
|
2013-06-13 04:28:24 +02:00
|
|
|
sessions under a given user name by setting this parameter
|
2017-10-09 03:44:17 +02:00
|
|
|
with <command>ALTER ROLE SET</command>. Also, this parameter can be changed
|
2013-06-13 04:28:24 +02:00
|
|
|
without restarting the server (but changes only take effect when a new
|
|
|
|
session is started), so it is easier to add new modules this way, even
|
|
|
|
if they should apply to all sessions.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-11-23 15:39:47 +01:00
|
|
|
Unlike <xref linkend="guc-shared-preload-libraries"/>, there is no large
|
2013-06-13 04:28:24 +02:00
|
|
|
performance advantage to loading a library at session start rather than
|
|
|
|
when it is first used. There is some advantage, however, when
|
|
|
|
connection pooling is used.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-shared-preload-libraries" xreflabel="shared_preload_libraries">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>shared_preload_libraries</varname> (<type>string</type>)
|
2013-06-13 04:28:24 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>shared_preload_libraries</varname> configuration parameter</primary>
|
2013-06-13 04:28:24 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2013-06-13 04:28:24 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This variable specifies one or more shared libraries to be preloaded at
|
2017-06-20 19:39:57 +02:00
|
|
|
server start.
|
|
|
|
It contains a comma-separated list of library names, where each name
|
Improve <xref> vs. <command> formatting in the documentation
SQL commands are generally marked up as <command>, except when a link
to a reference page is used using <xref>. But the latter doesn't
create monospace markup, so this looks strange especially when a
paragraph contains a mix of links and non-links.
We considered putting <command> in the <refentrytitle> on the target
side, but that creates some formatting side effects elsewhere.
Generally, it seems safer to solve this on the link source side.
We can't put the <xref> inside the <command>; the DTD doesn't allow
this. DocBook 5 would allow the <command> to have the linkend
attribute itself, but we are not there yet.
So to solve this for now, convert the <xref>s to <link> plus
<command>. This gives the correct look and also gives some more
flexibility what we can put into the link text (e.g., subcommands or
other clauses). In the future, these could then be converted to
DocBook 5 style.
I haven't converted absolutely all xrefs to SQL command reference
pages, only those where we care about the appearance of the link text
or where it was otherwise appropriate to make the appearance match a
bit better. Also in some cases, the links where repetitive, so in
those cases the links where just removed and replaced by a plain
<command>. In cases where we just want the link and don't
specifically care about the generated link text (typically phrased
"for further information see <xref ...>") the xref is kept.
Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/87o8pco34z.fsf@wibble.ilmari.org
2020-10-03 16:16:51 +02:00
|
|
|
is interpreted as for the <link linkend="sql-load"><command>LOAD</command></link> command.
|
2017-06-20 19:39:57 +02:00
|
|
|
Whitespace between entries is ignored; surround a library name with
|
|
|
|
double quotes if you need to include whitespace or commas in the name.
|
|
|
|
This parameter can only be set at server start. If a specified
|
|
|
|
library is not found, the server will fail to start.
|
2013-06-13 04:28:24 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Some libraries need to perform certain operations that can only take
|
|
|
|
place at postmaster start, such as allocating shared memory, reserving
|
|
|
|
light-weight locks, or starting background workers. Those libraries
|
|
|
|
must be loaded at server start through this parameter. See the
|
|
|
|
documentation of each library for details.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Other libraries can also be preloaded. By preloading a shared library,
|
|
|
|
the library startup time is avoided when the library is first used.
|
|
|
|
However, the time to start each new server process might increase
|
|
|
|
slightly, even if that process never uses the library. So this
|
|
|
|
parameter is recommended only for libraries that will be used in most
|
|
|
|
sessions. Also, changing this parameter requires a server restart, so
|
|
|
|
this is not the right setting to use for short-term debugging tasks,
|
2017-11-23 15:39:47 +01:00
|
|
|
say. Use <xref linkend="guc-session-preload-libraries"/> for that
|
2013-06-13 04:28:24 +02:00
|
|
|
instead.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<note>
|
|
|
|
<para>
|
|
|
|
On Windows hosts, preloading a library at server start will not reduce
|
|
|
|
the time required to start each new server process; each server process
|
|
|
|
will re-load all preload libraries. However, <varname>shared_preload_libraries
|
|
|
|
</varname> is still useful on Windows hosts for libraries that need to
|
2013-09-15 17:01:14 +02:00
|
|
|
perform operations at postmaster start time.
|
2013-06-13 04:28:24 +02:00
|
|
|
</para>
|
|
|
|
</note>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2018-03-28 23:22:42 +02:00
|
|
|
|
|
|
|
<varlistentry id="guc-jit-provider" xreflabel="jit_provider">
|
|
|
|
<term><varname>jit_provider</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>jit_provider</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-09-15 23:24:35 +02:00
|
|
|
This variable is the name of the JIT provider library to be used
|
|
|
|
(see <xref linkend="jit-pluggable"/>).
|
|
|
|
The default is <literal>llvmjit</literal>.
|
|
|
|
This parameter can only be set at server start.
|
2018-03-28 23:22:42 +02:00
|
|
|
</para>
|
2018-09-15 23:24:35 +02:00
|
|
|
|
2018-03-28 23:22:42 +02:00
|
|
|
<para>
|
2018-09-15 23:24:35 +02:00
|
|
|
If set to a non-existent library, <acronym>JIT</acronym> will not be
|
2018-03-28 23:22:42 +02:00
|
|
|
available, but no error will be raised. This allows JIT support to be
|
|
|
|
installed separately from the main
|
|
|
|
<productname>PostgreSQL</productname> package.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2013-06-13 04:28:24 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<sect2 id="runtime-config-client-other">
|
|
|
|
<title>Other Defaults</title>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
|
|
|
|
<varlistentry id="guc-dynamic-library-path" xreflabel="dynamic_library_path">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>dynamic_library_path</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>dynamic_library_path</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<indexterm><primary>dynamic loading</primary></indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
If a dynamically loadable module needs to be opened and the
|
|
|
|
file name specified in the <command>CREATE FUNCTION</command> or
|
|
|
|
<command>LOAD</command> command
|
2009-04-27 18:27:36 +02:00
|
|
|
does not have a directory component (i.e., the
|
2005-09-13 00:11:38 +02:00
|
|
|
name does not contain a slash), the system will search this
|
|
|
|
path for the required file.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2010-02-03 18:25:06 +01:00
|
|
|
The value for <varname>dynamic_library_path</varname> must be a
|
2005-09-13 00:11:38 +02:00
|
|
|
list of absolute directory paths separated by colons (or semi-colons
|
|
|
|
on Windows). If a list element starts
|
|
|
|
with the special string <literal>$libdir</literal>, the
|
|
|
|
compiled-in <productname>PostgreSQL</productname> package
|
2010-02-03 18:25:06 +01:00
|
|
|
library directory is substituted for <literal>$libdir</literal>; this
|
2005-09-13 00:11:38 +02:00
|
|
|
is where the modules provided by the standard
|
|
|
|
<productname>PostgreSQL</productname> distribution are installed.
|
|
|
|
(Use <literal>pg_config --pkglibdir</literal> to find out the name of
|
|
|
|
this directory.) For example:
|
|
|
|
<programlisting>
|
|
|
|
dynamic_library_path = '/usr/local/lib/postgresql:/home/my_project/lib:$libdir'
|
|
|
|
</programlisting>
|
|
|
|
or, in a Windows environment:
|
|
|
|
<programlisting>
|
|
|
|
dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
|
|
|
|
</programlisting>
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The default value for this parameter is
|
|
|
|
<literal>'$libdir'</literal>. If the value is set to an empty
|
|
|
|
string, the automatic path search is turned off.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
This parameter can be changed at run time by superusers, but a
|
|
|
|
setting done that way will only persist until the end of the
|
|
|
|
client connection, so this method should be reserved for
|
|
|
|
development purposes. The recommended way to set this parameter
|
|
|
|
is in the <filename>postgresql.conf</filename> configuration
|
|
|
|
file.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2006-10-07 21:25:29 +02:00
|
|
|
|
|
|
|
<varlistentry id="guc-gin-fuzzy-search-limit" xreflabel="gin_fuzzy_search_limit">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>gin_fuzzy_search_limit</varname> (<type>integer</type>)
|
2006-10-07 21:25:29 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>gin_fuzzy_search_limit</varname> configuration parameter</primary>
|
2006-10-07 21:25:29 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2006-10-07 21:25:29 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2010-02-03 18:25:06 +01:00
|
|
|
Soft upper limit of the size of the set returned by GIN index scans. For more
|
2017-11-23 15:39:47 +01:00
|
|
|
information see <xref linkend="gin-tips"/>.
|
2006-10-07 21:25:29 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2007-06-03 19:08:34 +02:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="runtime-config-locks">
|
|
|
|
<title>Lock Management</title>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
|
|
|
|
<varlistentry id="guc-deadlock-timeout" xreflabel="deadlock_timeout">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>deadlock_timeout</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
|
|
|
<primary>deadlock</primary>
|
|
|
|
<secondary>timeout during</secondary>
|
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
|
|
<primary>timeout</primary>
|
|
|
|
<secondary>deadlock</secondary>
|
|
|
|
</indexterm>
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>deadlock_timeout</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
This is the amount of time to wait on a lock
|
2005-09-13 00:11:38 +02:00
|
|
|
before checking to see if there is a deadlock condition. The
|
2010-02-03 18:25:06 +01:00
|
|
|
check for deadlock is relatively expensive, so the server doesn't run
|
2007-03-03 19:46:40 +01:00
|
|
|
it every time it waits for a lock. We optimistically assume
|
2005-09-13 00:11:38 +02:00
|
|
|
that deadlocks are not common in production applications and
|
2010-02-03 18:25:06 +01:00
|
|
|
just wait on the lock for a while before checking for a
|
2005-09-13 00:11:38 +02:00
|
|
|
deadlock. Increasing this value reduces the amount of time
|
|
|
|
wasted in needless deadlock checks, but slows down reporting of
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
real deadlock errors.
|
|
|
|
If this value is specified without units, it is taken as milliseconds.
|
|
|
|
The default is one second (<literal>1s</literal>),
|
2005-09-13 00:11:38 +02:00
|
|
|
which is probably about the smallest value you would want in
|
2007-06-19 22:13:22 +02:00
|
|
|
practice. On a heavily loaded server you might want to raise it.
|
2005-09-13 00:11:38 +02:00
|
|
|
Ideally the setting should exceed your typical transaction time,
|
2007-06-19 22:13:22 +02:00
|
|
|
so as to improve the odds that a lock will be released before
|
2011-06-22 04:32:30 +02:00
|
|
|
the waiter decides to check for deadlock. Only superusers can change
|
|
|
|
this setting.
|
2007-06-19 22:13:22 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-11-23 15:39:47 +01:00
|
|
|
When <xref linkend="guc-log-lock-waits"/> is set,
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
this parameter also determines the amount of time to wait before
|
2007-06-19 22:13:22 +02:00
|
|
|
a log message is issued about the lock wait. If you are trying
|
|
|
|
to investigate locking delays you might want to set a shorter than
|
|
|
|
normal <varname>deadlock_timeout</varname>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-max-locks-per-transaction" xreflabel="max_locks_per_transaction">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>max_locks_per_transaction</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_locks_per_transaction</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2010-02-03 18:25:06 +01:00
|
|
|
The shared lock table tracks locks on
|
2007-01-20 22:30:26 +01:00
|
|
|
<varname>max_locks_per_transaction</varname> * (<xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="guc-max-connections"/> + <xref
|
|
|
|
linkend="guc-max-prepared-transactions"/>) objects (e.g., tables);
|
2007-01-20 22:30:26 +01:00
|
|
|
hence, no more than this many distinct objects can be locked at
|
|
|
|
any one time. This parameter controls the average number of object
|
|
|
|
locks allocated for each transaction; individual transactions
|
|
|
|
can lock more objects as long as the locks of all transactions
|
2017-10-09 03:44:17 +02:00
|
|
|
fit in the lock table. This is <emphasis>not</emphasis> the number of
|
2007-01-20 22:30:26 +01:00
|
|
|
rows that can be locked; that value is unlimited. The default,
|
|
|
|
64, has historically proven sufficient, but you might need to
|
2012-08-30 22:56:12 +02:00
|
|
|
raise this value if you have queries that touch many different
|
2020-09-01 00:33:37 +02:00
|
|
|
tables in a single transaction, e.g., query of a parent table with
|
2012-08-30 22:56:12 +02:00
|
|
|
many children. This parameter can only be set at server start.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
<para>
|
|
|
|
When running a standby server, you must set this parameter to the
|
2020-06-15 19:12:58 +02:00
|
|
|
same or higher value than on the primary server. Otherwise, queries
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
will not be allowed in the standby server.
|
|
|
|
</para>
|
2005-09-13 00:11:38 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2011-02-15 14:00:04 +01:00
|
|
|
<varlistentry id="guc-max-pred-locks-per-transaction" xreflabel="max_pred_locks_per_transaction">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>max_pred_locks_per_transaction</varname> (<type>integer</type>)
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_pred_locks_per_transaction</varname> configuration parameter</primary>
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
The shared predicate lock table tracks locks on
|
2011-02-15 14:00:04 +01:00
|
|
|
<varname>max_pred_locks_per_transaction</varname> * (<xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="guc-max-connections"/> + <xref
|
|
|
|
linkend="guc-max-prepared-transactions"/>) objects (e.g., tables);
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
hence, no more than this many distinct objects can be locked at
|
|
|
|
any one time. This parameter controls the average number of object
|
|
|
|
locks allocated for each transaction; individual transactions
|
|
|
|
can lock more objects as long as the locks of all transactions
|
2017-10-09 03:44:17 +02:00
|
|
|
fit in the lock table. This is <emphasis>not</emphasis> the number of
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
rows that can be locked; that value is unlimited. The default,
|
|
|
|
64, has generally been sufficient in testing, but you might need to
|
|
|
|
raise this value if you have clients that touch many different
|
|
|
|
tables in a single serializable transaction. This parameter can
|
|
|
|
only be set at server start.
|
|
|
|
</para>
|
2017-04-08 04:38:05 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-max-pred-locks-per-relation" xreflabel="max_pred_locks_per_relation">
|
|
|
|
<term><varname>max_pred_locks_per_relation</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_pred_locks_per_relation</varname> configuration parameter</primary>
|
2017-04-08 04:38:05 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This controls how many pages or tuples of a single relation can be
|
|
|
|
predicate-locked before the lock is promoted to covering the whole
|
|
|
|
relation. Values greater than or equal to zero mean an absolute
|
|
|
|
limit, while negative values
|
2017-11-23 15:39:47 +01:00
|
|
|
mean <xref linkend="guc-max-pred-locks-per-transaction"/> divided by
|
2017-04-08 04:38:05 +02:00
|
|
|
the absolute value of this setting. The default is -2, which keeps
|
2017-10-09 03:44:17 +02:00
|
|
|
the behavior from previous versions of <productname>PostgreSQL</productname>.
|
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2017-04-08 04:38:05 +02:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
|
2017-04-08 04:38:05 +02:00
|
|
|
<varlistentry id="guc-max-pred-locks-per-page" xreflabel="max_pred_locks_per_page">
|
|
|
|
<term><varname>max_pred_locks_per_page</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_pred_locks_per_page</varname> configuration parameter</primary>
|
2017-04-08 04:38:05 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This controls how many rows on a single page can be predicate-locked
|
|
|
|
before the lock is promoted to covering the whole page. The default
|
|
|
|
is 2. This parameter can only be set in
|
2017-10-09 03:44:17 +02:00
|
|
|
the <filename>postgresql.conf</filename> file or on the server command line.
|
2017-04-08 04:38:05 +02:00
|
|
|
</para>
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-07 22:46:51 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="runtime-config-compatible">
|
|
|
|
<title>Version and Platform Compatibility</title>
|
|
|
|
|
|
|
|
<sect2 id="runtime-config-compatible-version">
|
|
|
|
<title>Previous PostgreSQL Versions</title>
|
2005-11-17 23:14:56 +01:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<variablelist>
|
|
|
|
|
2005-11-17 23:14:56 +01:00
|
|
|
<varlistentry id="guc-array-nulls" xreflabel="array_nulls">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>array_nulls</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>array_nulls</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2005-11-17 23:14:56 +01:00
|
|
|
This controls whether the array input parser recognizes
|
2017-10-09 03:44:17 +02:00
|
|
|
unquoted <literal>NULL</literal> as specifying a null array element.
|
|
|
|
By default, this is <literal>on</literal>, allowing array values containing
|
|
|
|
null values to be entered. However, <productname>PostgreSQL</productname> versions
|
2006-10-23 20:10:32 +02:00
|
|
|
before 8.2 did not support null values in arrays, and therefore would
|
2017-10-09 03:44:17 +02:00
|
|
|
treat <literal>NULL</literal> as specifying a normal array element with
|
|
|
|
the string value <quote>NULL</quote>. For backward compatibility with
|
2005-11-17 23:14:56 +01:00
|
|
|
applications that require the old behavior, this variable can be
|
2017-10-09 03:44:17 +02:00
|
|
|
turned <literal>off</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2006-10-23 20:10:32 +02:00
|
|
|
Note that it is possible to create array values containing null values
|
2017-10-09 03:44:17 +02:00
|
|
|
even when this variable is <literal>off</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2006-05-21 22:10:42 +02:00
|
|
|
<varlistentry id="guc-backslash-quote" xreflabel="backslash_quote">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>backslash_quote</varname> (<type>enum</type>)
|
2017-10-09 03:44:17 +02:00
|
|
|
<indexterm><primary>strings</primary><secondary>backslash quotes</secondary></indexterm>
|
2006-05-21 22:10:42 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>backslash_quote</varname> configuration parameter</primary>
|
2006-05-21 22:10:42 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2006-05-21 22:10:42 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This controls whether a quote mark can be represented by
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>\'</literal> in a string literal. The preferred, SQL-standard way
|
|
|
|
to represent a quote mark is by doubling it (<literal>''</literal>) but
|
|
|
|
<productname>PostgreSQL</productname> has historically also accepted
|
|
|
|
<literal>\'</literal>. However, use of <literal>\'</literal> creates security risks
|
2006-05-21 22:10:42 +02:00
|
|
|
because in some client character set encodings, there are multibyte
|
|
|
|
characters in which the last byte is numerically equivalent to ASCII
|
2021-06-11 03:38:04 +02:00
|
|
|
<literal>\</literal>. If client-side code does escaping incorrectly then an
|
2006-05-21 22:10:42 +02:00
|
|
|
SQL-injection attack is possible. This risk can be prevented by
|
|
|
|
making the server reject queries in which a quote mark appears to be
|
|
|
|
escaped by a backslash.
|
2017-10-09 03:44:17 +02:00
|
|
|
The allowed values of <varname>backslash_quote</varname> are
|
|
|
|
<literal>on</literal> (allow <literal>\'</literal> always),
|
|
|
|
<literal>off</literal> (reject always), and
|
|
|
|
<literal>safe_encoding</literal> (allow only if client encoding does not
|
|
|
|
allow ASCII <literal>\</literal> within a multibyte character).
|
|
|
|
<literal>safe_encoding</literal> is the default setting.
|
2006-05-21 22:10:42 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Note that in a standard-conforming string literal, <literal>\</literal> just
|
|
|
|
means <literal>\</literal> anyway. This parameter only affects the handling of
|
2006-05-21 22:10:42 +02:00
|
|
|
non-standard-conforming literals, including
|
2017-10-09 03:44:17 +02:00
|
|
|
escape string syntax (<literal>E'...'</literal>).
|
2006-05-21 22:10:42 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-escape-string-warning" xreflabel="escape_string_warning">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>escape_string_warning</varname> (<type>boolean</type>)
|
2017-10-09 03:44:17 +02:00
|
|
|
<indexterm><primary>strings</primary><secondary>escape warning</secondary></indexterm>
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>escape_string_warning</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
When on, a warning is issued if a backslash (<literal>\</literal>)
|
|
|
|
appears in an ordinary string literal (<literal>'...'</literal>
|
2006-05-11 21:15:36 +02:00
|
|
|
syntax) and <varname>standard_conforming_strings</varname> is off.
|
2017-10-09 03:44:17 +02:00
|
|
|
The default is <literal>on</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
<para>
|
2006-05-11 21:15:36 +02:00
|
|
|
Applications that wish to use backslash as escape should be
|
2017-10-09 03:44:17 +02:00
|
|
|
modified to use escape string syntax (<literal>E'...'</literal>),
|
2011-04-27 22:51:46 +02:00
|
|
|
because the default behavior of ordinary strings is now to treat
|
|
|
|
backslash as an ordinary character, per SQL standard. This variable
|
|
|
|
can be enabled to help locate code that needs to be changed.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2009-12-11 04:34:57 +01:00
|
|
|
<varlistentry id="guc-lo-compat-privileges" xreflabel="lo_compat_privileges">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>lo_compat_privileges</varname> (<type>boolean</type>)
|
2009-12-11 04:34:57 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>lo_compat_privileges</varname> configuration parameter</primary>
|
2009-12-11 04:34:57 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2009-12-11 04:34:57 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
In <productname>PostgreSQL</productname> releases prior to 9.0, large objects
|
2012-10-08 01:16:28 +02:00
|
|
|
did not have access privileges and were, therefore, always readable
|
2017-10-09 03:44:17 +02:00
|
|
|
and writable by all users. Setting this variable to <literal>on</literal>
|
2009-12-17 15:36:16 +01:00
|
|
|
disables the new privilege checks, for compatibility with prior
|
2017-10-09 03:44:17 +02:00
|
|
|
releases. The default is <literal>off</literal>.
|
2012-10-08 01:16:28 +02:00
|
|
|
Only superusers can change this setting.
|
2009-12-11 04:34:57 +01:00
|
|
|
</para>
|
|
|
|
<para>
|
2010-05-20 22:32:27 +02:00
|
|
|
Setting this variable does not disable all security checks related to
|
2010-02-17 05:19:41 +01:00
|
|
|
large objects — only those for which the default behavior has
|
2017-10-09 03:44:17 +02:00
|
|
|
changed in <productname>PostgreSQL</productname> 9.0.
|
2009-12-11 04:34:57 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
2010-07-22 03:22:35 +02:00
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-quote-all-identifiers" xreflabel="quote-all-identifiers">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>quote_all_identifiers</varname> (<type>boolean</type>)
|
2010-07-22 03:22:35 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>quote_all_identifiers</varname> configuration parameter</primary>
|
2010-07-22 03:22:35 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2010-07-22 03:22:35 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
When the database generates SQL, force all identifiers to be quoted,
|
|
|
|
even if they are not (currently) keywords. This will affect the
|
2017-10-09 03:44:17 +02:00
|
|
|
output of <command>EXPLAIN</command> as well as the results of functions
|
|
|
|
like <function>pg_get_viewdef</function>. See also the
|
2010-08-03 21:02:21 +02:00
|
|
|
<option>--quote-all-identifiers</option> option of
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="app-pgdump"/> and <xref linkend="app-pg-dumpall"/>.
|
2010-07-22 03:22:35 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
2009-12-11 04:34:57 +01:00
|
|
|
</varlistentry>
|
|
|
|
|
2006-05-11 21:15:36 +02:00
|
|
|
<varlistentry id="guc-standard-conforming-strings" xreflabel="standard_conforming_strings">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>standard_conforming_strings</varname> (<type>boolean</type>)
|
2017-10-09 03:44:17 +02:00
|
|
|
<indexterm><primary>strings</primary><secondary>standard conforming</secondary></indexterm>
|
2006-05-11 21:15:36 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>standard_conforming_strings</varname> configuration parameter</primary>
|
2006-05-11 21:15:36 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2006-05-11 21:15:36 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This controls whether ordinary string literals
|
2017-10-09 03:44:17 +02:00
|
|
|
(<literal>'...'</literal>) treat backslashes literally, as specified in
|
2006-05-11 21:15:36 +02:00
|
|
|
the SQL standard.
|
2010-07-20 02:34:44 +02:00
|
|
|
Beginning in <productname>PostgreSQL</productname> 9.1, the default is
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>on</literal> (prior releases defaulted to <literal>off</literal>).
|
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
|
|
|
Applications can check this
|
2006-05-11 21:15:36 +02:00
|
|
|
parameter to determine how string literals will be processed.
|
|
|
|
The presence of this parameter can also be taken as an indication
|
2017-10-09 03:44:17 +02:00
|
|
|
that the escape string syntax (<literal>E'...'</literal>) is supported.
|
2017-11-23 15:39:47 +01:00
|
|
|
Escape string syntax (<xref linkend="sql-syntax-strings-escape"/>)
|
2010-02-03 18:25:06 +01:00
|
|
|
should be used if an application desires
|
2006-05-11 21:15:36 +02:00
|
|
|
backslashes to be treated as escape characters.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2008-01-30 19:35:55 +01:00
|
|
|
<varlistentry id="guc-synchronize-seqscans" xreflabel="synchronize_seqscans">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>synchronize_seqscans</varname> (<type>boolean</type>)
|
2008-01-30 19:35:55 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>synchronize_seqscans</varname> configuration parameter</primary>
|
2008-01-30 19:35:55 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2008-01-30 19:35:55 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This allows sequential scans of large tables to synchronize with each
|
|
|
|
other, so that concurrent scans read the same block at about the
|
|
|
|
same time and hence share the I/O workload. When this is enabled,
|
|
|
|
a scan might start in the middle of the table and then <quote>wrap
|
2017-10-09 03:44:17 +02:00
|
|
|
around</quote> the end to cover all rows, so as to synchronize with the
|
2008-01-30 19:35:55 +01:00
|
|
|
activity of scans already in progress. This can result in
|
|
|
|
unpredictable changes in the row ordering returned by queries that
|
2017-10-09 03:44:17 +02:00
|
|
|
have no <literal>ORDER BY</literal> clause. Setting this parameter to
|
|
|
|
<literal>off</literal> ensures the pre-8.3 behavior in which a sequential
|
2008-01-30 19:35:55 +01:00
|
|
|
scan always starts from the beginning of the table. The default
|
2017-10-09 03:44:17 +02:00
|
|
|
is <literal>on</literal>.
|
2008-01-30 19:35:55 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
2005-11-17 23:14:56 +01:00
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<sect2 id="runtime-config-compatible-clients">
|
|
|
|
<title>Platform and Client Compatibility</title>
|
|
|
|
<variablelist>
|
|
|
|
|
|
|
|
<varlistentry id="guc-transform-null-equals" xreflabel="transform_null_equals">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>transform_null_equals</varname> (<type>boolean</type>)
|
2017-10-09 03:44:17 +02:00
|
|
|
<indexterm><primary>IS NULL</primary></indexterm>
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>transform_null_equals</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
When on, expressions of the form <literal><replaceable>expr</replaceable> =
|
2005-09-13 00:11:38 +02:00
|
|
|
NULL</literal> (or <literal>NULL =
|
2017-10-09 03:44:17 +02:00
|
|
|
<replaceable>expr</replaceable></literal>) are treated as
|
|
|
|
<literal><replaceable>expr</replaceable> IS NULL</literal>, that is, they
|
|
|
|
return true if <replaceable>expr</replaceable> evaluates to the null value,
|
2005-09-13 00:11:38 +02:00
|
|
|
and false otherwise. The correct SQL-spec-compliant behavior of
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal><replaceable>expr</replaceable> = NULL</literal> is to always
|
2006-01-23 19:16:41 +01:00
|
|
|
return null (unknown). Therefore this parameter defaults to
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>off</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
However, filtered forms in <productname>Microsoft
|
|
|
|
Access</productname> generate queries that appear to use
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal><replaceable>expr</replaceable> = NULL</literal> to test for
|
2005-09-13 00:11:38 +02:00
|
|
|
null values, so if you use that interface to access the database you
|
|
|
|
might want to turn this option on. Since expressions of the
|
2017-10-09 03:44:17 +02:00
|
|
|
form <literal><replaceable>expr</replaceable> = NULL</literal> always
|
2010-02-03 18:25:06 +01:00
|
|
|
return the null value (using the SQL standard interpretation), they are not
|
|
|
|
very useful and do not appear often in normal applications so
|
2005-09-13 00:11:38 +02:00
|
|
|
this option does little harm in practice. But new users are
|
|
|
|
frequently confused about the semantics of expressions
|
2010-02-03 18:25:06 +01:00
|
|
|
involving null values, so this option is off by default.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Note that this option only affects the exact form <literal>= NULL</literal>,
|
2005-09-13 00:11:38 +02:00
|
|
|
not other comparison operators or other expressions
|
|
|
|
that are computationally equivalent to some expression
|
|
|
|
involving the equals operator (such as <literal>IN</literal>).
|
|
|
|
Thus, this option is not a general fix for bad programming.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-11-23 15:39:47 +01:00
|
|
|
Refer to <xref linkend="functions-comparison"/> for related information.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
</variablelist>
|
|
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
|
2010-07-20 02:47:53 +02:00
|
|
|
<sect1 id="runtime-config-error-handling">
|
|
|
|
<title>Error Handling</title>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
|
|
|
|
<varlistentry id="guc-exit-on-error" xreflabel="exit_on_error">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>exit_on_error</varname> (<type>boolean</type>)
|
2010-07-20 02:47:53 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>exit_on_error</varname> configuration parameter</primary>
|
2010-07-20 02:47:53 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2010-07-20 02:47:53 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-12-06 04:15:15 +01:00
|
|
|
If on, any error will terminate the current session. By default,
|
|
|
|
this is set to off, so that only FATAL errors will terminate the
|
2010-07-20 02:47:53 +02:00
|
|
|
session.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-restart-after-crash" xreflabel="restart_after_crash">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>restart_after_crash</varname> (<type>boolean</type>)
|
2010-07-20 02:47:53 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>restart_after_crash</varname> configuration parameter</primary>
|
2010-07-20 02:47:53 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2010-07-20 02:47:53 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-12-06 04:15:15 +01:00
|
|
|
When set to on, which is the default, <productname>PostgreSQL</productname>
|
2010-07-20 02:47:53 +02:00
|
|
|
will automatically reinitialize after a backend crash. Leaving this
|
2018-12-06 04:15:15 +01:00
|
|
|
value set to on is normally the best way to maximize the availability
|
2010-07-20 02:47:53 +02:00
|
|
|
of the database. However, in some circumstances, such as when
|
2017-10-09 03:44:17 +02:00
|
|
|
<productname>PostgreSQL</productname> is being invoked by clusterware, it may be
|
2011-03-15 17:43:39 +01:00
|
|
|
useful to disable the restart so that the clusterware can gain
|
2010-07-20 02:47:53 +02:00
|
|
|
control and take any actions it deems appropriate.
|
|
|
|
</para>
|
doc: Fix description of some GUCs in docs and postgresql.conf.sample
The following parameters have been imprecise, or incorrect, about their
description (PGC_POSTMASTER or PGC_SIGHUP):
- autovacuum_work_mem (docs, as of 9.6~)
- huge_page_size (docs, as of 14~)
- max_logical_replication_workers (docs, as of 10~)
- max_sync_workers_per_subscription (docs, as of 10~)
- min_dynamic_shared_memory (docs, as of 14~)
- recovery_init_sync_method (postgresql.conf.sample, as of 14~)
- remove_temp_files_after_crash (docs, as of 14~)
- restart_after_crash (docs, as of 9.6~)
- ssl_min_protocol_version (docs, as of 12~)
- ssl_max_protocol_version (docs, as of 12~)
This commit adjusts the description of all these parameters to be more
consistent with the practice used for the others.
Revewed-by: Justin Pryzby
Discussion: https://postgr.es/m/YK2ltuLpe+FbRXzA@paquier.xyz
Backpatch-through: 9.6
2021-05-27 07:57:28 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
2010-07-20 02:47:53 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
PANIC on fsync() failure.
On some operating systems, it doesn't make sense to retry fsync(),
because dirty data cached by the kernel may have been dropped on
write-back failure. In that case the only remaining copy of the
data is in the WAL. A subsequent fsync() could appear to succeed,
but not have flushed the data. That means that a future checkpoint
could apparently complete successfully but have lost data.
Therefore, violently prevent any future checkpoint attempts by
panicking on the first fsync() failure. Note that we already
did the same for WAL data; this change extends that behavior to
non-temporary data files.
Provide a GUC data_sync_retry to control this new behavior, for
users of operating systems that don't eject dirty data, and possibly
forensic/testing uses. If it is set to on and the write-back error
was transient, a later checkpoint might genuinely succeed (on a
system that does not throw away buffers on failure); if the error is
permanent, later checkpoints will continue to fail. The GUC defaults
to off, meaning that we panic.
Back-patch to all supported releases.
There is still a narrow window for error-loss on some operating
systems: if the file is closed and later reopened and a write-back
error occurs in the intervening time, but the inode has the bad
luck to be evicted due to memory pressure before we reopen, we could
miss the error. A later patch will address that with a scheme
for keeping files with dirty data open at all times, but we judge
that to be too complicated to back-patch.
Author: Craig Ringer, with some adjustments by Thomas Munro
Reported-by: Craig Ringer
Reviewed-by: Robert Haas, Thomas Munro, Andres Freund
Discussion: https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de
2018-11-19 01:31:10 +01:00
|
|
|
<varlistentry id="guc-data-sync-retry" xreflabel="data_sync_retry">
|
|
|
|
<term><varname>data_sync_retry</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>data_sync_retry</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-12-06 04:15:15 +01:00
|
|
|
When set to off, which is the default, <productname>PostgreSQL</productname>
|
PANIC on fsync() failure.
On some operating systems, it doesn't make sense to retry fsync(),
because dirty data cached by the kernel may have been dropped on
write-back failure. In that case the only remaining copy of the
data is in the WAL. A subsequent fsync() could appear to succeed,
but not have flushed the data. That means that a future checkpoint
could apparently complete successfully but have lost data.
Therefore, violently prevent any future checkpoint attempts by
panicking on the first fsync() failure. Note that we already
did the same for WAL data; this change extends that behavior to
non-temporary data files.
Provide a GUC data_sync_retry to control this new behavior, for
users of operating systems that don't eject dirty data, and possibly
forensic/testing uses. If it is set to on and the write-back error
was transient, a later checkpoint might genuinely succeed (on a
system that does not throw away buffers on failure); if the error is
permanent, later checkpoints will continue to fail. The GUC defaults
to off, meaning that we panic.
Back-patch to all supported releases.
There is still a narrow window for error-loss on some operating
systems: if the file is closed and later reopened and a write-back
error occurs in the intervening time, but the inode has the bad
luck to be evicted due to memory pressure before we reopen, we could
miss the error. A later patch will address that with a scheme
for keeping files with dirty data open at all times, but we judge
that to be too complicated to back-patch.
Author: Craig Ringer, with some adjustments by Thomas Munro
Reported-by: Craig Ringer
Reviewed-by: Robert Haas, Thomas Munro, Andres Freund
Discussion: https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de
2018-11-19 01:31:10 +01:00
|
|
|
will raise a PANIC-level error on failure to flush modified data files
|
2019-07-05 08:33:51 +02:00
|
|
|
to the file system. This causes the database server to crash. This
|
2019-02-28 03:02:11 +01:00
|
|
|
parameter can only be set at server start.
|
PANIC on fsync() failure.
On some operating systems, it doesn't make sense to retry fsync(),
because dirty data cached by the kernel may have been dropped on
write-back failure. In that case the only remaining copy of the
data is in the WAL. A subsequent fsync() could appear to succeed,
but not have flushed the data. That means that a future checkpoint
could apparently complete successfully but have lost data.
Therefore, violently prevent any future checkpoint attempts by
panicking on the first fsync() failure. Note that we already
did the same for WAL data; this change extends that behavior to
non-temporary data files.
Provide a GUC data_sync_retry to control this new behavior, for
users of operating systems that don't eject dirty data, and possibly
forensic/testing uses. If it is set to on and the write-back error
was transient, a later checkpoint might genuinely succeed (on a
system that does not throw away buffers on failure); if the error is
permanent, later checkpoints will continue to fail. The GUC defaults
to off, meaning that we panic.
Back-patch to all supported releases.
There is still a narrow window for error-loss on some operating
systems: if the file is closed and later reopened and a write-back
error occurs in the intervening time, but the inode has the bad
luck to be evicted due to memory pressure before we reopen, we could
miss the error. A later patch will address that with a scheme
for keeping files with dirty data open at all times, but we judge
that to be too complicated to back-patch.
Author: Craig Ringer, with some adjustments by Thomas Munro
Reported-by: Craig Ringer
Reviewed-by: Robert Haas, Thomas Munro, Andres Freund
Discussion: https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de
2018-11-19 01:31:10 +01:00
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
On some operating systems, the status of data in the kernel's page
|
|
|
|
cache is unknown after a write-back failure. In some cases it might
|
|
|
|
have been entirely forgotten, making it unsafe to retry; the second
|
|
|
|
attempt may be reported as successful, when in fact the data has been
|
|
|
|
lost. In these circumstances, the only way to avoid data loss is to
|
|
|
|
recover from the WAL after any failure is reported, preferably
|
|
|
|
after investigating the root cause of the failure and replacing any
|
|
|
|
faulty hardware.
|
|
|
|
</para>
|
|
|
|
<para>
|
2018-12-06 04:15:15 +01:00
|
|
|
If set to on, <productname>PostgreSQL</productname> will instead
|
PANIC on fsync() failure.
On some operating systems, it doesn't make sense to retry fsync(),
because dirty data cached by the kernel may have been dropped on
write-back failure. In that case the only remaining copy of the
data is in the WAL. A subsequent fsync() could appear to succeed,
but not have flushed the data. That means that a future checkpoint
could apparently complete successfully but have lost data.
Therefore, violently prevent any future checkpoint attempts by
panicking on the first fsync() failure. Note that we already
did the same for WAL data; this change extends that behavior to
non-temporary data files.
Provide a GUC data_sync_retry to control this new behavior, for
users of operating systems that don't eject dirty data, and possibly
forensic/testing uses. If it is set to on and the write-back error
was transient, a later checkpoint might genuinely succeed (on a
system that does not throw away buffers on failure); if the error is
permanent, later checkpoints will continue to fail. The GUC defaults
to off, meaning that we panic.
Back-patch to all supported releases.
There is still a narrow window for error-loss on some operating
systems: if the file is closed and later reopened and a write-back
error occurs in the intervening time, but the inode has the bad
luck to be evicted due to memory pressure before we reopen, we could
miss the error. A later patch will address that with a scheme
for keeping files with dirty data open at all times, but we judge
that to be too complicated to back-patch.
Author: Craig Ringer, with some adjustments by Thomas Munro
Reported-by: Craig Ringer
Reviewed-by: Robert Haas, Thomas Munro, Andres Freund
Discussion: https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de
2018-11-19 01:31:10 +01:00
|
|
|
report an error but continue to run so that the data flushing
|
2018-12-06 04:15:15 +01:00
|
|
|
operation can be retried in a later checkpoint. Only set it to on
|
PANIC on fsync() failure.
On some operating systems, it doesn't make sense to retry fsync(),
because dirty data cached by the kernel may have been dropped on
write-back failure. In that case the only remaining copy of the
data is in the WAL. A subsequent fsync() could appear to succeed,
but not have flushed the data. That means that a future checkpoint
could apparently complete successfully but have lost data.
Therefore, violently prevent any future checkpoint attempts by
panicking on the first fsync() failure. Note that we already
did the same for WAL data; this change extends that behavior to
non-temporary data files.
Provide a GUC data_sync_retry to control this new behavior, for
users of operating systems that don't eject dirty data, and possibly
forensic/testing uses. If it is set to on and the write-back error
was transient, a later checkpoint might genuinely succeed (on a
system that does not throw away buffers on failure); if the error is
permanent, later checkpoints will continue to fail. The GUC defaults
to off, meaning that we panic.
Back-patch to all supported releases.
There is still a narrow window for error-loss on some operating
systems: if the file is closed and later reopened and a write-back
error occurs in the intervening time, but the inode has the bad
luck to be evicted due to memory pressure before we reopen, we could
miss the error. A later patch will address that with a scheme
for keeping files with dirty data open at all times, but we judge
that to be too complicated to back-patch.
Author: Craig Ringer, with some adjustments by Thomas Munro
Reported-by: Craig Ringer
Reviewed-by: Robert Haas, Thomas Munro, Andres Freund
Discussion: https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de
2018-11-19 01:31:10 +01:00
|
|
|
after investigating the operating system's treatment of buffered data
|
|
|
|
in case of write-back failure.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2021-03-19 23:46:32 +01:00
|
|
|
|
|
|
|
<varlistentry id="guc-recovery-init-sync-method" xreflabel="recovery_init_sync_method">
|
|
|
|
<term><varname>recovery_init_sync_method</varname> (<type>enum</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>recovery_init_sync_method</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
When set to <literal>fsync</literal>, which is the default,
|
|
|
|
<productname>PostgreSQL</productname> will recursively open and
|
|
|
|
synchronize all files in the data directory before crash recovery
|
|
|
|
begins. The search for files will follow symbolic links for the WAL
|
|
|
|
directory and each configured tablespace (but not any other symbolic
|
|
|
|
links). This is intended to make sure that all WAL and data files are
|
|
|
|
durably stored on disk before replaying changes. This applies whenever
|
|
|
|
starting a database cluster that did not shut down cleanly, including
|
|
|
|
copies created with <application>pg_basebackup</application>.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
On Linux, <literal>syncfs</literal> may be used instead, to ask the
|
|
|
|
operating system to synchronize the whole file systems that contain the
|
|
|
|
data directory, the WAL files and each tablespace (but not any other
|
|
|
|
file systems that may be reachable through symbolic links). This may
|
|
|
|
be a lot faster than the <literal>fsync</literal> setting, because it
|
|
|
|
doesn't need to open each file one by one. On the other hand, it may
|
|
|
|
be slower if a file system is shared by other applications that
|
|
|
|
modify a lot of files, since those files will also be written to disk.
|
|
|
|
Furthermore, on versions of Linux before 5.8, I/O errors encountered
|
|
|
|
while writing data to disk may not be reported to
|
|
|
|
<productname>PostgreSQL</productname>, and relevant error messages may
|
|
|
|
appear only in kernel logs.
|
|
|
|
</para>
|
doc: Fix description of some GUCs in docs and postgresql.conf.sample
The following parameters have been imprecise, or incorrect, about their
description (PGC_POSTMASTER or PGC_SIGHUP):
- autovacuum_work_mem (docs, as of 9.6~)
- huge_page_size (docs, as of 14~)
- max_logical_replication_workers (docs, as of 10~)
- max_sync_workers_per_subscription (docs, as of 10~)
- min_dynamic_shared_memory (docs, as of 14~)
- recovery_init_sync_method (postgresql.conf.sample, as of 14~)
- remove_temp_files_after_crash (docs, as of 14~)
- restart_after_crash (docs, as of 9.6~)
- ssl_min_protocol_version (docs, as of 12~)
- ssl_max_protocol_version (docs, as of 12~)
This commit adjusts the description of all these parameters to be more
consistent with the practice used for the others.
Revewed-by: Justin Pryzby
Discussion: https://postgr.es/m/YK2ltuLpe+FbRXzA@paquier.xyz
Backpatch-through: 9.6
2021-05-27 07:57:28 +02:00
|
|
|
<para>
|
2021-06-28 05:17:43 +02:00
|
|
|
This parameter can only be set in the
|
|
|
|
<filename>postgresql.conf</filename> file or on the server command line.
|
doc: Fix description of some GUCs in docs and postgresql.conf.sample
The following parameters have been imprecise, or incorrect, about their
description (PGC_POSTMASTER or PGC_SIGHUP):
- autovacuum_work_mem (docs, as of 9.6~)
- huge_page_size (docs, as of 14~)
- max_logical_replication_workers (docs, as of 10~)
- max_sync_workers_per_subscription (docs, as of 10~)
- min_dynamic_shared_memory (docs, as of 14~)
- recovery_init_sync_method (postgresql.conf.sample, as of 14~)
- remove_temp_files_after_crash (docs, as of 14~)
- restart_after_crash (docs, as of 9.6~)
- ssl_min_protocol_version (docs, as of 12~)
- ssl_max_protocol_version (docs, as of 12~)
This commit adjusts the description of all these parameters to be more
consistent with the practice used for the others.
Revewed-by: Justin Pryzby
Discussion: https://postgr.es/m/YK2ltuLpe+FbRXzA@paquier.xyz
Backpatch-through: 9.6
2021-05-27 07:57:28 +02:00
|
|
|
</para>
|
2021-03-19 23:46:32 +01:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
PANIC on fsync() failure.
On some operating systems, it doesn't make sense to retry fsync(),
because dirty data cached by the kernel may have been dropped on
write-back failure. In that case the only remaining copy of the
data is in the WAL. A subsequent fsync() could appear to succeed,
but not have flushed the data. That means that a future checkpoint
could apparently complete successfully but have lost data.
Therefore, violently prevent any future checkpoint attempts by
panicking on the first fsync() failure. Note that we already
did the same for WAL data; this change extends that behavior to
non-temporary data files.
Provide a GUC data_sync_retry to control this new behavior, for
users of operating systems that don't eject dirty data, and possibly
forensic/testing uses. If it is set to on and the write-back error
was transient, a later checkpoint might genuinely succeed (on a
system that does not throw away buffers on failure); if the error is
permanent, later checkpoints will continue to fail. The GUC defaults
to off, meaning that we panic.
Back-patch to all supported releases.
There is still a narrow window for error-loss on some operating
systems: if the file is closed and later reopened and a write-back
error occurs in the intervening time, but the inode has the bad
luck to be evicted due to memory pressure before we reopen, we could
miss the error. A later patch will address that with a scheme
for keeping files with dirty data open at all times, but we judge
that to be too complicated to back-patch.
Author: Craig Ringer, with some adjustments by Thomas Munro
Reported-by: Craig Ringer
Reviewed-by: Robert Haas, Thomas Munro, Andres Freund
Discussion: https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de
2018-11-19 01:31:10 +01:00
|
|
|
|
2010-07-20 02:47:53 +02:00
|
|
|
</variablelist>
|
|
|
|
|
|
|
|
</sect1>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<sect1 id="runtime-config-preset">
|
|
|
|
<title>Preset Options</title>
|
|
|
|
|
|
|
|
<para>
|
2021-01-05 22:18:01 +01:00
|
|
|
The following <quote>parameters</quote> are read-only.
|
|
|
|
As such, they have been excluded from the sample
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>postgresql.conf</filename> file. These options report
|
2005-09-13 00:11:38 +02:00
|
|
|
various aspects of <productname>PostgreSQL</productname> behavior
|
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
|
|
|
that might be of interest to certain applications, particularly
|
2005-09-13 00:11:38 +02:00
|
|
|
administrative front-ends.
|
2021-01-05 22:18:01 +01:00
|
|
|
Most of them are determined when <productname>PostgreSQL</productname>
|
|
|
|
is compiled or when it is installed.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
|
|
|
|
<varlistentry id="guc-block-size" xreflabel="block_size">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>block_size</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>block_size</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Reports the size of a disk block. It is determined by the value
|
2017-10-09 03:44:17 +02:00
|
|
|
of <literal>BLCKSZ</literal> when building the server. The default
|
2005-09-13 00:11:38 +02:00
|
|
|
value is 8192 bytes. The meaning of some configuration
|
2017-11-23 15:39:47 +01:00
|
|
|
variables (such as <xref linkend="guc-shared-buffers"/>) is
|
2005-09-13 00:11:38 +02:00
|
|
|
influenced by <varname>block_size</varname>. See <xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="runtime-config-resource"/> for information.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2013-09-16 13:36:01 +02:00
|
|
|
<varlistentry id="guc-data-checksums" xreflabel="data_checksums">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>data_checksums</varname> (<type>boolean</type>)
|
2013-09-16 13:36:01 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>data_checksums</varname> configuration parameter</primary>
|
2013-09-16 13:36:01 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2013-09-16 13:36:01 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Reports whether data checksums are enabled for this cluster.
|
2017-11-23 15:39:47 +01:00
|
|
|
See <xref linkend="app-initdb-data-checksums"/> for more information.
|
2013-09-16 13:36:01 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2018-04-07 23:45:39 +02:00
|
|
|
<varlistentry id="guc-data-directory-mode" xreflabel="data_directory_mode">
|
|
|
|
<term><varname>data_directory_mode</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>data_directory_mode</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2021-01-05 22:18:01 +01:00
|
|
|
On Unix systems this parameter reports the permissions the data
|
|
|
|
directory (defined by <xref linkend="guc-data-directory"/>)
|
|
|
|
had at server startup.
|
2018-04-07 23:45:39 +02:00
|
|
|
(On Microsoft Windows this parameter will always display
|
2021-01-05 22:18:01 +01:00
|
|
|
<literal>0700</literal>.) See
|
2018-04-07 23:45:39 +02:00
|
|
|
<xref linkend="app-initdb-allow-group-access"/> for more information.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2014-06-20 11:06:42 +02:00
|
|
|
<varlistentry id="guc-debug-assertions" xreflabel="debug_assertions">
|
|
|
|
<term><varname>debug_assertions</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>debug_assertions</varname> configuration parameter</primary>
|
2014-06-20 11:06:42 +02:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Reports whether <productname>PostgreSQL</productname> has been built
|
|
|
|
with assertions enabled. That is the case if the
|
|
|
|
macro <symbol>USE_ASSERT_CHECKING</symbol> is defined
|
|
|
|
when <productname>PostgreSQL</productname> is built (accomplished
|
2020-09-01 00:33:37 +02:00
|
|
|
e.g., by the <command>configure</command> option
|
2014-06-20 11:06:42 +02:00
|
|
|
<option>--enable-cassert</option>). By
|
|
|
|
default <productname>PostgreSQL</productname> is built without
|
|
|
|
assertions.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-integer-datetimes" xreflabel="integer_datetimes">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>integer_datetimes</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>integer_datetimes</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Reports whether <productname>PostgreSQL</productname> was built with support for
|
|
|
|
64-bit-integer dates and times. As of <productname>PostgreSQL</productname> 10,
|
2017-02-23 17:40:12 +01:00
|
|
|
this is always <literal>on</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2021-01-05 22:18:01 +01:00
|
|
|
<varlistentry id="guc-in-hot-standby" xreflabel="in_hot_standby">
|
|
|
|
<term><varname>in_hot_standby</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>in_hot_standby</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Reports whether the server is currently in hot standby mode. When
|
|
|
|
this is <literal>on</literal>, all transactions are forced to be
|
|
|
|
read-only. Within a session, this can change only if the server is
|
|
|
|
promoted to be primary. See <xref linkend="hot-standby"/> for more
|
|
|
|
information.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-lc-collate" xreflabel="lc_collate">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>lc_collate</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>lc_collate</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Reports the locale in which sorting of textual data is done.
|
2017-11-23 15:39:47 +01:00
|
|
|
See <xref linkend="locale"/> for more information.
|
2009-03-26 21:55:49 +01:00
|
|
|
This value is determined when a database is created.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-lc-ctype" xreflabel="lc_ctype">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>lc_ctype</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>lc_ctype</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Reports the locale that determines character classifications.
|
2017-11-23 15:39:47 +01:00
|
|
|
See <xref linkend="locale"/> for more information.
|
2009-03-26 21:55:49 +01:00
|
|
|
This value is determined when a database is created.
|
2005-09-13 00:11:38 +02:00
|
|
|
Ordinarily this will be the same as <varname>lc_collate</varname>,
|
|
|
|
but for special applications it might be set differently.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-max-function-args" xreflabel="max_function_args">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>max_function_args</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_function_args</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Reports the maximum number of function arguments. It is determined by
|
2017-10-09 03:44:17 +02:00
|
|
|
the value of <literal>FUNC_MAX_ARGS</literal> when building the server. The
|
2007-01-20 22:30:26 +01:00
|
|
|
default value is 100 arguments.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-max-identifier-length" xreflabel="max_identifier_length">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>max_identifier_length</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_identifier_length</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Reports the maximum identifier length. It is determined as one
|
2017-10-09 03:44:17 +02:00
|
|
|
less than the value of <literal>NAMEDATALEN</literal> when building
|
|
|
|
the server. The default value of <literal>NAMEDATALEN</literal> is
|
2005-09-13 00:11:38 +02:00
|
|
|
64; therefore the default
|
2010-02-03 18:25:06 +01:00
|
|
|
<varname>max_identifier_length</varname> is 63 bytes, which
|
2010-08-17 06:37:21 +02:00
|
|
|
can be less than 63 characters when using multibyte encodings.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-max-index-keys" xreflabel="max_index_keys">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>max_index_keys</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>max_index_keys</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Reports the maximum number of index keys. It is determined by
|
2017-10-09 03:44:17 +02:00
|
|
|
the value of <literal>INDEX_MAX_KEYS</literal> when building the server. The
|
2007-01-20 22:30:26 +01:00
|
|
|
default value is 32 keys.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2008-07-11 00:08:17 +02:00
|
|
|
<varlistentry id="guc-segment-size" xreflabel="segment_size">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>segment_size</varname> (<type>integer</type>)
|
2008-07-11 00:08:17 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>segment_size</varname> configuration parameter</primary>
|
2008-07-11 00:08:17 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2008-07-11 00:08:17 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Reports the number of blocks (pages) that can be stored within a file
|
2017-10-09 03:44:17 +02:00
|
|
|
segment. It is determined by the value of <literal>RELSEG_SIZE</literal>
|
2008-07-11 00:08:17 +02:00
|
|
|
when building the server. The maximum size of a segment file in bytes
|
2017-10-09 03:44:17 +02:00
|
|
|
is equal to <varname>segment_size</varname> multiplied by
|
|
|
|
<varname>block_size</varname>; by default this is 1GB.
|
2008-07-11 00:08:17 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-server-encoding" xreflabel="server_encoding">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>server_encoding</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>server_encoding</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<indexterm><primary>character set</primary></indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Reports the database encoding (character set).
|
|
|
|
It is determined when the database is created. Ordinarily,
|
|
|
|
clients need only be concerned with the value of <xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="guc-client-encoding"/>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-server-version" xreflabel="server_version">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>server_version</varname> (<type>string</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>server_version</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Reports the version number of the server. It is determined by the
|
2017-10-09 03:44:17 +02:00
|
|
|
value of <literal>PG_VERSION</literal> when building the server.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2006-09-02 15:12:50 +02:00
|
|
|
<varlistentry id="guc-server-version-num" xreflabel="server_version_num">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>server_version_num</varname> (<type>integer</type>)
|
2006-09-02 15:12:50 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>server_version_num</varname> configuration parameter</primary>
|
2006-09-02 15:12:50 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2006-09-02 15:12:50 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2008-07-11 00:08:17 +02:00
|
|
|
Reports the version number of the server as an integer. It is determined
|
2017-10-09 03:44:17 +02:00
|
|
|
by the value of <literal>PG_VERSION_NUM</literal> when building the server.
|
2006-09-02 15:12:50 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2021-09-08 05:02:30 +02:00
|
|
|
<varlistentry id="guc-shared-memory-size" xreflabel="shared_memory_size">
|
|
|
|
<term><varname>shared_memory_size</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>shared_memory_size</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Reports the size of the main shared memory area, rounded up to the
|
|
|
|
nearest megabyte.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2021-09-21 03:31:58 +02:00
|
|
|
<varlistentry id="guc-shared-memory-size-in-huge-pages" xreflabel="shared_memory_size_in_huge_pages">
|
|
|
|
<term><varname>shared_memory_size_in_huge_pages</varname> (<type>integer</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>shared_memory_size_in_huge_pages</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Reports the number of huge pages that are needed for the main shared
|
|
|
|
memory area based on the specified <xref linkend="guc-huge-page-size"/>.
|
|
|
|
If huge pages are not supported, this will be <literal>-1</literal>.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
This setting is supported only on <productname>Linux</productname>. It
|
|
|
|
is always set to <literal>-1</literal> on other platforms. For more
|
|
|
|
details about using huge pages on <productname>Linux</productname>, see
|
|
|
|
<xref linkend="linux-huge-pages"/>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2018-06-26 10:19:35 +02:00
|
|
|
<varlistentry id="guc-ssl-library" xreflabel="ssl_library">
|
|
|
|
<term><varname>ssl_library</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>ssl_library</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2019-08-20 05:36:31 +02:00
|
|
|
Reports the name of the SSL library that this
|
|
|
|
<productname>PostgreSQL</productname> server was built with (even if
|
|
|
|
SSL is not currently configured or in use on this instance), for
|
|
|
|
example <literal>OpenSSL</literal>, or an empty string if none.
|
2018-06-26 10:19:35 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2008-07-11 00:08:17 +02:00
|
|
|
<varlistentry id="guc-wal-block-size" xreflabel="wal_block_size">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>wal_block_size</varname> (<type>integer</type>)
|
2008-07-11 00:08:17 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>wal_block_size</varname> configuration parameter</primary>
|
2008-07-11 00:08:17 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2008-07-11 00:08:17 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Reports the size of a WAL disk block. It is determined by the value
|
2017-10-09 03:44:17 +02:00
|
|
|
of <literal>XLOG_BLCKSZ</literal> when building the server. The default value
|
2008-07-11 00:08:17 +02:00
|
|
|
is 8192 bytes.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-wal-segment-size" xreflabel="wal_segment_size">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>wal_segment_size</varname> (<type>integer</type>)
|
2008-07-11 00:08:17 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>wal_segment_size</varname> configuration parameter</primary>
|
2008-07-11 00:08:17 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2008-07-11 00:08:17 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-11-10 04:24:05 +01:00
|
|
|
Reports the size of write ahead log segments. The default value is
|
|
|
|
16MB. See <xref linkend="wal-configuration"/> for more information.
|
2008-07-11 00:08:17 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
</variablelist>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="runtime-config-custom">
|
|
|
|
<title>Customized Options</title>
|
|
|
|
|
|
|
|
<para>
|
2006-01-23 19:16:41 +01:00
|
|
|
This feature was designed to allow parameters not normally known to
|
2005-09-13 00:11:38 +02:00
|
|
|
<productname>PostgreSQL</productname> to be added by add-on modules
|
2011-10-04 18:36:18 +02:00
|
|
|
(such as procedural languages). This allows extension modules to be
|
2005-09-13 00:11:38 +02:00
|
|
|
configured in the standard ways.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2011-10-04 18:36:18 +02:00
|
|
|
Custom options have two-part names: an extension name, then a dot, then
|
|
|
|
the parameter name proper, much like qualified names in SQL. An example
|
2017-10-09 03:44:17 +02:00
|
|
|
is <literal>plpgsql.variable_conflict</literal>.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2011-10-04 18:36:18 +02:00
|
|
|
Because custom options may need to be set in processes that have not
|
2017-10-09 03:44:17 +02:00
|
|
|
loaded the relevant extension module, <productname>PostgreSQL</productname>
|
2011-10-04 18:36:18 +02:00
|
|
|
will accept a setting for any two-part parameter name. Such variables
|
|
|
|
are treated as placeholders and have no function until the module that
|
|
|
|
defines them is loaded. When an extension module is loaded, it will add
|
|
|
|
its variable definitions, convert any placeholder values according to
|
|
|
|
those definitions, and issue warnings for any unrecognized placeholders
|
|
|
|
that begin with its extension name.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="runtime-config-developer">
|
|
|
|
<title>Developer Options</title>
|
|
|
|
|
|
|
|
<para>
|
2021-04-14 08:55:55 +02:00
|
|
|
The following parameters are intended for developer testing, and
|
|
|
|
should never be used on a production database. However, some of
|
|
|
|
them can be used to assist with the recovery of severely damaged
|
|
|
|
databases. As such, they have been excluded from the sample
|
2017-10-09 03:44:17 +02:00
|
|
|
<filename>postgresql.conf</filename> file. Note that many of these
|
2006-01-23 19:16:41 +01:00
|
|
|
parameters require special source compilation flags to work at all.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<variablelist>
|
2006-01-05 11:07:46 +01:00
|
|
|
<varlistentry id="guc-allow-system-table-mods" xreflabel="allow_system_table_mods">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>allow_system_table_mods</varname> (<type>boolean</type>)
|
2006-01-05 11:07:46 +01:00
|
|
|
<indexterm>
|
|
|
|
<primary><varname>allow_system_table_mods</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2006-01-05 11:07:46 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2019-11-29 10:04:45 +01:00
|
|
|
Allows modification of the structure of system tables as well as
|
|
|
|
certain other risky actions on system tables. This is otherwise not
|
|
|
|
allowed even for superusers. Ill-advised use of this setting can
|
|
|
|
cause irretrievable data loss or seriously corrupt the database
|
|
|
|
system. Only superusers can change this setting.
|
2006-01-05 11:07:46 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2019-11-08 19:44:20 +01:00
|
|
|
<varlistentry id="guc-backtrace-functions" xreflabel="backtrace_functions">
|
|
|
|
<term><varname>backtrace_functions</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>backtrace_functions</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This parameter contains a comma-separated list of C function names.
|
|
|
|
If an error is raised and the name of the internal C function where
|
|
|
|
the error happens matches a value in the list, then a backtrace is
|
|
|
|
written to the server log together with the error message. This can
|
|
|
|
be used to debug specific areas of the source code.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Backtrace support is not available on all platforms, and the quality
|
|
|
|
of the backtraces depends on compilation options.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
This parameter can only be set by superusers.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2021-07-13 21:01:01 +02:00
|
|
|
<varlistentry id="guc-debug-discard-caches" xreflabel="debug_discard_caches">
|
|
|
|
<term><varname>debug_discard_caches</varname> (<type>integer</type>)
|
2021-01-06 10:15:19 +01:00
|
|
|
<indexterm>
|
2021-07-13 21:01:01 +02:00
|
|
|
<primary><varname>debug_discard_caches</varname> configuration parameter</primary>
|
2021-01-06 10:15:19 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2021-05-08 17:33:13 +02:00
|
|
|
When set to <literal>1</literal>, each system catalog cache entry is
|
|
|
|
invalidated at the first possible opportunity, whether or not
|
2021-01-06 10:15:19 +01:00
|
|
|
anything that would render it invalid really occurred. Caching of
|
|
|
|
system catalogs is effectively disabled as a result, so the server
|
|
|
|
will run extremely slowly. Higher values run the cache invalidation
|
2021-05-08 17:33:13 +02:00
|
|
|
recursively, which is even slower and only useful for testing
|
|
|
|
the caching logic itself. The default value of <literal>0</literal>
|
|
|
|
selects normal catalog caching behavior.
|
2021-01-06 10:15:19 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2021-05-08 17:33:13 +02:00
|
|
|
This parameter can be very helpful when trying to trigger
|
|
|
|
hard-to-reproduce bugs involving concurrent catalog changes, but it
|
2021-01-06 10:15:19 +01:00
|
|
|
is otherwise rarely needed. See the source code files
|
|
|
|
<filename>inval.c</filename> and
|
|
|
|
<filename>pg_config_manual.h</filename> for details.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2021-05-08 17:33:13 +02:00
|
|
|
This parameter is supported when
|
2021-07-13 21:01:01 +02:00
|
|
|
<symbol>DISCARD_CACHES_ENABLED</symbol> was defined at compile time
|
2021-01-06 10:15:19 +01:00
|
|
|
(which happens automatically when using the
|
2021-05-08 17:33:13 +02:00
|
|
|
<application>configure</application> option
|
2021-01-06 10:15:19 +01:00
|
|
|
<option>--enable-cassert</option>). In production builds, its value
|
|
|
|
will always be <literal>0</literal> and attempts to set it to another
|
|
|
|
value will raise an error.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2021-04-14 08:55:55 +02:00
|
|
|
<varlistentry id="guc-force-parallel-mode" xreflabel="force_parallel_mode">
|
|
|
|
<term><varname>force_parallel_mode</varname> (<type>enum</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>force_parallel_mode</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Allows the use of parallel queries for testing purposes even in cases
|
|
|
|
where no performance benefit is expected.
|
|
|
|
The allowed values of <varname>force_parallel_mode</varname> are
|
|
|
|
<literal>off</literal> (use parallel mode only when it is expected to improve
|
|
|
|
performance), <literal>on</literal> (force parallel query for all queries
|
|
|
|
for which it is thought to be safe), and <literal>regress</literal> (like
|
|
|
|
<literal>on</literal>, but with additional behavior changes as explained
|
|
|
|
below).
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
More specifically, setting this value to <literal>on</literal> will add
|
|
|
|
a <literal>Gather</literal> node to the top of any query plan for which this
|
|
|
|
appears to be safe, so that the query runs inside of a parallel worker.
|
|
|
|
Even when a parallel worker is not available or cannot be used,
|
|
|
|
operations such as starting a subtransaction that would be prohibited
|
|
|
|
in a parallel query context will be prohibited unless the planner
|
|
|
|
believes that this will cause the query to fail. If failures or
|
|
|
|
unexpected results occur when this option is set, some functions used
|
|
|
|
by the query may need to be marked <literal>PARALLEL UNSAFE</literal>
|
|
|
|
(or, possibly, <literal>PARALLEL RESTRICTED</literal>).
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Setting this value to <literal>regress</literal> has all of the same effects
|
|
|
|
as setting it to <literal>on</literal> plus some additional effects that are
|
|
|
|
intended to facilitate automated regression testing. Normally,
|
|
|
|
messages from a parallel worker include a context line indicating that,
|
|
|
|
but a setting of <literal>regress</literal> suppresses this line so that the
|
|
|
|
output is the same as in non-parallel execution. Also,
|
|
|
|
the <literal>Gather</literal> nodes added to plans by this setting are hidden
|
|
|
|
in <literal>EXPLAIN</literal> output so that the output matches what
|
|
|
|
would be obtained if this setting were turned <literal>off</literal>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2006-01-05 11:07:46 +01:00
|
|
|
<varlistentry id="guc-ignore-system-indexes" xreflabel="ignore_system_indexes">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>ignore_system_indexes</varname> (<type>boolean</type>)
|
2006-01-05 11:07:46 +01:00
|
|
|
<indexterm>
|
|
|
|
<primary><varname>ignore_system_indexes</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2006-01-05 11:07:46 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Ignore system indexes when reading system tables (but still
|
|
|
|
update the indexes when modifying the tables). This is useful
|
|
|
|
when recovering from damaged system indexes.
|
2006-01-23 19:16:41 +01:00
|
|
|
This parameter cannot be changed after session start.
|
2006-01-05 11:07:46 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-post-auth-delay" xreflabel="post_auth_delay">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>post_auth_delay</varname> (<type>integer</type>)
|
2006-01-05 11:07:46 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>post_auth_delay</varname> configuration parameter</primary>
|
2006-01-05 11:07:46 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2006-01-05 11:07:46 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
The amount of time to delay when a new
|
2006-01-05 11:07:46 +01:00
|
|
|
server process is started, after it conducts the
|
2010-02-03 18:25:06 +01:00
|
|
|
authentication procedure. This is intended to give developers an
|
2006-01-05 11:07:46 +01:00
|
|
|
opportunity to attach to the server process with a debugger.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as seconds.
|
|
|
|
A value of zero (the default) disables the delay.
|
|
|
|
This parameter cannot be changed after session start.
|
2006-01-05 11:07:46 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-pre-auth-delay" xreflabel="pre_auth_delay">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>pre_auth_delay</varname> (<type>integer</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>pre_auth_delay</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
The amount of time to delay just after a
|
2006-01-05 11:07:46 +01:00
|
|
|
new server process is forked, before it conducts the
|
2010-02-03 18:25:06 +01:00
|
|
|
authentication procedure. This is intended to give developers an
|
2006-01-05 11:07:46 +01:00
|
|
|
opportunity to attach to the server process with a debugger to
|
|
|
|
trace down misbehavior in authentication.
|
Doc: improve documentation of configuration settings that have units.
When we added the GUC units feature, we didn't make any great effort
to adjust the documentation of individual GUCs; they tended to still
say things like "this is the number of milliseconds that ...", even
though users might prefer to write some other units, and SHOW might
even show the value in other units. Commit 6c9fb69f2 made an effort
to improve this situation, but I thought it made things less readable
by injecting units information in mid-sentence. It also wasn't very
consistent, and did not touch all the GUCs that have units.
To improve matters, standardize on the phrasing "If this value is
specified without units, it is taken as <units>". Also, try to
standardize where this is mentioned, right before the specification
of the default. (In a couple of places, doing that would've required
more rewriting than seemed justified, so I wasn't 100% consistent
about that.) I also tried to use the phrases "amount of time",
"amount of memory", etc rather than describing the contents of GUCs
in other ways, as those were the majority usage in places that weren't
overcommitting to a particular unit. (I left "length of time" alone
in a couple of places, though.)
I failed to resist the temptation to copy-edit some awkward text, too.
Backpatch to v12, like 6c9fb69f2, mainly because v12 hasn't diverged
much from HEAD yet.
Discussion: https://postgr.es/m/15882.1571942223@sss.pgh.pa.us
2019-10-26 18:30:41 +02:00
|
|
|
If this value is specified without units, it is taken as seconds.
|
|
|
|
A value of zero (the default) disables the delay.
|
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
|
|
|
file or on the server command line.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-trace-notify" xreflabel="trace_notify">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>trace_notify</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>trace_notify</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Generates a great amount of debugging output for the
|
|
|
|
<command>LISTEN</command> and <command>NOTIFY</command>
|
2017-11-23 15:39:47 +01:00
|
|
|
commands. <xref linkend="guc-client-min-messages"/> or
|
|
|
|
<xref linkend="guc-log-min-messages"/> must be
|
2005-09-13 00:11:38 +02:00
|
|
|
<literal>DEBUG1</literal> or lower to send this output to the
|
2010-02-03 18:25:06 +01:00
|
|
|
client or server logs, respectively.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2010-08-20 00:55:01 +02:00
|
|
|
<varlistentry id="guc-trace-recovery-messages" xreflabel="trace_recovery_messages">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>trace_recovery_messages</varname> (<type>enum</type>)
|
2010-08-20 00:55:01 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>trace_recovery_messages</varname> configuration parameter</primary>
|
2010-08-20 00:55:01 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2010-08-20 00:55:01 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables logging of recovery-related debugging output that otherwise
|
|
|
|
would not be logged. This parameter allows the user to override the
|
2017-11-23 15:39:47 +01:00
|
|
|
normal setting of <xref linkend="guc-log-min-messages"/>, but only for
|
2010-08-20 00:55:01 +02:00
|
|
|
specific messages. This is intended for use in debugging Hot Standby.
|
2017-10-09 03:44:17 +02:00
|
|
|
Valid values are <literal>DEBUG5</literal>, <literal>DEBUG4</literal>,
|
|
|
|
<literal>DEBUG3</literal>, <literal>DEBUG2</literal>, <literal>DEBUG1</literal>, and
|
|
|
|
<literal>LOG</literal>. The default, <literal>LOG</literal>, does not affect
|
2010-08-20 00:55:01 +02:00
|
|
|
logging decisions at all. The other values cause recovery-related
|
|
|
|
debug messages of that priority or higher to be logged as though they
|
2017-10-09 03:44:17 +02:00
|
|
|
had <literal>LOG</literal> priority; for common settings of
|
|
|
|
<varname>log_min_messages</varname> this results in unconditionally sending
|
2010-08-20 00:55:01 +02:00
|
|
|
them to the server log.
|
2017-10-09 03:44:17 +02:00
|
|
|
This parameter can only be set in the <filename>postgresql.conf</filename>
|
2010-08-20 00:55:01 +02:00
|
|
|
file or on the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-10-04 00:55:56 +02:00
|
|
|
<varlistentry id="guc-trace-sort" xreflabel="trace_sort">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>trace_sort</varname> (<type>boolean</type>)
|
2005-10-04 00:55:56 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>trace_sort</varname> configuration parameter</primary>
|
2005-10-04 00:55:56 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-10-04 00:55:56 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
If on, emit information about resource usage during sort operations.
|
2006-01-23 19:16:41 +01:00
|
|
|
This parameter is only available if the <symbol>TRACE_SORT</symbol> macro
|
2005-10-04 00:55:56 +02:00
|
|
|
was defined when <productname>PostgreSQL</productname> was compiled.
|
|
|
|
(However, <symbol>TRACE_SORT</symbol> is currently defined by default.)
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry>
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>trace_locks</varname> (<type>boolean</type>)
|
2008-07-01 23:49:04 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>trace_locks</varname> configuration parameter</primary>
|
2008-07-01 23:49:04 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2008-07-01 23:49:04 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
If on, emit information about lock usage. Information dumped
|
|
|
|
includes the type of lock operation, the type of lock and the unique
|
|
|
|
identifier of the object being locked or unlocked. Also included
|
2010-08-17 06:37:21 +02:00
|
|
|
are bit masks for the lock types already granted on this object as
|
2008-07-01 23:49:04 +02:00
|
|
|
well as for the lock types awaited on this object. For each lock
|
|
|
|
type a count of the number of granted locks and waiting locks is
|
|
|
|
also dumped as well as the totals. An example of the log file output
|
|
|
|
is shown here:
|
2010-07-29 21:34:41 +02:00
|
|
|
<screen>
|
|
|
|
LOG: LockAcquire: new: lock(0xb7acd844) id(24688,24696,0,0,0,1)
|
|
|
|
grantMask(0) req(0,0,0,0,0,0,0)=0 grant(0,0,0,0,0,0,0)=0
|
|
|
|
wait(0) type(AccessShareLock)
|
|
|
|
LOG: GrantLock: lock(0xb7acd844) id(24688,24696,0,0,0,1)
|
|
|
|
grantMask(2) req(1,0,0,0,0,0,0)=1 grant(1,0,0,0,0,0,0)=1
|
|
|
|
wait(0) type(AccessShareLock)
|
|
|
|
LOG: UnGrantLock: updated: lock(0xb7acd844) id(24688,24696,0,0,0,1)
|
|
|
|
grantMask(0) req(0,0,0,0,0,0,0)=0 grant(0,0,0,0,0,0,0)=0
|
|
|
|
wait(0) type(AccessShareLock)
|
|
|
|
LOG: CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1)
|
|
|
|
grantMask(0) req(0,0,0,0,0,0,0)=0 grant(0,0,0,0,0,0,0)=0
|
|
|
|
wait(0) type(INVALID)
|
|
|
|
</screen>
|
2008-07-01 23:49:04 +02:00
|
|
|
Details of the structure being dumped may be found in
|
2010-08-17 06:37:21 +02:00
|
|
|
<filename>src/include/storage/lock.h</filename>.
|
2008-07-01 23:49:04 +02:00
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
This parameter is only available if the <symbol>LOCK_DEBUG</symbol>
|
|
|
|
macro was defined when <productname>PostgreSQL</productname> was
|
|
|
|
compiled.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry>
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>trace_lwlocks</varname> (<type>boolean</type>)
|
2008-07-01 23:49:04 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>trace_lwlocks</varname> configuration parameter</primary>
|
2008-07-01 23:49:04 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2008-07-01 23:49:04 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
If on, emit information about lightweight lock usage. Lightweight
|
|
|
|
locks are intended primarily to provide mutual exclusion of access
|
|
|
|
to shared-memory data structures.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
This parameter is only available if the <symbol>LOCK_DEBUG</symbol>
|
|
|
|
macro was defined when <productname>PostgreSQL</productname> was
|
|
|
|
compiled.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2011-11-10 23:54:27 +01:00
|
|
|
<varlistentry>
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>trace_userlocks</varname> (<type>boolean</type>)
|
2011-11-10 23:54:27 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>trace_userlocks</varname> configuration parameter</primary>
|
2011-11-10 23:54:27 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2011-11-10 23:54:27 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
If on, emit information about user lock usage. Output is the same
|
2011-11-11 00:00:34 +01:00
|
|
|
as for <symbol>trace_locks</symbol>, only for advisory locks.
|
2011-11-10 23:54:27 +01:00
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
This parameter is only available if the <symbol>LOCK_DEBUG</symbol>
|
|
|
|
macro was defined when <productname>PostgreSQL</productname> was
|
|
|
|
compiled.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2008-07-01 23:49:04 +02:00
|
|
|
<varlistentry>
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>trace_lock_oidmin</varname> (<type>integer</type>)
|
2008-07-01 23:49:04 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>trace_lock_oidmin</varname> configuration parameter</primary>
|
2008-07-01 23:49:04 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2008-07-01 23:49:04 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2020-10-19 18:28:54 +02:00
|
|
|
If set, do not trace locks for tables below this OID (used to avoid
|
|
|
|
output on system tables).
|
2008-07-01 23:49:04 +02:00
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
This parameter is only available if the <symbol>LOCK_DEBUG</symbol>
|
|
|
|
macro was defined when <productname>PostgreSQL</productname> was
|
|
|
|
compiled.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry>
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>trace_lock_table</varname> (<type>integer</type>)
|
2008-07-01 23:49:04 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>trace_lock_table</varname> configuration parameter</primary>
|
2008-07-01 23:49:04 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2008-07-01 23:49:04 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Unconditionally trace locks on this table (OID).
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
This parameter is only available if the <symbol>LOCK_DEBUG</symbol>
|
|
|
|
macro was defined when <productname>PostgreSQL</productname> was
|
|
|
|
compiled.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry>
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>debug_deadlocks</varname> (<type>boolean</type>)
|
2008-07-01 23:49:04 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>debug_deadlocks</varname> configuration parameter</primary>
|
2008-07-01 23:49:04 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2008-07-01 23:49:04 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
If set, dumps information about all current locks when a
|
2010-08-17 06:37:21 +02:00
|
|
|
deadlock timeout occurs.
|
2008-07-01 23:49:04 +02:00
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
This parameter is only available if the <symbol>LOCK_DEBUG</symbol>
|
|
|
|
macro was defined when <productname>PostgreSQL</productname> was
|
|
|
|
compiled.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry>
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>log_btree_build_stats</varname> (<type>boolean</type>)
|
2008-07-01 23:49:04 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>log_btree_build_stats</varname> configuration parameter</primary>
|
2008-07-01 23:49:04 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2008-07-01 23:49:04 +02:00
|
|
|
If set, logs system resource usage statistics (memory and CPU) on
|
2010-08-17 06:37:21 +02:00
|
|
|
various B-tree operations.
|
2008-07-01 23:49:04 +02:00
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
This parameter is only available if the <symbol>BTREE_BUILD_STATS</symbol>
|
|
|
|
macro was defined when <productname>PostgreSQL</productname> was
|
|
|
|
compiled.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2017-02-08 21:45:30 +01:00
|
|
|
<varlistentry id="guc-wal-consistency-checking" xreflabel="wal_consistency_checking">
|
|
|
|
<term><varname>wal_consistency_checking</varname> (<type>string</type>)
|
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>wal_consistency_checking</varname> configuration parameter</primary>
|
2017-02-08 21:45:30 +01:00
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
This parameter is intended to be used to check for bugs in the WAL
|
|
|
|
redo routines. When enabled, full-page images of any buffers modified
|
|
|
|
in conjunction with the WAL record are added to the record.
|
|
|
|
If the record is subsequently replayed, the system will first apply
|
|
|
|
each record and then test whether the buffers modified by the record
|
|
|
|
match the stored images. In certain cases (such as hint bits), minor
|
|
|
|
variations are acceptable, and will be ignored. Any unexpected
|
|
|
|
differences will result in a fatal error, terminating recovery.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The default value of this setting is the empty string, which disables
|
|
|
|
the feature. It can be set to <literal>all</literal> to check all
|
|
|
|
records, or to a comma-separated list of resource managers to check
|
|
|
|
only records originating from those resource managers. Currently,
|
2017-10-09 03:44:17 +02:00
|
|
|
the supported resource managers are <literal>heap</literal>,
|
|
|
|
<literal>heap2</literal>, <literal>btree</literal>, <literal>hash</literal>,
|
|
|
|
<literal>gin</literal>, <literal>gist</literal>, <literal>sequence</literal>,
|
|
|
|
<literal>spgist</literal>, <literal>brin</literal>, and <literal>generic</literal>. Only
|
2017-02-08 21:45:30 +01:00
|
|
|
superusers can change this setting.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-wal-debug" xreflabel="wal_debug">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>wal_debug</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>wal_debug</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2006-01-23 19:16:41 +01:00
|
|
|
If on, emit WAL-related debugging output. This parameter is
|
2005-09-13 00:11:38 +02:00
|
|
|
only available if the <symbol>WAL_DEBUG</symbol> macro was
|
|
|
|
defined when <productname>PostgreSQL</productname> was
|
|
|
|
compiled.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2013-03-22 14:54:07 +01:00
|
|
|
<varlistentry id="guc-ignore-checksum-failure" xreflabel="ignore_checksum_failure">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>ignore_checksum_failure</varname> (<type>boolean</type>)
|
2013-03-22 14:54:07 +01:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>ignore_checksum_failure</varname> configuration parameter</primary>
|
2013-03-22 14:54:07 +01:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2013-03-22 14:54:07 +01:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2017-11-23 15:39:47 +01:00
|
|
|
Only has effect if <xref linkend="app-initdb-data-checksums"/> are enabled.
|
2013-03-22 14:54:07 +01:00
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
Detection of a checksum failure during a read normally causes
|
2017-10-09 03:44:17 +02:00
|
|
|
<productname>PostgreSQL</productname> to report an error, aborting the current
|
|
|
|
transaction. Setting <varname>ignore_checksum_failure</varname> to on causes
|
2013-03-22 14:54:07 +01:00
|
|
|
the system to ignore the failure (but still report a warning), and
|
|
|
|
continue processing. This behavior may <emphasis>cause crashes, propagate
|
2017-10-09 03:44:17 +02:00
|
|
|
or hide corruption, or other serious problems</emphasis>. However, it may allow
|
2013-03-22 14:54:07 +01:00
|
|
|
you to get past the error and retrieve undamaged tuples that might still be
|
|
|
|
present in the table if the block header is still sane. If the header is
|
|
|
|
corrupt an error will be reported even if this option is enabled. The
|
2017-10-09 03:44:17 +02:00
|
|
|
default setting is <literal>off</literal>, and it can only be changed by a superuser.
|
2013-03-22 14:54:07 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2005-09-13 00:11:38 +02:00
|
|
|
<varlistentry id="guc-zero-damaged-pages" xreflabel="zero_damaged_pages">
|
2014-05-07 03:28:58 +02:00
|
|
|
<term><varname>zero_damaged_pages</varname> (<type>boolean</type>)
|
2005-09-13 00:11:38 +02:00
|
|
|
<indexterm>
|
2017-10-09 03:44:17 +02:00
|
|
|
<primary><varname>zero_damaged_pages</varname> configuration parameter</primary>
|
2005-09-13 00:11:38 +02:00
|
|
|
</indexterm>
|
2014-05-07 03:28:58 +02:00
|
|
|
</term>
|
2005-09-13 00:11:38 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Detection of a damaged page header normally causes
|
2017-10-09 03:44:17 +02:00
|
|
|
<productname>PostgreSQL</productname> to report an error, aborting the current
|
|
|
|
transaction. Setting <varname>zero_damaged_pages</varname> to on causes
|
2011-02-01 22:43:51 +01:00
|
|
|
the system to instead report a warning, zero out the damaged
|
2017-10-09 03:44:17 +02:00
|
|
|
page in memory, and continue processing. This behavior <emphasis>will destroy data</emphasis>,
|
2011-02-01 22:43:51 +01:00
|
|
|
namely all the rows on the damaged page. However, it does allow you to get
|
Update documentation on may/can/might:
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
Also update two error messages mentioned in the documenation to match.
2007-01-31 21:56:20 +01:00
|
|
|
past the error and retrieve rows from any undamaged pages that might
|
2011-02-01 22:43:51 +01:00
|
|
|
be present in the table. It is useful for recovering data if
|
2010-02-03 18:25:06 +01:00
|
|
|
corruption has occurred due to a hardware or software error. You should
|
2005-09-13 00:11:38 +02:00
|
|
|
generally not set this on until you have given up hope of recovering
|
2011-05-19 00:14:45 +02:00
|
|
|
data from the damaged pages of a table. Zeroed-out pages are not
|
2011-02-01 22:43:51 +01:00
|
|
|
forced to disk so it is recommended to recreate the table or
|
|
|
|
the index before turning this parameter off again. The
|
2017-10-09 03:44:17 +02:00
|
|
|
default setting is <literal>off</literal>, and it can only be changed
|
2005-09-13 00:11:38 +02:00
|
|
|
by a superuser.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
2018-03-28 23:22:42 +02:00
|
|
|
|
2020-01-22 03:56:34 +01:00
|
|
|
<varlistentry id="guc-ignore-invalid-pages" xreflabel="ignore_invalid_pages">
|
|
|
|
<term><varname>ignore_invalid_pages</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>ignore_invalid_pages</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
If set to <literal>off</literal> (the default), detection of
|
|
|
|
WAL records having references to invalid pages during
|
|
|
|
recovery causes <productname>PostgreSQL</productname> to
|
|
|
|
raise a PANIC-level error, aborting the recovery. Setting
|
|
|
|
<varname>ignore_invalid_pages</varname> to <literal>on</literal>
|
|
|
|
causes the system to ignore invalid page references in WAL records
|
|
|
|
(but still report a warning), and continue the recovery.
|
|
|
|
This behavior may <emphasis>cause crashes, data loss,
|
|
|
|
propagate or hide corruption, or other serious problems</emphasis>.
|
|
|
|
However, it may allow you to get past the PANIC-level error,
|
|
|
|
to finish the recovery, and to cause the server to start up.
|
|
|
|
The parameter can only be set at server start. It only has effect
|
|
|
|
during recovery or in standby mode.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2018-03-28 23:22:42 +02:00
|
|
|
<varlistentry id="guc-jit-debugging-support" xreflabel="jit_debugging_support">
|
|
|
|
<term><varname>jit_debugging_support</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>jit_debugging_support</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
If LLVM has the required functionality, register generated functions
|
|
|
|
with <productname>GDB</productname>. This makes debugging easier.
|
2018-09-15 23:24:35 +02:00
|
|
|
The default setting is <literal>off</literal>.
|
|
|
|
This parameter can only be set at server start.
|
2018-03-28 23:22:42 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-jit-dump-bitcode" xreflabel="jit_dump_bitcode">
|
|
|
|
<term><varname>jit_dump_bitcode</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>jit_dump_bitcode</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Writes the generated <productname>LLVM</productname> IR out to the
|
2018-06-29 21:26:41 +02:00
|
|
|
file system, inside <xref linkend="guc-data-directory"/>. This is only
|
2018-03-28 23:22:42 +02:00
|
|
|
useful for working on the internals of the JIT implementation.
|
2018-09-15 23:24:35 +02:00
|
|
|
The default setting is <literal>off</literal>.
|
|
|
|
This parameter can only be changed by a superuser.
|
2018-03-28 23:22:42 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-jit-expressions" xreflabel="jit_expressions">
|
|
|
|
<term><varname>jit_expressions</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>jit_expressions</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-09-15 23:24:35 +02:00
|
|
|
Determines whether expressions are JIT compiled, when JIT compilation
|
|
|
|
is activated (see <xref linkend="jit-decision"/>). The default is
|
2018-03-28 23:22:42 +02:00
|
|
|
<literal>on</literal>.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-jit-profiling-support" xreflabel="jit_profiling_support">
|
|
|
|
<term><varname>jit_profiling_support</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>jit_profiling_support</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-09-15 23:24:35 +02:00
|
|
|
If LLVM has the required functionality, emit the data needed to allow
|
2018-03-28 23:22:42 +02:00
|
|
|
<productname>perf</productname> to profile functions generated by JIT.
|
2021-05-29 20:27:37 +02:00
|
|
|
This writes out files to <filename>~/.debug/jit/</filename>; the
|
2018-03-28 23:22:42 +02:00
|
|
|
user is responsible for performing cleanup when desired.
|
2018-09-15 23:24:35 +02:00
|
|
|
The default setting is <literal>off</literal>.
|
|
|
|
This parameter can only be set at server start.
|
2018-03-28 23:22:42 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry id="guc-jit-tuple-deforming" xreflabel="jit_tuple_deforming">
|
|
|
|
<term><varname>jit_tuple_deforming</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>jit_tuple_deforming</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2018-09-15 23:24:35 +02:00
|
|
|
Determines whether tuple deforming is JIT compiled, when JIT
|
|
|
|
compilation is activated (see <xref linkend="jit-decision"/>).
|
|
|
|
The default is <literal>on</literal>.
|
2018-03-28 23:22:42 +02:00
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2021-06-25 01:40:16 +02:00
|
|
|
<varlistentry id="guc-remove-temp-files-after-crash" xreflabel="remove_temp_files_after_crash">
|
|
|
|
<term><varname>remove_temp_files_after_crash</varname> (<type>boolean</type>)
|
|
|
|
<indexterm>
|
|
|
|
<primary><varname>remove_temp_files_after_crash</varname> configuration parameter</primary>
|
|
|
|
</indexterm>
|
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
When set to <literal>on</literal>, which is the default,
|
|
|
|
<productname>PostgreSQL</productname> will automatically remove
|
|
|
|
temporary files after a backend crash. If disabled, the files will be
|
|
|
|
retained and may be used for debugging, for example. Repeated crashes
|
2021-07-17 17:52:54 +02:00
|
|
|
may however result in accumulation of useless files. This parameter
|
2021-06-25 01:40:16 +02:00
|
|
|
can only be set in the <filename>postgresql.conf</filename> file or on
|
|
|
|
the server command line.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2018-03-28 23:22:42 +02:00
|
|
|
</variablelist>
|
2005-09-13 00:11:38 +02:00
|
|
|
</sect1>
|
|
|
|
<sect1 id="runtime-config-short">
|
|
|
|
<title>Short Options</title>
|
|
|
|
|
|
|
|
<para>
|
2006-01-05 11:07:46 +01:00
|
|
|
For convenience there are also single letter command-line option
|
|
|
|
switches available for some parameters. They are described in
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="runtime-config-short-table"/>. Some of these
|
2006-01-05 11:07:46 +01:00
|
|
|
options exist for historical reasons, and their presence as a
|
|
|
|
single-letter option does not necessarily indicate an endorsement
|
|
|
|
to use the option heavily.
|
2005-09-13 00:11:38 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<table id="runtime-config-short-table">
|
2011-01-29 19:00:18 +01:00
|
|
|
<title>Short Option Key</title>
|
2005-09-13 00:11:38 +02:00
|
|
|
<tgroup cols="2">
|
2020-05-06 18:23:43 +02:00
|
|
|
<colspec colname="col1" colwidth="1*"/>
|
|
|
|
<colspec colname="col2" colwidth="2*"/>
|
2005-09-13 00:11:38 +02:00
|
|
|
<thead>
|
|
|
|
<row>
|
2011-01-29 19:00:18 +01:00
|
|
|
<entry>Short Option</entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
<entry>Equivalent</entry>
|
|
|
|
</row>
|
|
|
|
</thead>
|
|
|
|
|
|
|
|
<tbody>
|
|
|
|
<row>
|
|
|
|
<entry><option>-B <replaceable>x</replaceable></option></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>shared_buffers = <replaceable>x</replaceable></literal></entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><option>-d <replaceable>x</replaceable></option></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>log_min_messages = DEBUG<replaceable>x</replaceable></literal></entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</row>
|
2006-01-05 11:07:46 +01:00
|
|
|
<row>
|
|
|
|
<entry><option>-e</option></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>datestyle = euro</literal></entry>
|
2006-01-05 11:07:46 +01:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>
|
|
|
|
<option>-fb</option>, <option>-fh</option>, <option>-fi</option>,
|
2011-10-08 02:13:02 +02:00
|
|
|
<option>-fm</option>, <option>-fn</option>, <option>-fo</option>,
|
2006-01-05 11:07:46 +01:00
|
|
|
<option>-fs</option>, <option>-ft</option>
|
|
|
|
</entry>
|
|
|
|
<entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>enable_bitmapscan = off</literal>,
|
|
|
|
<literal>enable_hashjoin = off</literal>,
|
|
|
|
<literal>enable_indexscan = off</literal>,
|
|
|
|
<literal>enable_mergejoin = off</literal>,
|
|
|
|
<literal>enable_nestloop = off</literal>,
|
|
|
|
<literal>enable_indexonlyscan = off</literal>,
|
|
|
|
<literal>enable_seqscan = off</literal>,
|
|
|
|
<literal>enable_tidscan = off</literal>
|
2006-01-05 11:07:46 +01:00
|
|
|
</entry>
|
|
|
|
</row>
|
2005-09-13 00:11:38 +02:00
|
|
|
<row>
|
|
|
|
<entry><option>-F</option></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>fsync = off</literal></entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><option>-h <replaceable>x</replaceable></option></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>listen_addresses = <replaceable>x</replaceable></literal></entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><option>-i</option></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>listen_addresses = '*'</literal></entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><option>-k <replaceable>x</replaceable></option></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>unix_socket_directories = <replaceable>x</replaceable></literal></entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><option>-l</option></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>ssl = on</literal></entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry><option>-N <replaceable>x</replaceable></option></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>max_connections = <replaceable>x</replaceable></literal></entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</row>
|
2006-01-05 11:07:46 +01:00
|
|
|
<row>
|
|
|
|
<entry><option>-O</option></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>allow_system_table_mods = on</literal></entry>
|
2006-01-05 11:07:46 +01:00
|
|
|
</row>
|
2005-09-13 00:11:38 +02:00
|
|
|
<row>
|
|
|
|
<entry><option>-p <replaceable>x</replaceable></option></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>port = <replaceable>x</replaceable></literal></entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
2006-01-05 11:07:46 +01:00
|
|
|
<entry><option>-P</option></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>ignore_system_indexes = on</literal></entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
2006-01-05 11:07:46 +01:00
|
|
|
<entry><option>-s</option></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>log_statement_stats = on</literal></entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
2006-01-05 11:07:46 +01:00
|
|
|
<entry><option>-S <replaceable>x</replaceable></option></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>work_mem = <replaceable>x</replaceable></literal></entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</row>
|
|
|
|
<row>
|
2006-01-05 11:07:46 +01:00
|
|
|
<entry><option>-tpa</option>, <option>-tpl</option>, <option>-te</option></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>log_parser_stats = on</literal>,
|
|
|
|
<literal>log_planner_stats = on</literal>,
|
|
|
|
<literal>log_executor_stats = on</literal></entry>
|
2005-09-13 00:11:38 +02:00
|
|
|
</row>
|
2006-01-05 11:07:46 +01:00
|
|
|
<row>
|
|
|
|
<entry><option>-W <replaceable>x</replaceable></option></entry>
|
2017-10-09 03:44:17 +02:00
|
|
|
<entry><literal>post_auth_delay = <replaceable>x</replaceable></literal></entry>
|
2006-01-05 11:07:46 +01:00
|
|
|
</row>
|
2005-09-13 00:11:38 +02:00
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
|
|
|
|
|
|
|
</sect1>
|
|
|
|
</chapter>
|