Move max_wal_senders out of max_connections for connection slot handling

Since its introduction, max_wal_senders is counted as part of
max_connections when it comes to define how many connection slots can be
used for replication connections with a WAL sender context.  This can
lead to confusion for some users, as it could be possible to block a
base backup or replication from happening because other backend sessions
are already taken for other purposes by an application, and
superuser-only connection slots are not a correct solution to handle
that case.

This commit makes max_wal_senders independent of max_connections for its
handling of PGPROC entries in ProcGlobal, meaning that connection slots
for WAL senders are handled using their own free queue, like autovacuum
workers and bgworkers.

One compatibility issue that this change creates is that a standby now
requires to have a value of max_wal_senders at least equal to its
primary.  So, if a standby created enforces the value of
max_wal_senders to be lower than that, then this could break failovers.
Normally this should not be an issue though, as any settings of a
standby are inherited from its primary as postgresql.conf gets normally
copied as part of a base backup, so parameters would be consistent.

Author: Alexander Kukushkin
Reviewed-by: Kyotaro Horiguchi, Petr Jelínek, Masahiko Sawada, Oleksii
Kliukin
Discussion: https://postgr.es/m/CAFh8B=nBzHQeYAu0b8fjK-AF1X4+_p6GRtwG+cCgs6Vci2uRuQ@mail.gmail.com
This commit is contained in:
Michael Paquier 2019-02-12 10:07:56 +09:00
parent 1d92a0c9f7
commit ea92368cd1
15 changed files with 100 additions and 55 deletions

View File

@ -697,8 +697,7 @@ include_dir 'conf.d'
<para>
The default value is three connections. The value must be less
than <varname>max_connections</varname> minus
<xref linkend="guc-max-wal-senders"/>.
than <varname>max_connections</varname>.
This parameter can only be set at server start.
</para>
</listitem>
@ -3495,24 +3494,25 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
</term>
<listitem>
<para>
Specifies the maximum number of concurrent connections from
standby servers or streaming base backup clients (i.e., the
maximum number of simultaneously running WAL sender
processes). The default is 10. The value 0 means replication is
disabled. WAL sender processes count towards the total number
of connections, so this parameter's value must be less than
<xref linkend="guc-max-connections"/> minus
<xref linkend="guc-superuser-reserved-connections"/>.
Abrupt streaming client disconnection might leave an orphaned
connection slot behind until
a timeout is reached, so this parameter should be set slightly
higher than the maximum number of expected clients so disconnected
clients can immediately reconnect. This parameter can only
be set at server start.
Also, <varname>wal_level</varname> must be set to
Specifies the maximum number of concurrent connections from standby
servers or streaming base backup clients (i.e., the maximum number of
simultaneously running WAL sender processes). The default is
<literal>10</literal>. The value <literal>0</literal> means
replication is disabled. Abrupt streaming client disconnection might
leave an orphaned connection slot behind until a timeout is reached,
so this parameter should be set slightly higher than the maximum
number of expected clients so disconnected clients can immediately
reconnect. This parameter can only be set at server start. Also,
<varname>wal_level</varname> must be set to
<literal>replica</literal> or higher to allow connections from standby
servers.
</para>
<para>
When running a standby server, you must set this parameter to the
same or higher value than on the master server. Otherwise, queries
will not be allowed in the standby server.
</para>
</listitem>
</varlistentry>

View File

@ -2177,6 +2177,11 @@ LOG: database system is ready to accept read only connections
<varname>max_locks_per_transaction</varname>
</para>
</listitem>
<listitem>
<para>
<varname>max_wal_senders</varname>
</para>
</listitem>
<listitem>
<para>
<varname>max_worker_processes</varname>

View File

@ -720,13 +720,13 @@ psql: could not connect to server: No such file or directory
<row>
<entry><varname>SEMMNI</varname></entry>
<entry>Maximum number of semaphore identifiers (i.e., sets)</entry>
<entry>at least <literal>ceil((max_connections + autovacuum_max_workers + max_worker_processes + 5) / 16)</literal> plus room for other applications</entry>
<entry>at least <literal>ceil((max_connections + autovacuum_max_workers + wax_wal_senders + max_worker_processes + 5) / 16)</literal> plus room for other applications</entry>
</row>
<row>
<entry><varname>SEMMNS</varname></entry>
<entry>Maximum number of semaphores system-wide</entry>
<entry><literal>ceil((max_connections + autovacuum_max_workers + max_worker_processes + 5) / 16) * 17</literal> plus room for other applications</entry>
<entry><literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16) * 17</literal> plus room for other applications</entry>
</row>
<row>
@ -785,13 +785,13 @@ psql: could not connect to server: No such file or directory
other applications. The maximum number of semaphores in the system
is set by <varname>SEMMNS</varname>, which consequently must be at least
as high as <varname>max_connections</varname> plus
<varname>autovacuum_max_workers</varname> plus <varname>max_worker_processes</varname>,
plus one extra for each 16
<varname>autovacuum_max_workers</varname> plus <varname>max_wal_senders</varname>,
plus <varname>max_worker_processes</varname>, plus one extra for each 16
allowed connections plus workers (see the formula in <xref
linkend="sysvipc-parameters"/>). The parameter <varname>SEMMNI</varname>
determines the limit on the number of semaphore sets that can
exist on the system at one time. Hence this parameter must be at
least <literal>ceil((max_connections + autovacuum_max_workers + max_worker_processes + 5) / 16)</literal>.
least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16)</literal>.
Lowering the number
of allowed connections is a temporary workaround for failures,
which are usually confusingly worded <quote>No space

View File

@ -110,10 +110,11 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
}
appendStringInfo(buf, "max_connections=%d max_worker_processes=%d "
"max_prepared_xacts=%d max_locks_per_xact=%d "
"wal_level=%s wal_log_hints=%s "
"track_commit_timestamp=%s",
"max_wal_senders=%d max_prepared_xacts=%d "
"max_locks_per_xact=%d wal_level=%s "
"wal_log_hints=%s track_commit_timestamp=%s",
xlrec.MaxConnections,
xlrec.max_wal_senders,
xlrec.max_worker_processes,
xlrec.max_prepared_xacts,
xlrec.max_locks_per_xact,

View File

@ -5257,6 +5257,7 @@ BootStrapXLOG(void)
/* Set important parameter values for use when replaying WAL */
ControlFile->MaxConnections = MaxConnections;
ControlFile->max_worker_processes = max_worker_processes;
ControlFile->max_wal_senders = max_wal_senders;
ControlFile->max_prepared_xacts = max_prepared_xacts;
ControlFile->max_locks_per_xact = max_locks_per_xact;
ControlFile->wal_level = wal_level;
@ -6170,6 +6171,9 @@ CheckRequiredParameterValues(void)
RecoveryRequiresIntParameter("max_worker_processes",
max_worker_processes,
ControlFile->max_worker_processes);
RecoveryRequiresIntParameter("max_wal_senders",
max_wal_senders,
ControlFile->max_wal_senders);
RecoveryRequiresIntParameter("max_prepared_transactions",
max_prepared_xacts,
ControlFile->max_prepared_xacts);
@ -9460,6 +9464,7 @@ XLogReportParameters(void)
wal_log_hints != ControlFile->wal_log_hints ||
MaxConnections != ControlFile->MaxConnections ||
max_worker_processes != ControlFile->max_worker_processes ||
max_wal_senders != ControlFile->max_wal_senders ||
max_prepared_xacts != ControlFile->max_prepared_xacts ||
max_locks_per_xact != ControlFile->max_locks_per_xact ||
track_commit_timestamp != ControlFile->track_commit_timestamp)
@ -9478,6 +9483,7 @@ XLogReportParameters(void)
xlrec.MaxConnections = MaxConnections;
xlrec.max_worker_processes = max_worker_processes;
xlrec.max_wal_senders = max_wal_senders;
xlrec.max_prepared_xacts = max_prepared_xacts;
xlrec.max_locks_per_xact = max_locks_per_xact;
xlrec.wal_level = wal_level;
@ -9493,6 +9499,7 @@ XLogReportParameters(void)
ControlFile->MaxConnections = MaxConnections;
ControlFile->max_worker_processes = max_worker_processes;
ControlFile->max_wal_senders = max_wal_senders;
ControlFile->max_prepared_xacts = max_prepared_xacts;
ControlFile->max_locks_per_xact = max_locks_per_xact;
ControlFile->wal_level = wal_level;
@ -9896,6 +9903,7 @@ xlog_redo(XLogReaderState *record)
LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
ControlFile->MaxConnections = xlrec.MaxConnections;
ControlFile->max_worker_processes = xlrec.max_worker_processes;
ControlFile->max_wal_senders = xlrec.max_wal_senders;
ControlFile->max_prepared_xacts = xlrec.max_prepared_xacts;
ControlFile->max_locks_per_xact = xlrec.max_locks_per_xact;
ControlFile->wal_level = xlrec.wal_level;
@ -9927,7 +9935,7 @@ xlog_redo(XLogReaderState *record)
UpdateControlFile();
LWLockRelease(ControlFileLock);
/* Check to see if any changes to max_connections give problems */
/* Check to see if any parameter change gives a problem on recovery */
CheckRequiredParameterValues();
}
else if (info == XLOG_FPW_CHANGE)

View File

@ -885,11 +885,11 @@ PostmasterMain(int argc, char *argv[])
/*
* Check for invalid combinations of GUC settings.
*/
if (ReservedBackends + max_wal_senders >= MaxConnections)
if (ReservedBackends >= MaxConnections)
{
write_stderr("%s: superuser_reserved_connections (%d) plus max_wal_senders (%d) must be less than max_connections (%d)\n",
write_stderr("%s: superuser_reserved_connections (%d) must be less than max_connections (%d)\n",
progname,
ReservedBackends, max_wal_senders, MaxConnections);
ReservedBackends, MaxConnections);
ExitPostmaster(1);
}
if (XLogArchiveMode > ARCHIVE_MODE_OFF && wal_level == WAL_LEVEL_MINIMAL)
@ -5532,7 +5532,7 @@ int
MaxLivePostmasterChildren(void)
{
return 2 * (MaxConnections + autovacuum_max_workers + 1 +
max_worker_processes);
max_wal_senders + max_worker_processes);
}
/*

View File

@ -2273,8 +2273,8 @@ InitWalSenderSlot(void)
Assert(MyWalSnd == NULL);
/*
* Find a free walsender slot and reserve it. If this fails, we must be
* out of WalSnd structures.
* Find a free walsender slot and reserve it. This must not fail due to
* the prior check for free WAL senders in InitProcess().
*/
for (i = 0; i < max_wal_senders; i++)
{
@ -2310,12 +2310,8 @@ InitWalSenderSlot(void)
break;
}
}
if (MyWalSnd == NULL)
ereport(FATAL,
(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
errmsg("number of requested standby connections "
"exceeds max_wal_senders (currently %d)",
max_wal_senders)));
Assert(MyWalSnd != NULL);
/* Arrange to clean up at walsender exit */
on_shmem_exit(WalSndKill, 0);

View File

@ -43,6 +43,7 @@
#include "postmaster/autovacuum.h"
#include "replication/slot.h"
#include "replication/syncrep.h"
#include "replication/walsender.h"
#include "storage/condition_variable.h"
#include "storage/standby.h"
#include "storage/ipc.h"
@ -147,8 +148,9 @@ ProcGlobalSemas(void)
* running out when trying to start another backend is a common failure.
* So, now we grab enough semaphores to support the desired max number
* of backends immediately at initialization --- if the sysadmin has set
* MaxConnections, max_worker_processes, or autovacuum_max_workers higher
* than his kernel will support, he'll find out sooner rather than later.
* MaxConnections, max_worker_processes, max_wal_senders, or
* autovacuum_max_workers higher than his kernel will support, he'll
* find out sooner rather than later.
*
* Another reason for creating semaphores here is that the semaphore
* implementation typically requires us to create semaphores in the
@ -180,6 +182,7 @@ InitProcGlobal(void)
ProcGlobal->freeProcs = NULL;
ProcGlobal->autovacFreeProcs = NULL;
ProcGlobal->bgworkerFreeProcs = NULL;
ProcGlobal->walsenderFreeProcs = NULL;
ProcGlobal->startupProc = NULL;
ProcGlobal->startupProcPid = 0;
ProcGlobal->startupBufferPinWaitBufId = -1;
@ -253,13 +256,20 @@ InitProcGlobal(void)
ProcGlobal->autovacFreeProcs = &procs[i];
procs[i].procgloballist = &ProcGlobal->autovacFreeProcs;
}
else if (i < MaxBackends)
else if (i < MaxConnections + autovacuum_max_workers + 1 + max_worker_processes)
{
/* PGPROC for bgworker, add to bgworkerFreeProcs list */
procs[i].links.next = (SHM_QUEUE *) ProcGlobal->bgworkerFreeProcs;
ProcGlobal->bgworkerFreeProcs = &procs[i];
procs[i].procgloballist = &ProcGlobal->bgworkerFreeProcs;
}
else if (i < MaxBackends)
{
/* PGPROC for walsender, add to walsenderFreeProcs list */
procs[i].links.next = (SHM_QUEUE *) ProcGlobal->walsenderFreeProcs;
ProcGlobal->walsenderFreeProcs = &procs[i];
procs[i].procgloballist = &ProcGlobal->walsenderFreeProcs;
}
/* Initialize myProcLocks[] shared memory queues. */
for (j = 0; j < NUM_LOCK_PARTITIONS; j++)
@ -311,6 +321,8 @@ InitProcess(void)
procgloballist = &ProcGlobal->autovacFreeProcs;
else if (IsBackgroundWorker)
procgloballist = &ProcGlobal->bgworkerFreeProcs;
else if (am_walsender)
procgloballist = &ProcGlobal->walsenderFreeProcs;
else
procgloballist = &ProcGlobal->freeProcs;
@ -341,6 +353,11 @@ InitProcess(void)
* in the autovacuum case?
*/
SpinLockRelease(ProcStructLock);
if (am_walsender)
ereport(FATAL,
(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
errmsg("number of requested standby connections exceeds max_wal_senders (currently %d)",
max_wal_senders)));
ereport(FATAL,
(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
errmsg("sorry, too many clients already")));

View File

@ -527,7 +527,7 @@ InitializeMaxBackends(void)
/* the extra unit accounts for the autovacuum launcher */
MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
max_worker_processes;
max_worker_processes + max_wal_senders;
/* internal error because the values were all checked previously */
if (MaxBackends > MAX_BACKENDS)
@ -811,12 +811,11 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
}
/*
* The last few connection slots are reserved for superusers. Although
* replication connections currently require superuser privileges, we
* don't allow them to consume the reserved slots, which are intended for
* interactive use.
* The last few connection slots are reserved for superusers. Replication
* connections are drawn from slots reserved with max_wal_senders and not
* limited by max_connections or superuser_reserved_connections.
*/
if ((!am_superuser || am_walsender) &&
if (!am_superuser && !am_walsender &&
ReservedBackends > 0 &&
!HaveNFreeProcs(ReservedBackends))
ereport(FATAL,

View File

@ -187,6 +187,7 @@ static const char *show_tcp_keepalives_count(void);
static bool check_maxconnections(int *newval, void **extra, GucSource source);
static bool check_max_worker_processes(int *newval, void **extra, GucSource source);
static bool check_autovacuum_max_workers(int *newval, void **extra, GucSource source);
static bool check_max_wal_senders(int *newval, void **extra, GucSource source);
static bool check_autovacuum_work_mem(int *newval, void **extra, GucSource source);
static bool check_effective_io_concurrency(int *newval, void **extra, GucSource source);
static void assign_effective_io_concurrency(int newval, void *extra);
@ -2090,7 +2091,7 @@ static struct config_int ConfigureNamesInt[] =
},
{
/* see max_connections and max_wal_senders */
/* see max_connections */
{"superuser_reserved_connections", PGC_POSTMASTER, CONN_AUTH_SETTINGS,
gettext_noop("Sets the number of connection slots reserved for superusers."),
NULL
@ -2608,14 +2609,13 @@ static struct config_int ConfigureNamesInt[] =
},
{
/* see max_connections and superuser_reserved_connections */
{"max_wal_senders", PGC_POSTMASTER, REPLICATION_SENDING,
gettext_noop("Sets the maximum number of simultaneously running WAL sender processes."),
NULL
},
&max_wal_senders,
10, 0, MAX_BACKENDS,
NULL, NULL, NULL
check_max_wal_senders, NULL, NULL
},
{
@ -10911,7 +10911,7 @@ static bool
check_maxconnections(int *newval, void **extra, GucSource source)
{
if (*newval + autovacuum_max_workers + 1 +
max_worker_processes > MAX_BACKENDS)
max_worker_processes + max_wal_senders > MAX_BACKENDS)
return false;
return true;
}
@ -10919,7 +10919,17 @@ check_maxconnections(int *newval, void **extra, GucSource source)
static bool
check_autovacuum_max_workers(int *newval, void **extra, GucSource source)
{
if (MaxConnections + *newval + 1 + max_worker_processes > MAX_BACKENDS)
if (MaxConnections + *newval + 1 +
max_worker_processes + max_wal_senders > MAX_BACKENDS)
return false;
return true;
}
static bool
check_max_wal_senders(int *newval, void **extra, GucSource source)
{
if (MaxConnections + autovacuum_max_workers + 1 +
max_worker_processes + *newval > MAX_BACKENDS)
return false;
return true;
}
@ -10950,7 +10960,8 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
static bool
check_max_worker_processes(int *newval, void **extra, GucSource source)
{
if (MaxConnections + autovacuum_max_workers + 1 + *newval > MAX_BACKENDS)
if (MaxConnections + autovacuum_max_workers + 1 +
*newval + max_wal_senders > MAX_BACKENDS)
return false;
return true;
}

View File

@ -304,6 +304,8 @@ main(int argc, char *argv[])
ControlFile->MaxConnections);
printf(_("max_worker_processes setting: %d\n"),
ControlFile->max_worker_processes);
printf(_("max_wal_senders setting: %d\n"),
ControlFile->max_wal_senders);
printf(_("max_prepared_xacts setting: %d\n"),
ControlFile->max_prepared_xacts);
printf(_("max_locks_per_xact setting: %d\n"),

View File

@ -728,6 +728,7 @@ GuessControlValues(void)
ControlFile.wal_log_hints = false;
ControlFile.track_commit_timestamp = false;
ControlFile.MaxConnections = 100;
ControlFile.max_wal_senders = 10;
ControlFile.max_worker_processes = 8;
ControlFile.max_prepared_xacts = 0;
ControlFile.max_locks_per_xact = 64;
@ -955,6 +956,7 @@ RewriteControlFile(void)
ControlFile.wal_log_hints = false;
ControlFile.track_commit_timestamp = false;
ControlFile.MaxConnections = 100;
ControlFile.max_wal_senders = 10;
ControlFile.max_worker_processes = 8;
ControlFile.max_prepared_xacts = 0;
ControlFile.max_locks_per_xact = 64;

View File

@ -31,7 +31,7 @@
/*
* Each page of XLOG file has a header like this:
*/
#define XLOG_PAGE_MAGIC 0xD098 /* can be used as WAL version indicator */
#define XLOG_PAGE_MAGIC 0xD099 /* can be used as WAL version indicator */
typedef struct XLogPageHeaderData
{
@ -226,6 +226,7 @@ typedef struct xl_parameter_change
{
int MaxConnections;
int max_worker_processes;
int max_wal_senders;
int max_prepared_xacts;
int max_locks_per_xact;
int wal_level;

View File

@ -21,7 +21,7 @@
/* Version identifier for this pg_control format */
#define PG_CONTROL_VERSION 1100
#define PG_CONTROL_VERSION 1200
/* Nonce key length, see below */
#define MOCK_AUTH_NONCE_LEN 32
@ -177,6 +177,7 @@ typedef struct ControlFileData
bool wal_log_hints;
int MaxConnections;
int max_worker_processes;
int max_wal_senders;
int max_prepared_xacts;
int max_locks_per_xact;
bool track_commit_timestamp;

View File

@ -255,6 +255,8 @@ typedef struct PROC_HDR
PGPROC *autovacFreeProcs;
/* Head of list of bgworker free PGPROC structures */
PGPROC *bgworkerFreeProcs;
/* Head of list of walsender free PGPROC structures */
PGPROC *walsenderFreeProcs;
/* First pgproc waiting for group XID clear */
pg_atomic_uint32 procArrayGroupFirst;
/* First pgproc waiting for group transaction status update */