2011-11-02 15:25:01 +01:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* startup.c
|
|
|
|
*
|
|
|
|
* The Startup process initialises the server and performs any recovery
|
|
|
|
* actions that have been specified. Notice that there is no "main loop"
|
|
|
|
* since the Startup process ends as soon as initialisation is complete.
|
Allow a streaming replication standby to follow a timeline switch.
Before this patch, streaming replication would refuse to start replicating
if the timeline in the primary doesn't exactly match the standby. The
situation where it doesn't match is when you have a master, and two
standbys, and you promote one of the standbys to become new master.
Promoting bumps up the timeline ID, and after that bump, the other standby
would refuse to continue.
There's significantly more timeline related logic in streaming replication
now. First of all, when a standby connects to primary, it will ask the
primary for any timeline history files that are missing from the standby.
The missing files are sent using a new replication command TIMELINE_HISTORY,
and stored in standby's pg_xlog directory. Using the timeline history files,
the standby can follow the latest timeline present in the primary
(recovery_target_timeline='latest'), just as it can follow new timelines
appearing in an archive directory.
START_REPLICATION now takes a TIMELINE parameter, to specify exactly which
timeline to stream WAL from. This allows the standby to request the primary
to send over WAL that precedes the promotion. The replication protocol is
changed slightly (in a backwards-compatible way although there's little hope
of streaming replication working across major versions anyway), to allow
replication to stop when the end of timeline reached, putting the walsender
back into accepting a replication command.
Many thanks to Amit Kapila for testing and reviewing various versions of
this patch.
2012-12-13 18:00:00 +01:00
|
|
|
* (in standby mode, one can think of the replay loop as a main loop,
|
|
|
|
* though.)
|
2011-11-02 15:25:01 +01:00
|
|
|
*
|
|
|
|
*
|
2023-01-02 21:00:37 +01:00
|
|
|
* Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
|
2011-11-02 15:25:01 +01:00
|
|
|
*
|
|
|
|
*
|
|
|
|
* IDENTIFICATION
|
|
|
|
* src/backend/postmaster/startup.c
|
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#include "postgres.h"
|
|
|
|
|
|
|
|
#include "access/xlog.h"
|
2022-02-16 08:30:38 +01:00
|
|
|
#include "access/xlogrecovery.h"
|
2021-07-31 08:50:26 +02:00
|
|
|
#include "access/xlogutils.h"
|
2011-11-02 15:25:01 +01:00
|
|
|
#include "libpq/pqsignal.h"
|
|
|
|
#include "miscadmin.h"
|
2017-03-27 04:02:22 +02:00
|
|
|
#include "pgstat.h"
|
2019-12-17 19:14:28 +01:00
|
|
|
#include "postmaster/interrupt.h"
|
2011-11-02 15:25:01 +01:00
|
|
|
#include "postmaster/startup.h"
|
|
|
|
#include "storage/ipc.h"
|
|
|
|
#include "storage/latch.h"
|
|
|
|
#include "storage/pmsignal.h"
|
2019-11-25 22:08:53 +01:00
|
|
|
#include "storage/procsignal.h"
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
#include "storage/standby.h"
|
2011-11-02 15:25:01 +01:00
|
|
|
#include "utils/guc.h"
|
2022-01-11 15:19:59 +01:00
|
|
|
#include "utils/memutils.h"
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
#include "utils/timeout.h"
|
2011-11-02 15:25:01 +01:00
|
|
|
|
|
|
|
|
2021-03-12 07:08:52 +01:00
|
|
|
#ifndef USE_POSTMASTER_DEATH_SIGNAL
|
|
|
|
/*
|
|
|
|
* On systems that need to make a system call to find out if the postmaster has
|
|
|
|
* gone away, we'll do so only every Nth call to HandleStartupProcInterrupts().
|
|
|
|
* This only affects how long it takes us to detect the condition while we're
|
|
|
|
* busy replaying WAL. Latch waits and similar which should react immediately
|
|
|
|
* through the usual techniques.
|
|
|
|
*/
|
|
|
|
#define POSTMASTER_POLL_RATE_LIMIT 1024
|
|
|
|
#endif
|
|
|
|
|
2011-11-02 15:25:01 +01:00
|
|
|
/*
|
|
|
|
* Flags set by interrupt handlers for later service in the redo loop.
|
|
|
|
*/
|
2020-12-17 10:06:51 +01:00
|
|
|
static volatile sig_atomic_t got_SIGHUP = false;
|
2011-11-02 15:25:01 +01:00
|
|
|
static volatile sig_atomic_t shutdown_requested = false;
|
2020-03-24 04:46:48 +01:00
|
|
|
static volatile sig_atomic_t promote_signaled = false;
|
2011-11-02 15:25:01 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Flag set when executing a restore command, to tell SIGTERM signal handler
|
|
|
|
* that it's safe to just proc_exit.
|
|
|
|
*/
|
|
|
|
static volatile sig_atomic_t in_restore_command = false;
|
|
|
|
|
Report progress of startup operations that take a long time.
Users sometimes get concerned whe they start the server and it
emits a few messages and then doesn't emit any more messages for
a long time. Generally, what's happening is either that the
system is taking a long time to apply WAL, or it's taking a
long time to reset unlogged relations, or it's taking a long
time to fsync the data directory, but it's not easy to tell
which is the case.
To fix that, add a new 'log_startup_progress_interval' setting,
by default 10s. When an operation that is known to be potentially
long-running takes more than this amount of time, we'll log a
status update each time this interval elapses.
To avoid undesirable log chatter, don't log anything about WAL
replay when in standby mode.
Nitin Jadhav and Robert Haas, reviewed by Amul Sul, Bharath
Rupireddy, Justin Pryzby, Michael Paquier, and Álvaro Herrera.
Discussion: https://postgr.es/m/CA+TgmoaHQrgDFOBwgY16XCoMtXxsrVGFB2jNCvb7-ubuEe1MGg@mail.gmail.com
Discussion: https://postgr.es/m/CAMm1aWaHF7VE69572_OLQ+MgpT5RUiUDgF1x5RrtkJBLdpRj3Q@mail.gmail.com
2021-10-25 17:51:57 +02:00
|
|
|
/*
|
|
|
|
* Time at which the most recent startup operation started.
|
|
|
|
*/
|
|
|
|
static TimestampTz startup_progress_phase_start_time;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Indicates whether the startup progress interval mentioned by the user is
|
|
|
|
* elapsed or not. TRUE if timeout occurred, FALSE otherwise.
|
|
|
|
*/
|
|
|
|
static volatile sig_atomic_t startup_progress_timer_expired = false;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Time between progress updates for long-running startup operations.
|
|
|
|
*/
|
|
|
|
int log_startup_progress_interval = 10000; /* 10 sec */
|
|
|
|
|
2011-11-02 15:25:01 +01:00
|
|
|
/* Signal handlers */
|
|
|
|
static void StartupProcTriggerHandler(SIGNAL_ARGS);
|
2020-12-17 10:06:51 +01:00
|
|
|
static void StartupProcSigHupHandler(SIGNAL_ARGS);
|
2011-11-02 15:25:01 +01:00
|
|
|
|
2021-04-05 19:25:37 +02:00
|
|
|
/* Callbacks */
|
|
|
|
static void StartupProcExit(int code, Datum arg);
|
|
|
|
|
2011-11-02 15:25:01 +01:00
|
|
|
|
|
|
|
/* --------------------------------
|
|
|
|
* signal handler routines
|
|
|
|
* --------------------------------
|
|
|
|
*/
|
|
|
|
|
|
|
|
/* SIGUSR2: set flag to finish recovery */
|
|
|
|
static void
|
|
|
|
StartupProcTriggerHandler(SIGNAL_ARGS)
|
|
|
|
{
|
|
|
|
int save_errno = errno;
|
|
|
|
|
2020-03-24 04:46:48 +01:00
|
|
|
promote_signaled = true;
|
2020-12-17 10:06:51 +01:00
|
|
|
WakeupRecovery();
|
|
|
|
|
|
|
|
errno = save_errno;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* SIGHUP: set flag to re-read config file at next convenient time */
|
|
|
|
static void
|
|
|
|
StartupProcSigHupHandler(SIGNAL_ARGS)
|
|
|
|
{
|
|
|
|
int save_errno = errno;
|
|
|
|
|
|
|
|
got_SIGHUP = true;
|
|
|
|
WakeupRecovery();
|
2011-11-02 15:25:01 +01:00
|
|
|
|
|
|
|
errno = save_errno;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* SIGTERM: set flag to abort redo and exit */
|
|
|
|
static void
|
|
|
|
StartupProcShutdownHandler(SIGNAL_ARGS)
|
|
|
|
{
|
|
|
|
int save_errno = errno;
|
|
|
|
|
|
|
|
if (in_restore_command)
|
|
|
|
proc_exit(1);
|
|
|
|
else
|
|
|
|
shutdown_requested = true;
|
2020-12-17 10:06:51 +01:00
|
|
|
WakeupRecovery();
|
2011-11-02 15:25:01 +01:00
|
|
|
|
|
|
|
errno = save_errno;
|
|
|
|
}
|
|
|
|
|
2020-03-27 23:43:41 +01:00
|
|
|
/*
|
|
|
|
* Re-read the config file.
|
|
|
|
*
|
|
|
|
* If one of the critical walreceiver options has changed, flag xlog.c
|
|
|
|
* to restart it.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
StartupRereadConfig(void)
|
|
|
|
{
|
|
|
|
char *conninfo = pstrdup(PrimaryConnInfo);
|
|
|
|
char *slotname = pstrdup(PrimarySlotName);
|
|
|
|
bool tempSlot = wal_receiver_create_temp_slot;
|
|
|
|
bool conninfoChanged;
|
|
|
|
bool slotnameChanged;
|
|
|
|
bool tempSlotChanged = false;
|
|
|
|
|
|
|
|
ProcessConfigFile(PGC_SIGHUP);
|
|
|
|
|
|
|
|
conninfoChanged = strcmp(conninfo, PrimaryConnInfo) != 0;
|
|
|
|
slotnameChanged = strcmp(slotname, PrimarySlotName) != 0;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* wal_receiver_create_temp_slot is used only when we have no slot
|
|
|
|
* configured. We do not need to track this change if it has no effect.
|
|
|
|
*/
|
|
|
|
if (!slotnameChanged && strcmp(PrimarySlotName, "") == 0)
|
|
|
|
tempSlotChanged = tempSlot != wal_receiver_create_temp_slot;
|
|
|
|
pfree(conninfo);
|
|
|
|
pfree(slotname);
|
|
|
|
|
|
|
|
if (conninfoChanged || slotnameChanged || tempSlotChanged)
|
|
|
|
StartupRequestWalReceiverRestart();
|
|
|
|
}
|
|
|
|
|
2019-12-19 20:56:20 +01:00
|
|
|
/* Handle various signals that might be sent to the startup process */
|
2011-11-02 15:25:01 +01:00
|
|
|
void
|
|
|
|
HandleStartupProcInterrupts(void)
|
|
|
|
{
|
2021-03-12 07:08:52 +01:00
|
|
|
#ifdef POSTMASTER_POLL_RATE_LIMIT
|
|
|
|
static uint32 postmaster_poll_count = 0;
|
|
|
|
#endif
|
|
|
|
|
2011-11-02 15:25:01 +01:00
|
|
|
/*
|
2020-03-27 23:43:41 +01:00
|
|
|
* Process any requests or signals received recently.
|
2011-11-02 15:25:01 +01:00
|
|
|
*/
|
2020-12-17 10:06:51 +01:00
|
|
|
if (got_SIGHUP)
|
2011-11-02 15:25:01 +01:00
|
|
|
{
|
2020-12-17 10:06:51 +01:00
|
|
|
got_SIGHUP = false;
|
2020-03-27 23:43:41 +01:00
|
|
|
StartupRereadConfig();
|
2011-11-02 15:25:01 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Check if we were requested to exit without finishing recovery.
|
|
|
|
*/
|
|
|
|
if (shutdown_requested)
|
|
|
|
proc_exit(1);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Emergency bailout if postmaster has died. This is to avoid the
|
2021-03-12 07:08:52 +01:00
|
|
|
* necessity for manual cleanup of all postmaster children. Do this less
|
|
|
|
* frequently on systems for which we don't have signals to make that
|
|
|
|
* cheap.
|
2011-11-02 15:25:01 +01:00
|
|
|
*/
|
2021-03-12 07:08:52 +01:00
|
|
|
if (IsUnderPostmaster &&
|
|
|
|
#ifdef POSTMASTER_POLL_RATE_LIMIT
|
|
|
|
postmaster_poll_count++ % POSTMASTER_POLL_RATE_LIMIT == 0 &&
|
|
|
|
#endif
|
|
|
|
!PostmasterIsAlive())
|
2011-11-02 15:25:01 +01:00
|
|
|
exit(1);
|
2019-12-19 20:56:20 +01:00
|
|
|
|
|
|
|
/* Process barrier events */
|
|
|
|
if (ProcSignalBarrierPending)
|
|
|
|
ProcessProcSignalBarrier();
|
2022-01-11 15:19:59 +01:00
|
|
|
|
|
|
|
/* Perform logging of memory contexts of this process */
|
|
|
|
if (LogMemoryContextPending)
|
|
|
|
ProcessLogMemoryContextInterrupt();
|
2011-11-02 15:25:01 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
|
2021-04-05 19:25:37 +02:00
|
|
|
/* --------------------------------
|
|
|
|
* signal handler routines
|
|
|
|
* --------------------------------
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
StartupProcExit(int code, Datum arg)
|
|
|
|
{
|
|
|
|
/* Shutdown the recovery environment */
|
|
|
|
if (standbyState != STANDBY_DISABLED)
|
|
|
|
ShutdownRecoveryTransactionEnvironment();
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2011-11-02 15:25:01 +01:00
|
|
|
/* ----------------------------------
|
|
|
|
* Startup Process main entry point
|
|
|
|
* ----------------------------------
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
StartupProcessMain(void)
|
|
|
|
{
|
2021-04-05 19:25:37 +02:00
|
|
|
/* Arrange to clean up at startup process exit */
|
|
|
|
on_shmem_exit(StartupProcExit, 0);
|
|
|
|
|
2011-11-02 15:25:01 +01:00
|
|
|
/*
|
|
|
|
* Properly accept or ignore signals the postmaster might send us.
|
|
|
|
*/
|
2020-12-17 10:06:51 +01:00
|
|
|
pqsignal(SIGHUP, StartupProcSigHupHandler); /* reload config file */
|
2011-11-02 15:25:01 +01:00
|
|
|
pqsignal(SIGINT, SIG_IGN); /* ignore query cancel */
|
|
|
|
pqsignal(SIGTERM, StartupProcShutdownHandler); /* request shutdown */
|
Centralize setup of SIGQUIT handling for postmaster child processes.
We decided that the policy established in commit 7634bd4f6 for
the bgwriter, checkpointer, walwriter, and walreceiver processes,
namely that they should accept SIGQUIT at all times, really ought
to apply uniformly to all postmaster children. Therefore, get
rid of the duplicative and inconsistent per-process code for
establishing that signal handler and removing SIGQUIT from BlockSig.
Instead, make InitPostmasterChild do it.
The handler set up by InitPostmasterChild is SignalHandlerForCrashExit,
which just summarily does _exit(2). In interactive backends, we
almost immediately replace that with quickdie, since we would prefer
to try to tell the client that we're dying. However, this patch is
changing the behavior of autovacuum (both launcher and workers), as
well as walsenders. Those processes formerly also used quickdie,
but AFAICS that was just mindless copy-and-paste: they don't have
any interactive client that's likely to benefit from being told this.
The stats collector continues to be an outlier, in that it thinks
SIGQUIT means normal exit. That should probably be changed for
consistency, but there's another patch set where that's being
dealt with, so I didn't do so here.
Discussion: https://postgr.es/m/644875.1599933441@sss.pgh.pa.us
2020-09-16 22:04:36 +02:00
|
|
|
/* SIGQUIT handler was already set up by InitPostmasterChild */
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
InitializeTimeouts(); /* establishes SIGALRM handler */
|
2011-11-02 15:25:01 +01:00
|
|
|
pqsignal(SIGPIPE, SIG_IGN);
|
2019-11-25 22:08:53 +01:00
|
|
|
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
|
2011-11-02 15:25:01 +01:00
|
|
|
pqsignal(SIGUSR2, StartupProcTriggerHandler);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Reset some signals that are accepted by postmaster but not here
|
|
|
|
*/
|
|
|
|
pqsignal(SIGCHLD, SIG_DFL);
|
|
|
|
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
/*
|
|
|
|
* Register timeouts needed for standby mode
|
|
|
|
*/
|
|
|
|
RegisterTimeout(STANDBY_DEADLOCK_TIMEOUT, StandbyDeadLockHandler);
|
|
|
|
RegisterTimeout(STANDBY_TIMEOUT, StandbyTimeoutHandler);
|
2016-03-10 20:26:24 +01:00
|
|
|
RegisterTimeout(STANDBY_LOCK_TIMEOUT, StandbyLockTimeoutHandler);
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
|
2011-11-02 15:25:01 +01:00
|
|
|
/*
|
|
|
|
* Unblock signals (they were blocked when the postmaster forked us)
|
|
|
|
*/
|
2023-02-02 22:34:56 +01:00
|
|
|
sigprocmask(SIG_SETMASK, &UnBlockSig, NULL);
|
2011-11-02 15:25:01 +01:00
|
|
|
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
/*
|
|
|
|
* Do what we came for.
|
|
|
|
*/
|
2011-11-02 15:25:01 +01:00
|
|
|
StartupXLOG();
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Exit normally. Exit code 0 tells postmaster that we completed recovery
|
|
|
|
* successfully.
|
|
|
|
*/
|
|
|
|
proc_exit(0);
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
PreRestoreCommand(void)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Set in_restore_command to tell the signal handler that we should exit
|
|
|
|
* right away on SIGTERM. We know that we're at a safe point to do that.
|
|
|
|
* Check if we had already received the signal, so that we don't miss a
|
|
|
|
* shutdown request received just before this.
|
|
|
|
*/
|
|
|
|
in_restore_command = true;
|
|
|
|
if (shutdown_requested)
|
|
|
|
proc_exit(1);
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
PostRestoreCommand(void)
|
|
|
|
{
|
|
|
|
in_restore_command = false;
|
|
|
|
}
|
|
|
|
|
|
|
|
bool
|
2020-03-24 04:46:48 +01:00
|
|
|
IsPromoteSignaled(void)
|
2011-11-02 15:25:01 +01:00
|
|
|
{
|
2020-03-24 04:46:48 +01:00
|
|
|
return promote_signaled;
|
2011-11-02 15:25:01 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2020-03-24 04:46:48 +01:00
|
|
|
ResetPromoteSignaled(void)
|
2011-11-02 15:25:01 +01:00
|
|
|
{
|
2020-03-24 04:46:48 +01:00
|
|
|
promote_signaled = false;
|
2011-11-02 15:25:01 +01:00
|
|
|
}
|
Report progress of startup operations that take a long time.
Users sometimes get concerned whe they start the server and it
emits a few messages and then doesn't emit any more messages for
a long time. Generally, what's happening is either that the
system is taking a long time to apply WAL, or it's taking a
long time to reset unlogged relations, or it's taking a long
time to fsync the data directory, but it's not easy to tell
which is the case.
To fix that, add a new 'log_startup_progress_interval' setting,
by default 10s. When an operation that is known to be potentially
long-running takes more than this amount of time, we'll log a
status update each time this interval elapses.
To avoid undesirable log chatter, don't log anything about WAL
replay when in standby mode.
Nitin Jadhav and Robert Haas, reviewed by Amul Sul, Bharath
Rupireddy, Justin Pryzby, Michael Paquier, and Álvaro Herrera.
Discussion: https://postgr.es/m/CA+TgmoaHQrgDFOBwgY16XCoMtXxsrVGFB2jNCvb7-ubuEe1MGg@mail.gmail.com
Discussion: https://postgr.es/m/CAMm1aWaHF7VE69572_OLQ+MgpT5RUiUDgF1x5RrtkJBLdpRj3Q@mail.gmail.com
2021-10-25 17:51:57 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Set a flag indicating that it's time to log a progress report.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
startup_progress_timeout_handler(void)
|
|
|
|
{
|
|
|
|
startup_progress_timer_expired = true;
|
|
|
|
}
|
|
|
|
|
2023-02-06 16:51:08 +01:00
|
|
|
void
|
|
|
|
disable_startup_progress_timeout(void)
|
|
|
|
{
|
|
|
|
/* Feature is disabled. */
|
|
|
|
if (log_startup_progress_interval == 0)
|
|
|
|
return;
|
|
|
|
|
|
|
|
disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);
|
|
|
|
startup_progress_timer_expired = false;
|
|
|
|
}
|
|
|
|
|
Report progress of startup operations that take a long time.
Users sometimes get concerned whe they start the server and it
emits a few messages and then doesn't emit any more messages for
a long time. Generally, what's happening is either that the
system is taking a long time to apply WAL, or it's taking a
long time to reset unlogged relations, or it's taking a long
time to fsync the data directory, but it's not easy to tell
which is the case.
To fix that, add a new 'log_startup_progress_interval' setting,
by default 10s. When an operation that is known to be potentially
long-running takes more than this amount of time, we'll log a
status update each time this interval elapses.
To avoid undesirable log chatter, don't log anything about WAL
replay when in standby mode.
Nitin Jadhav and Robert Haas, reviewed by Amul Sul, Bharath
Rupireddy, Justin Pryzby, Michael Paquier, and Álvaro Herrera.
Discussion: https://postgr.es/m/CA+TgmoaHQrgDFOBwgY16XCoMtXxsrVGFB2jNCvb7-ubuEe1MGg@mail.gmail.com
Discussion: https://postgr.es/m/CAMm1aWaHF7VE69572_OLQ+MgpT5RUiUDgF1x5RrtkJBLdpRj3Q@mail.gmail.com
2021-10-25 17:51:57 +02:00
|
|
|
/*
|
|
|
|
* Set the start timestamp of the current operation and enable the timeout.
|
|
|
|
*/
|
|
|
|
void
|
2023-02-06 16:51:08 +01:00
|
|
|
enable_startup_progress_timeout(void)
|
Report progress of startup operations that take a long time.
Users sometimes get concerned whe they start the server and it
emits a few messages and then doesn't emit any more messages for
a long time. Generally, what's happening is either that the
system is taking a long time to apply WAL, or it's taking a
long time to reset unlogged relations, or it's taking a long
time to fsync the data directory, but it's not easy to tell
which is the case.
To fix that, add a new 'log_startup_progress_interval' setting,
by default 10s. When an operation that is known to be potentially
long-running takes more than this amount of time, we'll log a
status update each time this interval elapses.
To avoid undesirable log chatter, don't log anything about WAL
replay when in standby mode.
Nitin Jadhav and Robert Haas, reviewed by Amul Sul, Bharath
Rupireddy, Justin Pryzby, Michael Paquier, and Álvaro Herrera.
Discussion: https://postgr.es/m/CA+TgmoaHQrgDFOBwgY16XCoMtXxsrVGFB2jNCvb7-ubuEe1MGg@mail.gmail.com
Discussion: https://postgr.es/m/CAMm1aWaHF7VE69572_OLQ+MgpT5RUiUDgF1x5RrtkJBLdpRj3Q@mail.gmail.com
2021-10-25 17:51:57 +02:00
|
|
|
{
|
|
|
|
TimestampTz fin_time;
|
|
|
|
|
|
|
|
/* Feature is disabled. */
|
|
|
|
if (log_startup_progress_interval == 0)
|
|
|
|
return;
|
|
|
|
|
|
|
|
startup_progress_phase_start_time = GetCurrentTimestamp();
|
|
|
|
fin_time = TimestampTzPlusMilliseconds(startup_progress_phase_start_time,
|
|
|
|
log_startup_progress_interval);
|
|
|
|
enable_timeout_every(STARTUP_PROGRESS_TIMEOUT, fin_time,
|
|
|
|
log_startup_progress_interval);
|
|
|
|
}
|
|
|
|
|
2023-02-06 16:51:08 +01:00
|
|
|
/*
|
|
|
|
* A thin wrapper to first disable and then enable the startup progress
|
|
|
|
* timeout.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
begin_startup_progress_phase(void)
|
|
|
|
{
|
|
|
|
/* Feature is disabled. */
|
|
|
|
if (log_startup_progress_interval == 0)
|
|
|
|
return;
|
|
|
|
|
|
|
|
disable_startup_progress_timeout();
|
|
|
|
enable_startup_progress_timeout();
|
|
|
|
}
|
|
|
|
|
Report progress of startup operations that take a long time.
Users sometimes get concerned whe they start the server and it
emits a few messages and then doesn't emit any more messages for
a long time. Generally, what's happening is either that the
system is taking a long time to apply WAL, or it's taking a
long time to reset unlogged relations, or it's taking a long
time to fsync the data directory, but it's not easy to tell
which is the case.
To fix that, add a new 'log_startup_progress_interval' setting,
by default 10s. When an operation that is known to be potentially
long-running takes more than this amount of time, we'll log a
status update each time this interval elapses.
To avoid undesirable log chatter, don't log anything about WAL
replay when in standby mode.
Nitin Jadhav and Robert Haas, reviewed by Amul Sul, Bharath
Rupireddy, Justin Pryzby, Michael Paquier, and Álvaro Herrera.
Discussion: https://postgr.es/m/CA+TgmoaHQrgDFOBwgY16XCoMtXxsrVGFB2jNCvb7-ubuEe1MGg@mail.gmail.com
Discussion: https://postgr.es/m/CAMm1aWaHF7VE69572_OLQ+MgpT5RUiUDgF1x5RrtkJBLdpRj3Q@mail.gmail.com
2021-10-25 17:51:57 +02:00
|
|
|
/*
|
|
|
|
* Report whether startup progress timeout has occurred. Reset the timer flag
|
|
|
|
* if it did, set the elapsed time to the out parameters and return true,
|
|
|
|
* otherwise return false.
|
|
|
|
*/
|
|
|
|
bool
|
|
|
|
has_startup_progress_timeout_expired(long *secs, int *usecs)
|
|
|
|
{
|
|
|
|
long seconds;
|
|
|
|
int useconds;
|
|
|
|
TimestampTz now;
|
|
|
|
|
|
|
|
/* No timeout has occurred. */
|
|
|
|
if (!startup_progress_timer_expired)
|
|
|
|
return false;
|
|
|
|
|
|
|
|
/* Calculate the elapsed time. */
|
|
|
|
now = GetCurrentTimestamp();
|
|
|
|
TimestampDifference(startup_progress_phase_start_time, now, &seconds, &useconds);
|
|
|
|
|
|
|
|
*secs = seconds;
|
|
|
|
*usecs = useconds;
|
|
|
|
startup_progress_timer_expired = false;
|
|
|
|
|
|
|
|
return true;
|
|
|
|
}
|