Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* timeout.c
|
|
|
|
* Routines to multiplex SIGALRM interrupts for multiple timeout reasons.
|
|
|
|
*
|
2015-01-06 17:43:47 +01:00
|
|
|
* Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
*
|
|
|
|
*
|
|
|
|
* IDENTIFICATION
|
|
|
|
* src/backend/utils/misc/timeout.c
|
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#include "postgres.h"
|
|
|
|
|
|
|
|
#include <sys/time.h>
|
|
|
|
|
Fix assorted race conditions in the new timeout infrastructure.
Prevent handle_sig_alarm from losing control partway through due to a query
cancel (either an asynchronous SIGINT, or a cancel triggered by one of the
timeout handler functions). That would at least result in failure to
schedule any required future interrupt, and might result in actual
corruption of timeout.c's data structures, if the interrupt happened while
we were updating those.
We could still lose control if an asynchronous SIGINT arrives just as the
function is entered. This wouldn't break any data structures, but it would
have the same effect as if the SIGALRM interrupt had been silently lost:
we'd not fire any currently-due handlers, nor schedule any new interrupt.
To forestall that scenario, forcibly reschedule any pending timer interrupt
during AbortTransaction and AbortSubTransaction. We can avoid any extra
kernel call in most cases by not doing that until we've allowed
LockErrorCleanup to kill the DEADLOCK_TIMEOUT and LOCK_TIMEOUT events.
Another hazard is that some platforms (at least Linux and *BSD) block a
signal before calling its handler and then unblock it on return. When we
longjmp out of the handler, the unblock doesn't happen, and the signal is
left blocked indefinitely. Again, we can fix that by forcibly unblocking
signals during AbortTransaction and AbortSubTransaction.
These latter two problems do not manifest when the longjmp reaches
postgres.c, because the error recovery code there kills all pending timeout
events anyway, and it uses sigsetjmp(..., 1) so that the appropriate signal
mask is restored. So errors thrown outside any transaction should be OK
already, and cleaning up in AbortTransaction and AbortSubTransaction should
be enough to fix these issues. (We're assuming that any code that catches
a query cancel error and doesn't re-throw it will do at least a
subtransaction abort to clean up; but that was pretty much required already
by other subsystems.)
Lastly, ProcSleep should not clear the LOCK_TIMEOUT indicator flag when
disabling that event: if a lock timeout interrupt happened after the lock
was granted, the ensuing query cancel is still going to happen at the next
CHECK_FOR_INTERRUPTS, and we want to report it as a lock timeout not a user
cancel.
Per reports from Dan Wood.
Back-patch to 9.3 where the new timeout handling infrastructure was
introduced. We may at some point decide to back-patch the signal
unblocking changes further, but I'll desist from that until we hear
actual field complaints about it.
2013-11-29 22:41:00 +01:00
|
|
|
#include "miscadmin.h"
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
#include "storage/proc.h"
|
|
|
|
#include "utils/timeout.h"
|
|
|
|
#include "utils/timestamp.h"
|
|
|
|
|
|
|
|
|
|
|
|
/* Data about any one timeout reason */
|
|
|
|
typedef struct timeout_params
|
|
|
|
{
|
|
|
|
TimeoutId index; /* identifier of timeout reason */
|
|
|
|
|
|
|
|
/* volatile because it may be changed from the signal handler */
|
|
|
|
volatile bool indicator; /* true if timeout has occurred */
|
|
|
|
|
|
|
|
/* callback function for timeout, or NULL if timeout not registered */
|
2013-03-17 04:22:17 +01:00
|
|
|
timeout_handler_proc timeout_handler;
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
|
|
|
|
TimestampTz start_time; /* time that timeout was last activated */
|
|
|
|
TimestampTz fin_time; /* if active, time it is due to fire */
|
|
|
|
} timeout_params;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* List of possible timeout reasons in the order of enum TimeoutId.
|
|
|
|
*/
|
|
|
|
static timeout_params all_timeouts[MAX_TIMEOUTS];
|
|
|
|
static bool all_timeouts_initialized = false;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* List of active timeouts ordered by their fin_time and priority.
|
|
|
|
* This list is subject to change by the interrupt handler, so it's volatile.
|
|
|
|
*/
|
|
|
|
static volatile int num_active_timeouts = 0;
|
|
|
|
static timeout_params *volatile active_timeouts[MAX_TIMEOUTS];
|
|
|
|
|
2013-03-18 03:42:19 +01:00
|
|
|
/*
|
|
|
|
* Flag controlling whether the signal handler is allowed to do anything.
|
|
|
|
* We leave this "false" when we're not expecting interrupts, just in case.
|
|
|
|
*
|
|
|
|
* Note that we don't bother to reset any pending timer interrupt when we
|
|
|
|
* disable the signal handler; it's not really worth the cycles to do so,
|
|
|
|
* since the probability of the interrupt actually occurring while we have
|
2014-05-06 18:12:18 +02:00
|
|
|
* it disabled is low. See comments in schedule_alarm() about that.
|
2013-03-18 03:42:19 +01:00
|
|
|
*/
|
|
|
|
static volatile sig_atomic_t alarm_enabled = false;
|
|
|
|
|
|
|
|
#define disable_alarm() (alarm_enabled = false)
|
|
|
|
#define enable_alarm() (alarm_enabled = true)
|
|
|
|
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
|
|
|
|
/*****************************************************************************
|
|
|
|
* Internal helper functions
|
|
|
|
*
|
|
|
|
* For all of these, it is caller's responsibility to protect them from
|
2014-05-06 18:12:18 +02:00
|
|
|
* interruption by the signal handler. Generally, call disable_alarm()
|
2013-03-17 04:22:17 +01:00
|
|
|
* first to prevent interruption, then update state, and last call
|
2013-03-18 03:42:19 +01:00
|
|
|
* schedule_alarm(), which will re-enable the signal handler if needed.
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
*****************************************************************************/
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Find the index of a given timeout reason in the active array.
|
|
|
|
* If it's not there, return -1.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
find_active_timeout(TimeoutId id)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for (i = 0; i < num_active_timeouts; i++)
|
|
|
|
{
|
|
|
|
if (active_timeouts[i]->index == id)
|
|
|
|
return i;
|
|
|
|
}
|
|
|
|
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Insert specified timeout reason into the list of active timeouts
|
|
|
|
* at the given index.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
insert_timeout(TimeoutId id, int index)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
if (index < 0 || index > num_active_timeouts)
|
|
|
|
elog(FATAL, "timeout index %d out of range 0..%d", index,
|
|
|
|
num_active_timeouts);
|
|
|
|
|
|
|
|
for (i = num_active_timeouts - 1; i >= index; i--)
|
|
|
|
active_timeouts[i + 1] = active_timeouts[i];
|
|
|
|
|
|
|
|
active_timeouts[index] = &all_timeouts[id];
|
|
|
|
|
|
|
|
num_active_timeouts++;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Remove the index'th element from the timeout list.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
remove_timeout_index(int index)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
if (index < 0 || index >= num_active_timeouts)
|
|
|
|
elog(FATAL, "timeout index %d out of range 0..%d", index,
|
|
|
|
num_active_timeouts - 1);
|
|
|
|
|
|
|
|
for (i = index + 1; i < num_active_timeouts; i++)
|
|
|
|
active_timeouts[i - 1] = active_timeouts[i];
|
|
|
|
|
|
|
|
num_active_timeouts--;
|
|
|
|
}
|
|
|
|
|
2013-03-17 04:22:17 +01:00
|
|
|
/*
|
|
|
|
* Enable the specified timeout reason
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
enable_timeout(TimeoutId id, TimestampTz now, TimestampTz fin_time)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
/* Assert request is sane */
|
|
|
|
Assert(all_timeouts_initialized);
|
|
|
|
Assert(all_timeouts[id].timeout_handler != NULL);
|
|
|
|
|
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* If this timeout was already active, momentarily disable it. We
|
2013-03-17 04:22:17 +01:00
|
|
|
* interpret the call as a directive to reschedule the timeout.
|
|
|
|
*/
|
|
|
|
i = find_active_timeout(id);
|
|
|
|
if (i >= 0)
|
|
|
|
remove_timeout_index(i);
|
|
|
|
|
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* Find out the index where to insert the new timeout. We sort by
|
2013-03-17 04:22:17 +01:00
|
|
|
* fin_time, and for equal fin_time by priority.
|
|
|
|
*/
|
|
|
|
for (i = 0; i < num_active_timeouts; i++)
|
|
|
|
{
|
|
|
|
timeout_params *old_timeout = active_timeouts[i];
|
|
|
|
|
|
|
|
if (fin_time < old_timeout->fin_time)
|
|
|
|
break;
|
|
|
|
if (fin_time == old_timeout->fin_time && id < old_timeout->index)
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Mark the timeout active, and insert it into the active list.
|
|
|
|
*/
|
|
|
|
all_timeouts[id].indicator = false;
|
|
|
|
all_timeouts[id].start_time = now;
|
|
|
|
all_timeouts[id].fin_time = fin_time;
|
2013-03-18 03:42:19 +01:00
|
|
|
|
2013-03-17 04:22:17 +01:00
|
|
|
insert_timeout(id, i);
|
|
|
|
}
|
|
|
|
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
/*
|
|
|
|
* Schedule alarm for the next active timeout, if any
|
|
|
|
*
|
|
|
|
* We assume the caller has obtained the current time, or a close-enough
|
|
|
|
* approximation.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
schedule_alarm(TimestampTz now)
|
|
|
|
{
|
|
|
|
if (num_active_timeouts > 0)
|
|
|
|
{
|
|
|
|
struct itimerval timeval;
|
|
|
|
long secs;
|
|
|
|
int usecs;
|
|
|
|
|
|
|
|
MemSet(&timeval, 0, sizeof(struct itimerval));
|
|
|
|
|
|
|
|
/* Get the time remaining till the nearest pending timeout */
|
|
|
|
TimestampDifference(now, active_timeouts[0]->fin_time,
|
|
|
|
&secs, &usecs);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* It's possible that the difference is less than a microsecond;
|
|
|
|
* ensure we don't cancel, rather than set, the interrupt.
|
|
|
|
*/
|
|
|
|
if (secs == 0 && usecs == 0)
|
|
|
|
usecs = 1;
|
|
|
|
|
|
|
|
timeval.it_value.tv_sec = secs;
|
|
|
|
timeval.it_value.tv_usec = usecs;
|
|
|
|
|
2013-03-18 03:42:19 +01:00
|
|
|
/*
|
|
|
|
* We must enable the signal handler before calling setitimer(); if we
|
|
|
|
* did it in the other order, we'd have a race condition wherein the
|
|
|
|
* interrupt could occur before we can set alarm_enabled, so that the
|
|
|
|
* signal handler would fail to do anything.
|
|
|
|
*
|
|
|
|
* Because we didn't bother to reset the timer in disable_alarm(),
|
|
|
|
* it's possible that a previously-set interrupt will fire between
|
2014-05-06 18:12:18 +02:00
|
|
|
* enable_alarm() and setitimer(). This is safe, however. There are
|
2013-03-18 03:42:19 +01:00
|
|
|
* two possible outcomes:
|
|
|
|
*
|
|
|
|
* 1. The signal handler finds nothing to do (because the nearest
|
|
|
|
* timeout event is still in the future). It will re-set the timer
|
2014-05-06 18:12:18 +02:00
|
|
|
* and return. Then we'll overwrite the timer value with a new one.
|
2013-03-18 03:42:19 +01:00
|
|
|
* This will mean that the timer fires a little later than we
|
|
|
|
* intended, but only by the amount of time it takes for the signal
|
|
|
|
* handler to do nothing useful, which shouldn't be much.
|
|
|
|
*
|
|
|
|
* 2. The signal handler executes and removes one or more timeout
|
2014-05-06 18:12:18 +02:00
|
|
|
* events. When it returns, either the queue is now empty or the
|
2013-03-18 03:42:19 +01:00
|
|
|
* frontmost event is later than the one we looked at above. So we'll
|
|
|
|
* overwrite the timer value with one that is too soon (plus or minus
|
|
|
|
* the signal handler's execution time), causing a useless interrupt
|
|
|
|
* to occur. But the handler will then re-set the timer and
|
|
|
|
* everything will still work as expected.
|
|
|
|
*
|
|
|
|
* Since these cases are of very low probability (the window here
|
|
|
|
* being quite narrow), it's not worth adding cycles to the mainline
|
|
|
|
* code to prevent occasional wasted interrupts.
|
|
|
|
*/
|
|
|
|
enable_alarm();
|
|
|
|
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
/* Set the alarm timer */
|
|
|
|
if (setitimer(ITIMER_REAL, &timeval, NULL) != 0)
|
|
|
|
elog(FATAL, "could not enable SIGALRM timer: %m");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/*****************************************************************************
|
|
|
|
* Signal handler
|
|
|
|
*****************************************************************************/
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Signal handler for SIGALRM
|
|
|
|
*
|
|
|
|
* Process any active timeout reasons and then reschedule the interrupt
|
|
|
|
* as needed.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
handle_sig_alarm(SIGNAL_ARGS)
|
|
|
|
{
|
|
|
|
int save_errno = errno;
|
2013-12-13 17:50:15 +01:00
|
|
|
bool save_ImmediateInterruptOK = ImmediateInterruptOK;
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
|
Fix assorted race conditions in the new timeout infrastructure.
Prevent handle_sig_alarm from losing control partway through due to a query
cancel (either an asynchronous SIGINT, or a cancel triggered by one of the
timeout handler functions). That would at least result in failure to
schedule any required future interrupt, and might result in actual
corruption of timeout.c's data structures, if the interrupt happened while
we were updating those.
We could still lose control if an asynchronous SIGINT arrives just as the
function is entered. This wouldn't break any data structures, but it would
have the same effect as if the SIGALRM interrupt had been silently lost:
we'd not fire any currently-due handlers, nor schedule any new interrupt.
To forestall that scenario, forcibly reschedule any pending timer interrupt
during AbortTransaction and AbortSubTransaction. We can avoid any extra
kernel call in most cases by not doing that until we've allowed
LockErrorCleanup to kill the DEADLOCK_TIMEOUT and LOCK_TIMEOUT events.
Another hazard is that some platforms (at least Linux and *BSD) block a
signal before calling its handler and then unblock it on return. When we
longjmp out of the handler, the unblock doesn't happen, and the signal is
left blocked indefinitely. Again, we can fix that by forcibly unblocking
signals during AbortTransaction and AbortSubTransaction.
These latter two problems do not manifest when the longjmp reaches
postgres.c, because the error recovery code there kills all pending timeout
events anyway, and it uses sigsetjmp(..., 1) so that the appropriate signal
mask is restored. So errors thrown outside any transaction should be OK
already, and cleaning up in AbortTransaction and AbortSubTransaction should
be enough to fix these issues. (We're assuming that any code that catches
a query cancel error and doesn't re-throw it will do at least a
subtransaction abort to clean up; but that was pretty much required already
by other subsystems.)
Lastly, ProcSleep should not clear the LOCK_TIMEOUT indicator flag when
disabling that event: if a lock timeout interrupt happened after the lock
was granted, the ensuing query cancel is still going to happen at the next
CHECK_FOR_INTERRUPTS, and we want to report it as a lock timeout not a user
cancel.
Per reports from Dan Wood.
Back-patch to 9.3 where the new timeout handling infrastructure was
introduced. We may at some point decide to back-patch the signal
unblocking changes further, but I'll desist from that until we hear
actual field complaints about it.
2013-11-29 22:41:00 +01:00
|
|
|
/*
|
|
|
|
* We may be executing while ImmediateInterruptOK is true (e.g., when
|
|
|
|
* mainline is waiting for a lock). If SIGINT or similar arrives while
|
|
|
|
* this code is running, we'd lose control and perhaps leave our data
|
2013-12-13 17:50:15 +01:00
|
|
|
* structures in an inconsistent state. Disable immediate interrupts, and
|
2014-05-06 18:12:18 +02:00
|
|
|
* just to be real sure, bump the holdoff counter as well. (The reason
|
2013-12-13 17:50:15 +01:00
|
|
|
* for this belt-and-suspenders-too approach is to make sure that nothing
|
|
|
|
* bad happens if a timeout handler calls code that manipulates
|
|
|
|
* ImmediateInterruptOK.)
|
Fix assorted race conditions in the new timeout infrastructure.
Prevent handle_sig_alarm from losing control partway through due to a query
cancel (either an asynchronous SIGINT, or a cancel triggered by one of the
timeout handler functions). That would at least result in failure to
schedule any required future interrupt, and might result in actual
corruption of timeout.c's data structures, if the interrupt happened while
we were updating those.
We could still lose control if an asynchronous SIGINT arrives just as the
function is entered. This wouldn't break any data structures, but it would
have the same effect as if the SIGALRM interrupt had been silently lost:
we'd not fire any currently-due handlers, nor schedule any new interrupt.
To forestall that scenario, forcibly reschedule any pending timer interrupt
during AbortTransaction and AbortSubTransaction. We can avoid any extra
kernel call in most cases by not doing that until we've allowed
LockErrorCleanup to kill the DEADLOCK_TIMEOUT and LOCK_TIMEOUT events.
Another hazard is that some platforms (at least Linux and *BSD) block a
signal before calling its handler and then unblock it on return. When we
longjmp out of the handler, the unblock doesn't happen, and the signal is
left blocked indefinitely. Again, we can fix that by forcibly unblocking
signals during AbortTransaction and AbortSubTransaction.
These latter two problems do not manifest when the longjmp reaches
postgres.c, because the error recovery code there kills all pending timeout
events anyway, and it uses sigsetjmp(..., 1) so that the appropriate signal
mask is restored. So errors thrown outside any transaction should be OK
already, and cleaning up in AbortTransaction and AbortSubTransaction should
be enough to fix these issues. (We're assuming that any code that catches
a query cancel error and doesn't re-throw it will do at least a
subtransaction abort to clean up; but that was pretty much required already
by other subsystems.)
Lastly, ProcSleep should not clear the LOCK_TIMEOUT indicator flag when
disabling that event: if a lock timeout interrupt happened after the lock
was granted, the ensuing query cancel is still going to happen at the next
CHECK_FOR_INTERRUPTS, and we want to report it as a lock timeout not a user
cancel.
Per reports from Dan Wood.
Back-patch to 9.3 where the new timeout handling infrastructure was
introduced. We may at some point decide to back-patch the signal
unblocking changes further, but I'll desist from that until we hear
actual field complaints about it.
2013-11-29 22:41:00 +01:00
|
|
|
*
|
2013-12-13 17:50:15 +01:00
|
|
|
* Note: it's possible for a SIGINT to interrupt handle_sig_alarm before
|
|
|
|
* we manage to do this; the net effect would be as if the SIGALRM event
|
2014-05-06 18:12:18 +02:00
|
|
|
* had been silently lost. Therefore error recovery must include some
|
2013-12-13 17:50:15 +01:00
|
|
|
* action that will allow any lost interrupt to be rescheduled. Disabling
|
|
|
|
* some or all timeouts is sufficient, or if that's not appropriate,
|
|
|
|
* reschedule_timeouts() can be called. Also, the signal blocking hazard
|
|
|
|
* described below applies here too.
|
Fix assorted race conditions in the new timeout infrastructure.
Prevent handle_sig_alarm from losing control partway through due to a query
cancel (either an asynchronous SIGINT, or a cancel triggered by one of the
timeout handler functions). That would at least result in failure to
schedule any required future interrupt, and might result in actual
corruption of timeout.c's data structures, if the interrupt happened while
we were updating those.
We could still lose control if an asynchronous SIGINT arrives just as the
function is entered. This wouldn't break any data structures, but it would
have the same effect as if the SIGALRM interrupt had been silently lost:
we'd not fire any currently-due handlers, nor schedule any new interrupt.
To forestall that scenario, forcibly reschedule any pending timer interrupt
during AbortTransaction and AbortSubTransaction. We can avoid any extra
kernel call in most cases by not doing that until we've allowed
LockErrorCleanup to kill the DEADLOCK_TIMEOUT and LOCK_TIMEOUT events.
Another hazard is that some platforms (at least Linux and *BSD) block a
signal before calling its handler and then unblock it on return. When we
longjmp out of the handler, the unblock doesn't happen, and the signal is
left blocked indefinitely. Again, we can fix that by forcibly unblocking
signals during AbortTransaction and AbortSubTransaction.
These latter two problems do not manifest when the longjmp reaches
postgres.c, because the error recovery code there kills all pending timeout
events anyway, and it uses sigsetjmp(..., 1) so that the appropriate signal
mask is restored. So errors thrown outside any transaction should be OK
already, and cleaning up in AbortTransaction and AbortSubTransaction should
be enough to fix these issues. (We're assuming that any code that catches
a query cancel error and doesn't re-throw it will do at least a
subtransaction abort to clean up; but that was pretty much required already
by other subsystems.)
Lastly, ProcSleep should not clear the LOCK_TIMEOUT indicator flag when
disabling that event: if a lock timeout interrupt happened after the lock
was granted, the ensuing query cancel is still going to happen at the next
CHECK_FOR_INTERRUPTS, and we want to report it as a lock timeout not a user
cancel.
Per reports from Dan Wood.
Back-patch to 9.3 where the new timeout handling infrastructure was
introduced. We may at some point decide to back-patch the signal
unblocking changes further, but I'll desist from that until we hear
actual field complaints about it.
2013-11-29 22:41:00 +01:00
|
|
|
*/
|
2013-12-13 17:50:15 +01:00
|
|
|
ImmediateInterruptOK = false;
|
Fix assorted race conditions in the new timeout infrastructure.
Prevent handle_sig_alarm from losing control partway through due to a query
cancel (either an asynchronous SIGINT, or a cancel triggered by one of the
timeout handler functions). That would at least result in failure to
schedule any required future interrupt, and might result in actual
corruption of timeout.c's data structures, if the interrupt happened while
we were updating those.
We could still lose control if an asynchronous SIGINT arrives just as the
function is entered. This wouldn't break any data structures, but it would
have the same effect as if the SIGALRM interrupt had been silently lost:
we'd not fire any currently-due handlers, nor schedule any new interrupt.
To forestall that scenario, forcibly reschedule any pending timer interrupt
during AbortTransaction and AbortSubTransaction. We can avoid any extra
kernel call in most cases by not doing that until we've allowed
LockErrorCleanup to kill the DEADLOCK_TIMEOUT and LOCK_TIMEOUT events.
Another hazard is that some platforms (at least Linux and *BSD) block a
signal before calling its handler and then unblock it on return. When we
longjmp out of the handler, the unblock doesn't happen, and the signal is
left blocked indefinitely. Again, we can fix that by forcibly unblocking
signals during AbortTransaction and AbortSubTransaction.
These latter two problems do not manifest when the longjmp reaches
postgres.c, because the error recovery code there kills all pending timeout
events anyway, and it uses sigsetjmp(..., 1) so that the appropriate signal
mask is restored. So errors thrown outside any transaction should be OK
already, and cleaning up in AbortTransaction and AbortSubTransaction should
be enough to fix these issues. (We're assuming that any code that catches
a query cancel error and doesn't re-throw it will do at least a
subtransaction abort to clean up; but that was pretty much required already
by other subsystems.)
Lastly, ProcSleep should not clear the LOCK_TIMEOUT indicator flag when
disabling that event: if a lock timeout interrupt happened after the lock
was granted, the ensuing query cancel is still going to happen at the next
CHECK_FOR_INTERRUPTS, and we want to report it as a lock timeout not a user
cancel.
Per reports from Dan Wood.
Back-patch to 9.3 where the new timeout handling infrastructure was
introduced. We may at some point decide to back-patch the signal
unblocking changes further, but I'll desist from that until we hear
actual field complaints about it.
2013-11-29 22:41:00 +01:00
|
|
|
HOLD_INTERRUPTS();
|
|
|
|
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
/*
|
|
|
|
* SIGALRM is always cause for waking anything waiting on the process
|
|
|
|
* latch. Cope with MyProc not being there, as the startup process also
|
|
|
|
* uses this signal handler.
|
|
|
|
*/
|
|
|
|
if (MyProc)
|
|
|
|
SetLatch(&MyProc->procLatch);
|
|
|
|
|
|
|
|
/*
|
2013-03-18 03:42:19 +01:00
|
|
|
* Fire any pending timeouts, but only if we're enabled to do so.
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
*/
|
2013-03-18 03:42:19 +01:00
|
|
|
if (alarm_enabled)
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
{
|
2013-03-18 03:42:19 +01:00
|
|
|
/*
|
|
|
|
* Disable alarms, just in case this platform allows signal handlers
|
|
|
|
* to interrupt themselves. schedule_alarm() will re-enable if
|
|
|
|
* appropriate.
|
|
|
|
*/
|
|
|
|
disable_alarm();
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
|
2013-03-18 03:42:19 +01:00
|
|
|
if (num_active_timeouts > 0)
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
{
|
2013-03-18 03:42:19 +01:00
|
|
|
TimestampTz now = GetCurrentTimestamp();
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
|
2013-03-18 03:42:19 +01:00
|
|
|
/* While the first pending timeout has been reached ... */
|
|
|
|
while (num_active_timeouts > 0 &&
|
|
|
|
now >= active_timeouts[0]->fin_time)
|
|
|
|
{
|
|
|
|
timeout_params *this_timeout = active_timeouts[0];
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
|
2013-03-18 03:42:19 +01:00
|
|
|
/* Remove it from the active list */
|
|
|
|
remove_timeout_index(0);
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
|
2013-03-18 03:42:19 +01:00
|
|
|
/* Mark it as fired */
|
|
|
|
this_timeout->indicator = true;
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
|
2013-03-18 03:42:19 +01:00
|
|
|
/* And call its handler function */
|
|
|
|
(*this_timeout->timeout_handler) ();
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
|
2013-03-18 03:42:19 +01:00
|
|
|
/*
|
|
|
|
* The handler might not take negligible time (CheckDeadLock
|
|
|
|
* for instance isn't too cheap), so let's update our idea of
|
|
|
|
* "now" after each one.
|
|
|
|
*/
|
|
|
|
now = GetCurrentTimestamp();
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Done firing timeouts, so reschedule next interrupt if any */
|
|
|
|
schedule_alarm(now);
|
|
|
|
}
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
}
|
|
|
|
|
Fix assorted race conditions in the new timeout infrastructure.
Prevent handle_sig_alarm from losing control partway through due to a query
cancel (either an asynchronous SIGINT, or a cancel triggered by one of the
timeout handler functions). That would at least result in failure to
schedule any required future interrupt, and might result in actual
corruption of timeout.c's data structures, if the interrupt happened while
we were updating those.
We could still lose control if an asynchronous SIGINT arrives just as the
function is entered. This wouldn't break any data structures, but it would
have the same effect as if the SIGALRM interrupt had been silently lost:
we'd not fire any currently-due handlers, nor schedule any new interrupt.
To forestall that scenario, forcibly reschedule any pending timer interrupt
during AbortTransaction and AbortSubTransaction. We can avoid any extra
kernel call in most cases by not doing that until we've allowed
LockErrorCleanup to kill the DEADLOCK_TIMEOUT and LOCK_TIMEOUT events.
Another hazard is that some platforms (at least Linux and *BSD) block a
signal before calling its handler and then unblock it on return. When we
longjmp out of the handler, the unblock doesn't happen, and the signal is
left blocked indefinitely. Again, we can fix that by forcibly unblocking
signals during AbortTransaction and AbortSubTransaction.
These latter two problems do not manifest when the longjmp reaches
postgres.c, because the error recovery code there kills all pending timeout
events anyway, and it uses sigsetjmp(..., 1) so that the appropriate signal
mask is restored. So errors thrown outside any transaction should be OK
already, and cleaning up in AbortTransaction and AbortSubTransaction should
be enough to fix these issues. (We're assuming that any code that catches
a query cancel error and doesn't re-throw it will do at least a
subtransaction abort to clean up; but that was pretty much required already
by other subsystems.)
Lastly, ProcSleep should not clear the LOCK_TIMEOUT indicator flag when
disabling that event: if a lock timeout interrupt happened after the lock
was granted, the ensuing query cancel is still going to happen at the next
CHECK_FOR_INTERRUPTS, and we want to report it as a lock timeout not a user
cancel.
Per reports from Dan Wood.
Back-patch to 9.3 where the new timeout handling infrastructure was
introduced. We may at some point decide to back-patch the signal
unblocking changes further, but I'll desist from that until we hear
actual field complaints about it.
2013-11-29 22:41:00 +01:00
|
|
|
/*
|
2013-12-13 17:50:15 +01:00
|
|
|
* Re-allow query cancel, and then try to service any cancel request that
|
|
|
|
* arrived meanwhile (this might in particular include a cancel request
|
|
|
|
* fired by one of the timeout handlers). Since we are in a signal
|
|
|
|
* handler, we mustn't call ProcessInterrupts unless ImmediateInterruptOK
|
|
|
|
* is set; if it isn't, the cancel will happen at the next mainline
|
|
|
|
* CHECK_FOR_INTERRUPTS.
|
Fix assorted race conditions in the new timeout infrastructure.
Prevent handle_sig_alarm from losing control partway through due to a query
cancel (either an asynchronous SIGINT, or a cancel triggered by one of the
timeout handler functions). That would at least result in failure to
schedule any required future interrupt, and might result in actual
corruption of timeout.c's data structures, if the interrupt happened while
we were updating those.
We could still lose control if an asynchronous SIGINT arrives just as the
function is entered. This wouldn't break any data structures, but it would
have the same effect as if the SIGALRM interrupt had been silently lost:
we'd not fire any currently-due handlers, nor schedule any new interrupt.
To forestall that scenario, forcibly reschedule any pending timer interrupt
during AbortTransaction and AbortSubTransaction. We can avoid any extra
kernel call in most cases by not doing that until we've allowed
LockErrorCleanup to kill the DEADLOCK_TIMEOUT and LOCK_TIMEOUT events.
Another hazard is that some platforms (at least Linux and *BSD) block a
signal before calling its handler and then unblock it on return. When we
longjmp out of the handler, the unblock doesn't happen, and the signal is
left blocked indefinitely. Again, we can fix that by forcibly unblocking
signals during AbortTransaction and AbortSubTransaction.
These latter two problems do not manifest when the longjmp reaches
postgres.c, because the error recovery code there kills all pending timeout
events anyway, and it uses sigsetjmp(..., 1) so that the appropriate signal
mask is restored. So errors thrown outside any transaction should be OK
already, and cleaning up in AbortTransaction and AbortSubTransaction should
be enough to fix these issues. (We're assuming that any code that catches
a query cancel error and doesn't re-throw it will do at least a
subtransaction abort to clean up; but that was pretty much required already
by other subsystems.)
Lastly, ProcSleep should not clear the LOCK_TIMEOUT indicator flag when
disabling that event: if a lock timeout interrupt happened after the lock
was granted, the ensuing query cancel is still going to happen at the next
CHECK_FOR_INTERRUPTS, and we want to report it as a lock timeout not a user
cancel.
Per reports from Dan Wood.
Back-patch to 9.3 where the new timeout handling infrastructure was
introduced. We may at some point decide to back-patch the signal
unblocking changes further, but I'll desist from that until we hear
actual field complaints about it.
2013-11-29 22:41:00 +01:00
|
|
|
*
|
|
|
|
* Note: a longjmp from here is safe so far as our own data structures are
|
|
|
|
* concerned; but on platforms that block a signal before calling the
|
|
|
|
* handler and then un-block it on return, longjmping out of the signal
|
|
|
|
* handler leaves SIGALRM still blocked. Error cleanup is responsible for
|
|
|
|
* unblocking any blocked signals.
|
|
|
|
*/
|
|
|
|
RESUME_INTERRUPTS();
|
2013-12-13 17:50:15 +01:00
|
|
|
ImmediateInterruptOK = save_ImmediateInterruptOK;
|
|
|
|
if (save_ImmediateInterruptOK)
|
|
|
|
CHECK_FOR_INTERRUPTS();
|
Fix assorted race conditions in the new timeout infrastructure.
Prevent handle_sig_alarm from losing control partway through due to a query
cancel (either an asynchronous SIGINT, or a cancel triggered by one of the
timeout handler functions). That would at least result in failure to
schedule any required future interrupt, and might result in actual
corruption of timeout.c's data structures, if the interrupt happened while
we were updating those.
We could still lose control if an asynchronous SIGINT arrives just as the
function is entered. This wouldn't break any data structures, but it would
have the same effect as if the SIGALRM interrupt had been silently lost:
we'd not fire any currently-due handlers, nor schedule any new interrupt.
To forestall that scenario, forcibly reschedule any pending timer interrupt
during AbortTransaction and AbortSubTransaction. We can avoid any extra
kernel call in most cases by not doing that until we've allowed
LockErrorCleanup to kill the DEADLOCK_TIMEOUT and LOCK_TIMEOUT events.
Another hazard is that some platforms (at least Linux and *BSD) block a
signal before calling its handler and then unblock it on return. When we
longjmp out of the handler, the unblock doesn't happen, and the signal is
left blocked indefinitely. Again, we can fix that by forcibly unblocking
signals during AbortTransaction and AbortSubTransaction.
These latter two problems do not manifest when the longjmp reaches
postgres.c, because the error recovery code there kills all pending timeout
events anyway, and it uses sigsetjmp(..., 1) so that the appropriate signal
mask is restored. So errors thrown outside any transaction should be OK
already, and cleaning up in AbortTransaction and AbortSubTransaction should
be enough to fix these issues. (We're assuming that any code that catches
a query cancel error and doesn't re-throw it will do at least a
subtransaction abort to clean up; but that was pretty much required already
by other subsystems.)
Lastly, ProcSleep should not clear the LOCK_TIMEOUT indicator flag when
disabling that event: if a lock timeout interrupt happened after the lock
was granted, the ensuing query cancel is still going to happen at the next
CHECK_FOR_INTERRUPTS, and we want to report it as a lock timeout not a user
cancel.
Per reports from Dan Wood.
Back-patch to 9.3 where the new timeout handling infrastructure was
introduced. We may at some point decide to back-patch the signal
unblocking changes further, but I'll desist from that until we hear
actual field complaints about it.
2013-11-29 22:41:00 +01:00
|
|
|
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
errno = save_errno;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/*****************************************************************************
|
|
|
|
* Public API
|
|
|
|
*****************************************************************************/
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Initialize timeout module.
|
|
|
|
*
|
|
|
|
* This must be called in every process that wants to use timeouts.
|
|
|
|
*
|
|
|
|
* If the process was forked from another one that was also using this
|
|
|
|
* module, be sure to call this before re-enabling signals; else handlers
|
|
|
|
* meant to run in the parent process might get invoked in this one.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
InitializeTimeouts(void)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
/* Initialize, or re-initialize, all local state */
|
2013-03-18 03:42:19 +01:00
|
|
|
disable_alarm();
|
|
|
|
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
num_active_timeouts = 0;
|
|
|
|
|
|
|
|
for (i = 0; i < MAX_TIMEOUTS; i++)
|
|
|
|
{
|
|
|
|
all_timeouts[i].index = i;
|
|
|
|
all_timeouts[i].indicator = false;
|
|
|
|
all_timeouts[i].timeout_handler = NULL;
|
|
|
|
all_timeouts[i].start_time = 0;
|
|
|
|
all_timeouts[i].fin_time = 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
all_timeouts_initialized = true;
|
|
|
|
|
|
|
|
/* Now establish the signal handler */
|
|
|
|
pqsignal(SIGALRM, handle_sig_alarm);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Register a timeout reason
|
|
|
|
*
|
|
|
|
* For predefined timeouts, this just registers the callback function.
|
|
|
|
*
|
|
|
|
* For user-defined timeouts, pass id == USER_TIMEOUT; we then allocate and
|
|
|
|
* return a timeout ID.
|
|
|
|
*/
|
|
|
|
TimeoutId
|
2013-03-17 04:22:17 +01:00
|
|
|
RegisterTimeout(TimeoutId id, timeout_handler_proc handler)
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
{
|
|
|
|
Assert(all_timeouts_initialized);
|
|
|
|
|
2013-03-18 03:42:19 +01:00
|
|
|
/* There's no need to disable the signal handler here. */
|
|
|
|
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
if (id >= USER_TIMEOUT)
|
|
|
|
{
|
|
|
|
/* Allocate a user-defined timeout reason */
|
|
|
|
for (id = USER_TIMEOUT; id < MAX_TIMEOUTS; id++)
|
|
|
|
if (all_timeouts[id].timeout_handler == NULL)
|
|
|
|
break;
|
|
|
|
if (id >= MAX_TIMEOUTS)
|
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
|
|
|
|
errmsg("cannot add more timeout reasons")));
|
|
|
|
}
|
|
|
|
|
|
|
|
Assert(all_timeouts[id].timeout_handler == NULL);
|
|
|
|
|
|
|
|
all_timeouts[id].timeout_handler = handler;
|
|
|
|
|
|
|
|
return id;
|
|
|
|
}
|
|
|
|
|
Fix assorted race conditions in the new timeout infrastructure.
Prevent handle_sig_alarm from losing control partway through due to a query
cancel (either an asynchronous SIGINT, or a cancel triggered by one of the
timeout handler functions). That would at least result in failure to
schedule any required future interrupt, and might result in actual
corruption of timeout.c's data structures, if the interrupt happened while
we were updating those.
We could still lose control if an asynchronous SIGINT arrives just as the
function is entered. This wouldn't break any data structures, but it would
have the same effect as if the SIGALRM interrupt had been silently lost:
we'd not fire any currently-due handlers, nor schedule any new interrupt.
To forestall that scenario, forcibly reschedule any pending timer interrupt
during AbortTransaction and AbortSubTransaction. We can avoid any extra
kernel call in most cases by not doing that until we've allowed
LockErrorCleanup to kill the DEADLOCK_TIMEOUT and LOCK_TIMEOUT events.
Another hazard is that some platforms (at least Linux and *BSD) block a
signal before calling its handler and then unblock it on return. When we
longjmp out of the handler, the unblock doesn't happen, and the signal is
left blocked indefinitely. Again, we can fix that by forcibly unblocking
signals during AbortTransaction and AbortSubTransaction.
These latter two problems do not manifest when the longjmp reaches
postgres.c, because the error recovery code there kills all pending timeout
events anyway, and it uses sigsetjmp(..., 1) so that the appropriate signal
mask is restored. So errors thrown outside any transaction should be OK
already, and cleaning up in AbortTransaction and AbortSubTransaction should
be enough to fix these issues. (We're assuming that any code that catches
a query cancel error and doesn't re-throw it will do at least a
subtransaction abort to clean up; but that was pretty much required already
by other subsystems.)
Lastly, ProcSleep should not clear the LOCK_TIMEOUT indicator flag when
disabling that event: if a lock timeout interrupt happened after the lock
was granted, the ensuing query cancel is still going to happen at the next
CHECK_FOR_INTERRUPTS, and we want to report it as a lock timeout not a user
cancel.
Per reports from Dan Wood.
Back-patch to 9.3 where the new timeout handling infrastructure was
introduced. We may at some point decide to back-patch the signal
unblocking changes further, but I'll desist from that until we hear
actual field complaints about it.
2013-11-29 22:41:00 +01:00
|
|
|
/*
|
|
|
|
* Reschedule any pending SIGALRM interrupt.
|
|
|
|
*
|
|
|
|
* This can be used during error recovery in case query cancel resulted in loss
|
|
|
|
* of a SIGALRM event (due to longjmp'ing out of handle_sig_alarm before it
|
2014-05-06 18:12:18 +02:00
|
|
|
* could do anything). But note it's not necessary if any of the public
|
Fix assorted race conditions in the new timeout infrastructure.
Prevent handle_sig_alarm from losing control partway through due to a query
cancel (either an asynchronous SIGINT, or a cancel triggered by one of the
timeout handler functions). That would at least result in failure to
schedule any required future interrupt, and might result in actual
corruption of timeout.c's data structures, if the interrupt happened while
we were updating those.
We could still lose control if an asynchronous SIGINT arrives just as the
function is entered. This wouldn't break any data structures, but it would
have the same effect as if the SIGALRM interrupt had been silently lost:
we'd not fire any currently-due handlers, nor schedule any new interrupt.
To forestall that scenario, forcibly reschedule any pending timer interrupt
during AbortTransaction and AbortSubTransaction. We can avoid any extra
kernel call in most cases by not doing that until we've allowed
LockErrorCleanup to kill the DEADLOCK_TIMEOUT and LOCK_TIMEOUT events.
Another hazard is that some platforms (at least Linux and *BSD) block a
signal before calling its handler and then unblock it on return. When we
longjmp out of the handler, the unblock doesn't happen, and the signal is
left blocked indefinitely. Again, we can fix that by forcibly unblocking
signals during AbortTransaction and AbortSubTransaction.
These latter two problems do not manifest when the longjmp reaches
postgres.c, because the error recovery code there kills all pending timeout
events anyway, and it uses sigsetjmp(..., 1) so that the appropriate signal
mask is restored. So errors thrown outside any transaction should be OK
already, and cleaning up in AbortTransaction and AbortSubTransaction should
be enough to fix these issues. (We're assuming that any code that catches
a query cancel error and doesn't re-throw it will do at least a
subtransaction abort to clean up; but that was pretty much required already
by other subsystems.)
Lastly, ProcSleep should not clear the LOCK_TIMEOUT indicator flag when
disabling that event: if a lock timeout interrupt happened after the lock
was granted, the ensuing query cancel is still going to happen at the next
CHECK_FOR_INTERRUPTS, and we want to report it as a lock timeout not a user
cancel.
Per reports from Dan Wood.
Back-patch to 9.3 where the new timeout handling infrastructure was
introduced. We may at some point decide to back-patch the signal
unblocking changes further, but I'll desist from that until we hear
actual field complaints about it.
2013-11-29 22:41:00 +01:00
|
|
|
* enable_ or disable_timeout functions are called in the same area, since
|
|
|
|
* those all do schedule_alarm() internally if needed.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
reschedule_timeouts(void)
|
|
|
|
{
|
|
|
|
/* For flexibility, allow this to be called before we're initialized. */
|
|
|
|
if (!all_timeouts_initialized)
|
|
|
|
return;
|
|
|
|
|
|
|
|
/* Disable timeout interrupts for safety. */
|
|
|
|
disable_alarm();
|
|
|
|
|
|
|
|
/* Reschedule the interrupt, if any timeouts remain active. */
|
|
|
|
if (num_active_timeouts > 0)
|
|
|
|
schedule_alarm(GetCurrentTimestamp());
|
|
|
|
}
|
|
|
|
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
/*
|
|
|
|
* Enable the specified timeout to fire after the specified delay.
|
|
|
|
*
|
|
|
|
* Delay is given in milliseconds.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
enable_timeout_after(TimeoutId id, int delay_ms)
|
|
|
|
{
|
|
|
|
TimestampTz now;
|
|
|
|
TimestampTz fin_time;
|
|
|
|
|
2013-03-17 04:22:17 +01:00
|
|
|
/* Disable timeout interrupts for safety. */
|
2013-03-18 03:42:19 +01:00
|
|
|
disable_alarm();
|
2013-03-17 04:22:17 +01:00
|
|
|
|
|
|
|
/* Queue the timeout at the appropriate time. */
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
now = GetCurrentTimestamp();
|
|
|
|
fin_time = TimestampTzPlusMilliseconds(now, delay_ms);
|
|
|
|
enable_timeout(id, now, fin_time);
|
2013-03-17 04:22:17 +01:00
|
|
|
|
|
|
|
/* Set the timer interrupt. */
|
|
|
|
schedule_alarm(now);
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Enable the specified timeout to fire at the specified time.
|
|
|
|
*
|
|
|
|
* This is provided to support cases where there's a reason to calculate
|
|
|
|
* the timeout by reference to some point other than "now". If there isn't,
|
|
|
|
* use enable_timeout_after(), to avoid calling GetCurrentTimestamp() twice.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
enable_timeout_at(TimeoutId id, TimestampTz fin_time)
|
|
|
|
{
|
2013-03-17 04:22:17 +01:00
|
|
|
TimestampTz now;
|
|
|
|
|
|
|
|
/* Disable timeout interrupts for safety. */
|
2013-03-18 03:42:19 +01:00
|
|
|
disable_alarm();
|
2013-03-17 04:22:17 +01:00
|
|
|
|
|
|
|
/* Queue the timeout at the appropriate time. */
|
|
|
|
now = GetCurrentTimestamp();
|
|
|
|
enable_timeout(id, now, fin_time);
|
|
|
|
|
|
|
|
/* Set the timer interrupt. */
|
|
|
|
schedule_alarm(now);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Enable multiple timeouts at once.
|
|
|
|
*
|
|
|
|
* This works like calling enable_timeout_after() and/or enable_timeout_at()
|
2014-05-06 18:12:18 +02:00
|
|
|
* multiple times. Use this to reduce the number of GetCurrentTimestamp()
|
2013-03-17 04:22:17 +01:00
|
|
|
* and setitimer() calls needed to establish multiple timeouts.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
enable_timeouts(const EnableTimeoutParams *timeouts, int count)
|
|
|
|
{
|
|
|
|
TimestampTz now;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
/* Disable timeout interrupts for safety. */
|
2013-03-18 03:42:19 +01:00
|
|
|
disable_alarm();
|
2013-03-17 04:22:17 +01:00
|
|
|
|
|
|
|
/* Queue the timeout(s) at the appropriate times. */
|
|
|
|
now = GetCurrentTimestamp();
|
|
|
|
|
|
|
|
for (i = 0; i < count; i++)
|
|
|
|
{
|
|
|
|
TimeoutId id = timeouts[i].id;
|
|
|
|
TimestampTz fin_time;
|
|
|
|
|
|
|
|
switch (timeouts[i].type)
|
|
|
|
{
|
|
|
|
case TMPARAM_AFTER:
|
|
|
|
fin_time = TimestampTzPlusMilliseconds(now,
|
|
|
|
timeouts[i].delay_ms);
|
|
|
|
enable_timeout(id, now, fin_time);
|
|
|
|
break;
|
|
|
|
|
|
|
|
case TMPARAM_AT:
|
|
|
|
enable_timeout(id, now, timeouts[i].fin_time);
|
|
|
|
break;
|
|
|
|
|
|
|
|
default:
|
|
|
|
elog(ERROR, "unrecognized timeout type %d",
|
|
|
|
(int) timeouts[i].type);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Set the timer interrupt. */
|
|
|
|
schedule_alarm(now);
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Cancel the specified timeout.
|
|
|
|
*
|
|
|
|
* The timeout's I've-been-fired indicator is reset,
|
|
|
|
* unless keep_indicator is true.
|
|
|
|
*
|
|
|
|
* When a timeout is canceled, any other active timeout remains in force.
|
|
|
|
* It's not an error to disable a timeout that is not enabled.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
disable_timeout(TimeoutId id, bool keep_indicator)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
/* Assert request is sane */
|
|
|
|
Assert(all_timeouts_initialized);
|
|
|
|
Assert(all_timeouts[id].timeout_handler != NULL);
|
|
|
|
|
2013-03-17 04:22:17 +01:00
|
|
|
/* Disable timeout interrupts for safety. */
|
2013-03-18 03:42:19 +01:00
|
|
|
disable_alarm();
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
|
|
|
|
/* Find the timeout and remove it from the active list. */
|
|
|
|
i = find_active_timeout(id);
|
|
|
|
if (i >= 0)
|
|
|
|
remove_timeout_index(i);
|
|
|
|
|
|
|
|
/* Mark it inactive, whether it was active or not. */
|
|
|
|
if (!keep_indicator)
|
|
|
|
all_timeouts[id].indicator = false;
|
|
|
|
|
2013-03-17 04:22:17 +01:00
|
|
|
/* Reschedule the interrupt, if any timeouts remain active. */
|
|
|
|
if (num_active_timeouts > 0)
|
|
|
|
schedule_alarm(GetCurrentTimestamp());
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Cancel multiple timeouts at once.
|
|
|
|
*
|
|
|
|
* The timeouts' I've-been-fired indicators are reset,
|
|
|
|
* unless timeouts[i].keep_indicator is true.
|
|
|
|
*
|
|
|
|
* This works like calling disable_timeout() multiple times.
|
|
|
|
* Use this to reduce the number of GetCurrentTimestamp()
|
|
|
|
* and setitimer() calls needed to cancel multiple timeouts.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
disable_timeouts(const DisableTimeoutParams *timeouts, int count)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
Assert(all_timeouts_initialized);
|
|
|
|
|
|
|
|
/* Disable timeout interrupts for safety. */
|
2013-03-18 03:42:19 +01:00
|
|
|
disable_alarm();
|
2013-03-17 04:22:17 +01:00
|
|
|
|
|
|
|
/* Cancel the timeout(s). */
|
|
|
|
for (i = 0; i < count; i++)
|
|
|
|
{
|
|
|
|
TimeoutId id = timeouts[i].id;
|
|
|
|
int idx;
|
|
|
|
|
|
|
|
Assert(all_timeouts[id].timeout_handler != NULL);
|
|
|
|
|
|
|
|
idx = find_active_timeout(id);
|
|
|
|
if (idx >= 0)
|
|
|
|
remove_timeout_index(idx);
|
|
|
|
|
|
|
|
if (!timeouts[i].keep_indicator)
|
|
|
|
all_timeouts[id].indicator = false;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Reschedule the interrupt, if any timeouts remain active. */
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
if (num_active_timeouts > 0)
|
|
|
|
schedule_alarm(GetCurrentTimestamp());
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Disable SIGALRM and remove all timeouts from the active list,
|
|
|
|
* and optionally reset their timeout indicators.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
disable_all_timeouts(bool keep_indicators)
|
|
|
|
{
|
2013-03-18 03:42:19 +01:00
|
|
|
disable_alarm();
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
|
2013-03-18 03:42:19 +01:00
|
|
|
/*
|
|
|
|
* Only bother to reset the timer if we think it's active. We could just
|
|
|
|
* let the interrupt happen anyway, but it's probably a bit cheaper to do
|
|
|
|
* setitimer() than to let the useless interrupt happen.
|
|
|
|
*/
|
|
|
|
if (num_active_timeouts > 0)
|
|
|
|
{
|
|
|
|
struct itimerval timeval;
|
|
|
|
|
|
|
|
MemSet(&timeval, 0, sizeof(struct itimerval));
|
|
|
|
if (setitimer(ITIMER_REAL, &timeval, NULL) != 0)
|
|
|
|
elog(FATAL, "could not disable SIGALRM timer: %m");
|
|
|
|
}
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
|
|
|
|
num_active_timeouts = 0;
|
|
|
|
|
|
|
|
if (!keep_indicators)
|
|
|
|
{
|
2013-03-18 03:42:19 +01:00
|
|
|
int i;
|
|
|
|
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
for (i = 0; i < MAX_TIMEOUTS; i++)
|
|
|
|
all_timeouts[i].indicator = false;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Return the timeout's I've-been-fired indicator
|
2013-03-17 04:22:17 +01:00
|
|
|
*
|
|
|
|
* If reset_indicator is true, reset the indicator when returning true.
|
|
|
|
* To avoid missing timeouts due to race conditions, we are careful not to
|
|
|
|
* reset the indicator when returning false.
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
*/
|
|
|
|
bool
|
2013-03-17 04:22:17 +01:00
|
|
|
get_timeout_indicator(TimeoutId id, bool reset_indicator)
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
{
|
2013-03-17 04:22:17 +01:00
|
|
|
if (all_timeouts[id].indicator)
|
|
|
|
{
|
|
|
|
if (reset_indicator)
|
|
|
|
all_timeouts[id].indicator = false;
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
return false;
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Return the time when the timeout was most recently activated
|
|
|
|
*
|
|
|
|
* Note: will return 0 if timeout has never been activated in this process.
|
|
|
|
* However, we do *not* reset the start_time when a timeout occurs, so as
|
|
|
|
* not to create a race condition if SIGALRM fires just as some code is
|
|
|
|
* about to fetch the value.
|
|
|
|
*/
|
|
|
|
TimestampTz
|
|
|
|
get_timeout_start_time(TimeoutId id)
|
|
|
|
{
|
|
|
|
return all_timeouts[id].start_time;
|
|
|
|
}
|