Introduce latches. A latch is a boolean variable, with the capability to
wait until it is set. Latches can be used to reliably wait until a signal
arrives, which is hard otherwise because signals don't interrupt select()
on some platforms, and even when they do, there's race conditions.
On Unix, latches use the so called self-pipe trick under the covers to
implement the sleep until the latch is set, without race conditions. On
Windows, Windows events are used.
Use the new latch abstraction to sleep in walsender, so that as soon as
a transaction finishes, walsender is woken up to immediately send the WAL
to the standby. This reduces the latency between master and standby, which
is good.
Preliminary work by Fujii Masao. The latch implementation is by me, with
helpful comments from many people.
2010-09-11 17:48:04 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* latch.h
|
|
|
|
* Routines for interprocess latches
|
|
|
|
*
|
2011-08-09 21:30:45 +02:00
|
|
|
* A latch is a boolean variable, with operations that let processes sleep
|
|
|
|
* until it is set. A latch can be set from another process, or a signal
|
|
|
|
* handler within the same process.
|
|
|
|
*
|
|
|
|
* The latch interface is a reliable replacement for the common pattern of
|
|
|
|
* using pg_usleep() or select() to wait until a signal arrives, where the
|
|
|
|
* signal handler sets a flag variable. Because on some platforms an
|
|
|
|
* incoming signal doesn't interrupt sleep, and even on platforms where it
|
|
|
|
* does there is a race condition if the signal arrives just before
|
|
|
|
* entering the sleep, the common pattern must periodically wake up and
|
|
|
|
* poll the flag variable. The pselect() system call was invented to solve
|
|
|
|
* this problem, but it is not portable enough. Latches are designed to
|
|
|
|
* overcome these limitations, allowing you to sleep without polling and
|
|
|
|
* ensuring quick response to signals from other processes.
|
|
|
|
*
|
|
|
|
* There are two kinds of latches: local and shared. A local latch is
|
|
|
|
* initialized by InitLatch, and can only be set from the same process.
|
|
|
|
* A local latch can be used to wait for a signal to arrive, by calling
|
|
|
|
* SetLatch in the signal handler. A shared latch resides in shared memory,
|
|
|
|
* and must be initialized at postmaster startup by InitSharedLatch. Before
|
|
|
|
* a shared latch can be waited on, it must be associated with a process
|
|
|
|
* with OwnLatch. Only the process owning the latch can wait on it, but any
|
|
|
|
* process can set it.
|
|
|
|
*
|
|
|
|
* There are three basic operations on a latch:
|
|
|
|
*
|
|
|
|
* SetLatch - Sets the latch
|
|
|
|
* ResetLatch - Clears the latch, allowing it to be set again
|
|
|
|
* WaitLatch - Waits for the latch to become set
|
|
|
|
*
|
2012-11-09 02:04:48 +01:00
|
|
|
* WaitLatch includes a provision for timeouts (which should be avoided
|
|
|
|
* when possible, as they incur extra overhead) and a provision for
|
2011-08-09 21:30:45 +02:00
|
|
|
* postmaster child processes to wake up immediately on postmaster death.
|
2016-03-21 09:56:39 +01:00
|
|
|
* See latch.c for detailed specifications for the exported functions.
|
2011-08-09 21:30:45 +02:00
|
|
|
*
|
|
|
|
* The correct pattern to wait for event(s) is:
|
|
|
|
*
|
|
|
|
* for (;;)
|
|
|
|
* {
|
|
|
|
* ResetLatch();
|
|
|
|
* if (work to do)
|
|
|
|
* Do Stuff();
|
|
|
|
* WaitLatch();
|
|
|
|
* }
|
|
|
|
*
|
|
|
|
* It's important to reset the latch *before* checking if there's work to
|
|
|
|
* do. Otherwise, if someone sets the latch between the check and the
|
|
|
|
* ResetLatch call, you will miss it and Wait will incorrectly block.
|
|
|
|
*
|
2016-08-01 21:13:53 +02:00
|
|
|
* Another valid coding pattern looks like:
|
|
|
|
*
|
|
|
|
* for (;;)
|
|
|
|
* {
|
|
|
|
* if (work to do)
|
|
|
|
* Do Stuff(); // in particular, exit loop if some condition satisfied
|
|
|
|
* WaitLatch();
|
|
|
|
* ResetLatch();
|
|
|
|
* }
|
|
|
|
*
|
|
|
|
* This is useful to reduce latch traffic if it's expected that the loop's
|
|
|
|
* termination condition will often be satisfied in the first iteration;
|
|
|
|
* the cost is an extra loop iteration before blocking when it is not.
|
|
|
|
* What must be avoided is placing any checks for asynchronous events after
|
|
|
|
* WaitLatch and before ResetLatch, as that creates a race condition.
|
|
|
|
*
|
2011-08-09 21:30:45 +02:00
|
|
|
* To wake up the waiter, you must first set a global flag or something
|
|
|
|
* else that the wait loop tests in the "if (work to do)" part, and call
|
|
|
|
* SetLatch *after* that. SetLatch is designed to return quickly if the
|
|
|
|
* latch is already set.
|
|
|
|
*
|
2012-11-09 02:04:48 +01:00
|
|
|
* On some platforms, signals will not interrupt the latch wait primitive
|
|
|
|
* by themselves. Therefore, it is critical that any signal handler that
|
|
|
|
* is meant to terminate a WaitLatch wait calls SetLatch.
|
|
|
|
*
|
Reduce idle power consumption of walwriter and checkpointer processes.
This patch modifies the walwriter process so that, when it has not found
anything useful to do for many consecutive wakeup cycles, it extends its
sleep time to reduce the server's idle power consumption. It reverts to
normal as soon as it's done any successful flushes. It's still true that
during any async commit, backends check for completed, unflushed pages of
WAL and signal the walwriter if there are any; so that in practice the
walwriter can get awakened and returned to normal operation sooner than the
sleep time might suggest.
Also, improve the checkpointer so that it uses a latch and a computed delay
time to not wake up at all except when it has something to do, replacing a
previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for
reducing the server's power consumption when idle.
In passing, get rid of the dedicated latch for signaling the walwriter in
favor of using its procLatch, since that comports better with possible
generic signal handlers using that latch. Also, fix a pre-existing bug
with failure to save/restore errno in walwriter's signal handlers.
Peter Geoghegan, somewhat simplified by Tom
2012-05-09 02:03:26 +02:00
|
|
|
* Note that use of the process latch (PGPROC.procLatch) is generally better
|
|
|
|
* than an ad-hoc shared latch for signaling auxiliary processes. This is
|
|
|
|
* because generic signal handlers will call SetLatch on the process latch
|
|
|
|
* only, so using any latch other than the process latch effectively precludes
|
2012-11-09 02:04:48 +01:00
|
|
|
* use of any generic handler.
|
Reduce idle power consumption of walwriter and checkpointer processes.
This patch modifies the walwriter process so that, when it has not found
anything useful to do for many consecutive wakeup cycles, it extends its
sleep time to reduce the server's idle power consumption. It reverts to
normal as soon as it's done any successful flushes. It's still true that
during any async commit, backends check for completed, unflushed pages of
WAL and signal the walwriter if there are any; so that in practice the
walwriter can get awakened and returned to normal operation sooner than the
sleep time might suggest.
Also, improve the checkpointer so that it uses a latch and a computed delay
time to not wake up at all except when it has something to do, replacing a
previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for
reducing the server's power consumption when idle.
In passing, get rid of the dedicated latch for signaling the walwriter in
favor of using its procLatch, since that comports better with possible
generic signal handlers using that latch. Also, fix a pre-existing bug
with failure to save/restore errno in walwriter's signal handlers.
Peter Geoghegan, somewhat simplified by Tom
2012-05-09 02:03:26 +02:00
|
|
|
*
|
Introduce latches. A latch is a boolean variable, with the capability to
wait until it is set. Latches can be used to reliably wait until a signal
arrives, which is hard otherwise because signals don't interrupt select()
on some platforms, and even when they do, there's race conditions.
On Unix, latches use the so called self-pipe trick under the covers to
implement the sleep until the latch is set, without race conditions. On
Windows, Windows events are used.
Use the new latch abstraction to sleep in walsender, so that as soon as
a transaction finishes, walsender is woken up to immediately send the WAL
to the standby. This reduces the latency between master and standby, which
is good.
Preliminary work by Fujii Masao. The latch implementation is by me, with
helpful comments from many people.
2010-09-11 17:48:04 +02:00
|
|
|
*
|
Introduce WaitEventSet API.
Commit ac1d794 ("Make idle backends exit if the postmaster dies.")
introduced a regression on, at least, large linux systems. Constantly
adding the same postmaster_alive_fds to the OSs internal datastructures
for implementing poll/select can cause significant contention; leading
to a performance regression of nearly 3x in one example.
This can be avoided by using e.g. linux' epoll, which avoids having to
add/remove file descriptors to the wait datastructures at a high rate.
Unfortunately the current latch interface makes it hard to allocate any
persistent per-backend resources.
Replace, with a backward compatibility layer, WaitLatchOrSocket with a
new WaitEventSet API. Users can allocate such a Set across multiple
calls, and add more than one file-descriptor to wait on. The latter has
been added because there's upcoming postgres features where that will be
helpful.
In addition to the previously existing poll(2), select(2),
WaitForMultipleObjects() implementations also provide an epoll_wait(2)
based implementation to address the aforementioned performance
problem. Epoll is only available on linux, but that is the most likely
OS for machines large enough (four sockets) to reproduce the problem.
To actually address the aforementioned regression, create and use a
long-lived WaitEventSet for FE/BE communication. There are additional
places that would benefit from a long-lived set, but that's a task for
another day.
Thanks to Amit Kapila, who helped make the windows code I blindly wrote
actually work.
Reported-By: Dmitry Vasilyev Discussion:
CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com
20160114143931.GG10941@awork2.anarazel.de
2016-03-21 09:56:39 +01:00
|
|
|
* WaitEventSets allow to wait for latches being set and additional events -
|
|
|
|
* postmaster dying and socket readiness of several sockets currently - at the
|
|
|
|
* same time. On many platforms using a long lived event set is more
|
|
|
|
* efficient than using WaitLatch or WaitLatchOrSocket.
|
|
|
|
*
|
|
|
|
*
|
2016-01-02 19:33:40 +01:00
|
|
|
* Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
|
Introduce latches. A latch is a boolean variable, with the capability to
wait until it is set. Latches can be used to reliably wait until a signal
arrives, which is hard otherwise because signals don't interrupt select()
on some platforms, and even when they do, there's race conditions.
On Unix, latches use the so called self-pipe trick under the covers to
implement the sleep until the latch is set, without race conditions. On
Windows, Windows events are used.
Use the new latch abstraction to sleep in walsender, so that as soon as
a transaction finishes, walsender is woken up to immediately send the WAL
to the standby. This reduces the latency between master and standby, which
is good.
Preliminary work by Fujii Masao. The latch implementation is by me, with
helpful comments from many people.
2010-09-11 17:48:04 +02:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
*
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/include/storage/latch.h
|
Introduce latches. A latch is a boolean variable, with the capability to
wait until it is set. Latches can be used to reliably wait until a signal
arrives, which is hard otherwise because signals don't interrupt select()
on some platforms, and even when they do, there's race conditions.
On Unix, latches use the so called self-pipe trick under the covers to
implement the sleep until the latch is set, without race conditions. On
Windows, Windows events are used.
Use the new latch abstraction to sleep in walsender, so that as soon as
a transaction finishes, walsender is woken up to immediately send the WAL
to the standby. This reduces the latency between master and standby, which
is good.
Preliminary work by Fujii Masao. The latch implementation is by me, with
helpful comments from many people.
2010-09-11 17:48:04 +02:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#ifndef LATCH_H
|
|
|
|
#define LATCH_H
|
|
|
|
|
|
|
|
#include <signal.h>
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Latch structure should be treated as opaque and only accessed through
|
|
|
|
* the public functions. It is defined here to allow embedding Latches as
|
|
|
|
* part of bigger structs.
|
|
|
|
*/
|
2015-01-14 18:45:22 +01:00
|
|
|
typedef struct Latch
|
Introduce latches. A latch is a boolean variable, with the capability to
wait until it is set. Latches can be used to reliably wait until a signal
arrives, which is hard otherwise because signals don't interrupt select()
on some platforms, and even when they do, there's race conditions.
On Unix, latches use the so called self-pipe trick under the covers to
implement the sleep until the latch is set, without race conditions. On
Windows, Windows events are used.
Use the new latch abstraction to sleep in walsender, so that as soon as
a transaction finishes, walsender is woken up to immediately send the WAL
to the standby. This reduces the latency between master and standby, which
is good.
Preliminary work by Fujii Masao. The latch implementation is by me, with
helpful comments from many people.
2010-09-11 17:48:04 +02:00
|
|
|
{
|
2011-04-10 17:42:00 +02:00
|
|
|
sig_atomic_t is_set;
|
|
|
|
bool is_shared;
|
|
|
|
int owner_pid;
|
2010-09-15 12:06:21 +02:00
|
|
|
#ifdef WIN32
|
2011-04-10 17:42:00 +02:00
|
|
|
HANDLE event;
|
Introduce latches. A latch is a boolean variable, with the capability to
wait until it is set. Latches can be used to reliably wait until a signal
arrives, which is hard otherwise because signals don't interrupt select()
on some platforms, and even when they do, there's race conditions.
On Unix, latches use the so called self-pipe trick under the covers to
implement the sleep until the latch is set, without race conditions. On
Windows, Windows events are used.
Use the new latch abstraction to sleep in walsender, so that as soon as
a transaction finishes, walsender is woken up to immediately send the WAL
to the standby. This reduces the latency between master and standby, which
is good.
Preliminary work by Fujii Masao. The latch implementation is by me, with
helpful comments from many people.
2010-09-11 17:48:04 +02:00
|
|
|
#endif
|
|
|
|
} Latch;
|
|
|
|
|
Introduce WaitEventSet API.
Commit ac1d794 ("Make idle backends exit if the postmaster dies.")
introduced a regression on, at least, large linux systems. Constantly
adding the same postmaster_alive_fds to the OSs internal datastructures
for implementing poll/select can cause significant contention; leading
to a performance regression of nearly 3x in one example.
This can be avoided by using e.g. linux' epoll, which avoids having to
add/remove file descriptors to the wait datastructures at a high rate.
Unfortunately the current latch interface makes it hard to allocate any
persistent per-backend resources.
Replace, with a backward compatibility layer, WaitLatchOrSocket with a
new WaitEventSet API. Users can allocate such a Set across multiple
calls, and add more than one file-descriptor to wait on. The latter has
been added because there's upcoming postgres features where that will be
helpful.
In addition to the previously existing poll(2), select(2),
WaitForMultipleObjects() implementations also provide an epoll_wait(2)
based implementation to address the aforementioned performance
problem. Epoll is only available on linux, but that is the most likely
OS for machines large enough (four sockets) to reproduce the problem.
To actually address the aforementioned regression, create and use a
long-lived WaitEventSet for FE/BE communication. There are additional
places that would benefit from a long-lived set, but that's a task for
another day.
Thanks to Amit Kapila, who helped make the windows code I blindly wrote
actually work.
Reported-By: Dmitry Vasilyev Discussion:
CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com
20160114143931.GG10941@awork2.anarazel.de
2016-03-21 09:56:39 +01:00
|
|
|
/*
|
|
|
|
* Bitmasks for events that may wake-up WaitLatch(), WaitLatchOrSocket(), or
|
|
|
|
* WaitEventSetWait().
|
|
|
|
*/
|
2012-06-10 21:20:04 +02:00
|
|
|
#define WL_LATCH_SET (1 << 0)
|
|
|
|
#define WL_SOCKET_READABLE (1 << 1)
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
#define WL_SOCKET_WRITEABLE (1 << 2)
|
Introduce WaitEventSet API.
Commit ac1d794 ("Make idle backends exit if the postmaster dies.")
introduced a regression on, at least, large linux systems. Constantly
adding the same postmaster_alive_fds to the OSs internal datastructures
for implementing poll/select can cause significant contention; leading
to a performance regression of nearly 3x in one example.
This can be avoided by using e.g. linux' epoll, which avoids having to
add/remove file descriptors to the wait datastructures at a high rate.
Unfortunately the current latch interface makes it hard to allocate any
persistent per-backend resources.
Replace, with a backward compatibility layer, WaitLatchOrSocket with a
new WaitEventSet API. Users can allocate such a Set across multiple
calls, and add more than one file-descriptor to wait on. The latter has
been added because there's upcoming postgres features where that will be
helpful.
In addition to the previously existing poll(2), select(2),
WaitForMultipleObjects() implementations also provide an epoll_wait(2)
based implementation to address the aforementioned performance
problem. Epoll is only available on linux, but that is the most likely
OS for machines large enough (four sockets) to reproduce the problem.
To actually address the aforementioned regression, create and use a
long-lived WaitEventSet for FE/BE communication. There are additional
places that would benefit from a long-lived set, but that's a task for
another day.
Thanks to Amit Kapila, who helped make the windows code I blindly wrote
actually work.
Reported-By: Dmitry Vasilyev Discussion:
CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com
20160114143931.GG10941@awork2.anarazel.de
2016-03-21 09:56:39 +01:00
|
|
|
#define WL_TIMEOUT (1 << 3) /* not for WaitEventSetWait() */
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
#define WL_POSTMASTER_DEATH (1 << 4)
|
|
|
|
|
Introduce WaitEventSet API.
Commit ac1d794 ("Make idle backends exit if the postmaster dies.")
introduced a regression on, at least, large linux systems. Constantly
adding the same postmaster_alive_fds to the OSs internal datastructures
for implementing poll/select can cause significant contention; leading
to a performance regression of nearly 3x in one example.
This can be avoided by using e.g. linux' epoll, which avoids having to
add/remove file descriptors to the wait datastructures at a high rate.
Unfortunately the current latch interface makes it hard to allocate any
persistent per-backend resources.
Replace, with a backward compatibility layer, WaitLatchOrSocket with a
new WaitEventSet API. Users can allocate such a Set across multiple
calls, and add more than one file-descriptor to wait on. The latter has
been added because there's upcoming postgres features where that will be
helpful.
In addition to the previously existing poll(2), select(2),
WaitForMultipleObjects() implementations also provide an epoll_wait(2)
based implementation to address the aforementioned performance
problem. Epoll is only available on linux, but that is the most likely
OS for machines large enough (four sockets) to reproduce the problem.
To actually address the aforementioned regression, create and use a
long-lived WaitEventSet for FE/BE communication. There are additional
places that would benefit from a long-lived set, but that's a task for
another day.
Thanks to Amit Kapila, who helped make the windows code I blindly wrote
actually work.
Reported-By: Dmitry Vasilyev Discussion:
CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com
20160114143931.GG10941@awork2.anarazel.de
2016-03-21 09:56:39 +01:00
|
|
|
typedef struct WaitEvent
|
|
|
|
{
|
|
|
|
int pos; /* position in the event data structure */
|
|
|
|
uint32 events; /* triggered events */
|
|
|
|
pgsocket fd; /* socket fd associated with event */
|
|
|
|
void *user_data; /* pointer provided in AddWaitEventToSet */
|
|
|
|
} WaitEvent;
|
|
|
|
|
|
|
|
/* forward declaration to avoid exposing latch.c implementation details */
|
|
|
|
typedef struct WaitEventSet WaitEventSet;
|
|
|
|
|
Introduce latches. A latch is a boolean variable, with the capability to
wait until it is set. Latches can be used to reliably wait until a signal
arrives, which is hard otherwise because signals don't interrupt select()
on some platforms, and even when they do, there's race conditions.
On Unix, latches use the so called self-pipe trick under the covers to
implement the sleep until the latch is set, without race conditions. On
Windows, Windows events are used.
Use the new latch abstraction to sleep in walsender, so that as soon as
a transaction finishes, walsender is woken up to immediately send the WAL
to the standby. This reduces the latency between master and standby, which
is good.
Preliminary work by Fujii Masao. The latch implementation is by me, with
helpful comments from many people.
2010-09-11 17:48:04 +02:00
|
|
|
/*
|
|
|
|
* prototypes for functions in latch.c
|
|
|
|
*/
|
2012-10-15 04:59:56 +02:00
|
|
|
extern void InitializeLatchSupport(void);
|
Introduce latches. A latch is a boolean variable, with the capability to
wait until it is set. Latches can be used to reliably wait until a signal
arrives, which is hard otherwise because signals don't interrupt select()
on some platforms, and even when they do, there's race conditions.
On Unix, latches use the so called self-pipe trick under the covers to
implement the sleep until the latch is set, without race conditions. On
Windows, Windows events are used.
Use the new latch abstraction to sleep in walsender, so that as soon as
a transaction finishes, walsender is woken up to immediately send the WAL
to the standby. This reduces the latency between master and standby, which
is good.
Preliminary work by Fujii Masao. The latch implementation is by me, with
helpful comments from many people.
2010-09-11 17:48:04 +02:00
|
|
|
extern void InitLatch(volatile Latch *latch);
|
|
|
|
extern void InitSharedLatch(volatile Latch *latch);
|
|
|
|
extern void OwnLatch(volatile Latch *latch);
|
|
|
|
extern void DisownLatch(volatile Latch *latch);
|
|
|
|
extern void SetLatch(volatile Latch *latch);
|
|
|
|
extern void ResetLatch(volatile Latch *latch);
|
2011-04-10 17:42:00 +02:00
|
|
|
|
Introduce WaitEventSet API.
Commit ac1d794 ("Make idle backends exit if the postmaster dies.")
introduced a regression on, at least, large linux systems. Constantly
adding the same postmaster_alive_fds to the OSs internal datastructures
for implementing poll/select can cause significant contention; leading
to a performance regression of nearly 3x in one example.
This can be avoided by using e.g. linux' epoll, which avoids having to
add/remove file descriptors to the wait datastructures at a high rate.
Unfortunately the current latch interface makes it hard to allocate any
persistent per-backend resources.
Replace, with a backward compatibility layer, WaitLatchOrSocket with a
new WaitEventSet API. Users can allocate such a Set across multiple
calls, and add more than one file-descriptor to wait on. The latter has
been added because there's upcoming postgres features where that will be
helpful.
In addition to the previously existing poll(2), select(2),
WaitForMultipleObjects() implementations also provide an epoll_wait(2)
based implementation to address the aforementioned performance
problem. Epoll is only available on linux, but that is the most likely
OS for machines large enough (four sockets) to reproduce the problem.
To actually address the aforementioned regression, create and use a
long-lived WaitEventSet for FE/BE communication. There are additional
places that would benefit from a long-lived set, but that's a task for
another day.
Thanks to Amit Kapila, who helped make the windows code I blindly wrote
actually work.
Reported-By: Dmitry Vasilyev Discussion:
CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com
20160114143931.GG10941@awork2.anarazel.de
2016-03-21 09:56:39 +01:00
|
|
|
extern WaitEventSet *CreateWaitEventSet(MemoryContext context, int nevents);
|
|
|
|
extern void FreeWaitEventSet(WaitEventSet *set);
|
|
|
|
extern int AddWaitEventToSet(WaitEventSet *set, uint32 events, pgsocket fd,
|
|
|
|
Latch *latch, void *user_data);
|
|
|
|
extern void ModifyWaitEvent(WaitEventSet *set, int pos, uint32 events, Latch *latch);
|
|
|
|
|
2016-10-04 16:50:13 +02:00
|
|
|
extern int WaitEventSetWait(WaitEventSet *set, long timeout,
|
|
|
|
WaitEvent *occurred_events, int nevents,
|
|
|
|
uint32 wait_event_info);
|
|
|
|
extern int WaitLatch(volatile Latch *latch, int wakeEvents, long timeout,
|
|
|
|
uint32 wait_event_info);
|
Introduce WaitEventSet API.
Commit ac1d794 ("Make idle backends exit if the postmaster dies.")
introduced a regression on, at least, large linux systems. Constantly
adding the same postmaster_alive_fds to the OSs internal datastructures
for implementing poll/select can cause significant contention; leading
to a performance regression of nearly 3x in one example.
This can be avoided by using e.g. linux' epoll, which avoids having to
add/remove file descriptors to the wait datastructures at a high rate.
Unfortunately the current latch interface makes it hard to allocate any
persistent per-backend resources.
Replace, with a backward compatibility layer, WaitLatchOrSocket with a
new WaitEventSet API. Users can allocate such a Set across multiple
calls, and add more than one file-descriptor to wait on. The latter has
been added because there's upcoming postgres features where that will be
helpful.
In addition to the previously existing poll(2), select(2),
WaitForMultipleObjects() implementations also provide an epoll_wait(2)
based implementation to address the aforementioned performance
problem. Epoll is only available on linux, but that is the most likely
OS for machines large enough (four sockets) to reproduce the problem.
To actually address the aforementioned regression, create and use a
long-lived WaitEventSet for FE/BE communication. There are additional
places that would benefit from a long-lived set, but that's a task for
another day.
Thanks to Amit Kapila, who helped make the windows code I blindly wrote
actually work.
Reported-By: Dmitry Vasilyev Discussion:
CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com
20160114143931.GG10941@awork2.anarazel.de
2016-03-21 09:56:39 +01:00
|
|
|
extern int WaitLatchOrSocket(volatile Latch *latch, int wakeEvents,
|
2016-10-04 16:50:13 +02:00
|
|
|
pgsocket sock, long timeout, uint32 wait_event_info);
|
Introduce latches. A latch is a boolean variable, with the capability to
wait until it is set. Latches can be used to reliably wait until a signal
arrives, which is hard otherwise because signals don't interrupt select()
on some platforms, and even when they do, there's race conditions.
On Unix, latches use the so called self-pipe trick under the covers to
implement the sleep until the latch is set, without race conditions. On
Windows, Windows events are used.
Use the new latch abstraction to sleep in walsender, so that as soon as
a transaction finishes, walsender is woken up to immediately send the WAL
to the standby. This reduces the latency between master and standby, which
is good.
Preliminary work by Fujii Masao. The latch implementation is by me, with
helpful comments from many people.
2010-09-11 17:48:04 +02:00
|
|
|
|
|
|
|
/*
|
2011-08-09 21:30:45 +02:00
|
|
|
* Unix implementation uses SIGUSR1 for inter-process signaling.
|
|
|
|
* Win32 doesn't need this.
|
Introduce latches. A latch is a boolean variable, with the capability to
wait until it is set. Latches can be used to reliably wait until a signal
arrives, which is hard otherwise because signals don't interrupt select()
on some platforms, and even when they do, there's race conditions.
On Unix, latches use the so called self-pipe trick under the covers to
implement the sleep until the latch is set, without race conditions. On
Windows, Windows events are used.
Use the new latch abstraction to sleep in walsender, so that as soon as
a transaction finishes, walsender is woken up to immediately send the WAL
to the standby. This reduces the latency between master and standby, which
is good.
Preliminary work by Fujii Masao. The latch implementation is by me, with
helpful comments from many people.
2010-09-11 17:48:04 +02:00
|
|
|
*/
|
|
|
|
#ifndef WIN32
|
|
|
|
extern void latch_sigusr1_handler(void);
|
|
|
|
#else
|
2011-08-09 21:30:45 +02:00
|
|
|
#define latch_sigusr1_handler() ((void) 0)
|
Introduce latches. A latch is a boolean variable, with the capability to
wait until it is set. Latches can be used to reliably wait until a signal
arrives, which is hard otherwise because signals don't interrupt select()
on some platforms, and even when they do, there's race conditions.
On Unix, latches use the so called self-pipe trick under the covers to
implement the sleep until the latch is set, without race conditions. On
Windows, Windows events are used.
Use the new latch abstraction to sleep in walsender, so that as soon as
a transaction finishes, walsender is woken up to immediately send the WAL
to the standby. This reduces the latency between master and standby, which
is good.
Preliminary work by Fujii Masao. The latch implementation is by me, with
helpful comments from many people.
2010-09-11 17:48:04 +02:00
|
|
|
#endif
|
|
|
|
|
|
|
|
#endif /* LATCH_H */
|