postgresql/src/include/storage/latch.h

177 lines
6.6 KiB
C
Raw Normal View History

/*-------------------------------------------------------------------------
*
* latch.h
* Routines for interprocess latches
*
* A latch is a boolean variable, with operations that let processes sleep
* until it is set. A latch can be set from another process, or a signal
* handler within the same process.
*
* The latch interface is a reliable replacement for the common pattern of
* using pg_usleep() or select() to wait until a signal arrives, where the
* signal handler sets a flag variable. Because on some platforms an
* incoming signal doesn't interrupt sleep, and even on platforms where it
* does there is a race condition if the signal arrives just before
* entering the sleep, the common pattern must periodically wake up and
* poll the flag variable. The pselect() system call was invented to solve
* this problem, but it is not portable enough. Latches are designed to
* overcome these limitations, allowing you to sleep without polling and
* ensuring quick response to signals from other processes.
*
* There are two kinds of latches: local and shared. A local latch is
* initialized by InitLatch, and can only be set from the same process.
* A local latch can be used to wait for a signal to arrive, by calling
* SetLatch in the signal handler. A shared latch resides in shared memory,
* and must be initialized at postmaster startup by InitSharedLatch. Before
* a shared latch can be waited on, it must be associated with a process
* with OwnLatch. Only the process owning the latch can wait on it, but any
* process can set it.
*
* There are three basic operations on a latch:
*
* SetLatch - Sets the latch
* ResetLatch - Clears the latch, allowing it to be set again
* WaitLatch - Waits for the latch to become set
*
* WaitLatch includes a provision for timeouts (which should be avoided
* when possible, as they incur extra overhead) and a provision for
* postmaster child processes to wake up immediately on postmaster death.
* See latch.c for detailed specifications for the exported functions.
*
* The correct pattern to wait for event(s) is:
*
* for (;;)
* {
* ResetLatch();
* if (work to do)
* Do Stuff();
* WaitLatch();
* }
*
* It's important to reset the latch *before* checking if there's work to
* do. Otherwise, if someone sets the latch between the check and the
* ResetLatch call, you will miss it and Wait will incorrectly block.
*
* Another valid coding pattern looks like:
*
* for (;;)
* {
* if (work to do)
* Do Stuff(); // in particular, exit loop if some condition satisfied
* WaitLatch();
* ResetLatch();
* }
*
* This is useful to reduce latch traffic if it's expected that the loop's
* termination condition will often be satisfied in the first iteration;
* the cost is an extra loop iteration before blocking when it is not.
* What must be avoided is placing any checks for asynchronous events after
* WaitLatch and before ResetLatch, as that creates a race condition.
*
* To wake up the waiter, you must first set a global flag or something
* else that the wait loop tests in the "if (work to do)" part, and call
* SetLatch *after* that. SetLatch is designed to return quickly if the
* latch is already set.
*
* On some platforms, signals will not interrupt the latch wait primitive
* by themselves. Therefore, it is critical that any signal handler that
* is meant to terminate a WaitLatch wait calls SetLatch.
*
* Note that use of the process latch (PGPROC.procLatch) is generally better
* than an ad-hoc shared latch for signaling auxiliary processes. This is
* because generic signal handlers will call SetLatch on the process latch
* only, so using any latch other than the process latch effectively precludes
* use of any generic handler.
*
*
Introduce WaitEventSet API. Commit ac1d794 ("Make idle backends exit if the postmaster dies.") introduced a regression on, at least, large linux systems. Constantly adding the same postmaster_alive_fds to the OSs internal datastructures for implementing poll/select can cause significant contention; leading to a performance regression of nearly 3x in one example. This can be avoided by using e.g. linux' epoll, which avoids having to add/remove file descriptors to the wait datastructures at a high rate. Unfortunately the current latch interface makes it hard to allocate any persistent per-backend resources. Replace, with a backward compatibility layer, WaitLatchOrSocket with a new WaitEventSet API. Users can allocate such a Set across multiple calls, and add more than one file-descriptor to wait on. The latter has been added because there's upcoming postgres features where that will be helpful. In addition to the previously existing poll(2), select(2), WaitForMultipleObjects() implementations also provide an epoll_wait(2) based implementation to address the aforementioned performance problem. Epoll is only available on linux, but that is the most likely OS for machines large enough (four sockets) to reproduce the problem. To actually address the aforementioned regression, create and use a long-lived WaitEventSet for FE/BE communication. There are additional places that would benefit from a long-lived set, but that's a task for another day. Thanks to Amit Kapila, who helped make the windows code I blindly wrote actually work. Reported-By: Dmitry Vasilyev Discussion: CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com 20160114143931.GG10941@awork2.anarazel.de
2016-03-21 09:56:39 +01:00
* WaitEventSets allow to wait for latches being set and additional events -
* postmaster dying and socket readiness of several sockets currently - at the
* same time. On many platforms using a long lived event set is more
* efficient than using WaitLatch or WaitLatchOrSocket.
*
*
* Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
2010-09-20 22:08:53 +02:00
* src/include/storage/latch.h
*
*-------------------------------------------------------------------------
*/
#ifndef LATCH_H
#define LATCH_H
#include <signal.h>
/*
* Latch structure should be treated as opaque and only accessed through
* the public functions. It is defined here to allow embedding Latches as
* part of bigger structs.
*/
typedef struct Latch
{
2011-04-10 17:42:00 +02:00
sig_atomic_t is_set;
bool is_shared;
int owner_pid;
#ifdef WIN32
2011-04-10 17:42:00 +02:00
HANDLE event;
#endif
} Latch;
Introduce WaitEventSet API. Commit ac1d794 ("Make idle backends exit if the postmaster dies.") introduced a regression on, at least, large linux systems. Constantly adding the same postmaster_alive_fds to the OSs internal datastructures for implementing poll/select can cause significant contention; leading to a performance regression of nearly 3x in one example. This can be avoided by using e.g. linux' epoll, which avoids having to add/remove file descriptors to the wait datastructures at a high rate. Unfortunately the current latch interface makes it hard to allocate any persistent per-backend resources. Replace, with a backward compatibility layer, WaitLatchOrSocket with a new WaitEventSet API. Users can allocate such a Set across multiple calls, and add more than one file-descriptor to wait on. The latter has been added because there's upcoming postgres features where that will be helpful. In addition to the previously existing poll(2), select(2), WaitForMultipleObjects() implementations also provide an epoll_wait(2) based implementation to address the aforementioned performance problem. Epoll is only available on linux, but that is the most likely OS for machines large enough (four sockets) to reproduce the problem. To actually address the aforementioned regression, create and use a long-lived WaitEventSet for FE/BE communication. There are additional places that would benefit from a long-lived set, but that's a task for another day. Thanks to Amit Kapila, who helped make the windows code I blindly wrote actually work. Reported-By: Dmitry Vasilyev Discussion: CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com 20160114143931.GG10941@awork2.anarazel.de
2016-03-21 09:56:39 +01:00
/*
* Bitmasks for events that may wake-up WaitLatch(), WaitLatchOrSocket(), or
* WaitEventSetWait().
*/
#define WL_LATCH_SET (1 << 0)
#define WL_SOCKET_READABLE (1 << 1)
#define WL_SOCKET_WRITEABLE (1 << 2)
Introduce WaitEventSet API. Commit ac1d794 ("Make idle backends exit if the postmaster dies.") introduced a regression on, at least, large linux systems. Constantly adding the same postmaster_alive_fds to the OSs internal datastructures for implementing poll/select can cause significant contention; leading to a performance regression of nearly 3x in one example. This can be avoided by using e.g. linux' epoll, which avoids having to add/remove file descriptors to the wait datastructures at a high rate. Unfortunately the current latch interface makes it hard to allocate any persistent per-backend resources. Replace, with a backward compatibility layer, WaitLatchOrSocket with a new WaitEventSet API. Users can allocate such a Set across multiple calls, and add more than one file-descriptor to wait on. The latter has been added because there's upcoming postgres features where that will be helpful. In addition to the previously existing poll(2), select(2), WaitForMultipleObjects() implementations also provide an epoll_wait(2) based implementation to address the aforementioned performance problem. Epoll is only available on linux, but that is the most likely OS for machines large enough (four sockets) to reproduce the problem. To actually address the aforementioned regression, create and use a long-lived WaitEventSet for FE/BE communication. There are additional places that would benefit from a long-lived set, but that's a task for another day. Thanks to Amit Kapila, who helped make the windows code I blindly wrote actually work. Reported-By: Dmitry Vasilyev Discussion: CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com 20160114143931.GG10941@awork2.anarazel.de
2016-03-21 09:56:39 +01:00
#define WL_TIMEOUT (1 << 3) /* not for WaitEventSetWait() */
#define WL_POSTMASTER_DEATH (1 << 4)
Introduce WaitEventSet API. Commit ac1d794 ("Make idle backends exit if the postmaster dies.") introduced a regression on, at least, large linux systems. Constantly adding the same postmaster_alive_fds to the OSs internal datastructures for implementing poll/select can cause significant contention; leading to a performance regression of nearly 3x in one example. This can be avoided by using e.g. linux' epoll, which avoids having to add/remove file descriptors to the wait datastructures at a high rate. Unfortunately the current latch interface makes it hard to allocate any persistent per-backend resources. Replace, with a backward compatibility layer, WaitLatchOrSocket with a new WaitEventSet API. Users can allocate such a Set across multiple calls, and add more than one file-descriptor to wait on. The latter has been added because there's upcoming postgres features where that will be helpful. In addition to the previously existing poll(2), select(2), WaitForMultipleObjects() implementations also provide an epoll_wait(2) based implementation to address the aforementioned performance problem. Epoll is only available on linux, but that is the most likely OS for machines large enough (four sockets) to reproduce the problem. To actually address the aforementioned regression, create and use a long-lived WaitEventSet for FE/BE communication. There are additional places that would benefit from a long-lived set, but that's a task for another day. Thanks to Amit Kapila, who helped make the windows code I blindly wrote actually work. Reported-By: Dmitry Vasilyev Discussion: CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com 20160114143931.GG10941@awork2.anarazel.de
2016-03-21 09:56:39 +01:00
typedef struct WaitEvent
{
int pos; /* position in the event data structure */
uint32 events; /* triggered events */
pgsocket fd; /* socket fd associated with event */
void *user_data; /* pointer provided in AddWaitEventToSet */
} WaitEvent;
/* forward declaration to avoid exposing latch.c implementation details */
typedef struct WaitEventSet WaitEventSet;
/*
* prototypes for functions in latch.c
*/
extern void InitializeLatchSupport(void);
extern void InitLatch(volatile Latch *latch);
extern void InitSharedLatch(volatile Latch *latch);
extern void OwnLatch(volatile Latch *latch);
extern void DisownLatch(volatile Latch *latch);
extern void SetLatch(volatile Latch *latch);
extern void ResetLatch(volatile Latch *latch);
2011-04-10 17:42:00 +02:00
Introduce WaitEventSet API. Commit ac1d794 ("Make idle backends exit if the postmaster dies.") introduced a regression on, at least, large linux systems. Constantly adding the same postmaster_alive_fds to the OSs internal datastructures for implementing poll/select can cause significant contention; leading to a performance regression of nearly 3x in one example. This can be avoided by using e.g. linux' epoll, which avoids having to add/remove file descriptors to the wait datastructures at a high rate. Unfortunately the current latch interface makes it hard to allocate any persistent per-backend resources. Replace, with a backward compatibility layer, WaitLatchOrSocket with a new WaitEventSet API. Users can allocate such a Set across multiple calls, and add more than one file-descriptor to wait on. The latter has been added because there's upcoming postgres features where that will be helpful. In addition to the previously existing poll(2), select(2), WaitForMultipleObjects() implementations also provide an epoll_wait(2) based implementation to address the aforementioned performance problem. Epoll is only available on linux, but that is the most likely OS for machines large enough (four sockets) to reproduce the problem. To actually address the aforementioned regression, create and use a long-lived WaitEventSet for FE/BE communication. There are additional places that would benefit from a long-lived set, but that's a task for another day. Thanks to Amit Kapila, who helped make the windows code I blindly wrote actually work. Reported-By: Dmitry Vasilyev Discussion: CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com 20160114143931.GG10941@awork2.anarazel.de
2016-03-21 09:56:39 +01:00
extern WaitEventSet *CreateWaitEventSet(MemoryContext context, int nevents);
extern void FreeWaitEventSet(WaitEventSet *set);
extern int AddWaitEventToSet(WaitEventSet *set, uint32 events, pgsocket fd,
Latch *latch, void *user_data);
extern void ModifyWaitEvent(WaitEventSet *set, int pos, uint32 events, Latch *latch);
extern int WaitEventSetWait(WaitEventSet *set, long timeout,
WaitEvent *occurred_events, int nevents,
uint32 wait_event_info);
extern int WaitLatch(volatile Latch *latch, int wakeEvents, long timeout,
uint32 wait_event_info);
Introduce WaitEventSet API. Commit ac1d794 ("Make idle backends exit if the postmaster dies.") introduced a regression on, at least, large linux systems. Constantly adding the same postmaster_alive_fds to the OSs internal datastructures for implementing poll/select can cause significant contention; leading to a performance regression of nearly 3x in one example. This can be avoided by using e.g. linux' epoll, which avoids having to add/remove file descriptors to the wait datastructures at a high rate. Unfortunately the current latch interface makes it hard to allocate any persistent per-backend resources. Replace, with a backward compatibility layer, WaitLatchOrSocket with a new WaitEventSet API. Users can allocate such a Set across multiple calls, and add more than one file-descriptor to wait on. The latter has been added because there's upcoming postgres features where that will be helpful. In addition to the previously existing poll(2), select(2), WaitForMultipleObjects() implementations also provide an epoll_wait(2) based implementation to address the aforementioned performance problem. Epoll is only available on linux, but that is the most likely OS for machines large enough (four sockets) to reproduce the problem. To actually address the aforementioned regression, create and use a long-lived WaitEventSet for FE/BE communication. There are additional places that would benefit from a long-lived set, but that's a task for another day. Thanks to Amit Kapila, who helped make the windows code I blindly wrote actually work. Reported-By: Dmitry Vasilyev Discussion: CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com 20160114143931.GG10941@awork2.anarazel.de
2016-03-21 09:56:39 +01:00
extern int WaitLatchOrSocket(volatile Latch *latch, int wakeEvents,
pgsocket sock, long timeout, uint32 wait_event_info);
/*
* Unix implementation uses SIGUSR1 for inter-process signaling.
* Win32 doesn't need this.
*/
#ifndef WIN32
extern void latch_sigusr1_handler(void);
#else
#define latch_sigusr1_handler() ((void) 0)
#endif
#endif /* LATCH_H */