2004-05-30 00:48:23 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* postmaster.h
|
|
|
|
* Exports from postmaster/postmaster.c.
|
|
|
|
*
|
2024-01-04 02:49:05 +01:00
|
|
|
* Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
|
2004-05-30 00:48:23 +02:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
*
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/include/postmaster/postmaster.h
|
2004-05-30 00:48:23 +02:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#ifndef _POSTMASTER_H
|
|
|
|
#define _POSTMASTER_H
|
|
|
|
|
2024-03-18 10:35:08 +01:00
|
|
|
#include "miscadmin.h"
|
|
|
|
|
2004-05-30 00:48:23 +02:00
|
|
|
/* GUC options */
|
2022-04-08 14:16:38 +02:00
|
|
|
extern PGDLLIMPORT bool EnableSSL;
|
2023-01-20 21:32:08 +01:00
|
|
|
extern PGDLLIMPORT int SuperuserReservedConnections;
|
2023-01-20 21:36:36 +01:00
|
|
|
extern PGDLLIMPORT int ReservedConnections;
|
2017-12-05 15:23:57 +01:00
|
|
|
extern PGDLLIMPORT int PostPortNumber;
|
2022-04-08 14:16:38 +02:00
|
|
|
extern PGDLLIMPORT int Unix_socket_permissions;
|
|
|
|
extern PGDLLIMPORT char *Unix_socket_group;
|
|
|
|
extern PGDLLIMPORT char *Unix_socket_directories;
|
|
|
|
extern PGDLLIMPORT char *ListenAddresses;
|
|
|
|
extern PGDLLIMPORT bool ClientAuthInProgress;
|
|
|
|
extern PGDLLIMPORT int PreAuthDelay;
|
|
|
|
extern PGDLLIMPORT int AuthenticationTimeout;
|
|
|
|
extern PGDLLIMPORT bool Log_connections;
|
|
|
|
extern PGDLLIMPORT bool log_hostname;
|
|
|
|
extern PGDLLIMPORT bool enable_bonjour;
|
|
|
|
extern PGDLLIMPORT char *bonjour_name;
|
|
|
|
extern PGDLLIMPORT bool restart_after_crash;
|
|
|
|
extern PGDLLIMPORT bool remove_temp_files_after_crash;
|
Provide options for postmaster to kill child processes with SIGABRT.
The postmaster normally sends SIGQUIT to force-terminate its
child processes after a child crash or immediate-stop request.
If that doesn't result in child exit within a few seconds,
we follow it up with SIGKILL. This patch provides GUC flags
that allow either of these signals to be replaced with SIGABRT.
On typically-configured Unix systems, that will result in a
core dump being produced for each such child. This can be
useful for debugging problems, although it's not something you'd
want to have on in production due to the risk of disk space
bloat from lots of core files.
The old postmaster -T switch, which sent SIGSTOP in place of
SIGQUIT, is changed to be the same as send_abort_for_crash.
As far as I can tell from the code comments, the intent of
that switch was just to block things for long enough to force
core dumps manually, which seems like an unnecessary extra step.
(Maybe at the time, there was no way to get most kernels to
produce core files with per-PID names, requiring manual core
file renaming after each one. But now it's surely the hard way.)
I also took the opportunity to remove the old postmaster -n
(skip shmem reinit) switch, which hasn't actually done anything
in decades, though the documentation still claimed it did.
Discussion: https://postgr.es/m/2251016.1668797294@sss.pgh.pa.us
2022-11-21 17:59:29 +01:00
|
|
|
extern PGDLLIMPORT bool send_abort_for_crash;
|
|
|
|
extern PGDLLIMPORT bool send_abort_for_kill;
|
2004-05-30 00:48:23 +02:00
|
|
|
|
2004-05-30 05:50:15 +02:00
|
|
|
#ifdef WIN32
|
2022-04-08 14:16:38 +02:00
|
|
|
extern PGDLLIMPORT HANDLE PostmasterHandle;
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
#else
|
2022-04-08 14:16:38 +02:00
|
|
|
extern PGDLLIMPORT int postmaster_alive_fds[2];
|
2012-06-10 21:20:04 +02:00
|
|
|
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
/*
|
|
|
|
* Constants that represent which of postmaster_alive_fds is held by
|
|
|
|
* postmaster, and which is used in children to check for postmaster death.
|
|
|
|
*/
|
|
|
|
#define POSTMASTER_FD_WATCH 0 /* used in children to check for
|
|
|
|
* postmaster death */
|
|
|
|
#define POSTMASTER_FD_OWN 1 /* kept open by postmaster only */
|
2004-05-30 05:50:15 +02:00
|
|
|
#endif
|
|
|
|
|
2017-12-05 15:23:57 +01:00
|
|
|
extern PGDLLIMPORT const char *progname;
|
2004-05-30 00:48:23 +02:00
|
|
|
|
2015-03-11 14:19:54 +01:00
|
|
|
extern void PostmasterMain(int argc, char *argv[]) pg_attribute_noreturn();
|
2004-08-06 01:32:13 +02:00
|
|
|
extern void ClosePostmasterPorts(bool am_syslogger);
|
2018-10-19 02:59:14 +02:00
|
|
|
extern void InitProcessGlobals(void);
|
2004-08-29 07:07:03 +02:00
|
|
|
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
extern int MaxLivePostmasterChildren(void);
|
|
|
|
|
2013-08-28 20:08:13 +02:00
|
|
|
extern bool PostmasterMarkPIDForWorkerNotify(int);
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
2024-03-18 10:35:08 +01:00
|
|
|
extern void BackendMain(char *startup_data, size_t startup_data_len) pg_attribute_noreturn();
|
2004-12-29 22:36:09 +01:00
|
|
|
|
2024-03-18 10:35:08 +01:00
|
|
|
#ifdef EXEC_BACKEND
|
2005-08-21 01:26:37 +02:00
|
|
|
extern Size ShmemBackendArraySize(void);
|
2004-12-29 22:36:09 +01:00
|
|
|
extern void ShmemBackendArrayAllocation(void);
|
2024-03-18 10:35:05 +01:00
|
|
|
|
|
|
|
#ifdef WIN32
|
|
|
|
extern void pgwin32_register_deadchild_callback(HANDLE procHandle, DWORD procId);
|
|
|
|
#endif
|
2004-05-30 00:48:23 +02:00
|
|
|
#endif
|
|
|
|
|
2024-03-18 10:35:08 +01:00
|
|
|
/* defined in globals.c */
|
|
|
|
extern struct ClientSocket *MyClientSocket;
|
|
|
|
|
|
|
|
/* prototypes for functions in launch_backend.c */
|
|
|
|
extern pid_t postmaster_child_launch(BackendType child_type, char *startup_data, size_t startup_data_len, struct ClientSocket *sock);
|
|
|
|
const char *PostmasterChildName(BackendType child_type);
|
|
|
|
#ifdef EXEC_BACKEND
|
|
|
|
extern void SubPostmasterMain(int argc, char *argv[]) pg_attribute_noreturn();
|
|
|
|
#endif
|
|
|
|
|
2013-01-02 16:01:14 +01:00
|
|
|
/*
|
Allow Pin/UnpinBuffer to operate in a lockfree manner.
Pinning/Unpinning a buffer is a very frequent operation; especially in
read-mostly cache resident workloads. Benchmarking shows that in various
scenarios the spinlock protecting a buffer header's state becomes a
significant bottleneck. The problem can be reproduced with pgbench -S on
larger machines, but can be considerably worse for queries which touch
the same buffers over and over at a high frequency (e.g. nested loops
over a small inner table).
To allow atomic operations to be used, cram BufferDesc's flags,
usage_count, buf_hdr_lock, refcount into a single 32bit atomic variable;
that allows to manipulate them together using 32bit compare-and-swap
operations. This requires reducing MAX_BACKENDS to 2^18-1 (which could
be lifted by using a 64bit field, but it's not a realistic configuration
atm).
As not all operations can easily implemented in a lockfree manner,
implement the previous buf_hdr_lock via a flag bit in the atomic
variable. That way we can continue to lock the header in places where
it's needed, but can get away without acquiring it in the more frequent
hot-paths. There's some additional operations which can be done without
the lock, but aren't in this patch; but the most important places are
covered.
As bufmgr.c now essentially re-implements spinlocks, abstract the delay
logic from s_lock.c into something more generic. It now has already two
users, and more are coming up; there's a follupw patch for lwlock.c at
least.
This patch is based on a proof-of-concept written by me, which Alexander
Korotkov made into a fully working patch; the committed version is again
revised by me. Benchmarking and testing has, amongst others, been
provided by Dilip Kumar, Alexander Korotkov, Robert Haas.
On a large x86 system improvements for readonly pgbench, with a high
client count, of a factor of 8 have been observed.
Author: Alexander Korotkov and Andres Freund
Discussion: 2400449.GjM57CE0Yg@dinodell
2016-04-11 05:12:32 +02:00
|
|
|
* Note: MAX_BACKENDS is limited to 2^18-1 because that's the width reserved
|
|
|
|
* for buffer references in buf_internals.h. This limitation could be lifted
|
|
|
|
* by using a 64bit state; but it's unlikely to be worthwhile as 2^18-1
|
|
|
|
* backends exceed currently realistic configurations. Even if that limitation
|
|
|
|
* were removed, we still could not a) exceed 2^23-1 because inval.c stores
|
2024-03-03 18:38:22 +01:00
|
|
|
* the ProcNumber as a 3-byte signed integer, b) INT_MAX/4 because some places
|
Allow Pin/UnpinBuffer to operate in a lockfree manner.
Pinning/Unpinning a buffer is a very frequent operation; especially in
read-mostly cache resident workloads. Benchmarking shows that in various
scenarios the spinlock protecting a buffer header's state becomes a
significant bottleneck. The problem can be reproduced with pgbench -S on
larger machines, but can be considerably worse for queries which touch
the same buffers over and over at a high frequency (e.g. nested loops
over a small inner table).
To allow atomic operations to be used, cram BufferDesc's flags,
usage_count, buf_hdr_lock, refcount into a single 32bit atomic variable;
that allows to manipulate them together using 32bit compare-and-swap
operations. This requires reducing MAX_BACKENDS to 2^18-1 (which could
be lifted by using a 64bit field, but it's not a realistic configuration
atm).
As not all operations can easily implemented in a lockfree manner,
implement the previous buf_hdr_lock via a flag bit in the atomic
variable. That way we can continue to lock the header in places where
it's needed, but can get away without acquiring it in the more frequent
hot-paths. There's some additional operations which can be done without
the lock, but aren't in this patch; but the most important places are
covered.
As bufmgr.c now essentially re-implements spinlocks, abstract the delay
logic from s_lock.c into something more generic. It now has already two
users, and more are coming up; there's a follupw patch for lwlock.c at
least.
This patch is based on a proof-of-concept written by me, which Alexander
Korotkov made into a fully working patch; the committed version is again
revised by me. Benchmarking and testing has, amongst others, been
provided by Dilip Kumar, Alexander Korotkov, Robert Haas.
On a large x86 system improvements for readonly pgbench, with a high
client count, of a factor of 8 have been observed.
Author: Alexander Korotkov and Andres Freund
Discussion: 2400449.GjM57CE0Yg@dinodell
2016-04-11 05:12:32 +02:00
|
|
|
* compute 4*MaxBackends without any overflow check. This is rechecked in the
|
|
|
|
* relevant GUC check hooks and in RegisterBackgroundWorker().
|
2013-01-02 16:01:14 +01:00
|
|
|
*/
|
Allow Pin/UnpinBuffer to operate in a lockfree manner.
Pinning/Unpinning a buffer is a very frequent operation; especially in
read-mostly cache resident workloads. Benchmarking shows that in various
scenarios the spinlock protecting a buffer header's state becomes a
significant bottleneck. The problem can be reproduced with pgbench -S on
larger machines, but can be considerably worse for queries which touch
the same buffers over and over at a high frequency (e.g. nested loops
over a small inner table).
To allow atomic operations to be used, cram BufferDesc's flags,
usage_count, buf_hdr_lock, refcount into a single 32bit atomic variable;
that allows to manipulate them together using 32bit compare-and-swap
operations. This requires reducing MAX_BACKENDS to 2^18-1 (which could
be lifted by using a 64bit field, but it's not a realistic configuration
atm).
As not all operations can easily implemented in a lockfree manner,
implement the previous buf_hdr_lock via a flag bit in the atomic
variable. That way we can continue to lock the header in places where
it's needed, but can get away without acquiring it in the more frequent
hot-paths. There's some additional operations which can be done without
the lock, but aren't in this patch; but the most important places are
covered.
As bufmgr.c now essentially re-implements spinlocks, abstract the delay
logic from s_lock.c into something more generic. It now has already two
users, and more are coming up; there's a follupw patch for lwlock.c at
least.
This patch is based on a proof-of-concept written by me, which Alexander
Korotkov made into a fully working patch; the committed version is again
revised by me. Benchmarking and testing has, amongst others, been
provided by Dilip Kumar, Alexander Korotkov, Robert Haas.
On a large x86 system improvements for readonly pgbench, with a high
client count, of a factor of 8 have been observed.
Author: Alexander Korotkov and Andres Freund
Discussion: 2400449.GjM57CE0Yg@dinodell
2016-04-11 05:12:32 +02:00
|
|
|
#define MAX_BACKENDS 0x3FFFF
|
2013-01-02 16:01:14 +01:00
|
|
|
|
2004-05-30 00:48:23 +02:00
|
|
|
#endif /* _POSTMASTER_H */
|