1996-07-09 08:22:35 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
1999-02-14 00:22:53 +01:00
|
|
|
* postmaster.c
|
1997-09-07 07:04:48 +02:00
|
|
|
* This program acts as a clearing house for requests to the
|
2014-05-06 18:12:18 +02:00
|
|
|
* POSTGRES system. Frontend programs send a startup message
|
1997-09-07 07:04:48 +02:00
|
|
|
* to the Postmaster and the postmaster uses the info in the
|
|
|
|
* message to setup a backend process.
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-13 02:17:06 +01:00
|
|
|
* The postmaster also manages system-wide operations such as
|
2004-08-29 07:07:03 +02:00
|
|
|
* startup and shutdown. The postmaster itself doesn't do those
|
2004-05-28 07:13:32 +02:00
|
|
|
* operations, mind you --- it just forks off a subprocess to do them
|
|
|
|
* at the right times. It also takes care of resetting the system
|
|
|
|
* if a backend crashes.
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-13 02:17:06 +01:00
|
|
|
*
|
|
|
|
* The postmaster process creates the shared memory and semaphore
|
|
|
|
* pools during startup, but as a rule does not touch them itself.
|
2002-06-11 15:40:53 +02:00
|
|
|
* In particular, it is not a member of the PGPROC array of backends
|
2014-05-06 18:12:18 +02:00
|
|
|
* and so it cannot participate in lock-manager operations. Keeping
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-13 02:17:06 +01:00
|
|
|
* the postmaster away from shared memory operations makes it simpler
|
|
|
|
* and more reliable. The postmaster is almost always able to recover
|
|
|
|
* from crashes of individual backends by resetting shared memory;
|
|
|
|
* if it did much with shared memory then it would be prone to crashing
|
|
|
|
* along with the backends.
|
|
|
|
*
|
2001-06-21 18:43:24 +02:00
|
|
|
* When a request message is received, we now fork() immediately.
|
|
|
|
* The child process performs authentication of the request, and
|
|
|
|
* then becomes a backend if successful. This allows the auth code
|
|
|
|
* to be written in a simple single-threaded style (as opposed to the
|
|
|
|
* crufty "poor man's multitasking" code that used to be needed).
|
|
|
|
* More importantly, it ensures that blockages in non-multithreaded
|
|
|
|
* libraries like SSL or PAM cannot cause denial of service to other
|
|
|
|
* clients.
|
|
|
|
*
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-13 02:17:06 +01:00
|
|
|
*
|
2017-01-03 19:48:53 +01:00
|
|
|
* Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
|
2000-01-26 06:58:53 +01:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
*
|
|
|
|
* IDENTIFICATION
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/backend/postmaster/postmaster.c
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
* NOTES
|
|
|
|
*
|
|
|
|
* Initialization:
|
2004-05-30 00:48:23 +02:00
|
|
|
* The Postmaster sets up shared memory data structures
|
|
|
|
* for the backends.
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
* Synchronization:
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-13 02:17:06 +01:00
|
|
|
* The Postmaster shares memory with the backends but should avoid
|
|
|
|
* touching shared memory, so as not to become stuck if a crashing
|
|
|
|
* backend screws up locks or shared memory. Likewise, the Postmaster
|
|
|
|
* should never block on messages from frontend clients.
|
1997-09-07 07:04:48 +02:00
|
|
|
*
|
1996-07-09 08:22:35 +02:00
|
|
|
* Garbage Collection:
|
1997-09-07 07:04:48 +02:00
|
|
|
* The Postmaster cleans up after backends if they have an emergency
|
|
|
|
* exit and/or core dump.
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
2004-07-21 22:34:50 +02:00
|
|
|
* Error Reporting:
|
|
|
|
* Use write_stderr() only for reporting "interactive" errors
|
|
|
|
* (essentially, bogus arguments on the command line). Once the
|
2011-07-04 13:35:44 +02:00
|
|
|
* postmaster is launched, use ereport().
|
2004-07-21 22:34:50 +02:00
|
|
|
*
|
1996-07-09 08:22:35 +02:00
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
2000-07-17 05:05:41 +02:00
|
|
|
|
2000-06-04 03:44:38 +02:00
|
|
|
#include "postgres.h"
|
|
|
|
|
1999-07-19 04:27:16 +02:00
|
|
|
#include <unistd.h>
|
|
|
|
#include <signal.h>
|
2004-06-03 04:08:07 +02:00
|
|
|
#include <time.h>
|
1999-07-19 04:27:16 +02:00
|
|
|
#include <sys/wait.h>
|
|
|
|
#include <ctype.h>
|
|
|
|
#include <sys/stat.h>
|
|
|
|
#include <sys/socket.h>
|
|
|
|
#include <fcntl.h>
|
1999-07-16 05:14:30 +02:00
|
|
|
#include <sys/param.h>
|
2003-01-07 19:48:13 +01:00
|
|
|
#include <netinet/in.h>
|
2001-10-19 02:44:08 +02:00
|
|
|
#include <arpa/inet.h>
|
1999-07-16 05:14:30 +02:00
|
|
|
#include <netdb.h>
|
1997-09-07 07:04:48 +02:00
|
|
|
#include <limits.h>
|
1996-07-09 08:22:35 +02:00
|
|
|
|
1997-01-24 19:27:32 +01:00
|
|
|
#ifdef HAVE_SYS_SELECT_H
|
1997-09-07 07:04:48 +02:00
|
|
|
#include <sys/select.h>
|
|
|
|
#endif
|
1996-07-09 08:22:35 +02:00
|
|
|
|
2005-05-15 02:26:19 +02:00
|
|
|
#ifdef USE_BONJOUR
|
2009-09-08 18:08:26 +02:00
|
|
|
#include <dns_sd.h>
|
2003-06-11 08:56:07 +02:00
|
|
|
#endif
|
|
|
|
|
2015-11-17 12:46:17 +01:00
|
|
|
#ifdef USE_SYSTEMD
|
|
|
|
#include <systemd/sd-daemon.h>
|
|
|
|
#endif
|
|
|
|
|
2015-01-08 04:35:44 +01:00
|
|
|
#ifdef HAVE_PTHREAD_IS_THREADED_NP
|
|
|
|
#include <pthread.h>
|
|
|
|
#endif
|
|
|
|
|
2006-07-15 17:47:17 +02:00
|
|
|
#include "access/transam.h"
|
2008-06-07 00:35:22 +02:00
|
|
|
#include "access/xlog.h"
|
2005-03-10 08:14:03 +01:00
|
|
|
#include "bootstrap/bootstrap.h"
|
2006-07-15 17:47:17 +02:00
|
|
|
#include "catalog/pg_control.h"
|
2016-09-02 12:49:59 +02:00
|
|
|
#include "common/ip.h"
|
2012-10-16 22:36:30 +02:00
|
|
|
#include "lib/ilist.h"
|
1996-07-09 08:22:35 +02:00
|
|
|
#include "libpq/auth.h"
|
1999-07-16 07:00:38 +02:00
|
|
|
#include "libpq/libpq.h"
|
1996-12-26 23:08:34 +01:00
|
|
|
#include "libpq/pqsignal.h"
|
1996-07-09 08:22:35 +02:00
|
|
|
#include "miscadmin.h"
|
2014-02-15 20:31:30 +01:00
|
|
|
#include "pg_getopt.h"
|
2005-03-10 08:14:03 +01:00
|
|
|
#include "pgstat.h"
|
2005-07-14 07:13:45 +02:00
|
|
|
#include "postmaster/autovacuum.h"
|
Allow background workers to be started dynamically.
There is a new API, RegisterDynamicBackgroundWorker, which allows
an ordinary user backend to register a new background writer during
normal running. This means that it's no longer necessary for all
background workers to be registered during processing of
shared_preload_libraries, although the option of registering workers
at that time remains available.
When a background worker exits and will not be restarted, the
slot previously used by that background worker is automatically
released and becomes available for reuse. Slots used by background
workers that are configured for automatic restart can't (yet) be
released without shutting down the system.
This commit adds a new source file, bgworker.c, and moves some
of the existing control logic for background workers there.
Previously, there was little enough logic that it made sense to
keep everything in postmaster.c, but not any more.
This commit also makes the worker_spi contrib module into an
extension and adds a new function, worker_spi_launch, which can
be used to demonstrate the new facility.
2013-07-16 19:02:15 +02:00
|
|
|
#include "postmaster/bgworker_internals.h"
|
2005-03-10 08:14:03 +01:00
|
|
|
#include "postmaster/fork_process.h"
|
2004-07-19 04:47:16 +02:00
|
|
|
#include "postmaster/pgarch.h"
|
2005-03-10 08:14:03 +01:00
|
|
|
#include "postmaster/postmaster.h"
|
2004-08-06 01:32:13 +02:00
|
|
|
#include "postmaster/syslogger.h"
|
2017-01-19 18:00:00 +01:00
|
|
|
#include "replication/logicallauncher.h"
|
2010-01-15 10:19:10 +01:00
|
|
|
#include "replication/walsender.h"
|
1997-12-09 04:11:25 +01:00
|
|
|
#include "storage/fd.h"
|
1999-07-16 07:00:38 +02:00
|
|
|
#include "storage/ipc.h"
|
2006-07-15 17:47:17 +02:00
|
|
|
#include "storage/pg_shmem.h"
|
2001-11-04 20:55:31 +01:00
|
|
|
#include "storage/pmsignal.h"
|
2012-07-17 16:14:06 +02:00
|
|
|
#include "storage/proc.h"
|
1999-07-16 07:00:38 +02:00
|
|
|
#include "tcop/tcopprot.h"
|
2004-08-08 22:17:36 +02:00
|
|
|
#include "utils/builtins.h"
|
2005-06-30 00:51:57 +02:00
|
|
|
#include "utils/datetime.h"
|
Allow background workers to be started dynamically.
There is a new API, RegisterDynamicBackgroundWorker, which allows
an ordinary user backend to register a new background writer during
normal running. This means that it's no longer necessary for all
background workers to be registered during processing of
shared_preload_libraries, although the option of registering workers
at that time remains available.
When a background worker exits and will not be restarted, the
slot previously used by that background worker is automatically
released and becomes available for reuse. Slots used by background
workers that are configured for automatic restart can't (yet) be
released without shutting down the system.
This commit adds a new source file, bgworker.c, and moves some
of the existing control logic for background workers there.
Previously, there was little enough logic that it made sense to
keep everything in postmaster.c, but not any more.
This commit also makes the worker_spi contrib module into an
extension and adds a new function, worker_spi_launch, which can
be used to demonstrate the new facility.
2013-07-16 19:02:15 +02:00
|
|
|
#include "utils/dynamic_loader.h"
|
2000-07-17 05:05:41 +02:00
|
|
|
#include "utils/memutils.h"
|
Change pg_ctl to detect server-ready by watching status in postmaster.pid.
Traditionally, "pg_ctl start -w" has waited for the server to become
ready to accept connections by attempting a connection once per second.
That has the major problem that connection issues (for instance, a
kernel packet filter blocking traffic) can't be reliably told apart
from server startup issues, and the minor problem that if server startup
isn't quick, we accumulate "the database system is starting up" spam
in the server log. We've hacked around many of the possible connection
issues, but it resulted in ugly and complicated code in pg_ctl.c.
In commit c61559ec3, I changed the probe rate to every tenth of a second.
That prompted Jeff Janes to complain that the log-spam problem had become
much worse. In the ensuing discussion, Andres Freund pointed out that
we could dispense with connection attempts altogether if the postmaster
were changed to report its status in postmaster.pid, which "pg_ctl start"
already relies on being able to read. This patch implements that, teaching
postmaster.c to report a status string into the pidfile at the same
state-change points already identified as being of interest for systemd
status reporting (cf commit 7d17e683f). pg_ctl no longer needs to link
with libpq at all; all its functions now depend on reading server files.
In support of this, teach AddToDataDirLockFile() to allow addition of
postmaster.pid lines in not-necessarily-sequential order. This is needed
on Windows where the SHMEM_KEY line will never be written at all. We still
have the restriction that we don't want to truncate the pidfile; document
the reasons for that a bit better.
Also, fix the pg_ctl TAP tests so they'll notice if "start -w" mode
is broken --- before, they'd just wait out the sixty seconds until
the loop gives up, and then report success anyway. (Yes, I found that
out the hard way.)
While at it, arrange for pg_ctl to not need to #include miscadmin.h;
as a rather low-level backend header, requiring that to be compilable
client-side is pretty dubious. This requires moving the #define's
associated with the pidfile into a new header file, and moving
PG_BACKEND_VERSIONSTR someplace else. For lack of a clearly better
"someplace else", I put it into port.h, beside the declaration of
find_other_exec(), since most users of that macro are passing the value to
find_other_exec(). (initdb still depends on miscadmin.h, but at least
pg_ctl and pg_upgrade no longer do.)
In passing, fix main.c so that PG_BACKEND_VERSIONSTR actually defines the
output of "postgres -V", which remarkably it had never done before.
Discussion: https://postgr.es/m/CAMkU=1xJW8e+CTotojOMBd-yzUvD0e_JZu2xHo=MnuZ4__m7Pg@mail.gmail.com
2017-06-28 23:31:24 +02:00
|
|
|
#include "utils/pidfile.h"
|
2001-10-19 02:44:08 +02:00
|
|
|
#include "utils/ps_status.h"
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
#include "utils/timeout.h"
|
2017-01-21 02:29:53 +01:00
|
|
|
#include "utils/varlena.h"
|
2001-06-22 21:16:24 +02:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
#ifdef EXEC_BACKEND
|
|
|
|
#include "storage/spin.h"
|
|
|
|
#endif
|
|
|
|
|
2003-04-18 00:26:02 +02:00
|
|
|
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/*
|
|
|
|
* Possible types of a backend. Beyond being the possible bkend_type values in
|
|
|
|
* struct bkend, these are OR-able request flag bits for SignalSomeChildren()
|
|
|
|
* and CountChildren().
|
|
|
|
*/
|
|
|
|
#define BACKEND_TYPE_NORMAL 0x0001 /* normal backend */
|
|
|
|
#define BACKEND_TYPE_AUTOVAC 0x0002 /* autovacuum worker process */
|
|
|
|
#define BACKEND_TYPE_WALSND 0x0004 /* walsender process */
|
|
|
|
#define BACKEND_TYPE_BGWORKER 0x0008 /* bgworker process */
|
|
|
|
#define BACKEND_TYPE_ALL 0x000F /* OR of all the above */
|
|
|
|
|
|
|
|
#define BACKEND_TYPE_WORKER (BACKEND_TYPE_AUTOVAC | BACKEND_TYPE_BGWORKER)
|
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
2001-06-21 18:43:24 +02:00
|
|
|
* List of active backends (or child processes anyway; we don't actually
|
|
|
|
* know whether a given child has become a backend or is still in the
|
|
|
|
* authorization phase). This is used mainly to keep track of how many
|
|
|
|
* children we have and send them appropriate signals when necessary.
|
2004-05-30 00:48:23 +02:00
|
|
|
*
|
2007-02-16 00:23:23 +01:00
|
|
|
* "Special" children such as the startup, bgwriter and autovacuum launcher
|
2014-05-06 18:12:18 +02:00
|
|
|
* tasks are not in this list. Autovacuum worker and walsender are in it.
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
* Also, "dead_end" children are in it: these are children launched just for
|
|
|
|
* the purpose of sending a friendly rejection message to a would-be client.
|
|
|
|
* We must track them because they are attached to shared memory, but we know
|
|
|
|
* they will never become live backends. dead_end children are not assigned a
|
|
|
|
* PMChildSlot.
|
|
|
|
*
|
2015-09-01 21:30:19 +02:00
|
|
|
* Background workers are in this list, too.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
1997-09-07 07:04:48 +02:00
|
|
|
typedef struct bkend
|
|
|
|
{
|
2001-06-20 20:07:56 +02:00
|
|
|
pid_t pid; /* process id of backend */
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
int32 cancel_key; /* cancel key for cancels for this backend */
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
int child_slot; /* PMChildSlot for this backend, if any */
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* Flavor of backend or auxiliary process. Note that BACKEND_TYPE_WALSND
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
* backends initially announce themselves as BACKEND_TYPE_NORMAL, so if
|
|
|
|
* bkend_type is normal, you should check for a recent transition.
|
|
|
|
*/
|
|
|
|
int bkend_type;
|
2007-08-09 03:18:43 +02:00
|
|
|
bool dead_end; /* is it going to send an error and quit? */
|
2014-05-06 18:12:18 +02:00
|
|
|
bool bgworker_notify; /* gets bgworker start/stop notifications */
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_node elem; /* list link in BackendList */
|
1997-09-08 22:59:27 +02:00
|
|
|
} Backend;
|
1996-07-09 08:22:35 +02:00
|
|
|
|
2012-10-16 22:36:30 +02:00
|
|
|
static dlist_head BackendList = DLIST_STATIC_INIT(BackendList);
|
1996-07-09 08:22:35 +02:00
|
|
|
|
2004-01-26 23:59:54 +01:00
|
|
|
#ifdef EXEC_BACKEND
|
|
|
|
static Backend *ShmemBackendArray;
|
|
|
|
#endif
|
|
|
|
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
BackgroundWorker *MyBgworkerEntry = NULL;
|
|
|
|
|
|
|
|
|
|
|
|
|
2000-11-12 21:51:52 +01:00
|
|
|
/* The socket number we are listening for connections on */
|
2001-03-22 05:01:46 +01:00
|
|
|
int PostPortNumber;
|
2013-05-29 22:58:43 +02:00
|
|
|
|
2012-08-10 23:26:44 +02:00
|
|
|
/* The directory names for Unix socket(s) */
|
|
|
|
char *Unix_socket_directories;
|
2013-05-29 22:58:43 +02:00
|
|
|
|
2012-08-10 23:26:44 +02:00
|
|
|
/* The TCP listen address(es) */
|
2004-03-23 02:23:48 +01:00
|
|
|
char *ListenAddresses;
|
1998-02-26 05:46:47 +01:00
|
|
|
|
2002-08-29 23:02:12 +02:00
|
|
|
/*
|
|
|
|
* ReservedBackends is the number of backends reserved for superuser use.
|
|
|
|
* This number is taken out of the pool size given by MaxBackends so
|
2002-11-21 07:36:08 +01:00
|
|
|
* number of backend slots available to non-superusers is
|
|
|
|
* (MaxBackends - ReservedBackends). Note what this really means is
|
|
|
|
* "if there are <= ReservedBackends connections available, only superusers
|
|
|
|
* can make new connections" --- pre-existing superuser connections don't
|
|
|
|
* count against the limit.
|
2002-08-29 23:02:12 +02:00
|
|
|
*/
|
2003-04-07 00:45:23 +02:00
|
|
|
int ReservedBackends;
|
2002-08-29 23:02:12 +02:00
|
|
|
|
2003-07-24 01:30:41 +02:00
|
|
|
/* The socket(s) we're listening to. */
|
2005-01-12 17:38:17 +01:00
|
|
|
#define MAXLISTEN 64
|
2010-01-10 15:16:08 +01:00
|
|
|
static pgsocket ListenSocket[MAXLISTEN];
|
1996-07-09 08:22:35 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Set by the -o option
|
|
|
|
*/
|
1999-10-25 05:08:03 +02:00
|
|
|
static char ExtraOptions[MAXPGPATH];
|
1996-07-09 08:22:35 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* These globals control the behavior of the postmaster in case some
|
2014-05-06 18:12:18 +02:00
|
|
|
* backend dumps core. Normally, it kills all peers of the dead backend
|
1996-07-09 08:22:35 +02:00
|
|
|
* and reinitializes shared memory. By specifying -s or -n, we can have
|
|
|
|
* the postmaster stop (rather than kill) peers and not reinitialize
|
2014-05-06 18:12:18 +02:00
|
|
|
* shared data structures. (Reinit is currently dead code, though.)
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
1998-09-01 06:40:42 +02:00
|
|
|
static bool Reinit = true;
|
1998-05-27 20:32:05 +02:00
|
|
|
static int SendStop = false;
|
1996-07-09 08:22:35 +02:00
|
|
|
|
2001-06-21 18:43:24 +02:00
|
|
|
/* still more option variables */
|
2001-03-22 05:01:46 +01:00
|
|
|
bool EnableSSL = false;
|
1998-05-29 19:00:34 +02:00
|
|
|
|
2001-09-21 19:06:12 +02:00
|
|
|
int PreAuthDelay = 0;
|
|
|
|
int AuthenticationTimeout = 60;
|
2000-11-09 12:26:00 +01:00
|
|
|
|
2004-03-15 16:56:28 +01:00
|
|
|
bool log_hostname; /* for ps display and logging */
|
2001-10-19 02:44:08 +02:00
|
|
|
bool Log_connections = false;
|
2002-08-18 05:03:26 +02:00
|
|
|
bool Db_user_namespace = false;
|
|
|
|
|
2009-09-08 19:08:36 +02:00
|
|
|
bool enable_bonjour = false;
|
2005-05-15 02:26:19 +02:00
|
|
|
char *bonjour_name;
|
2010-07-20 02:47:53 +02:00
|
|
|
bool restart_after_crash = true;
|
2003-07-22 22:29:13 +02:00
|
|
|
|
2004-05-30 00:48:23 +02:00
|
|
|
/* PIDs of special child processes; 0 when not running */
|
2001-03-22 05:01:46 +01:00
|
|
|
static pid_t StartupPID = 0,
|
2004-06-14 20:08:19 +02:00
|
|
|
BgWriterPID = 0,
|
2011-11-01 18:14:47 +01:00
|
|
|
CheckpointerPID = 0,
|
2007-07-24 06:54:09 +02:00
|
|
|
WalWriterPID = 0,
|
2010-01-15 10:19:10 +01:00
|
|
|
WalReceiverPID = 0,
|
2005-07-14 07:13:45 +02:00
|
|
|
AutoVacPID = 0,
|
2004-07-19 04:47:16 +02:00
|
|
|
PgArchPID = 0,
|
2007-11-15 22:14:46 +01:00
|
|
|
PgStatPID = 0,
|
|
|
|
SysLoggerPID = 0;
|
1998-09-01 06:40:42 +02:00
|
|
|
|
2015-07-09 19:22:22 +02:00
|
|
|
/* Startup process's status */
|
|
|
|
typedef enum
|
|
|
|
{
|
|
|
|
STARTUP_NOT_RUNNING,
|
|
|
|
STARTUP_RUNNING,
|
|
|
|
STARTUP_SIGNALED, /* we sent it a SIGQUIT or SIGKILL */
|
|
|
|
STARTUP_CRASHED
|
|
|
|
} StartupStatusEnum;
|
|
|
|
|
|
|
|
static StartupStatusEnum StartupStatus = STARTUP_NOT_RUNNING;
|
|
|
|
|
2004-05-30 00:48:23 +02:00
|
|
|
/* Startup/shutdown state */
|
1999-10-06 23:58:18 +02:00
|
|
|
#define NoShutdown 0
|
|
|
|
#define SmartShutdown 1
|
|
|
|
#define FastShutdown 2
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
#define ImmediateShutdown 3
|
1998-09-01 06:40:42 +02:00
|
|
|
|
2000-04-12 19:17:23 +02:00
|
|
|
static int Shutdown = NoShutdown;
|
1999-10-06 23:58:18 +02:00
|
|
|
|
2001-03-22 05:01:46 +01:00
|
|
|
static bool FatalError = false; /* T if recovering from backend crash */
|
1998-05-29 19:00:34 +02:00
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
/*
|
|
|
|
* We use a simple state machine to control startup, shutdown, and
|
|
|
|
* crash recovery (which is rather like shutdown followed by startup).
|
|
|
|
*
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
* After doing all the postmaster initialization work, we enter PM_STARTUP
|
|
|
|
* state and the startup process is launched. The startup process begins by
|
2009-06-26 22:29:04 +02:00
|
|
|
* reading the control file and other preliminary initialization steps.
|
|
|
|
* In a normal startup, or after crash recovery, the startup process exits
|
2014-05-06 18:12:18 +02:00
|
|
|
* with exit code 0 and we switch to PM_RUN state. However, archive recovery
|
2009-06-26 22:29:04 +02:00
|
|
|
* is handled specially since it takes much longer and we would like to support
|
|
|
|
* hot standby during archive recovery.
|
2009-06-11 16:49:15 +02:00
|
|
|
*
|
2009-06-26 22:29:04 +02:00
|
|
|
* When the startup process is ready to start archive recovery, it signals the
|
2012-05-11 23:46:08 +02:00
|
|
|
* postmaster, and we switch to PM_RECOVERY state. The background writer and
|
|
|
|
* checkpointer are launched, while the startup process continues applying WAL.
|
|
|
|
* If Hot Standby is enabled, then, after reaching a consistent point in WAL
|
|
|
|
* redo, startup process signals us again, and we switch to PM_HOT_STANDBY
|
2014-05-06 18:12:18 +02:00
|
|
|
* state and begin accepting connections to perform read-only queries. When
|
2012-05-11 23:46:08 +02:00
|
|
|
* archive recovery is finished, the startup process exits with exit code 0
|
|
|
|
* and we switch to PM_RUN state.
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
*
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
* Normal child backends can only be launched when we are in PM_RUN or
|
2010-05-15 22:01:32 +02:00
|
|
|
* PM_HOT_STANDBY state. (We also allow launch of normal
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
* child backends in PM_WAIT_BACKUP state, but only for superusers.)
|
2007-08-09 03:18:43 +02:00
|
|
|
* In other states we handle connection requests by launching "dead_end"
|
|
|
|
* child processes, which will simply send the client an error message and
|
|
|
|
* quit. (We track these in the BackendList so that we can know when they
|
|
|
|
* are all gone; this is important because they're still connected to shared
|
|
|
|
* memory, and would interfere with an attempt to destroy the shmem segment,
|
|
|
|
* possibly leading to SHMALL failure when we try to make a new one.)
|
|
|
|
* In PM_WAIT_DEAD_END state we are waiting for all the dead_end children
|
|
|
|
* to drain out of the system, and therefore stop accepting connection
|
|
|
|
* requests at all until the last existing child has quit (which hopefully
|
|
|
|
* will not be very long).
|
|
|
|
*
|
|
|
|
* Notice that this state variable does not distinguish *why* we entered
|
2008-04-27 00:47:40 +02:00
|
|
|
* states later than PM_RUN --- Shutdown and FatalError must be consulted
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
* to find that out. FatalError is never true in PM_RECOVERY_* or PM_RUN
|
|
|
|
* states, nor in PM_SHUTDOWN states (because we don't enter those states
|
|
|
|
* when trying to recover from a crash). It can be true in PM_STARTUP state,
|
|
|
|
* because we don't clear it until we've successfully started WAL redo.
|
2007-08-09 03:18:43 +02:00
|
|
|
*/
|
2007-11-15 22:14:46 +01:00
|
|
|
typedef enum
|
|
|
|
{
|
2007-08-09 03:18:43 +02:00
|
|
|
PM_INIT, /* postmaster starting */
|
|
|
|
PM_STARTUP, /* waiting for startup subprocess */
|
2009-06-26 22:29:04 +02:00
|
|
|
PM_RECOVERY, /* in archive recovery mode */
|
2010-05-15 22:01:32 +02:00
|
|
|
PM_HOT_STANDBY, /* in hot standby mode */
|
2007-08-09 03:18:43 +02:00
|
|
|
PM_RUN, /* normal "database is alive" state */
|
2008-04-23 15:44:59 +02:00
|
|
|
PM_WAIT_BACKUP, /* waiting for online backup mode to end */
|
2010-04-08 03:39:37 +02:00
|
|
|
PM_WAIT_READONLY, /* waiting for read only backends to exit */
|
2007-08-09 03:18:43 +02:00
|
|
|
PM_WAIT_BACKENDS, /* waiting for live backends to exit */
|
2012-06-10 21:20:04 +02:00
|
|
|
PM_SHUTDOWN, /* waiting for checkpointer to do shutdown
|
|
|
|
* ckpt */
|
2010-02-26 03:01:40 +01:00
|
|
|
PM_SHUTDOWN_2, /* waiting for archiver and walsenders to
|
|
|
|
* finish */
|
2007-08-09 03:18:43 +02:00
|
|
|
PM_WAIT_DEAD_END, /* waiting for dead_end children to exit */
|
|
|
|
PM_NO_CHILDREN /* all important children have exited */
|
2007-11-15 23:25:18 +01:00
|
|
|
} PMState;
|
2007-08-09 03:18:43 +02:00
|
|
|
|
|
|
|
static PMState pmState = PM_INIT;
|
|
|
|
|
2015-06-19 20:23:39 +02:00
|
|
|
/* Start time of SIGKILL timeout during immediate shutdown or child crash */
|
|
|
|
/* Zero means timeout is not running */
|
|
|
|
static time_t AbortStartTime = 0;
|
2017-06-21 20:39:04 +02:00
|
|
|
|
2015-06-19 20:23:39 +02:00
|
|
|
/* Length of said timeout */
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
#define SIGKILL_CHILDREN_AFTER_SECS 5
|
|
|
|
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
static bool ReachedNormalRunning = false; /* T if we've reached PM_RUN */
|
2010-05-26 14:32:41 +02:00
|
|
|
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
bool ClientAuthInProgress = false; /* T during new-client
|
|
|
|
* authentication */
|
2002-03-04 02:46:04 +01:00
|
|
|
|
2007-11-15 22:14:46 +01:00
|
|
|
bool redirection_done = false; /* stderr redirected for syslogger? */
|
2007-07-19 21:13:43 +02:00
|
|
|
|
2007-02-16 00:23:23 +01:00
|
|
|
/* received START_AUTOVAC_LAUNCHER signal */
|
2007-07-24 06:54:09 +02:00
|
|
|
static volatile sig_atomic_t start_autovac_launcher = false;
|
2010-02-26 03:01:40 +01:00
|
|
|
|
2009-08-24 19:23:02 +02:00
|
|
|
/* the launcher needs to be signalled to communicate some condition */
|
2010-02-26 03:01:40 +01:00
|
|
|
static volatile bool avlauncher_needs_signal = false;
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
|
Don't lose walreceiver start requests due to race condition in postmaster.
When a walreceiver dies, the startup process will notice that and send
a PMSIGNAL_START_WALRECEIVER signal to the postmaster, asking for a new
walreceiver to be launched. There's a race condition, which at least
in HEAD is very easy to hit, whereby the postmaster might see that
signal before it processes the SIGCHLD from the walreceiver process.
In that situation, sigusr1_handler() just dropped the start request
on the floor, reasoning that it must be redundant. Eventually, after
10 seconds (WALRCV_STARTUP_TIMEOUT), the startup process would make a
fresh request --- but that's a long time if the connection could have
been re-established almost immediately.
Fix it by setting a state flag inside the postmaster that we won't
clear until we do launch a walreceiver. In cases where that results
in an extra walreceiver launch, it's up to the walreceiver to realize
it's unwanted and go away --- but we have, and need, that logic anyway
for the opposite race case.
I came across this through investigating unexpected delays in the
src/test/recovery TAP tests: it manifests there in test cases where
a master server is stopped and restarted while leaving streaming
slaves active.
This logic has been broken all along, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/21344.1498494720@sss.pgh.pa.us
2017-06-26 23:31:56 +02:00
|
|
|
/* received START_WALRECEIVER signal */
|
|
|
|
static volatile sig_atomic_t WalReceiverRequested = false;
|
|
|
|
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/* set when there's a worker that needs to be started up */
|
|
|
|
static volatile bool StartWorkerNeeded = true;
|
|
|
|
static volatile bool HaveCrashedWorker = false;
|
|
|
|
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
#ifndef HAVE_STRONG_RANDOM
|
2016-10-18 15:28:23 +02:00
|
|
|
/*
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
* State for assigning cancel keys.
|
2016-10-18 15:28:23 +02:00
|
|
|
* Also, the global MyCancelKey passes the cancel key assigned to a given
|
|
|
|
* backend from the postmaster to that backend (via fork).
|
|
|
|
*/
|
|
|
|
static unsigned int random_seed = 0;
|
|
|
|
static struct timeval random_start_time;
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
#endif
|
2016-10-18 15:28:23 +02:00
|
|
|
|
2017-01-03 03:37:12 +01:00
|
|
|
#ifdef USE_SSL
|
|
|
|
/* Set when and if SSL has been initialized properly */
|
|
|
|
static bool LoadedSSL = false;
|
|
|
|
#endif
|
|
|
|
|
2009-09-08 18:08:26 +02:00
|
|
|
#ifdef USE_BONJOUR
|
|
|
|
static DNSServiceRef bonjour_sdref = NULL;
|
|
|
|
#endif
|
|
|
|
|
1997-09-07 07:04:48 +02:00
|
|
|
/*
|
1996-07-09 08:22:35 +02:00
|
|
|
* postmaster.c - function prototypes
|
|
|
|
*/
|
2015-08-02 20:54:44 +02:00
|
|
|
static void CloseServerPorts(int status, Datum arg);
|
2012-08-21 05:47:11 +02:00
|
|
|
static void unlink_external_pid_file(int status, Datum arg);
|
2009-05-03 00:02:37 +02:00
|
|
|
static void getInstallationPaths(const char *argv0);
|
2004-10-08 03:36:36 +02:00
|
|
|
static void checkDataDir(void);
|
2000-04-12 19:17:23 +02:00
|
|
|
static Port *ConnCreate(int serverFd);
|
|
|
|
static void ConnFree(Port *port);
|
2005-06-18 00:32:51 +02:00
|
|
|
static void reset_shared(int port);
|
2000-08-29 11:36:51 +02:00
|
|
|
static void SIGHUP_handler(SIGNAL_ARGS);
|
|
|
|
static void pmdie(SIGNAL_ARGS);
|
|
|
|
static void reaper(SIGNAL_ARGS);
|
2001-11-04 20:55:31 +01:00
|
|
|
static void sigusr1_handler(SIGNAL_ARGS);
|
2009-08-29 21:26:52 +02:00
|
|
|
static void startup_die(SIGNAL_ARGS);
|
2001-11-04 20:55:31 +01:00
|
|
|
static void dummy_handler(SIGNAL_ARGS);
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
static void StartupPacketTimeoutHandler(void);
|
2004-08-04 22:09:47 +02:00
|
|
|
static void CleanupBackend(int pid, int exitstatus);
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
static bool CleanupBackgroundWorker(int pid, int exitstatus);
|
2004-08-04 22:09:47 +02:00
|
|
|
static void HandleChildCrash(int pid, int exitstatus, const char *procname);
|
2002-03-04 02:46:04 +01:00
|
|
|
static void LogChildExit(int lev, const char *procname,
|
2002-09-04 22:31:48 +02:00
|
|
|
int pid, int exitstatus);
|
2007-08-09 03:18:43 +02:00
|
|
|
static void PostmasterStateMachine(void);
|
2006-01-04 22:06:32 +01:00
|
|
|
static void BackendInitialize(Port *port);
|
2015-03-26 19:03:19 +01:00
|
|
|
static void BackendRun(Port *port) pg_attribute_noreturn();
|
|
|
|
static void ExitPostmaster(int status) pg_attribute_noreturn();
|
2000-04-12 19:17:23 +02:00
|
|
|
static int ServerLoop(void);
|
|
|
|
static int BackendStartup(Port *port);
|
2001-06-21 18:43:24 +02:00
|
|
|
static int ProcessStartupPacket(Port *port, bool SSLdone);
|
2001-10-25 07:50:21 +02:00
|
|
|
static void processCancelRequest(Port *port, void *pkt);
|
2003-07-24 01:30:41 +02:00
|
|
|
static int initMasks(fd_set *rmask);
|
2002-01-06 22:40:02 +01:00
|
|
|
static void report_fork_failure_to_client(Port *port, int errnum);
|
2010-11-14 21:57:37 +01:00
|
|
|
static CAC_state canAcceptConnections(void);
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
static bool RandomCancelKey(int32 *cancel_key);
|
2006-11-21 21:59:53 +01:00
|
|
|
static void signal_child(pid_t pid, int signal);
|
2010-01-15 10:19:10 +01:00
|
|
|
static bool SignalSomeChildren(int signal, int targets);
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
static void TerminateChildren(int signal);
|
2010-01-15 10:19:10 +01:00
|
|
|
|
2010-02-26 03:01:40 +01:00
|
|
|
#define SignalChildren(sig) SignalSomeChildren(sig, BACKEND_TYPE_ALL)
|
2010-01-15 10:19:10 +01:00
|
|
|
|
|
|
|
static int CountChildren(int target);
|
2017-04-24 18:16:58 +02:00
|
|
|
static bool assign_backendlist_entry(RegisteredBgWorker *rw);
|
Allow multiple bgworkers to be launched per postmaster iteration.
Previously, maybe_start_bgworker() would launch at most one bgworker
process per call, on the grounds that the postmaster might otherwise
neglect its other duties for too long. However, that seems overly
conservative, especially since bad effects only become obvious when
many hundreds of bgworkers need to be launched at once. On the other
side of the coin is that the existing logic could result in substantial
delay of bgworker launches, because ServerLoop isn't guaranteed to
iterate immediately after a signal arrives. (My attempt to fix that
by using pselect(2) encountered too many portability question marks,
and in any case could not help on platforms without pselect().)
One could also question the wisdom of using an O(N^2) processing
method if the system is intended to support so many bgworkers.
As a compromise, allow that function to launch up to 100 bgworkers
per call (and in consequence, rename it to maybe_start_bgworkers).
This will allow any normal parallel-query request for workers
to be satisfied immediately during sigusr1_handler, avoiding the
question of whether ServerLoop will be able to launch more promptly.
There is talk of rewriting the postmaster to use a WaitEventSet to
avoid the signal-response-delay problem, but I'd argue that this change
should be kept even after that happens (if it ever does).
Backpatch to 9.6 where parallel query was added. The issue exists
before that, but previous uses of bgworkers typically aren't as
sensitive to how quickly they get launched.
Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
2017-04-26 22:17:29 +02:00
|
|
|
static void maybe_start_bgworkers(void);
|
2004-05-14 00:45:04 +02:00
|
|
|
static bool CreateOptsFile(int argc, char *argv[], char *fullprogname);
|
2007-03-07 14:35:03 +01:00
|
|
|
static pid_t StartChildProcess(AuxProcType type);
|
2007-02-16 00:23:23 +01:00
|
|
|
static void StartAutovacuumWorker(void);
|
Don't lose walreceiver start requests due to race condition in postmaster.
When a walreceiver dies, the startup process will notice that and send
a PMSIGNAL_START_WALRECEIVER signal to the postmaster, asking for a new
walreceiver to be launched. There's a race condition, which at least
in HEAD is very easy to hit, whereby the postmaster might see that
signal before it processes the SIGCHLD from the walreceiver process.
In that situation, sigusr1_handler() just dropped the start request
on the floor, reasoning that it must be redundant. Eventually, after
10 seconds (WALRCV_STARTUP_TIMEOUT), the startup process would make a
fresh request --- but that's a long time if the connection could have
been re-established almost immediately.
Fix it by setting a state flag inside the postmaster that we won't
clear until we do launch a walreceiver. In cases where that results
in an extra walreceiver launch, it's up to the walreceiver to realize
it's unwanted and go away --- but we have, and need, that logic anyway
for the opposite race case.
I came across this through investigating unexpected delays in the
src/test/recovery TAP tests: it manifests there in test cases where
a master server is stopped and restarted while leaving streaming
slaves active.
This logic has been broken all along, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/21344.1498494720@sss.pgh.pa.us
2017-06-26 23:31:56 +02:00
|
|
|
static void MaybeStartWalReceiver(void);
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
static void InitPostmasterDeathWatchHandle(void);
|
2000-04-12 19:17:23 +02:00
|
|
|
|
2015-06-12 16:11:51 +02:00
|
|
|
/*
|
|
|
|
* Archiver is allowed to start up at the current postmaster state?
|
|
|
|
*
|
|
|
|
* If WAL archiving is enabled always, we are allowed to start archiver
|
|
|
|
* even during recovery.
|
|
|
|
*/
|
|
|
|
#define PgArchStartupAllowed() \
|
|
|
|
((XLogArchivingActive() && pmState == PM_RUN) || \
|
|
|
|
(XLogArchivingAlways() && \
|
|
|
|
(pmState == PM_RECOVERY || pmState == PM_HOT_STANDBY)))
|
|
|
|
|
2003-12-20 18:31:21 +01:00
|
|
|
#ifdef EXEC_BACKEND
|
2004-05-14 00:45:04 +02:00
|
|
|
|
2004-01-11 04:49:31 +01:00
|
|
|
#ifdef WIN32
|
2012-07-05 20:00:40 +02:00
|
|
|
#define WNOHANG 0 /* ignored, so any integer value will do */
|
|
|
|
|
|
|
|
static pid_t waitpid(pid_t pid, int *exitstatus, int options);
|
2007-10-26 23:50:10 +02:00
|
|
|
static void WINAPI pgwin32_deadchild_callback(PVOID lpParameter, BOOLEAN TimerOrWaitFired);
|
2004-01-26 23:59:54 +01:00
|
|
|
|
2007-10-26 23:50:10 +02:00
|
|
|
static HANDLE win32ChildQueue;
|
|
|
|
|
2007-11-15 22:14:46 +01:00
|
|
|
typedef struct
|
2007-10-26 23:50:10 +02:00
|
|
|
{
|
2007-11-15 22:14:46 +01:00
|
|
|
HANDLE waitHandle;
|
|
|
|
HANDLE procHandle;
|
|
|
|
DWORD procId;
|
2011-04-10 17:42:00 +02:00
|
|
|
} win32_deadchild_waitinfo;
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
#endif /* WIN32 */
|
2004-01-11 04:49:31 +01:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
static pid_t backend_forkexec(Port *port);
|
|
|
|
static pid_t internal_forkexec(int argc, char *argv[], Port *port);
|
2004-01-07 00:15:22 +01:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
/* Type for a socket that can be inherited to a client process */
|
|
|
|
#ifdef WIN32
|
|
|
|
typedef struct
|
|
|
|
{
|
2010-02-26 03:01:40 +01:00
|
|
|
SOCKET origsocket; /* Original socket value, or PGINVALID_SOCKET
|
|
|
|
* if not a socket */
|
2004-11-17 01:14:14 +01:00
|
|
|
WSAPROTOCOL_INFO wsainfo;
|
2011-04-10 17:42:00 +02:00
|
|
|
} InheritableSocket;
|
2004-11-17 01:14:14 +01:00
|
|
|
#else
|
|
|
|
typedef int InheritableSocket;
|
|
|
|
#endif
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Structure contains all variables passed to exec:ed backends
|
|
|
|
*/
|
|
|
|
typedef struct
|
|
|
|
{
|
2005-10-15 04:49:52 +02:00
|
|
|
Port port;
|
2004-11-17 01:14:14 +01:00
|
|
|
InheritableSocket portsocket;
|
2005-10-15 04:49:52 +02:00
|
|
|
char DataDir[MAXPGPATH];
|
2010-01-10 15:16:08 +01:00
|
|
|
pgsocket ListenSocket[MAXLISTEN];
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
int32 MyCancelKey;
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
int MyPMChildSlot;
|
2010-01-02 13:01:29 +01:00
|
|
|
#ifndef WIN32
|
2004-11-17 01:14:14 +01:00
|
|
|
unsigned long UsedShmemSegID;
|
2010-01-02 13:01:29 +01:00
|
|
|
#else
|
2010-02-26 03:01:40 +01:00
|
|
|
HANDLE UsedShmemSegID;
|
2010-01-02 13:01:29 +01:00
|
|
|
#endif
|
2005-10-15 04:49:52 +02:00
|
|
|
void *UsedShmemSegAddr;
|
|
|
|
slock_t *ShmemLock;
|
2004-11-17 01:14:14 +01:00
|
|
|
VariableCache ShmemVariableCache;
|
2005-10-15 04:49:52 +02:00
|
|
|
Backend *ShmemBackendArray;
|
Reduce the number of semaphores used under --disable-spinlocks.
Instead of allocating a semaphore from the operating system for every
spinlock, allocate a fixed number of semaphores (by default, 1024)
from the operating system and multiplex all the spinlocks that get
created onto them. This could self-deadlock if a process attempted
to acquire more than one spinlock at a time, but since processes
aren't supposed to execute anything other than short stretches of
straight-line code while holding a spinlock, that shouldn't happen.
One motivation for this change is that, with the introduction of
dynamic shared memory, it may be desirable to create spinlocks that
last for less than the lifetime of the server. Without this change,
attempting to use such facilities under --disable-spinlocks would
quickly exhaust any supply of available semaphores. Quite apart
from that, it's desirable to contain the quantity of semaphores
needed to run the server simply on convenience grounds, since using
too many may make it harder to get PostgreSQL running on a new
platform, which is mostly the point of --disable-spinlocks in the
first place.
Patch by me; review by Tom Lane.
2014-01-09 00:49:14 +01:00
|
|
|
#ifndef HAVE_SPINLOCKS
|
Make the different Unix-y semaphore implementations ABI-compatible.
Previously, the "sem" field of PGPROC varied in size depending on which
kernel semaphore API we were using. That was okay as long as there was
only one likely choice per platform, but in the wake of commit ecb0d20a9,
that assumption seems rather shaky. It doesn't seem out of the question
anymore that an extension compiled against one API choice might be loaded
into a postmaster built with another choice. Moreover, this prevents any
possibility of selecting the semaphore API at postmaster startup, which
might be something we want to do in future.
Hence, change PGPROC.sem to be PGSemaphore (i.e. a pointer) for all Unix
semaphore APIs, and turn the pointed-to data into an opaque struct whose
contents are only known within the responsible modules.
For the SysV and unnamed-POSIX APIs, the pointed-to data has to be
allocated elsewhere in shared memory, which takes a little bit of
rejiggering of the InitShmemAllocation code sequence. (I invented a
ShmemAllocUnlocked() function to make that a little cleaner than it used
to be. That function is not meant for any uses other than the ones it
has now, but it beats having InitShmemAllocation() know explicitly about
allocation of space for semaphores and spinlocks.) This change means an
extra indirection to access the semaphore data, but since we only touch
that when blocking or awakening a process, there shouldn't be any
meaningful performance penalty. Moreover, at least for the unnamed-POSIX
case on Linux, the sem_t type is quite a bit wider than a pointer, so this
reduces sizeof(PGPROC) which seems like a good thing.
For the named-POSIX API, there's effectively no change: the PGPROC.sem
field was and still is a pointer to something returned by sem_open() in
the postmaster's memory space. Document and check the pre-existing
limitation that this case can't work in EXEC_BACKEND mode.
It did not seem worth unifying the Windows semaphore ABI with the Unix
cases, since there's no likelihood of needing ABI compatibility much less
runtime switching across those cases. However, we can simplify the Windows
code a bit if we define PGSemaphore as being directly a HANDLE, rather than
pointer to HANDLE, so let's do that while we're here. (This also ends up
being no change in what's physically stored in PGPROC.sem. We're just
moving the HANDLE fetch from callees to callers.)
It would take a bunch of additional code shuffling to get to the point of
actually choosing a semaphore API at postmaster start, but the effects
of that would now be localized in the port/XXX_sema.c files, so it seems
like fit material for a separate patch. The need for it is unproven as
yet, anyhow, whereas the ABI risk to extensions seems real enough.
Discussion: https://postgr.es/m/4029.1481413370@sss.pgh.pa.us
2016-12-12 19:32:10 +01:00
|
|
|
PGSemaphore *SpinlockSemaArray;
|
Reduce the number of semaphores used under --disable-spinlocks.
Instead of allocating a semaphore from the operating system for every
spinlock, allocate a fixed number of semaphores (by default, 1024)
from the operating system and multiplex all the spinlocks that get
created onto them. This could self-deadlock if a process attempted
to acquire more than one spinlock at a time, but since processes
aren't supposed to execute anything other than short stretches of
straight-line code while holding a spinlock, that shouldn't happen.
One motivation for this change is that, with the introduction of
dynamic shared memory, it may be desirable to create spinlocks that
last for less than the lifetime of the server. Without this change,
attempting to use such facilities under --disable-spinlocks would
quickly exhaust any supply of available semaphores. Quite apart
from that, it's desirable to contain the quantity of semaphores
needed to run the server simply on convenience grounds, since using
too many may make it harder to get PostgreSQL running on a new
platform, which is mostly the point of --disable-spinlocks in the
first place.
Patch by me; review by Tom Lane.
2014-01-09 00:49:14 +01:00
|
|
|
#endif
|
2016-02-04 22:43:04 +01:00
|
|
|
int NamedLWLockTrancheRequests;
|
|
|
|
NamedLWLockTranche *NamedLWLockTrancheArray;
|
2014-01-29 05:35:50 +01:00
|
|
|
LWLockPadded *MainLWLockArray;
|
2005-10-15 04:49:52 +02:00
|
|
|
slock_t *ProcStructLock;
|
2006-01-04 22:06:32 +01:00
|
|
|
PROC_HDR *ProcGlobal;
|
2007-03-07 14:35:03 +01:00
|
|
|
PGPROC *AuxiliaryProcs;
|
2011-11-25 14:02:10 +01:00
|
|
|
PGPROC *PreparedXactProcs;
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
PMSignalData *PMSignalState;
|
2004-11-17 01:14:14 +01:00
|
|
|
InheritableSocket pgStatSock;
|
2005-10-15 04:49:52 +02:00
|
|
|
pid_t PostmasterPid;
|
2005-06-30 00:51:57 +02:00
|
|
|
TimestampTz PgStartTime;
|
2008-05-04 23:13:36 +02:00
|
|
|
TimestampTz PgReloadTime;
|
2012-07-31 20:36:54 +02:00
|
|
|
pg_time_t first_syslogger_file_time;
|
2007-11-15 22:14:46 +01:00
|
|
|
bool redirection_done;
|
2011-09-29 23:20:53 +02:00
|
|
|
bool IsBinaryUpgrade;
|
2012-03-29 07:19:11 +02:00
|
|
|
int max_safe_fds;
|
2013-01-02 16:01:14 +01:00
|
|
|
int MaxBackends;
|
2004-11-17 01:14:14 +01:00
|
|
|
#ifdef WIN32
|
2005-10-15 04:49:52 +02:00
|
|
|
HANDLE PostmasterHandle;
|
|
|
|
HANDLE initial_signal_pipe;
|
|
|
|
HANDLE syslogPipe[2];
|
2004-11-17 01:14:14 +01:00
|
|
|
#else
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
int postmaster_alive_fds[2];
|
2005-10-15 04:49:52 +02:00
|
|
|
int syslogPipe[2];
|
2004-11-17 01:14:14 +01:00
|
|
|
#endif
|
2005-10-15 04:49:52 +02:00
|
|
|
char my_exec_path[MAXPGPATH];
|
|
|
|
char pkglib_path[MAXPGPATH];
|
|
|
|
char ExtraOptions[MAXPGPATH];
|
2011-04-10 17:42:00 +02:00
|
|
|
} BackendParameters;
|
2004-11-17 01:14:14 +01:00
|
|
|
|
|
|
|
static void read_backend_variables(char *id, Port *port);
|
2011-04-10 17:42:00 +02:00
|
|
|
static void restore_backend_variables(BackendParameters *param, Port *port);
|
2005-10-15 04:49:52 +02:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
#ifndef WIN32
|
2011-04-10 17:42:00 +02:00
|
|
|
static bool save_backend_variables(BackendParameters *param, Port *port);
|
2004-11-17 01:14:14 +01:00
|
|
|
#else
|
2011-04-10 17:42:00 +02:00
|
|
|
static bool save_backend_variables(BackendParameters *param, Port *port,
|
2005-10-15 04:49:52 +02:00
|
|
|
HANDLE childProcess, pid_t childPid);
|
2004-11-17 01:14:14 +01:00
|
|
|
#endif
|
2004-01-26 23:59:54 +01:00
|
|
|
|
2004-05-27 17:07:41 +02:00
|
|
|
static void ShmemBackendArrayAdd(Backend *bn);
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
static void ShmemBackendArrayRemove(Backend *bn);
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
#endif /* EXEC_BACKEND */
|
2003-12-20 18:31:21 +01:00
|
|
|
|
2007-03-07 14:35:03 +01:00
|
|
|
#define StartupDataBase() StartChildProcess(StartupProcess)
|
|
|
|
#define StartBackgroundWriter() StartChildProcess(BgWriterProcess)
|
2011-11-01 18:14:47 +01:00
|
|
|
#define StartCheckpointer() StartChildProcess(CheckpointerProcess)
|
2007-07-24 06:54:09 +02:00
|
|
|
#define StartWalWriter() StartChildProcess(WalWriterProcess)
|
2010-01-15 10:19:10 +01:00
|
|
|
#define StartWalReceiver() StartChildProcess(WalReceiverProcess)
|
1999-10-06 23:58:18 +02:00
|
|
|
|
2006-11-21 01:49:55 +01:00
|
|
|
/* Macros to check exit status of a child process */
|
|
|
|
#define EXIT_STATUS_0(st) ((st) == 0)
|
|
|
|
#define EXIT_STATUS_1(st) (WIFEXITED(st) && WEXITSTATUS(st) == 1)
|
2014-11-25 21:13:30 +01:00
|
|
|
#define EXIT_STATUS_3(st) (WIFEXITED(st) && WEXITSTATUS(st) == 3)
|
2006-11-21 01:49:55 +01:00
|
|
|
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
#ifndef WIN32
|
|
|
|
/*
|
|
|
|
* File descriptors for pipe used to monitor if postmaster is alive.
|
|
|
|
* First is POSTMASTER_FD_WATCH, second is POSTMASTER_FD_OWN.
|
|
|
|
*/
|
2012-06-10 21:20:04 +02:00
|
|
|
int postmaster_alive_fds[2] = {-1, -1};
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
#else
|
|
|
|
/* Process handle of postmaster used for the same purpose on Windows */
|
|
|
|
HANDLE PostmasterHandle;
|
|
|
|
#endif
|
2003-06-11 08:56:07 +02:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
/*
|
|
|
|
* Postmaster main entry point
|
|
|
|
*/
|
2012-06-25 20:25:26 +02:00
|
|
|
void
|
1996-07-09 08:22:35 +02:00
|
|
|
PostmasterMain(int argc, char *argv[])
|
|
|
|
{
|
2003-08-04 02:43:34 +02:00
|
|
|
int opt;
|
|
|
|
int status;
|
2004-10-08 03:36:36 +02:00
|
|
|
char *userDoption = NULL;
|
2011-01-14 01:01:28 +01:00
|
|
|
bool listen_addr_saved = false;
|
2003-08-04 02:43:34 +02:00
|
|
|
int i;
|
2014-02-09 03:21:46 +01:00
|
|
|
char *output_config_variable = NULL;
|
2011-01-14 01:01:28 +01:00
|
|
|
|
2004-05-30 00:48:23 +02:00
|
|
|
MyProcPid = PostmasterPid = getpid();
|
|
|
|
|
2007-08-03 01:39:45 +02:00
|
|
|
MyStartTime = time(NULL);
|
|
|
|
|
2003-05-28 19:25:02 +02:00
|
|
|
IsPostmasterEnvironment = true;
|
|
|
|
|
1997-09-07 07:04:48 +02:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* for security, no dir or file created can be group or other accessible
|
1997-09-07 07:04:48 +02:00
|
|
|
*/
|
2010-12-10 23:35:33 +01:00
|
|
|
umask(S_IRWXG | S_IRWXO);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2016-09-23 15:54:11 +02:00
|
|
|
/*
|
|
|
|
* Initialize random(3) so we don't get the same values in every run.
|
|
|
|
*
|
|
|
|
* Note: the seed is pretty predictable from externally-visible facts such
|
2016-10-18 15:28:23 +02:00
|
|
|
* as postmaster start time, so avoid using random() for security-critical
|
|
|
|
* random values during postmaster startup. At the time of first
|
|
|
|
* connection, PostmasterRandom will select a hopefully-more-random seed.
|
2016-09-23 15:54:11 +02:00
|
|
|
*/
|
|
|
|
srandom((unsigned int) (MyProcPid ^ MyStartTime));
|
|
|
|
|
2000-06-28 05:33:33 +02:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* By default, palloc() requests in the postmaster will be allocated in
|
|
|
|
* the PostmasterContext, which is space that can be recycled by backends.
|
|
|
|
* Allocated data that needs to be available to backends should be
|
|
|
|
* allocated in TopMemoryContext.
|
2000-06-28 05:33:33 +02:00
|
|
|
*/
|
|
|
|
PostmasterContext = AllocSetContextCreate(TopMemoryContext,
|
|
|
|
"Postmaster",
|
Add macros to make AllocSetContextCreate() calls simpler and safer.
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls
had typos in the context-sizing parameters. While none of these led to
especially significant problems, they did create minor inefficiencies,
and it's now clear that expecting people to copy-and-paste those calls
accurately is not a great idea. Let's reduce the risk of future errors
by introducing single macros that encapsulate the common use-cases.
Three such macros are enough to cover all but two special-purpose contexts;
those two calls can be left as-is, I think.
While this patch doesn't in itself improve matters for third-party
extensions, it doesn't break anything for them either, and they can
gradually adopt the simplified notation over time.
In passing, change TopMemoryContext to use the default allocation
parameters. Formerly it could only be extended 8K at a time. That was
probably reasonable when this code was written; but nowadays we create
many more contexts than we did then, so that it's not unusual to have a
couple hundred K in TopMemoryContext, even without considering various
dubious code that sticks other things there. There seems no good reason
not to let it use growing blocks like most other contexts.
Back-patch to 9.6, mostly because that's still close enough to HEAD that
it's easy to do so, and keeping the branches in sync can be expected to
avoid some future back-patching pain. The bugs fixed by these changes
don't seem to be significant enough to justify fixing them further back.
Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 23:50:38 +02:00
|
|
|
ALLOCSET_DEFAULT_SIZES);
|
2000-06-28 05:33:33 +02:00
|
|
|
MemoryContextSwitchTo(PostmasterContext);
|
|
|
|
|
2009-05-03 00:02:37 +02:00
|
|
|
/* Initialize paths to installation files */
|
|
|
|
getInstallationPaths(argv[0]);
|
2004-05-19 21:11:25 +02:00
|
|
|
|
Block signals earlier during postmaster startup.
Formerly, we set up the postmaster's signal handling only when we were
about to start launching subprocesses. This is a bad idea though, as
it means that for example a SIGINT arriving before that will kill the
postmaster instantly, perhaps leaving lockfiles, socket files, shared
memory, etc laying about. We'd rather that such a signal caused orderly
postmaster termination including releasing of those resources. A simple
fix is to move the PostmasterMain stanza that initializes signal handling
to an earlier point, before we've created any such resources. Then, an
early-arriving signal will be blocked until we're ready to deal with it
in the usual way. (The only part that really needs to be moved up is
blocking of signals, but it seems best to keep the signal handler
installation calls together with that; for one thing this ensures the
kernel won't drop any signals we wished to get. The handlers won't get
invoked in any case until we unblock signals in ServerLoop.)
Per a report from MauMau. He proposed changing the way "pg_ctl stop"
works to deal with this, but that'd just be masking one symptom not
fixing the core issue.
It's been like this since forever, so back-patch to all supported branches.
2014-04-06 00:16:08 +02:00
|
|
|
/*
|
|
|
|
* Set up signal handlers for the postmaster process.
|
|
|
|
*
|
Run the postmaster's signal handlers without SA_RESTART.
The postmaster keeps signals blocked everywhere except while waiting
for something to happen in ServerLoop(). The code expects that the
select(2) will be cancelled with EINTR if an interrupt occurs; without
that, followup actions that should be performed by ServerLoop() itself
will be delayed. However, some platforms interpret the SA_RESTART
signal flag as meaning that they should restart rather than cancel
the select(2). Worse yet, some of them restart it with the original
timeout delay, meaning that a steady stream of signal interrupts can
prevent ServerLoop() from iterating at all if there are no incoming
connection requests.
Observable symptoms of this, on an affected platform such as HPUX 10,
include extremely slow parallel query startup (possibly as much as
30 seconds) and failure to update timestamps on the postmaster's sockets
and lockfiles when no new connections arrive for a long time.
We can fix this by running the postmaster's signal handlers without
SA_RESTART. That would be quite a scary change if the range of code
where signals are accepted weren't so tiny, but as it is, it seems
safe enough. (Note that postmaster children do, and must, reset all
the handlers before unblocking signals; so this change should not
affect any child process.)
There is talk of rewriting the postmaster to use a WaitEventSet and
not do signal response work in signal handlers, at which point it might
be appropriate to revert this patch. But that's not happening before
v11 at the earliest.
Back-patch to 9.6. The problem exists much further back, but the
worst symptom arises only in connection with parallel query, so it
does not seem worth taking any portability risks in older branches.
Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us
2017-04-24 19:00:23 +02:00
|
|
|
* In the postmaster, we want to install non-ignored handlers *without*
|
|
|
|
* SA_RESTART. This is because they'll be blocked at all times except
|
|
|
|
* when ServerLoop is waiting for something to happen, and during that
|
2017-04-25 00:29:03 +02:00
|
|
|
* window, we want signals to exit the select(2) wait so that ServerLoop
|
Run the postmaster's signal handlers without SA_RESTART.
The postmaster keeps signals blocked everywhere except while waiting
for something to happen in ServerLoop(). The code expects that the
select(2) will be cancelled with EINTR if an interrupt occurs; without
that, followup actions that should be performed by ServerLoop() itself
will be delayed. However, some platforms interpret the SA_RESTART
signal flag as meaning that they should restart rather than cancel
the select(2). Worse yet, some of them restart it with the original
timeout delay, meaning that a steady stream of signal interrupts can
prevent ServerLoop() from iterating at all if there are no incoming
connection requests.
Observable symptoms of this, on an affected platform such as HPUX 10,
include extremely slow parallel query startup (possibly as much as
30 seconds) and failure to update timestamps on the postmaster's sockets
and lockfiles when no new connections arrive for a long time.
We can fix this by running the postmaster's signal handlers without
SA_RESTART. That would be quite a scary change if the range of code
where signals are accepted weren't so tiny, but as it is, it seems
safe enough. (Note that postmaster children do, and must, reset all
the handlers before unblocking signals; so this change should not
affect any child process.)
There is talk of rewriting the postmaster to use a WaitEventSet and
not do signal response work in signal handlers, at which point it might
be appropriate to revert this patch. But that's not happening before
v11 at the earliest.
Back-patch to 9.6. The problem exists much further back, but the
worst symptom arises only in connection with parallel query, so it
does not seem worth taking any portability risks in older branches.
Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us
2017-04-24 19:00:23 +02:00
|
|
|
* can respond if anything interesting happened. On some platforms,
|
2017-04-25 00:29:03 +02:00
|
|
|
* signals marked SA_RESTART would not cause the select() wait to end.
|
Run the postmaster's signal handlers without SA_RESTART.
The postmaster keeps signals blocked everywhere except while waiting
for something to happen in ServerLoop(). The code expects that the
select(2) will be cancelled with EINTR if an interrupt occurs; without
that, followup actions that should be performed by ServerLoop() itself
will be delayed. However, some platforms interpret the SA_RESTART
signal flag as meaning that they should restart rather than cancel
the select(2). Worse yet, some of them restart it with the original
timeout delay, meaning that a steady stream of signal interrupts can
prevent ServerLoop() from iterating at all if there are no incoming
connection requests.
Observable symptoms of this, on an affected platform such as HPUX 10,
include extremely slow parallel query startup (possibly as much as
30 seconds) and failure to update timestamps on the postmaster's sockets
and lockfiles when no new connections arrive for a long time.
We can fix this by running the postmaster's signal handlers without
SA_RESTART. That would be quite a scary change if the range of code
where signals are accepted weren't so tiny, but as it is, it seems
safe enough. (Note that postmaster children do, and must, reset all
the handlers before unblocking signals; so this change should not
affect any child process.)
There is talk of rewriting the postmaster to use a WaitEventSet and
not do signal response work in signal handlers, at which point it might
be appropriate to revert this patch. But that's not happening before
v11 at the earliest.
Back-patch to 9.6. The problem exists much further back, but the
worst symptom arises only in connection with parallel query, so it
does not seem worth taking any portability risks in older branches.
Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us
2017-04-24 19:00:23 +02:00
|
|
|
* Child processes will generally want SA_RESTART, but we expect them to
|
|
|
|
* set up their own handlers before unblocking signals.
|
|
|
|
*
|
Block signals earlier during postmaster startup.
Formerly, we set up the postmaster's signal handling only when we were
about to start launching subprocesses. This is a bad idea though, as
it means that for example a SIGINT arriving before that will kill the
postmaster instantly, perhaps leaving lockfiles, socket files, shared
memory, etc laying about. We'd rather that such a signal caused orderly
postmaster termination including releasing of those resources. A simple
fix is to move the PostmasterMain stanza that initializes signal handling
to an earlier point, before we've created any such resources. Then, an
early-arriving signal will be blocked until we're ready to deal with it
in the usual way. (The only part that really needs to be moved up is
blocking of signals, but it seems best to keep the signal handler
installation calls together with that; for one thing this ensures the
kernel won't drop any signals we wished to get. The handlers won't get
invoked in any case until we unblock signals in ServerLoop.)
Per a report from MauMau. He proposed changing the way "pg_ctl stop"
works to deal with this, but that'd just be masking one symptom not
fixing the core issue.
It's been like this since forever, so back-patch to all supported branches.
2014-04-06 00:16:08 +02:00
|
|
|
* CAUTION: when changing this list, check for side-effects on the signal
|
|
|
|
* handling setup of child processes. See tcop/postgres.c,
|
|
|
|
* bootstrap/bootstrap.c, postmaster/bgwriter.c, postmaster/walwriter.c,
|
|
|
|
* postmaster/autovacuum.c, postmaster/pgarch.c, postmaster/pgstat.c,
|
|
|
|
* postmaster/syslogger.c, postmaster/bgworker.c and
|
|
|
|
* postmaster/checkpointer.c.
|
|
|
|
*/
|
|
|
|
pqinitmask();
|
|
|
|
PG_SETMASK(&BlockSig);
|
|
|
|
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
pqsignal_no_restart(SIGHUP, SIGHUP_handler); /* reread config file and
|
|
|
|
* have children do same */
|
Run the postmaster's signal handlers without SA_RESTART.
The postmaster keeps signals blocked everywhere except while waiting
for something to happen in ServerLoop(). The code expects that the
select(2) will be cancelled with EINTR if an interrupt occurs; without
that, followup actions that should be performed by ServerLoop() itself
will be delayed. However, some platforms interpret the SA_RESTART
signal flag as meaning that they should restart rather than cancel
the select(2). Worse yet, some of them restart it with the original
timeout delay, meaning that a steady stream of signal interrupts can
prevent ServerLoop() from iterating at all if there are no incoming
connection requests.
Observable symptoms of this, on an affected platform such as HPUX 10,
include extremely slow parallel query startup (possibly as much as
30 seconds) and failure to update timestamps on the postmaster's sockets
and lockfiles when no new connections arrive for a long time.
We can fix this by running the postmaster's signal handlers without
SA_RESTART. That would be quite a scary change if the range of code
where signals are accepted weren't so tiny, but as it is, it seems
safe enough. (Note that postmaster children do, and must, reset all
the handlers before unblocking signals; so this change should not
affect any child process.)
There is talk of rewriting the postmaster to use a WaitEventSet and
not do signal response work in signal handlers, at which point it might
be appropriate to revert this patch. But that's not happening before
v11 at the earliest.
Back-patch to 9.6. The problem exists much further back, but the
worst symptom arises only in connection with parallel query, so it
does not seem worth taking any portability risks in older branches.
Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us
2017-04-24 19:00:23 +02:00
|
|
|
pqsignal_no_restart(SIGINT, pmdie); /* send SIGTERM and shut down */
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
pqsignal_no_restart(SIGQUIT, pmdie); /* send SIGQUIT and die */
|
|
|
|
pqsignal_no_restart(SIGTERM, pmdie); /* wait for children and shut down */
|
Block signals earlier during postmaster startup.
Formerly, we set up the postmaster's signal handling only when we were
about to start launching subprocesses. This is a bad idea though, as
it means that for example a SIGINT arriving before that will kill the
postmaster instantly, perhaps leaving lockfiles, socket files, shared
memory, etc laying about. We'd rather that such a signal caused orderly
postmaster termination including releasing of those resources. A simple
fix is to move the PostmasterMain stanza that initializes signal handling
to an earlier point, before we've created any such resources. Then, an
early-arriving signal will be blocked until we're ready to deal with it
in the usual way. (The only part that really needs to be moved up is
blocking of signals, but it seems best to keep the signal handler
installation calls together with that; for one thing this ensures the
kernel won't drop any signals we wished to get. The handlers won't get
invoked in any case until we unblock signals in ServerLoop.)
Per a report from MauMau. He proposed changing the way "pg_ctl stop"
works to deal with this, but that'd just be masking one symptom not
fixing the core issue.
It's been like this since forever, so back-patch to all supported branches.
2014-04-06 00:16:08 +02:00
|
|
|
pqsignal(SIGALRM, SIG_IGN); /* ignored */
|
|
|
|
pqsignal(SIGPIPE, SIG_IGN); /* ignored */
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
pqsignal_no_restart(SIGUSR1, sigusr1_handler); /* message from child
|
|
|
|
* process */
|
|
|
|
pqsignal_no_restart(SIGUSR2, dummy_handler); /* unused, reserve for
|
|
|
|
* children */
|
|
|
|
pqsignal_no_restart(SIGCHLD, reaper); /* handle child termination */
|
Block signals earlier during postmaster startup.
Formerly, we set up the postmaster's signal handling only when we were
about to start launching subprocesses. This is a bad idea though, as
it means that for example a SIGINT arriving before that will kill the
postmaster instantly, perhaps leaving lockfiles, socket files, shared
memory, etc laying about. We'd rather that such a signal caused orderly
postmaster termination including releasing of those resources. A simple
fix is to move the PostmasterMain stanza that initializes signal handling
to an earlier point, before we've created any such resources. Then, an
early-arriving signal will be blocked until we're ready to deal with it
in the usual way. (The only part that really needs to be moved up is
blocking of signals, but it seems best to keep the signal handler
installation calls together with that; for one thing this ensures the
kernel won't drop any signals we wished to get. The handlers won't get
invoked in any case until we unblock signals in ServerLoop.)
Per a report from MauMau. He proposed changing the way "pg_ctl stop"
works to deal with this, but that'd just be masking one symptom not
fixing the core issue.
It's been like this since forever, so back-patch to all supported branches.
2014-04-06 00:16:08 +02:00
|
|
|
pqsignal(SIGTTIN, SIG_IGN); /* ignored */
|
|
|
|
pqsignal(SIGTTOU, SIG_IGN); /* ignored */
|
|
|
|
/* ignore SIGXFSZ, so that ulimit violations work like disk full */
|
|
|
|
#ifdef SIGXFSZ
|
|
|
|
pqsignal(SIGXFSZ, SIG_IGN); /* ignored */
|
|
|
|
#endif
|
|
|
|
|
2000-06-28 05:33:33 +02:00
|
|
|
/*
|
|
|
|
* Options setup
|
|
|
|
*/
|
2002-05-17 03:19:19 +02:00
|
|
|
InitializeGUCOptions();
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2000-05-31 02:28:42 +02:00
|
|
|
opterr = 1;
|
2001-10-19 22:47:09 +02:00
|
|
|
|
2007-01-04 01:57:51 +01:00
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* Parse command-line options. CAUTION: keep this in sync with
|
2007-11-15 22:14:46 +01:00
|
|
|
* tcop/postgres.c (the option sets should not conflict) and with the
|
|
|
|
* common help() function in main/main.c.
|
2007-01-04 01:57:51 +01:00
|
|
|
*/
|
2014-06-20 11:06:42 +02:00
|
|
|
while ((opt = getopt(argc, argv, "B:bc:C:D:d:EeFf:h:ijk:lN:nOo:Pp:r:S:sTt:W:-:")) != -1)
|
1997-09-07 07:04:48 +02:00
|
|
|
{
|
|
|
|
switch (opt)
|
|
|
|
{
|
1997-09-08 04:41:22 +02:00
|
|
|
case 'B':
|
2002-02-23 02:31:37 +01:00
|
|
|
SetConfigOption("shared_buffers", optarg, PGC_POSTMASTER, PGC_S_ARGV);
|
1997-09-08 04:41:22 +02:00
|
|
|
break;
|
2006-01-05 11:07:46 +01:00
|
|
|
|
2011-04-25 18:00:21 +02:00
|
|
|
case 'b':
|
|
|
|
/* Undocumented flag used for binary upgrades */
|
|
|
|
IsBinaryUpgrade = true;
|
|
|
|
break;
|
|
|
|
|
2011-10-06 15:38:39 +02:00
|
|
|
case 'C':
|
2012-10-12 19:35:40 +02:00
|
|
|
output_config_variable = strdup(optarg);
|
2011-10-06 15:38:39 +02:00
|
|
|
break;
|
|
|
|
|
1997-09-08 04:41:22 +02:00
|
|
|
case 'D':
|
2012-10-12 19:35:40 +02:00
|
|
|
userDoption = strdup(optarg);
|
1997-09-08 04:41:22 +02:00
|
|
|
break;
|
2006-01-05 11:07:46 +01:00
|
|
|
|
1997-09-08 04:41:22 +02:00
|
|
|
case 'd':
|
2004-11-14 20:35:35 +01:00
|
|
|
set_debug_options(atoi(optarg), PGC_POSTMASTER, PGC_S_ARGV);
|
|
|
|
break;
|
2006-01-05 11:07:46 +01:00
|
|
|
|
|
|
|
case 'E':
|
|
|
|
SetConfigOption("log_statement", "all", PGC_POSTMASTER, PGC_S_ARGV);
|
|
|
|
break;
|
|
|
|
|
|
|
|
case 'e':
|
|
|
|
SetConfigOption("datestyle", "euro", PGC_POSTMASTER, PGC_S_ARGV);
|
|
|
|
break;
|
|
|
|
|
2000-06-02 17:57:44 +02:00
|
|
|
case 'F':
|
2002-02-23 02:31:37 +01:00
|
|
|
SetConfigOption("fsync", "false", PGC_POSTMASTER, PGC_S_ARGV);
|
2000-06-02 17:57:44 +02:00
|
|
|
break;
|
2006-01-05 11:07:46 +01:00
|
|
|
|
|
|
|
case 'f':
|
|
|
|
if (!set_plan_disabling_options(optarg, PGC_POSTMASTER, PGC_S_ARGV))
|
|
|
|
{
|
|
|
|
write_stderr("%s: invalid argument for option -f: \"%s\"\n",
|
|
|
|
progname, optarg);
|
|
|
|
ExitPostmaster(1);
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
|
UUNET is looking into offering PostgreSQL as a part of a managed web
hosting product, on both shared and dedicated machines. We currently
offer Oracle and MySQL, and it would be a nice middle-ground.
However, as shipped, PostgreSQL lacks the following features we need
that MySQL has:
1. The ability to listen only on a particular IP address. Each
hosting customer has their own IP address, on which all of their
servers (http, ftp, real media, etc.) run.
2. The ability to place the Unix-domain socket in a mode 700 directory.
This allows us to automatically create an empty database, with an
empty DBA password, for new or upgrading customers without having
to interactively set a DBA password and communicate it to (or from)
the customer. This in turn cuts down our install and upgrade times.
3. The ability to connect to the Unix-domain socket from within a
change-rooted environment. We run CGI programs chrooted to the
user's home directory, which is another reason why we need to be
able to specify where the Unix-domain socket is, instead of /tmp.
4. The ability to, if run as root, open a pid file in /var/run as
root, and then setuid to the desired user. (mysqld -u can almost
do this; I had to patch it, too).
The patch below fixes problem 1-3. I plan to address #4, also, but
haven't done so yet. These diffs are big enough that they should give
the PG development team something to think about in the meantime :-)
Also, I'm about to leave for 2 weeks' vacation, so I thought I'd get
out what I have, which works (for the problems it tackles), now.
With these changes, we can set up and run PostgreSQL with scripts the
same way we can with apache or proftpd or mysql.
In summary, this patch makes the following enhancements:
1. Adds an environment variable PGUNIXSOCKET, analogous to MYSQL_UNIX_PORT,
and command line options -k --unix-socket to the relevant programs.
2. Adds a -h option to postmaster to set the hostname or IP address to
listen on instead of the default INADDR_ANY.
3. Extends some library interfaces to support the above.
4. Fixes a few memory leaks in PQconnectdb().
The default behavior is unchanged from stock 7.0.2; if you don't use
any of these new features, they don't change the operation.
David J. MacKenzie
2000-11-13 16:18:15 +01:00
|
|
|
case 'h':
|
2004-03-23 02:23:48 +01:00
|
|
|
SetConfigOption("listen_addresses", optarg, PGC_POSTMASTER, PGC_S_ARGV);
|
UUNET is looking into offering PostgreSQL as a part of a managed web
hosting product, on both shared and dedicated machines. We currently
offer Oracle and MySQL, and it would be a nice middle-ground.
However, as shipped, PostgreSQL lacks the following features we need
that MySQL has:
1. The ability to listen only on a particular IP address. Each
hosting customer has their own IP address, on which all of their
servers (http, ftp, real media, etc.) run.
2. The ability to place the Unix-domain socket in a mode 700 directory.
This allows us to automatically create an empty database, with an
empty DBA password, for new or upgrading customers without having
to interactively set a DBA password and communicate it to (or from)
the customer. This in turn cuts down our install and upgrade times.
3. The ability to connect to the Unix-domain socket from within a
change-rooted environment. We run CGI programs chrooted to the
user's home directory, which is another reason why we need to be
able to specify where the Unix-domain socket is, instead of /tmp.
4. The ability to, if run as root, open a pid file in /var/run as
root, and then setuid to the desired user. (mysqld -u can almost
do this; I had to patch it, too).
The patch below fixes problem 1-3. I plan to address #4, also, but
haven't done so yet. These diffs are big enough that they should give
the PG development team something to think about in the meantime :-)
Also, I'm about to leave for 2 weeks' vacation, so I thought I'd get
out what I have, which works (for the problems it tackles), now.
With these changes, we can set up and run PostgreSQL with scripts the
same way we can with apache or proftpd or mysql.
In summary, this patch makes the following enhancements:
1. Adds an environment variable PGUNIXSOCKET, analogous to MYSQL_UNIX_PORT,
and command line options -k --unix-socket to the relevant programs.
2. Adds a -h option to postmaster to set the hostname or IP address to
listen on instead of the default INADDR_ANY.
3. Extends some library interfaces to support the above.
4. Fixes a few memory leaks in PQconnectdb().
The default behavior is unchanged from stock 7.0.2; if you don't use
any of these new features, they don't change the operation.
David J. MacKenzie
2000-11-13 16:18:15 +01:00
|
|
|
break;
|
2006-01-05 11:07:46 +01:00
|
|
|
|
1997-11-10 06:10:50 +01:00
|
|
|
case 'i':
|
2004-03-23 02:23:48 +01:00
|
|
|
SetConfigOption("listen_addresses", "*", PGC_POSTMASTER, PGC_S_ARGV);
|
1999-10-08 06:28:57 +02:00
|
|
|
break;
|
2006-01-05 11:07:46 +01:00
|
|
|
|
|
|
|
case 'j':
|
|
|
|
/* only used by interactive backend */
|
|
|
|
break;
|
|
|
|
|
UUNET is looking into offering PostgreSQL as a part of a managed web
hosting product, on both shared and dedicated machines. We currently
offer Oracle and MySQL, and it would be a nice middle-ground.
However, as shipped, PostgreSQL lacks the following features we need
that MySQL has:
1. The ability to listen only on a particular IP address. Each
hosting customer has their own IP address, on which all of their
servers (http, ftp, real media, etc.) run.
2. The ability to place the Unix-domain socket in a mode 700 directory.
This allows us to automatically create an empty database, with an
empty DBA password, for new or upgrading customers without having
to interactively set a DBA password and communicate it to (or from)
the customer. This in turn cuts down our install and upgrade times.
3. The ability to connect to the Unix-domain socket from within a
change-rooted environment. We run CGI programs chrooted to the
user's home directory, which is another reason why we need to be
able to specify where the Unix-domain socket is, instead of /tmp.
4. The ability to, if run as root, open a pid file in /var/run as
root, and then setuid to the desired user. (mysqld -u can almost
do this; I had to patch it, too).
The patch below fixes problem 1-3. I plan to address #4, also, but
haven't done so yet. These diffs are big enough that they should give
the PG development team something to think about in the meantime :-)
Also, I'm about to leave for 2 weeks' vacation, so I thought I'd get
out what I have, which works (for the problems it tackles), now.
With these changes, we can set up and run PostgreSQL with scripts the
same way we can with apache or proftpd or mysql.
In summary, this patch makes the following enhancements:
1. Adds an environment variable PGUNIXSOCKET, analogous to MYSQL_UNIX_PORT,
and command line options -k --unix-socket to the relevant programs.
2. Adds a -h option to postmaster to set the hostname or IP address to
listen on instead of the default INADDR_ANY.
3. Extends some library interfaces to support the above.
4. Fixes a few memory leaks in PQconnectdb().
The default behavior is unchanged from stock 7.0.2; if you don't use
any of these new features, they don't change the operation.
David J. MacKenzie
2000-11-13 16:18:15 +01:00
|
|
|
case 'k':
|
2012-08-10 23:26:44 +02:00
|
|
|
SetConfigOption("unix_socket_directories", optarg, PGC_POSTMASTER, PGC_S_ARGV);
|
UUNET is looking into offering PostgreSQL as a part of a managed web
hosting product, on both shared and dedicated machines. We currently
offer Oracle and MySQL, and it would be a nice middle-ground.
However, as shipped, PostgreSQL lacks the following features we need
that MySQL has:
1. The ability to listen only on a particular IP address. Each
hosting customer has their own IP address, on which all of their
servers (http, ftp, real media, etc.) run.
2. The ability to place the Unix-domain socket in a mode 700 directory.
This allows us to automatically create an empty database, with an
empty DBA password, for new or upgrading customers without having
to interactively set a DBA password and communicate it to (or from)
the customer. This in turn cuts down our install and upgrade times.
3. The ability to connect to the Unix-domain socket from within a
change-rooted environment. We run CGI programs chrooted to the
user's home directory, which is another reason why we need to be
able to specify where the Unix-domain socket is, instead of /tmp.
4. The ability to, if run as root, open a pid file in /var/run as
root, and then setuid to the desired user. (mysqld -u can almost
do this; I had to patch it, too).
The patch below fixes problem 1-3. I plan to address #4, also, but
haven't done so yet. These diffs are big enough that they should give
the PG development team something to think about in the meantime :-)
Also, I'm about to leave for 2 weeks' vacation, so I thought I'd get
out what I have, which works (for the problems it tackles), now.
With these changes, we can set up and run PostgreSQL with scripts the
same way we can with apache or proftpd or mysql.
In summary, this patch makes the following enhancements:
1. Adds an environment variable PGUNIXSOCKET, analogous to MYSQL_UNIX_PORT,
and command line options -k --unix-socket to the relevant programs.
2. Adds a -h option to postmaster to set the hostname or IP address to
listen on instead of the default INADDR_ANY.
3. Extends some library interfaces to support the above.
4. Fixes a few memory leaks in PQconnectdb().
The default behavior is unchanged from stock 7.0.2; if you don't use
any of these new features, they don't change the operation.
David J. MacKenzie
2000-11-13 16:18:15 +01:00
|
|
|
break;
|
2006-01-05 11:07:46 +01:00
|
|
|
|
1999-10-08 06:28:57 +02:00
|
|
|
case 'l':
|
2002-02-23 02:31:37 +01:00
|
|
|
SetConfigOption("ssl", "true", PGC_POSTMASTER, PGC_S_ARGV);
|
1997-11-07 21:52:15 +01:00
|
|
|
break;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1999-02-19 07:06:39 +01:00
|
|
|
case 'N':
|
2002-02-23 02:31:37 +01:00
|
|
|
SetConfigOption("max_connections", optarg, PGC_POSTMASTER, PGC_S_ARGV);
|
1999-02-19 07:06:39 +01:00
|
|
|
break;
|
2006-01-05 11:07:46 +01:00
|
|
|
|
1997-09-08 04:41:22 +02:00
|
|
|
case 'n':
|
|
|
|
/* Don't reinit shared mem after abnormal exit */
|
1998-05-27 20:32:05 +02:00
|
|
|
Reinit = false;
|
1997-09-08 04:41:22 +02:00
|
|
|
break;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2006-01-05 11:07:46 +01:00
|
|
|
case 'O':
|
|
|
|
SetConfigOption("allow_system_table_mods", "true", PGC_POSTMASTER, PGC_S_ARGV);
|
|
|
|
break;
|
|
|
|
|
|
|
|
case 'o':
|
|
|
|
/* Other options to pass to the backend on the command line */
|
2004-07-11 01:29:16 +02:00
|
|
|
snprintf(ExtraOptions + strlen(ExtraOptions),
|
|
|
|
sizeof(ExtraOptions) - strlen(ExtraOptions),
|
|
|
|
" %s", optarg);
|
1997-09-08 04:41:22 +02:00
|
|
|
break;
|
2006-01-05 11:07:46 +01:00
|
|
|
|
|
|
|
case 'P':
|
|
|
|
SetConfigOption("ignore_system_indexes", "true", PGC_POSTMASTER, PGC_S_ARGV);
|
|
|
|
break;
|
|
|
|
|
1997-09-08 04:41:22 +02:00
|
|
|
case 'p':
|
2002-02-23 02:31:37 +01:00
|
|
|
SetConfigOption("port", optarg, PGC_POSTMASTER, PGC_S_ARGV);
|
1997-09-08 04:41:22 +02:00
|
|
|
break;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2006-01-05 11:07:46 +01:00
|
|
|
case 'r':
|
|
|
|
/* only used by single-user backend */
|
|
|
|
break;
|
|
|
|
|
|
|
|
case 'S':
|
|
|
|
SetConfigOption("work_mem", optarg, PGC_POSTMASTER, PGC_S_ARGV);
|
1997-09-08 04:41:22 +02:00
|
|
|
break;
|
2006-01-05 11:07:46 +01:00
|
|
|
|
1997-09-08 04:41:22 +02:00
|
|
|
case 's':
|
2007-01-04 01:57:51 +01:00
|
|
|
SetConfigOption("log_statement_stats", "true", PGC_POSTMASTER, PGC_S_ARGV);
|
2006-01-05 11:07:46 +01:00
|
|
|
break;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2006-01-05 11:07:46 +01:00
|
|
|
case 'T':
|
2006-10-04 02:30:14 +02:00
|
|
|
|
1997-09-08 04:41:22 +02:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* In the event that some backend dumps core, send SIGSTOP,
|
|
|
|
* rather than SIGQUIT, to all its peers. This lets the wily
|
|
|
|
* post_hacker collect core dumps from everyone.
|
1997-09-08 04:41:22 +02:00
|
|
|
*/
|
1998-05-27 20:32:05 +02:00
|
|
|
SendStop = true;
|
1997-09-08 04:41:22 +02:00
|
|
|
break;
|
2006-01-05 11:07:46 +01:00
|
|
|
|
|
|
|
case 't':
|
|
|
|
{
|
2006-10-04 02:30:14 +02:00
|
|
|
const char *tmp = get_stats_option_name(optarg);
|
|
|
|
|
|
|
|
if (tmp)
|
|
|
|
{
|
|
|
|
SetConfigOption(tmp, "true", PGC_POSTMASTER, PGC_S_ARGV);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
write_stderr("%s: invalid argument for option -t: \"%s\"\n",
|
|
|
|
progname, optarg);
|
|
|
|
ExitPostmaster(1);
|
|
|
|
}
|
|
|
|
break;
|
2006-01-05 11:07:46 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
case 'W':
|
|
|
|
SetConfigOption("post_auth_delay", optarg, PGC_POSTMASTER, PGC_S_ARGV);
|
|
|
|
break;
|
|
|
|
|
2000-11-08 18:57:46 +01:00
|
|
|
case 'c':
|
2000-06-02 17:57:44 +02:00
|
|
|
case '-':
|
2000-11-08 18:57:46 +01:00
|
|
|
{
|
2001-03-22 05:01:46 +01:00
|
|
|
char *name,
|
|
|
|
*value;
|
|
|
|
|
|
|
|
ParseLongOption(optarg, &name, &value);
|
|
|
|
if (!value)
|
|
|
|
{
|
|
|
|
if (opt == '-')
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_SYNTAX_ERROR),
|
|
|
|
errmsg("--%s requires a value",
|
|
|
|
optarg)));
|
2001-03-22 05:01:46 +01:00
|
|
|
else
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_SYNTAX_ERROR),
|
|
|
|
errmsg("-c %s requires a value",
|
|
|
|
optarg)));
|
2001-03-22 05:01:46 +01:00
|
|
|
}
|
|
|
|
|
2002-02-23 02:31:37 +01:00
|
|
|
SetConfigOption(name, value, PGC_POSTMASTER, PGC_S_ARGV);
|
2001-03-22 05:01:46 +01:00
|
|
|
free(name);
|
|
|
|
if (value)
|
|
|
|
free(value);
|
|
|
|
break;
|
2000-11-08 18:57:46 +01:00
|
|
|
}
|
2000-07-03 22:46:10 +02:00
|
|
|
|
1997-09-08 04:41:22 +02:00
|
|
|
default:
|
2004-07-21 22:34:50 +02:00
|
|
|
write_stderr("Try \"%s --help\" for more information.\n",
|
|
|
|
progname);
|
2000-11-29 21:59:54 +01:00
|
|
|
ExitPostmaster(1);
|
1997-09-07 07:04:48 +02:00
|
|
|
}
|
|
|
|
}
|
1999-06-04 23:14:46 +02:00
|
|
|
|
2002-02-23 02:31:37 +01:00
|
|
|
/*
|
|
|
|
* Postmaster accepts no non-option switch arguments.
|
|
|
|
*/
|
|
|
|
if (optind < argc)
|
|
|
|
{
|
2004-07-21 22:34:50 +02:00
|
|
|
write_stderr("%s: invalid argument: \"%s\"\n",
|
|
|
|
progname, argv[optind]);
|
|
|
|
write_stderr("Try \"%s --help\" for more information.\n",
|
|
|
|
progname);
|
2002-02-23 02:31:37 +01:00
|
|
|
ExitPostmaster(1);
|
|
|
|
}
|
|
|
|
|
2004-10-08 03:36:36 +02:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Locate the proper configuration files and data directory, and read
|
|
|
|
* postgresql.conf for the first time.
|
2004-10-08 03:36:36 +02:00
|
|
|
*/
|
|
|
|
if (!SelectConfigFiles(userDoption, progname))
|
|
|
|
ExitPostmaster(2);
|
2004-05-27 17:07:41 +02:00
|
|
|
|
2011-10-06 15:38:39 +02:00
|
|
|
if (output_config_variable != NULL)
|
|
|
|
{
|
2012-06-10 21:20:04 +02:00
|
|
|
/*
|
2016-06-16 18:17:03 +02:00
|
|
|
* "-C guc" was specified, so print GUC's value and exit. No extra
|
|
|
|
* permission check is needed because the user is reading inside the
|
|
|
|
* data dir.
|
2012-06-10 21:20:04 +02:00
|
|
|
*/
|
2016-06-16 18:17:03 +02:00
|
|
|
const char *config_val = GetConfigOption(output_config_variable,
|
|
|
|
false, false);
|
|
|
|
|
2016-06-22 17:55:18 +02:00
|
|
|
puts(config_val ? config_val : "");
|
2011-10-06 15:38:39 +02:00
|
|
|
ExitPostmaster(0);
|
|
|
|
}
|
2012-06-10 21:20:04 +02:00
|
|
|
|
2004-10-08 03:36:36 +02:00
|
|
|
/* Verify that DataDir looks reasonable */
|
|
|
|
checkDataDir();
|
2004-05-21 07:08:06 +02:00
|
|
|
|
2005-07-04 06:51:52 +02:00
|
|
|
/* And switch working directory into it */
|
|
|
|
ChangeToDataDir();
|
|
|
|
|
2002-11-21 07:36:08 +01:00
|
|
|
/*
|
|
|
|
* Check for invalid combinations of GUC settings.
|
1999-06-04 23:14:46 +02:00
|
|
|
*/
|
2012-08-10 14:49:03 +02:00
|
|
|
if (ReservedBackends >= MaxConnections)
|
2002-11-21 07:36:08 +01:00
|
|
|
{
|
2004-07-21 22:34:50 +02:00
|
|
|
write_stderr("%s: superuser_reserved_connections must be less than max_connections\n", progname);
|
2002-11-21 07:36:08 +01:00
|
|
|
ExitPostmaster(1);
|
|
|
|
}
|
2012-08-10 14:49:03 +02:00
|
|
|
if (max_wal_senders >= MaxConnections)
|
|
|
|
{
|
|
|
|
write_stderr("%s: max_wal_senders must be less than max_connections\n", progname);
|
|
|
|
ExitPostmaster(1);
|
|
|
|
}
|
2015-05-15 17:55:24 +02:00
|
|
|
if (XLogArchiveMode > ARCHIVE_MODE_OFF && wal_level == WAL_LEVEL_MINIMAL)
|
Introduce wal_level GUC to explicitly control if information needed for
archival or hot standby should be WAL-logged, instead of deducing that from
other options like archive_mode. This replaces recovery_connections GUC in
the primary, where it now has no effect, but it's still used in the standby
to enable/disable hot standby.
Remove the WAL-logging of "unlogged operations", like creating an index
without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
the wal_mode setting and the settings that affect how much shared memory a
hot standby server needs to track master transactions (max_connections,
max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
change, at server restart, write a WAL record noting the new settings and
update pg_control. This allows us to notice the change in those settings in
the standby at the right moment, they used to be included in checkpoint
records, but that meant that a changed value was not reflected in the
standby until the first checkpoint after the change.
Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
the sequence it used to follow, before hot standby and subsequent patches
changed it to 0x9003.
2010-04-28 18:10:43 +02:00
|
|
|
ereport(ERROR,
|
2015-05-15 17:55:24 +02:00
|
|
|
(errmsg("WAL archival cannot be enabled when wal_level is \"minimal\"")));
|
Introduce wal_level GUC to explicitly control if information needed for
archival or hot standby should be WAL-logged, instead of deducing that from
other options like archive_mode. This replaces recovery_connections GUC in
the primary, where it now has no effect, but it's still used in the standby
to enable/disable hot standby.
Remove the WAL-logging of "unlogged operations", like creating an index
without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
the wal_mode setting and the settings that affect how much shared memory a
hot standby server needs to track master transactions (max_connections,
max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
change, at server restart, write a WAL record noting the new settings and
update pg_control. This allows us to notice the change in those settings in
the standby at the right moment, they used to be included in checkpoint
records, but that meant that a changed value was not reflected in the
standby until the first checkpoint after the change.
Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
the sequence it used to follow, before hot standby and subsequent patches
changed it to 0x9003.
2010-04-28 18:10:43 +02:00
|
|
|
if (max_wal_senders > 0 && wal_level == WAL_LEVEL_MINIMAL)
|
|
|
|
ereport(ERROR,
|
2016-03-01 02:01:54 +01:00
|
|
|
(errmsg("WAL streaming (max_wal_senders > 0) requires wal_level \"replica\" or \"logical\"")));
|
2002-08-29 23:02:12 +02:00
|
|
|
|
2003-01-16 01:26:49 +01:00
|
|
|
/*
|
2005-10-20 22:05:45 +02:00
|
|
|
* Other one-time internal sanity checks can go here, if they are fast.
|
|
|
|
* (Put any slow processing further down, after postmaster.pid creation.)
|
2003-01-16 01:26:49 +01:00
|
|
|
*/
|
|
|
|
if (!CheckDateTokenTables())
|
|
|
|
{
|
2004-07-21 22:34:50 +02:00
|
|
|
write_stderr("%s: invalid datetoken tables, please fix\n", progname);
|
2003-01-16 01:26:49 +01:00
|
|
|
ExitPostmaster(1);
|
|
|
|
}
|
|
|
|
|
2001-10-19 22:47:09 +02:00
|
|
|
/*
|
2001-10-25 07:50:21 +02:00
|
|
|
* Now that we are done processing the postmaster arguments, reset
|
|
|
|
* getopt(3) library so that it will work correctly in subprocesses.
|
2001-10-19 22:47:09 +02:00
|
|
|
*/
|
|
|
|
optind = 1;
|
2010-12-16 22:22:05 +01:00
|
|
|
#ifdef HAVE_INT_OPTRESET
|
2001-10-21 05:25:36 +02:00
|
|
|
optreset = 1; /* some systems need this too */
|
2001-10-19 22:47:09 +02:00
|
|
|
#endif
|
|
|
|
|
|
|
|
/* For debugging: display postmaster environment */
|
2000-11-12 21:51:52 +01:00
|
|
|
{
|
|
|
|
extern char **environ;
|
|
|
|
char **p;
|
|
|
|
|
2003-08-12 20:23:21 +02:00
|
|
|
ereport(DEBUG3,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
(errmsg_internal("%s: PostmasterMain: initial environment dump:",
|
|
|
|
progname)));
|
2003-08-12 20:23:21 +02:00
|
|
|
ereport(DEBUG3,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
(errmsg_internal("-----------------------------------------")));
|
2000-11-12 21:51:52 +01:00
|
|
|
for (p = environ; *p; ++p)
|
2003-08-12 20:23:21 +02:00
|
|
|
ereport(DEBUG3,
|
|
|
|
(errmsg_internal("\t%s", *p)));
|
|
|
|
ereport(DEBUG3,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
(errmsg_internal("-----------------------------------------")));
|
2000-11-12 21:51:52 +01:00
|
|
|
}
|
|
|
|
|
2000-11-29 21:59:54 +01:00
|
|
|
/*
|
|
|
|
* Create lockfile for data directory.
|
|
|
|
*
|
2005-10-15 04:49:52 +02:00
|
|
|
* We want to do this before we try to grab the input sockets, because the
|
|
|
|
* data directory interlock is more reliable than the socket-file
|
|
|
|
* interlock (thanks to whoever decided to put socket files in /tmp :-().
|
|
|
|
* For the same reason, it's best to grab the TCP socket(s) before the
|
2012-08-10 23:36:54 +02:00
|
|
|
* Unix socket(s).
|
2015-08-02 20:54:44 +02:00
|
|
|
*
|
|
|
|
* Also note that this internally sets up the on_proc_exit function that
|
|
|
|
* is responsible for removing both data directory and socket lockfiles;
|
|
|
|
* so it must happen before opening sockets so that at exit, the socket
|
|
|
|
* lockfiles go away after CloseServerPorts runs.
|
2000-11-29 21:59:54 +01:00
|
|
|
*/
|
2005-07-04 06:51:52 +02:00
|
|
|
CreateDataDirLockFile(true);
|
2000-11-29 21:59:54 +01:00
|
|
|
|
2017-09-13 11:12:17 +02:00
|
|
|
/* read control file (error checking and contains config) */
|
|
|
|
LocalProcessControlFile();
|
|
|
|
|
2005-10-20 22:05:45 +02:00
|
|
|
/*
|
|
|
|
* Initialize SSL library, if specified.
|
|
|
|
*/
|
|
|
|
#ifdef USE_SSL
|
|
|
|
if (EnableSSL)
|
2017-01-03 03:37:12 +01:00
|
|
|
{
|
|
|
|
(void) secure_initialize(true);
|
|
|
|
LoadedSSL = true;
|
|
|
|
}
|
2005-10-20 22:05:45 +02:00
|
|
|
#endif
|
|
|
|
|
2017-01-19 18:00:00 +01:00
|
|
|
/*
|
|
|
|
* Register the apply launcher. Since it registers a background worker,
|
|
|
|
* it needs to be called before InitializeMaxBackends(), and it's probably
|
|
|
|
* a good idea to call it before any modules had chance to take the
|
|
|
|
* background worker slots.
|
|
|
|
*/
|
|
|
|
ApplyLauncherRegister();
|
|
|
|
|
2005-10-20 22:05:45 +02:00
|
|
|
/*
|
2006-08-08 21:15:09 +02:00
|
|
|
* process any libraries that should be preloaded at postmaster start
|
2005-10-20 22:05:45 +02:00
|
|
|
*/
|
2006-08-15 20:26:59 +02:00
|
|
|
process_shared_preload_libraries();
|
2005-10-20 22:05:45 +02:00
|
|
|
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/*
|
2013-01-02 16:01:14 +01:00
|
|
|
* Now that loadable modules have had their chance to register background
|
2013-01-02 18:39:11 +01:00
|
|
|
* workers, calculate MaxBackends.
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
*/
|
2013-01-02 18:39:11 +01:00
|
|
|
InitializeMaxBackends();
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
2000-11-12 21:51:52 +01:00
|
|
|
/*
|
|
|
|
* Establish input sockets.
|
2015-08-02 20:54:44 +02:00
|
|
|
*
|
|
|
|
* First, mark them all closed, and set up an on_proc_exit function that's
|
|
|
|
* charged with closing the sockets again at postmaster shutdown.
|
2000-11-12 21:51:52 +01:00
|
|
|
*/
|
2003-06-12 09:36:51 +02:00
|
|
|
for (i = 0; i < MAXLISTEN; i++)
|
2010-01-10 15:16:08 +01:00
|
|
|
ListenSocket[i] = PGINVALID_SOCKET;
|
2003-07-24 01:30:41 +02:00
|
|
|
|
2015-08-02 20:54:44 +02:00
|
|
|
on_proc_exit(CloseServerPorts, 0);
|
|
|
|
|
2004-03-23 02:23:48 +01:00
|
|
|
if (ListenAddresses)
|
1997-11-10 06:10:50 +01:00
|
|
|
{
|
2004-08-29 07:07:03 +02:00
|
|
|
char *rawstring;
|
|
|
|
List *elemlist;
|
|
|
|
ListCell *l;
|
2005-06-30 12:02:22 +02:00
|
|
|
int success = 0;
|
2004-03-23 02:23:48 +01:00
|
|
|
|
2004-08-08 22:17:36 +02:00
|
|
|
/* Need a modifiable copy of ListenAddresses */
|
|
|
|
rawstring = pstrdup(ListenAddresses);
|
|
|
|
|
2012-08-10 23:26:44 +02:00
|
|
|
/* Parse string into list of hostnames */
|
2004-08-29 07:07:03 +02:00
|
|
|
if (!SplitIdentifierString(rawstring, ',', &elemlist))
|
2000-04-12 19:17:23 +02:00
|
|
|
{
|
2004-08-08 22:17:36 +02:00
|
|
|
/* syntax error in list */
|
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
|
2013-08-29 18:33:50 +02:00
|
|
|
errmsg("invalid list syntax in parameter \"%s\"",
|
|
|
|
"listen_addresses")));
|
2004-08-08 22:17:36 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
foreach(l, elemlist)
|
|
|
|
{
|
2004-08-29 07:07:03 +02:00
|
|
|
char *curhost = (char *) lfirst(l);
|
2004-08-08 22:17:36 +02:00
|
|
|
|
2004-05-27 17:07:41 +02:00
|
|
|
if (strcmp(curhost, "*") == 0)
|
2004-03-23 02:23:48 +01:00
|
|
|
status = StreamServerPort(AF_UNSPEC, NULL,
|
|
|
|
(unsigned short) PostPortNumber,
|
2012-08-10 23:26:44 +02:00
|
|
|
NULL,
|
2004-03-23 02:23:48 +01:00
|
|
|
ListenSocket, MAXLISTEN);
|
|
|
|
else
|
2003-07-24 01:30:41 +02:00
|
|
|
status = StreamServerPort(AF_UNSPEC, curhost,
|
|
|
|
(unsigned short) PostPortNumber,
|
2012-08-10 23:26:44 +02:00
|
|
|
NULL,
|
2003-07-24 01:30:41 +02:00
|
|
|
ListenSocket, MAXLISTEN);
|
2010-12-31 23:24:26 +01:00
|
|
|
|
2005-06-30 12:02:22 +02:00
|
|
|
if (status == STATUS_OK)
|
2011-01-14 01:01:28 +01:00
|
|
|
{
|
2005-06-30 12:02:22 +02:00
|
|
|
success++;
|
2011-01-14 01:01:28 +01:00
|
|
|
/* record the first successful host addr in lockfile */
|
|
|
|
if (!listen_addr_saved)
|
|
|
|
{
|
|
|
|
AddToDataDirLockFile(LOCK_FILE_LINE_LISTEN_ADDR, curhost);
|
|
|
|
listen_addr_saved = true;
|
|
|
|
}
|
|
|
|
}
|
2005-06-30 12:02:22 +02:00
|
|
|
else
|
2004-03-23 02:23:48 +01:00
|
|
|
ereport(WARNING,
|
2005-10-15 04:49:52 +02:00
|
|
|
(errmsg("could not create listen socket for \"%s\"",
|
|
|
|
curhost)));
|
2000-04-12 19:17:23 +02:00
|
|
|
}
|
2004-08-08 22:17:36 +02:00
|
|
|
|
2012-08-10 23:26:44 +02:00
|
|
|
if (!success && elemlist != NIL)
|
2005-06-30 12:02:22 +02:00
|
|
|
ereport(FATAL,
|
|
|
|
(errmsg("could not create any TCP/IP sockets")));
|
|
|
|
|
2004-08-08 22:17:36 +02:00
|
|
|
list_free(elemlist);
|
|
|
|
pfree(rawstring);
|
2004-03-23 02:23:48 +01:00
|
|
|
}
|
2003-07-24 01:30:41 +02:00
|
|
|
|
2005-05-15 02:26:19 +02:00
|
|
|
#ifdef USE_BONJOUR
|
|
|
|
/* Register for Bonjour only if we opened TCP socket(s) */
|
2010-01-10 15:16:08 +01:00
|
|
|
if (enable_bonjour && ListenSocket[0] != PGINVALID_SOCKET)
|
2004-03-23 02:23:48 +01:00
|
|
|
{
|
2009-09-08 18:08:26 +02:00
|
|
|
DNSServiceErrorType err;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We pass 0 for interface_index, which will result in registering on
|
|
|
|
* all "applicable" interfaces. It's not entirely clear from the
|
|
|
|
* DNS-SD docs whether this would be appropriate if we have bound to
|
|
|
|
* just a subset of the available network interfaces.
|
|
|
|
*/
|
|
|
|
err = DNSServiceRegister(&bonjour_sdref,
|
|
|
|
0,
|
|
|
|
0,
|
|
|
|
bonjour_name,
|
|
|
|
"_postgresql._tcp.",
|
|
|
|
NULL,
|
|
|
|
NULL,
|
|
|
|
htons(PostPortNumber),
|
|
|
|
0,
|
|
|
|
NULL,
|
|
|
|
NULL,
|
|
|
|
NULL);
|
|
|
|
if (err != kDNSServiceErr_NoError)
|
|
|
|
elog(LOG, "DNSServiceRegister() failed: error code %ld",
|
|
|
|
(long) err);
|
2010-02-26 03:01:40 +01:00
|
|
|
|
2009-09-08 18:08:26 +02:00
|
|
|
/*
|
2010-02-26 03:01:40 +01:00
|
|
|
* We don't bother to read the mDNS daemon's reply, and we expect that
|
|
|
|
* it will automatically terminate our registration when the socket is
|
|
|
|
* closed at postmaster termination. So there's nothing more to be
|
|
|
|
* done here. However, the bonjour_sdref is kept around so that
|
|
|
|
* forked children can close their copies of the socket.
|
2009-09-08 18:08:26 +02:00
|
|
|
*/
|
1997-11-10 06:10:50 +01:00
|
|
|
}
|
2004-03-23 02:23:48 +01:00
|
|
|
#endif
|
1999-09-27 05:13:16 +02:00
|
|
|
|
2000-08-20 12:55:35 +02:00
|
|
|
#ifdef HAVE_UNIX_SOCKETS
|
2012-08-10 23:26:44 +02:00
|
|
|
if (Unix_socket_directories)
|
|
|
|
{
|
|
|
|
char *rawstring;
|
|
|
|
List *elemlist;
|
|
|
|
ListCell *l;
|
|
|
|
int success = 0;
|
|
|
|
|
|
|
|
/* Need a modifiable copy of Unix_socket_directories */
|
|
|
|
rawstring = pstrdup(Unix_socket_directories);
|
|
|
|
|
|
|
|
/* Parse string into list of directories */
|
|
|
|
if (!SplitDirectoriesString(rawstring, ',', &elemlist))
|
|
|
|
{
|
|
|
|
/* syntax error in list */
|
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
|
2013-08-29 18:33:50 +02:00
|
|
|
errmsg("invalid list syntax in parameter \"%s\"",
|
|
|
|
"unix_socket_directories")));
|
2012-08-10 23:26:44 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
foreach(l, elemlist)
|
|
|
|
{
|
|
|
|
char *socketdir = (char *) lfirst(l);
|
|
|
|
|
|
|
|
status = StreamServerPort(AF_UNIX, NULL,
|
|
|
|
(unsigned short) PostPortNumber,
|
|
|
|
socketdir,
|
|
|
|
ListenSocket, MAXLISTEN);
|
|
|
|
|
|
|
|
if (status == STATUS_OK)
|
|
|
|
{
|
|
|
|
success++;
|
|
|
|
/* record the first successful Unix socket in lockfile */
|
|
|
|
if (success == 1)
|
|
|
|
AddToDataDirLockFile(LOCK_FILE_LINE_SOCKET_DIR, socketdir);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
ereport(WARNING,
|
|
|
|
(errmsg("could not create Unix-domain socket in directory \"%s\"",
|
|
|
|
socketdir)));
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!success && elemlist != NIL)
|
|
|
|
ereport(FATAL,
|
|
|
|
(errmsg("could not create any Unix-domain sockets")));
|
|
|
|
|
|
|
|
list_free_deep(elemlist);
|
|
|
|
pfree(rawstring);
|
|
|
|
}
|
1999-01-17 07:20:06 +01:00
|
|
|
#endif
|
2000-11-12 21:51:52 +01:00
|
|
|
|
2004-03-23 02:23:48 +01:00
|
|
|
/*
|
|
|
|
* check that we have some socket to listen on
|
|
|
|
*/
|
2010-01-10 15:16:08 +01:00
|
|
|
if (ListenSocket[0] == PGINVALID_SOCKET)
|
2004-03-23 02:23:48 +01:00
|
|
|
ereport(FATAL,
|
2004-03-24 16:20:54 +01:00
|
|
|
(errmsg("no socket created for listening")));
|
2004-03-23 02:23:48 +01:00
|
|
|
|
2011-01-14 01:01:28 +01:00
|
|
|
/*
|
|
|
|
* If no valid TCP ports, write an empty line for listen address,
|
|
|
|
* indicating the Unix socket must be used. Note that this line is not
|
|
|
|
* added to the lock file until there is a socket backing it.
|
|
|
|
*/
|
|
|
|
if (!listen_addr_saved)
|
|
|
|
AddToDataDirLockFile(LOCK_FILE_LINE_LISTEN_ADDR, "");
|
|
|
|
|
2000-11-29 21:59:54 +01:00
|
|
|
/*
|
|
|
|
* Set up shared memory and semaphores.
|
|
|
|
*/
|
2000-11-14 02:15:06 +01:00
|
|
|
reset_shared(PostPortNumber);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2004-02-23 21:45:59 +01:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Estimate number of openable files. This must happen after setting up
|
|
|
|
* semaphores, because on some platforms semaphores count as open files.
|
2004-02-23 21:45:59 +01:00
|
|
|
*/
|
|
|
|
set_max_safe_fds();
|
|
|
|
|
Do stack-depth checking in all postmaster children.
We used to only initialize the stack base pointer when starting up a regular
backend, not in other processes. In particular, autovacuum workers can run
arbitrary user code, and without stack-depth checking, infinite recursion
in e.g an index expression will bring down the whole cluster.
The comment about PL/Java using set_stack_base() is not yet true. As the
code stands, PL/java still modifies the stack_base_ptr variable directly.
However, it's been discussed in the PL/Java mailing list that it should be
changed to use the function, because PL/Java is currently oblivious to the
register stack used on Itanium. There's another issues with PL/Java, namely
that the stack base pointer it sets is not really the base of the stack, it
could be something close to the bottom of the stack. That's a separate issue
that might need some further changes to this code, but that's a different
story.
Backpatch to all supported releases.
2012-04-08 17:28:12 +02:00
|
|
|
/*
|
|
|
|
* Set reference point for stack-depth checking.
|
|
|
|
*/
|
|
|
|
set_stack_base();
|
|
|
|
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
/*
|
|
|
|
* Initialize pipe (or process handle on Windows) that allows children to
|
|
|
|
* wake up from sleep on postmaster death.
|
|
|
|
*/
|
|
|
|
InitPostmasterDeathWatchHandle();
|
2004-08-29 07:07:03 +02:00
|
|
|
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
#ifdef WIN32
|
2014-05-06 18:12:18 +02:00
|
|
|
|
2004-01-26 23:59:54 +01:00
|
|
|
/*
|
2007-10-26 23:50:10 +02:00
|
|
|
* Initialize I/O completion port used to deliver list of dead children.
|
2004-01-26 23:59:54 +01:00
|
|
|
*/
|
2007-10-26 23:50:10 +02:00
|
|
|
win32ChildQueue = CreateIoCompletionPort(INVALID_HANDLE_VALUE, NULL, 0, 1);
|
|
|
|
if (win32ChildQueue == NULL)
|
2004-02-23 21:45:59 +01:00
|
|
|
ereport(FATAL,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
(errmsg("could not create I/O completion port for child queue")));
|
2004-01-26 23:59:54 +01:00
|
|
|
#endif
|
|
|
|
|
2000-11-29 21:59:54 +01:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Record postmaster options. We delay this till now to avoid recording
|
|
|
|
* bogus options (eg, NBuffers too high for available memory).
|
2000-11-29 21:59:54 +01:00
|
|
|
*/
|
2004-05-14 00:45:04 +02:00
|
|
|
if (!CreateOptsFile(argc, argv, my_exec_path))
|
2000-11-29 21:59:54 +01:00
|
|
|
ExitPostmaster(1);
|
2000-04-12 19:17:23 +02:00
|
|
|
|
2004-10-10 01:13:22 +02:00
|
|
|
#ifdef EXEC_BACKEND
|
2007-08-09 03:18:43 +02:00
|
|
|
/* Write out nondefault GUC settings for child processes to use */
|
2004-10-10 01:13:22 +02:00
|
|
|
write_nondefault_variables(PGC_POSTMASTER);
|
|
|
|
#endif
|
|
|
|
|
2004-10-08 03:36:36 +02:00
|
|
|
/*
|
|
|
|
* Write the external PID file if requested
|
|
|
|
*/
|
2004-10-10 01:13:22 +02:00
|
|
|
if (external_pid_file)
|
2004-10-08 03:36:36 +02:00
|
|
|
{
|
2004-10-10 01:13:22 +02:00
|
|
|
FILE *fpidfile = fopen(external_pid_file, "w");
|
2004-10-08 03:36:36 +02:00
|
|
|
|
|
|
|
if (fpidfile)
|
|
|
|
{
|
|
|
|
fprintf(fpidfile, "%d\n", MyProcPid);
|
|
|
|
fclose(fpidfile);
|
2011-06-18 23:34:32 +02:00
|
|
|
|
|
|
|
/* Make PID file world readable */
|
|
|
|
if (chmod(external_pid_file, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH) != 0)
|
|
|
|
write_stderr("%s: could not change permissions of external PID file \"%s\": %s\n",
|
|
|
|
progname, external_pid_file, strerror(errno));
|
2004-10-08 03:36:36 +02:00
|
|
|
}
|
|
|
|
else
|
2004-10-12 23:54:45 +02:00
|
|
|
write_stderr("%s: could not write external PID file \"%s\": %s\n",
|
2004-10-10 01:13:22 +02:00
|
|
|
progname, external_pid_file, strerror(errno));
|
2012-08-21 05:47:11 +02:00
|
|
|
|
|
|
|
on_proc_exit(unlink_external_pid_file, 0);
|
2004-10-08 03:36:36 +02:00
|
|
|
}
|
|
|
|
|
2015-01-13 21:02:47 +01:00
|
|
|
/*
|
|
|
|
* Remove old temporary files. At this point there can be no other
|
|
|
|
* Postgres processes running in this directory, so this should be safe.
|
|
|
|
*/
|
|
|
|
RemovePgTempFiles();
|
|
|
|
|
2015-09-09 15:51:44 +02:00
|
|
|
/*
|
2016-06-10 00:02:36 +02:00
|
|
|
* Forcibly remove the files signaling a standby promotion request.
|
|
|
|
* Otherwise, the existence of those files triggers a promotion too early,
|
|
|
|
* whether a user wants that or not.
|
2015-09-09 15:51:44 +02:00
|
|
|
*
|
2016-06-10 00:02:36 +02:00
|
|
|
* This removal of files is usually unnecessary because they can exist
|
|
|
|
* only during a few moments during a standby promotion. However there is
|
|
|
|
* a race condition: if pg_ctl promote is executed and creates the files
|
|
|
|
* during a promotion, the files can stay around even after the server is
|
|
|
|
* brought up to new master. Then, if new standby starts by using the
|
|
|
|
* backup taken from that master, the files can exist at the server
|
2015-09-09 15:51:44 +02:00
|
|
|
* startup and should be removed in order to avoid an unexpected
|
|
|
|
* promotion.
|
|
|
|
*
|
2016-06-10 00:02:36 +02:00
|
|
|
* Note that promotion signal files need to be removed before the startup
|
|
|
|
* process is invoked. Because, after that, they can be used by
|
|
|
|
* postmaster's SIGUSR1 signal handler.
|
2015-09-09 15:51:44 +02:00
|
|
|
*/
|
|
|
|
RemovePromoteSignalFiles();
|
|
|
|
|
2017-03-03 07:02:45 +01:00
|
|
|
/* Remove any outdated file holding the current log filenames. */
|
|
|
|
if (unlink(LOG_METAINFO_DATAFILE) < 0 && errno != ENOENT)
|
|
|
|
ereport(LOG,
|
|
|
|
(errcode_for_file_access(),
|
|
|
|
errmsg("could not remove file \"%s\": %m",
|
2017-05-17 22:31:56 +02:00
|
|
|
LOG_METAINFO_DATAFILE)));
|
2017-03-03 07:02:45 +01:00
|
|
|
|
2004-08-06 01:32:13 +02:00
|
|
|
/*
|
|
|
|
* If enabled, start up syslogger collection subprocess
|
|
|
|
*/
|
|
|
|
SysLoggerPID = SysLogger_Start();
|
|
|
|
|
2001-09-08 03:10:21 +02:00
|
|
|
/*
|
2005-11-22 19:17:34 +01:00
|
|
|
* Reset whereToSendOutput from DestDebug (its starting state) to
|
|
|
|
* DestNone. This stops ereport from sending log messages to stderr unless
|
2004-08-29 07:07:03 +02:00
|
|
|
* Log_destination permits. We don't do this until the postmaster is
|
|
|
|
* fully launched, since startup failures may as well be reported to
|
|
|
|
* stderr.
|
2013-08-13 21:24:52 +02:00
|
|
|
*
|
|
|
|
* If we are in fact disabling logging to stderr, first emit a log message
|
|
|
|
* saying so, to provide a breadcrumb trail for users who may not remember
|
|
|
|
* that their logging is configured to go somewhere else.
|
2001-09-08 03:10:21 +02:00
|
|
|
*/
|
2013-08-13 21:24:52 +02:00
|
|
|
if (!(Log_destination & LOG_DESTINATION_STDERR))
|
|
|
|
ereport(LOG,
|
|
|
|
(errmsg("ending log output to stderr"),
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
errhint("Future log output will go to log destination \"%s\".",
|
|
|
|
Log_destination_string)));
|
2013-08-13 21:24:52 +02:00
|
|
|
|
2005-11-03 18:11:40 +01:00
|
|
|
whereToSendOutput = DestNone;
|
2001-09-08 03:10:21 +02:00
|
|
|
|
2001-07-03 18:52:12 +02:00
|
|
|
/*
|
2006-07-25 03:23:34 +02:00
|
|
|
* Initialize stats collection subsystem (this does NOT start the
|
|
|
|
* collector process!)
|
2001-07-03 18:52:12 +02:00
|
|
|
*/
|
2003-04-26 04:57:14 +02:00
|
|
|
pgstat_init();
|
2001-07-03 18:52:12 +02:00
|
|
|
|
2001-11-02 19:39:57 +01:00
|
|
|
/*
|
2006-07-25 03:23:34 +02:00
|
|
|
* Initialize the autovacuum subsystem (again, no process start yet)
|
2000-11-29 21:59:54 +01:00
|
|
|
*/
|
2006-07-25 03:23:34 +02:00
|
|
|
autovac_init();
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2009-08-24 22:08:32 +02:00
|
|
|
/*
|
|
|
|
* Load configuration files for client authentication.
|
|
|
|
*/
|
|
|
|
if (!load_hba())
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* It makes no sense to continue if we fail to load the HBA file,
|
|
|
|
* since there is no way to connect to the database in this case.
|
|
|
|
*/
|
|
|
|
ereport(FATAL,
|
|
|
|
(errmsg("could not load pg_hba.conf")));
|
|
|
|
}
|
Parse pg_ident.conf when it's loaded, keeping it in memory in parsed format.
Similar changes were done to pg_hba.conf earlier already, this commit makes
pg_ident.conf to behave the same as pg_hba.conf.
This has two user-visible effects. First, if pg_ident.conf contains multiple
errors, the whole file is parsed at postmaster startup time and all the
errors are immediately reported. Before this patch, the file was parsed and
the errors were reported only when someone tries to connect using an
authentication method that uses the file, and the parsing stopped on first
error. Second, if you SIGHUP to reload the config files, and the new
pg_ident.conf file contains an error, the error is logged but the old file
stays in effect.
Also, regular expressions in pg_ident.conf are now compiled only once when
the file is loaded, rather than every time the a user is authenticated. That
should speed up authentication if you have a lot of regexps in the file.
Amit Kapila
2012-09-21 16:41:22 +02:00
|
|
|
if (!load_ident())
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* We can start up without the IDENT file, although it means that you
|
|
|
|
* cannot log in using any of the authentication methods that need a
|
2013-05-29 22:58:43 +02:00
|
|
|
* user name mapping. load_ident() already logged the details of error
|
|
|
|
* to the log.
|
Parse pg_ident.conf when it's loaded, keeping it in memory in parsed format.
Similar changes were done to pg_hba.conf earlier already, this commit makes
pg_ident.conf to behave the same as pg_hba.conf.
This has two user-visible effects. First, if pg_ident.conf contains multiple
errors, the whole file is parsed at postmaster startup time and all the
errors are immediately reported. Before this patch, the file was parsed and
the errors were reported only when someone tries to connect using an
authentication method that uses the file, and the parsing stopped on first
error. Second, if you SIGHUP to reload the config files, and the new
pg_ident.conf file contains an error, the error is logged but the old file
stays in effect.
Also, regular expressions in pg_ident.conf are now compiled only once when
the file is loaded, rather than every time the a user is authenticated. That
should speed up authentication if you have a lot of regexps in the file.
Amit Kapila
2012-09-21 16:41:22 +02:00
|
|
|
*/
|
|
|
|
}
|
|
|
|
|
2015-01-08 04:35:44 +01:00
|
|
|
#ifdef HAVE_PTHREAD_IS_THREADED_NP
|
|
|
|
|
|
|
|
/*
|
Refer to OS X as "macOS", except for the port name which is still "darwin".
We weren't terribly consistent about whether to call Apple's OS "OS X"
or "Mac OS X", and the former is probably confusing to people who aren't
Apple users. Now that Apple has rebranded it "macOS", follow their lead
to establish a consistent naming pattern. Also, avoid the use of the
ancient project name "Darwin", except as the port code name which does not
seem desirable to change. (In short, this patch touches documentation and
comments, but no actual code.)
I didn't touch contrib/start-scripts/osx/, either. I suspect those are
obsolete and due for a rewrite, anyway.
I dithered about whether to apply this edit to old release notes, but
those were responsible for quite a lot of the inconsistencies, so I ended
up changing them too. Anyway, Apple's being ahistorical about this,
so why shouldn't we be?
2016-09-25 21:40:57 +02:00
|
|
|
* On macOS, libintl replaces setlocale() with a version that calls
|
2015-01-08 04:35:44 +01:00
|
|
|
* CFLocaleCopyCurrent() when its second argument is "" and every relevant
|
|
|
|
* environment variable is unset or empty. CFLocaleCopyCurrent() makes
|
|
|
|
* the process multithreaded. The postmaster calls sigprocmask() and
|
|
|
|
* calls fork() without an immediate exec(), both of which have undefined
|
|
|
|
* behavior in a multithreaded program. A multithreaded postmaster is the
|
|
|
|
* normal case on Windows, which offers neither fork() nor sigprocmask().
|
|
|
|
*/
|
|
|
|
if (pthread_is_threaded_np() != 0)
|
2015-01-08 04:46:59 +01:00
|
|
|
ereport(FATAL,
|
2015-01-08 04:35:44 +01:00
|
|
|
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
|
|
|
|
errmsg("postmaster became multithreaded during startup"),
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
errhint("Set the LC_ALL environment variable to a valid locale.")));
|
2015-01-08 04:35:44 +01:00
|
|
|
#endif
|
|
|
|
|
2005-06-14 23:04:42 +02:00
|
|
|
/*
|
2005-06-30 00:51:57 +02:00
|
|
|
* Remember postmaster startup time
|
2005-06-14 23:04:42 +02:00
|
|
|
*/
|
2005-06-30 00:51:57 +02:00
|
|
|
PgStartTime = GetCurrentTimestamp();
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
#ifndef HAVE_STRONG_RANDOM
|
|
|
|
/* RandomCancelKey wants its own copy */
|
2016-10-18 15:28:23 +02:00
|
|
|
gettimeofday(&random_start_time, NULL);
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
#endif
|
2005-06-14 23:04:42 +02:00
|
|
|
|
Change pg_ctl to detect server-ready by watching status in postmaster.pid.
Traditionally, "pg_ctl start -w" has waited for the server to become
ready to accept connections by attempting a connection once per second.
That has the major problem that connection issues (for instance, a
kernel packet filter blocking traffic) can't be reliably told apart
from server startup issues, and the minor problem that if server startup
isn't quick, we accumulate "the database system is starting up" spam
in the server log. We've hacked around many of the possible connection
issues, but it resulted in ugly and complicated code in pg_ctl.c.
In commit c61559ec3, I changed the probe rate to every tenth of a second.
That prompted Jeff Janes to complain that the log-spam problem had become
much worse. In the ensuing discussion, Andres Freund pointed out that
we could dispense with connection attempts altogether if the postmaster
were changed to report its status in postmaster.pid, which "pg_ctl start"
already relies on being able to read. This patch implements that, teaching
postmaster.c to report a status string into the pidfile at the same
state-change points already identified as being of interest for systemd
status reporting (cf commit 7d17e683f). pg_ctl no longer needs to link
with libpq at all; all its functions now depend on reading server files.
In support of this, teach AddToDataDirLockFile() to allow addition of
postmaster.pid lines in not-necessarily-sequential order. This is needed
on Windows where the SHMEM_KEY line will never be written at all. We still
have the restriction that we don't want to truncate the pidfile; document
the reasons for that a bit better.
Also, fix the pg_ctl TAP tests so they'll notice if "start -w" mode
is broken --- before, they'd just wait out the sixty seconds until
the loop gives up, and then report success anyway. (Yes, I found that
out the hard way.)
While at it, arrange for pg_ctl to not need to #include miscadmin.h;
as a rather low-level backend header, requiring that to be compilable
client-side is pretty dubious. This requires moving the #define's
associated with the pidfile into a new header file, and moving
PG_BACKEND_VERSIONSTR someplace else. For lack of a clearly better
"someplace else", I put it into port.h, beside the declaration of
find_other_exec(), since most users of that macro are passing the value to
find_other_exec(). (initdb still depends on miscadmin.h, but at least
pg_ctl and pg_upgrade no longer do.)
In passing, fix main.c so that PG_BACKEND_VERSIONSTR actually defines the
output of "postgres -V", which remarkably it had never done before.
Discussion: https://postgr.es/m/CAMkU=1xJW8e+CTotojOMBd-yzUvD0e_JZu2xHo=MnuZ4__m7Pg@mail.gmail.com
2017-06-28 23:31:24 +02:00
|
|
|
/*
|
|
|
|
* Report postmaster status in the postmaster.pid file, to allow pg_ctl to
|
|
|
|
* see what's happening.
|
|
|
|
*/
|
|
|
|
AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STARTING);
|
|
|
|
|
2005-07-14 07:13:45 +02:00
|
|
|
/*
|
2006-07-25 03:23:34 +02:00
|
|
|
* We're ready to rock and roll...
|
2005-07-14 07:13:45 +02:00
|
|
|
*/
|
2006-07-25 03:23:34 +02:00
|
|
|
StartupPID = StartupDataBase();
|
2007-08-09 03:18:43 +02:00
|
|
|
Assert(StartupPID != 0);
|
2015-07-09 19:22:22 +02:00
|
|
|
StartupStatus = STARTUP_RUNNING;
|
2007-08-09 03:18:43 +02:00
|
|
|
pmState = PM_STARTUP;
|
2005-07-14 07:13:45 +02:00
|
|
|
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/* Some workers may be scheduled to start now */
|
Allow multiple bgworkers to be launched per postmaster iteration.
Previously, maybe_start_bgworker() would launch at most one bgworker
process per call, on the grounds that the postmaster might otherwise
neglect its other duties for too long. However, that seems overly
conservative, especially since bad effects only become obvious when
many hundreds of bgworkers need to be launched at once. On the other
side of the coin is that the existing logic could result in substantial
delay of bgworker launches, because ServerLoop isn't guaranteed to
iterate immediately after a signal arrives. (My attempt to fix that
by using pselect(2) encountered too many portability question marks,
and in any case could not help on platforms without pselect().)
One could also question the wisdom of using an O(N^2) processing
method if the system is intended to support so many bgworkers.
As a compromise, allow that function to launch up to 100 bgworkers
per call (and in consequence, rename it to maybe_start_bgworkers).
This will allow any normal parallel-query request for workers
to be satisfied immediately during sigusr1_handler, avoiding the
question of whether ServerLoop will be able to launch more promptly.
There is talk of rewriting the postmaster to use a WaitEventSet to
avoid the signal-response-delay problem, but I'd argue that this change
should be kept even after that happens (if it ever does).
Backpatch to 9.6 where parallel query was added. The issue exists
before that, but previous uses of bgworkers typically aren't as
sensitive to how quickly they get launched.
Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
2017-04-26 22:17:29 +02:00
|
|
|
maybe_start_bgworkers();
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
1997-09-07 07:04:48 +02:00
|
|
|
status = ServerLoop();
|
|
|
|
|
2000-11-29 21:59:54 +01:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* ServerLoop probably shouldn't ever return, but if it does, close down.
|
2000-11-29 21:59:54 +01:00
|
|
|
*/
|
1997-09-07 07:04:48 +02:00
|
|
|
ExitPostmaster(status != STATUS_OK);
|
2000-11-29 21:59:54 +01:00
|
|
|
|
2012-06-25 20:25:26 +02:00
|
|
|
abort(); /* not reached */
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
|
2015-08-02 20:54:44 +02:00
|
|
|
/*
|
|
|
|
* on_proc_exit callback to close server's listen sockets
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
CloseServerPorts(int status, Datum arg)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* First, explicitly close all the socket FDs. We used to just let this
|
|
|
|
* happen implicitly at postmaster exit, but it's better to close them
|
|
|
|
* before we remove the postmaster.pid lockfile; otherwise there's a race
|
|
|
|
* condition if a new postmaster wants to re-use the TCP port number.
|
|
|
|
*/
|
|
|
|
for (i = 0; i < MAXLISTEN; i++)
|
|
|
|
{
|
|
|
|
if (ListenSocket[i] != PGINVALID_SOCKET)
|
|
|
|
{
|
|
|
|
StreamClose(ListenSocket[i]);
|
|
|
|
ListenSocket[i] = PGINVALID_SOCKET;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Next, remove any filesystem entries for Unix sockets. To avoid race
|
|
|
|
* conditions against incoming postmasters, this must happen after closing
|
|
|
|
* the sockets and before removing lock files.
|
|
|
|
*/
|
|
|
|
RemoveSocketFiles();
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We don't do anything about socket lock files here; those will be
|
|
|
|
* removed in a later on_proc_exit callback.
|
|
|
|
*/
|
|
|
|
}
|
|
|
|
|
2012-08-21 05:47:11 +02:00
|
|
|
/*
|
|
|
|
* on_proc_exit callback to delete external_pid_file
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
unlink_external_pid_file(int status, Datum arg)
|
|
|
|
{
|
|
|
|
if (external_pid_file)
|
|
|
|
unlink(external_pid_file);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2009-05-03 00:02:37 +02:00
|
|
|
/*
|
|
|
|
* Compute and check the directory paths to files that are part of the
|
|
|
|
* installation (as deduced from the postgres executable's own location)
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
getInstallationPaths(const char *argv0)
|
|
|
|
{
|
|
|
|
DIR *pdir;
|
|
|
|
|
|
|
|
/* Locate the postgres executable itself */
|
|
|
|
if (find_my_exec(argv0, my_exec_path) < 0)
|
|
|
|
elog(FATAL, "%s: could not locate my own executable path", argv0);
|
|
|
|
|
|
|
|
#ifdef EXEC_BACKEND
|
|
|
|
/* Locate executable backend before we change working directory */
|
|
|
|
if (find_other_exec(argv0, "postgres", PG_BACKEND_VERSIONSTR,
|
|
|
|
postgres_exec_path) < 0)
|
|
|
|
ereport(FATAL,
|
|
|
|
(errmsg("%s: could not locate matching postgres executable",
|
|
|
|
argv0)));
|
|
|
|
#endif
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Locate the pkglib directory --- this has to be set early in case we try
|
|
|
|
* to load any modules from it in response to postgresql.conf entries.
|
|
|
|
*/
|
|
|
|
get_pkglib_path(my_exec_path, pkglib_path);
|
|
|
|
|
|
|
|
/*
|
2009-06-11 16:49:15 +02:00
|
|
|
* Verify that there's a readable directory there; otherwise the Postgres
|
|
|
|
* installation is incomplete or corrupt. (A typical cause of this
|
|
|
|
* failure is that the postgres executable has been moved or hardlinked to
|
|
|
|
* some directory that's not a sibling of the installation lib/
|
|
|
|
* directory.)
|
2009-05-03 00:02:37 +02:00
|
|
|
*/
|
|
|
|
pdir = AllocateDir(pkglib_path);
|
|
|
|
if (pdir == NULL)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode_for_file_access(),
|
|
|
|
errmsg("could not open directory \"%s\": %m",
|
|
|
|
pkglib_path),
|
|
|
|
errhint("This may indicate an incomplete PostgreSQL installation, or that the file \"%s\" has been moved away from its proper location.",
|
|
|
|
my_exec_path)));
|
|
|
|
FreeDir(pdir);
|
|
|
|
|
|
|
|
/*
|
2009-06-11 16:49:15 +02:00
|
|
|
* XXX is it worth similarly checking the share/ directory? If the lib/
|
|
|
|
* directory is there, then share/ probably is too.
|
2009-05-03 00:02:37 +02:00
|
|
|
*/
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
/*
|
|
|
|
* Validate the proposed data directory
|
|
|
|
*/
|
|
|
|
static void
|
2004-10-08 03:36:36 +02:00
|
|
|
checkDataDir(void)
|
2004-05-28 07:13:32 +02:00
|
|
|
{
|
|
|
|
char path[MAXPGPATH];
|
|
|
|
FILE *fp;
|
|
|
|
struct stat stat_buf;
|
|
|
|
|
2004-10-08 03:36:36 +02:00
|
|
|
Assert(DataDir);
|
2004-05-28 07:13:32 +02:00
|
|
|
|
2004-10-08 03:36:36 +02:00
|
|
|
if (stat(DataDir, &stat_buf) != 0)
|
2004-05-28 07:13:32 +02:00
|
|
|
{
|
|
|
|
if (errno == ENOENT)
|
|
|
|
ereport(FATAL,
|
|
|
|
(errcode_for_file_access(),
|
2004-08-06 01:32:13 +02:00
|
|
|
errmsg("data directory \"%s\" does not exist",
|
2004-10-08 03:36:36 +02:00
|
|
|
DataDir)));
|
2004-05-28 07:13:32 +02:00
|
|
|
else
|
|
|
|
ereport(FATAL,
|
|
|
|
(errcode_for_file_access(),
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
errmsg("could not read permissions of directory \"%s\": %m",
|
|
|
|
DataDir)));
|
2004-05-28 07:13:32 +02:00
|
|
|
}
|
|
|
|
|
2008-03-31 04:43:14 +02:00
|
|
|
/* eventual chdir would fail anyway, but let's test ... */
|
|
|
|
if (!S_ISDIR(stat_buf.st_mode))
|
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
|
|
|
|
errmsg("specified data directory \"%s\" is not a directory",
|
|
|
|
DataDir)));
|
|
|
|
|
2005-03-18 04:48:49 +01:00
|
|
|
/*
|
|
|
|
* Check that the directory belongs to my userid; if not, reject.
|
|
|
|
*
|
|
|
|
* This check is an essential part of the interlock that prevents two
|
|
|
|
* postmasters from starting in the same directory (see CreateLockFile()).
|
|
|
|
* Do not remove or weaken it.
|
|
|
|
*
|
|
|
|
* XXX can we safely enable this check on Windows?
|
|
|
|
*/
|
|
|
|
#if !defined(WIN32) && !defined(__CYGWIN__)
|
|
|
|
if (stat_buf.st_uid != geteuid())
|
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
|
|
|
|
errmsg("data directory \"%s\" has wrong ownership",
|
|
|
|
DataDir),
|
|
|
|
errhint("The server must be started by the user that owns the data directory.")));
|
|
|
|
#endif
|
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
/*
|
|
|
|
* Check if the directory has group or world access. If so, reject.
|
|
|
|
*
|
2005-11-22 19:17:34 +01:00
|
|
|
* It would be possible to allow weaker constraints (for example, allow
|
|
|
|
* group access) but we cannot make a general assumption that that is
|
|
|
|
* okay; for example there are platforms where nearly all users
|
|
|
|
* customarily belong to the same group. Perhaps this test should be
|
|
|
|
* configurable.
|
2005-03-18 04:48:49 +01:00
|
|
|
*
|
2005-11-22 19:17:34 +01:00
|
|
|
* XXX temporarily suppress check when on Windows, because there may not
|
|
|
|
* be proper support for Unix-y file permissions. Need to think of a
|
2004-05-28 07:13:32 +02:00
|
|
|
* reasonable check to apply on Windows.
|
|
|
|
*/
|
2004-09-09 02:59:49 +02:00
|
|
|
#if !defined(WIN32) && !defined(__CYGWIN__)
|
2004-05-28 07:13:32 +02:00
|
|
|
if (stat_buf.st_mode & (S_IRWXG | S_IRWXO))
|
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
|
|
|
|
errmsg("data directory \"%s\" has group or world access",
|
2004-10-08 03:36:36 +02:00
|
|
|
DataDir),
|
2004-05-28 07:13:32 +02:00
|
|
|
errdetail("Permissions should be u=rwx (0700).")));
|
|
|
|
#endif
|
|
|
|
|
|
|
|
/* Look for PG_VERSION before looking for pg_control */
|
2004-10-08 03:36:36 +02:00
|
|
|
ValidatePgVersion(DataDir);
|
2004-05-28 07:13:32 +02:00
|
|
|
|
2004-10-08 03:36:36 +02:00
|
|
|
snprintf(path, sizeof(path), "%s/global/pg_control", DataDir);
|
2004-05-28 07:13:32 +02:00
|
|
|
|
|
|
|
fp = AllocateFile(path, PG_BINARY_R);
|
|
|
|
if (fp == NULL)
|
|
|
|
{
|
2004-07-21 22:34:50 +02:00
|
|
|
write_stderr("%s: could not find the database system\n"
|
|
|
|
"Expected to find it in the directory \"%s\",\n"
|
|
|
|
"but could not open file \"%s\": %s\n",
|
2004-10-08 03:36:36 +02:00
|
|
|
progname, DataDir, path, strerror(errno));
|
2004-05-28 07:13:32 +02:00
|
|
|
ExitPostmaster(2);
|
|
|
|
}
|
|
|
|
FreeFile(fp);
|
|
|
|
}
|
|
|
|
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/*
|
|
|
|
* Determine how long should we let ServerLoop sleep.
|
|
|
|
*
|
|
|
|
* In normal conditions we wait at most one minute, to ensure that the other
|
|
|
|
* background tasks handled by ServerLoop get done even when no requests are
|
|
|
|
* arriving. However, if there are background workers waiting to be started,
|
2015-06-19 20:23:39 +02:00
|
|
|
* we don't actually sleep so that they are quickly serviced. Other exception
|
|
|
|
* cases are as shown in the code.
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
*/
|
|
|
|
static void
|
2017-06-21 20:39:04 +02:00
|
|
|
DetermineSleepTime(struct timeval *timeout)
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
{
|
|
|
|
TimestampTz next_wakeup = 0;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Normal case: either there are no background workers at all, or we're in
|
|
|
|
* a shutdown sequence (during which we ignore bgworkers altogether).
|
|
|
|
*/
|
|
|
|
if (Shutdown > NoShutdown ||
|
|
|
|
(!StartWorkerNeeded && !HaveCrashedWorker))
|
|
|
|
{
|
2015-06-19 20:23:39 +02:00
|
|
|
if (AbortStartTime != 0)
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
{
|
2013-10-06 04:24:50 +02:00
|
|
|
/* time left to abort; clamp to 0 in case it already expired */
|
2015-06-19 20:23:39 +02:00
|
|
|
timeout->tv_sec = SIGKILL_CHILDREN_AFTER_SECS -
|
|
|
|
(time(NULL) - AbortStartTime);
|
|
|
|
timeout->tv_sec = Max(timeout->tv_sec, 0);
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
timeout->tv_usec = 0;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
timeout->tv_sec = 60;
|
|
|
|
timeout->tv_usec = 0;
|
|
|
|
}
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (StartWorkerNeeded)
|
|
|
|
{
|
|
|
|
timeout->tv_sec = 0;
|
|
|
|
timeout->tv_usec = 0;
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (HaveCrashedWorker)
|
|
|
|
{
|
2014-05-06 18:12:18 +02:00
|
|
|
slist_mutable_iter siter;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* When there are crashed bgworkers, we sleep just long enough that
|
2014-05-06 18:12:18 +02:00
|
|
|
* they are restarted when they request to be. Scan the list to
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
* determine the minimum of all wakeup times according to most recent
|
|
|
|
* crash time and requested restart interval.
|
|
|
|
*/
|
Allow background workers to be started dynamically.
There is a new API, RegisterDynamicBackgroundWorker, which allows
an ordinary user backend to register a new background writer during
normal running. This means that it's no longer necessary for all
background workers to be registered during processing of
shared_preload_libraries, although the option of registering workers
at that time remains available.
When a background worker exits and will not be restarted, the
slot previously used by that background worker is automatically
released and becomes available for reuse. Slots used by background
workers that are configured for automatic restart can't (yet) be
released without shutting down the system.
This commit adds a new source file, bgworker.c, and moves some
of the existing control logic for background workers there.
Previously, there was little enough logic that it made sense to
keep everything in postmaster.c, but not any more.
This commit also makes the worker_spi contrib module into an
extension and adds a new function, worker_spi_launch, which can
be used to demonstrate the new facility.
2013-07-16 19:02:15 +02:00
|
|
|
slist_foreach_modify(siter, &BackgroundWorkerList)
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
{
|
|
|
|
RegisteredBgWorker *rw;
|
|
|
|
TimestampTz this_wakeup;
|
|
|
|
|
|
|
|
rw = slist_container(RegisteredBgWorker, rw_lnode, siter.cur);
|
|
|
|
|
|
|
|
if (rw->rw_crashed_at == 0)
|
|
|
|
continue;
|
|
|
|
|
2013-10-18 16:21:25 +02:00
|
|
|
if (rw->rw_worker.bgw_restart_time == BGW_NEVER_RESTART
|
|
|
|
|| rw->rw_terminate)
|
Allow background workers to be started dynamically.
There is a new API, RegisterDynamicBackgroundWorker, which allows
an ordinary user backend to register a new background writer during
normal running. This means that it's no longer necessary for all
background workers to be registered during processing of
shared_preload_libraries, although the option of registering workers
at that time remains available.
When a background worker exits and will not be restarted, the
slot previously used by that background worker is automatically
released and becomes available for reuse. Slots used by background
workers that are configured for automatic restart can't (yet) be
released without shutting down the system.
This commit adds a new source file, bgworker.c, and moves some
of the existing control logic for background workers there.
Previously, there was little enough logic that it made sense to
keep everything in postmaster.c, but not any more.
This commit also makes the worker_spi contrib module into an
extension and adds a new function, worker_spi_launch, which can
be used to demonstrate the new facility.
2013-07-16 19:02:15 +02:00
|
|
|
{
|
2013-07-24 23:41:55 +02:00
|
|
|
ForgetBackgroundWorker(&siter);
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
continue;
|
Allow background workers to be started dynamically.
There is a new API, RegisterDynamicBackgroundWorker, which allows
an ordinary user backend to register a new background writer during
normal running. This means that it's no longer necessary for all
background workers to be registered during processing of
shared_preload_libraries, although the option of registering workers
at that time remains available.
When a background worker exits and will not be restarted, the
slot previously used by that background worker is automatically
released and becomes available for reuse. Slots used by background
workers that are configured for automatic restart can't (yet) be
released without shutting down the system.
This commit adds a new source file, bgworker.c, and moves some
of the existing control logic for background workers there.
Previously, there was little enough logic that it made sense to
keep everything in postmaster.c, but not any more.
This commit also makes the worker_spi contrib module into an
extension and adds a new function, worker_spi_launch, which can
be used to demonstrate the new facility.
2013-07-16 19:02:15 +02:00
|
|
|
}
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
|
|
|
this_wakeup = TimestampTzPlusMilliseconds(rw->rw_crashed_at,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
1000L * rw->rw_worker.bgw_restart_time);
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
if (next_wakeup == 0 || this_wakeup < next_wakeup)
|
|
|
|
next_wakeup = this_wakeup;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (next_wakeup != 0)
|
|
|
|
{
|
2014-02-15 23:09:50 +01:00
|
|
|
long secs;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
int microsecs;
|
|
|
|
|
|
|
|
TimestampDifference(GetCurrentTimestamp(), next_wakeup,
|
2014-02-15 23:09:50 +01:00
|
|
|
&secs, µsecs);
|
|
|
|
timeout->tv_sec = secs;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
timeout->tv_usec = microsecs;
|
|
|
|
|
|
|
|
/* Ensure we don't exceed one minute */
|
|
|
|
if (timeout->tv_sec > 60)
|
|
|
|
{
|
|
|
|
timeout->tv_sec = 60;
|
|
|
|
timeout->tv_usec = 0;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
timeout->tv_sec = 60;
|
|
|
|
timeout->tv_usec = 0;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
/*
|
2004-05-30 00:48:23 +02:00
|
|
|
* Main idle loop of postmaster
|
2014-10-01 14:23:43 +02:00
|
|
|
*
|
|
|
|
* NB: Needs to be called with signals blocked
|
2004-05-28 07:13:32 +02:00
|
|
|
*/
|
1997-08-19 23:40:56 +02:00
|
|
|
static int
|
1996-10-04 22:17:11 +02:00
|
|
|
ServerLoop(void)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2003-08-04 02:43:34 +02:00
|
|
|
fd_set readmask;
|
1998-02-26 05:46:47 +01:00
|
|
|
int nSockets;
|
2015-10-09 16:12:03 +02:00
|
|
|
time_t last_lockfile_recheck_time,
|
2004-05-30 00:48:23 +02:00
|
|
|
last_touch_time;
|
1998-06-08 06:27:59 +02:00
|
|
|
|
Perform an immediate shutdown if the postmaster.pid file is removed.
The postmaster now checks every minute or so (worst case, at most two
minutes) that postmaster.pid is still there and still contains its own PID.
If not, it performs an immediate shutdown, as though it had received
SIGQUIT.
The original goal behind this change was to ensure that failed buildfarm
runs would get fully cleaned up, even if the test scripts had left a
postmaster running, which is not an infrequent occurrence. When the
buildfarm script removes a test postmaster's $PGDATA directory, its next
check on postmaster.pid will fail and cause it to exit. Previously, manual
intervention was often needed to get rid of such orphaned postmasters,
since they'd block new test postmasters from obtaining the expected socket
address.
However, by checking postmaster.pid and not something else, we can provide
additional robustness: manual removal of postmaster.pid is a frequent DBA
mistake, and now we can at least limit the damage that will ensue if a new
postmaster is started while the old one is still alive.
Back-patch to all supported branches, since we won't get the desired
improvement in buildfarm reliability otherwise.
2015-10-06 23:15:27 +02:00
|
|
|
last_lockfile_recheck_time = last_touch_time = time(NULL);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2003-07-24 01:30:41 +02:00
|
|
|
nSockets = initMasks(&readmask);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
|
|
|
for (;;)
|
|
|
|
{
|
2003-07-24 01:30:41 +02:00
|
|
|
fd_set rmask;
|
2004-05-30 00:48:23 +02:00
|
|
|
int selres;
|
2015-10-09 16:12:03 +02:00
|
|
|
time_t now;
|
2001-11-04 20:55:31 +01:00
|
|
|
|
|
|
|
/*
|
2007-08-09 03:18:43 +02:00
|
|
|
* Wait for a connection request to arrive.
|
2004-05-30 00:48:23 +02:00
|
|
|
*
|
2014-10-01 14:23:43 +02:00
|
|
|
* We block all signals except while sleeping. That makes it safe for
|
|
|
|
* signal handlers, which again block all signals while executing, to
|
|
|
|
* do nontrivial work.
|
2017-04-25 00:29:03 +02:00
|
|
|
*
|
|
|
|
* If we are in PM_WAIT_DEAD_END state, then we don't want to accept
|
|
|
|
* any new connections, so we don't call select(), and just sleep.
|
2001-11-04 20:55:31 +01:00
|
|
|
*/
|
2017-04-25 00:29:03 +02:00
|
|
|
memcpy((char *) &rmask, (char *) &readmask, sizeof(fd_set));
|
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
if (pmState == PM_WAIT_DEAD_END)
|
|
|
|
{
|
2017-04-25 00:29:03 +02:00
|
|
|
PG_SETMASK(&UnBlockSig);
|
|
|
|
|
|
|
|
pg_usleep(100000L); /* 100 msec seems reasonable */
|
|
|
|
selres = 0;
|
|
|
|
|
|
|
|
PG_SETMASK(&BlockSig);
|
2007-08-09 03:18:43 +02:00
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
2017-04-25 00:29:03 +02:00
|
|
|
/* must set timeout each time; some OSes change it! */
|
|
|
|
struct timeval timeout;
|
2007-08-09 03:18:43 +02:00
|
|
|
|
2014-10-01 14:23:43 +02:00
|
|
|
/* Needs to run with blocked signals! */
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
DetermineSleepTime(&timeout);
|
2007-08-09 03:18:43 +02:00
|
|
|
|
2014-10-01 14:23:43 +02:00
|
|
|
PG_SETMASK(&UnBlockSig);
|
|
|
|
|
2017-04-25 00:29:03 +02:00
|
|
|
selres = select(nSockets, &rmask, NULL, NULL, &timeout);
|
2003-11-19 16:55:08 +01:00
|
|
|
|
2014-10-01 14:23:43 +02:00
|
|
|
PG_SETMASK(&BlockSig);
|
|
|
|
}
|
2000-10-24 23:33:52 +02:00
|
|
|
|
2017-04-25 00:29:03 +02:00
|
|
|
/* Now check the select() result */
|
2004-05-30 00:48:23 +02:00
|
|
|
if (selres < 0)
|
1997-09-07 07:04:48 +02:00
|
|
|
{
|
2004-07-27 03:46:03 +02:00
|
|
|
if (errno != EINTR && errno != EWOULDBLOCK)
|
|
|
|
{
|
|
|
|
ereport(LOG,
|
|
|
|
(errcode_for_socket_access(),
|
|
|
|
errmsg("select() failed in postmaster: %m")));
|
|
|
|
return STATUS_ERROR;
|
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
}
|
2000-10-24 23:33:52 +02:00
|
|
|
|
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* New connection pending on any of our sockets? If so, fork a child
|
|
|
|
* process to deal with it.
|
1998-06-09 06:06:12 +02:00
|
|
|
*/
|
2004-05-30 00:48:23 +02:00
|
|
|
if (selres > 0)
|
1998-06-08 06:27:59 +02:00
|
|
|
{
|
2007-08-09 03:18:43 +02:00
|
|
|
int i;
|
|
|
|
|
2004-05-30 00:48:23 +02:00
|
|
|
for (i = 0; i < MAXLISTEN; i++)
|
2001-06-21 18:43:24 +02:00
|
|
|
{
|
2010-01-10 15:16:08 +01:00
|
|
|
if (ListenSocket[i] == PGINVALID_SOCKET)
|
2004-05-30 00:48:23 +02:00
|
|
|
break;
|
|
|
|
if (FD_ISSET(ListenSocket[i], &rmask))
|
2003-06-12 09:36:51 +02:00
|
|
|
{
|
2007-08-09 03:18:43 +02:00
|
|
|
Port *port;
|
|
|
|
|
2004-05-30 00:48:23 +02:00
|
|
|
port = ConnCreate(ListenSocket[i]);
|
|
|
|
if (port)
|
|
|
|
{
|
|
|
|
BackendStartup(port);
|
|
|
|
|
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* We no longer need the open socket or port structure
|
|
|
|
* in this process
|
2004-05-30 00:48:23 +02:00
|
|
|
*/
|
|
|
|
StreamClose(port->sock);
|
|
|
|
ConnFree(port);
|
|
|
|
}
|
2003-06-12 09:36:51 +02:00
|
|
|
}
|
2001-06-21 18:43:24 +02:00
|
|
|
}
|
1999-09-27 05:13:16 +02:00
|
|
|
}
|
2003-04-26 04:57:14 +02:00
|
|
|
|
2007-08-19 03:41:25 +02:00
|
|
|
/* If we have lost the log collector, try to start a new one */
|
|
|
|
if (SysLoggerPID == 0 && Logging_collector)
|
2004-08-06 01:32:13 +02:00
|
|
|
SysLoggerPID = SysLogger_Start();
|
|
|
|
|
2004-05-30 00:48:23 +02:00
|
|
|
/*
|
2004-08-29 07:07:03 +02:00
|
|
|
* If no background writer process is running, and we are not in a
|
|
|
|
* state that prevents it, start one. It doesn't matter if this
|
2012-05-11 23:46:08 +02:00
|
|
|
* fails, we'll just try again later. Likewise for the checkpointer.
|
2004-05-30 00:48:23 +02:00
|
|
|
*/
|
2011-11-01 18:14:47 +01:00
|
|
|
if (pmState == PM_RUN || pmState == PM_RECOVERY ||
|
2012-05-11 23:46:08 +02:00
|
|
|
pmState == PM_HOT_STANDBY)
|
2011-11-01 18:14:47 +01:00
|
|
|
{
|
|
|
|
if (CheckpointerPID == 0)
|
|
|
|
CheckpointerPID = StartCheckpointer();
|
2012-06-01 09:25:17 +02:00
|
|
|
if (BgWriterPID == 0)
|
|
|
|
BgWriterPID = StartBackgroundWriter();
|
2011-11-01 18:14:47 +01:00
|
|
|
}
|
2004-05-30 00:48:23 +02:00
|
|
|
|
2007-07-24 06:54:09 +02:00
|
|
|
/*
|
2007-11-15 22:14:46 +01:00
|
|
|
* Likewise, if we have lost the walwriter process, try to start a new
|
2012-05-11 23:46:08 +02:00
|
|
|
* one. But this is needed only in normal operation (else we cannot
|
|
|
|
* be writing any new WAL).
|
2007-07-24 06:54:09 +02:00
|
|
|
*/
|
2007-08-09 03:18:43 +02:00
|
|
|
if (WalWriterPID == 0 && pmState == PM_RUN)
|
2007-07-24 06:54:09 +02:00
|
|
|
WalWriterPID = StartWalWriter();
|
|
|
|
|
2011-04-25 18:00:21 +02:00
|
|
|
/*
|
2011-06-09 20:32:50 +02:00
|
|
|
* If we have lost the autovacuum launcher, try to start a new one. We
|
|
|
|
* don't want autovacuum to run in binary upgrade mode because
|
|
|
|
* autovacuum might update relfrozenxid for empty tables before the
|
|
|
|
* physical files are put in place.
|
2011-04-25 18:00:21 +02:00
|
|
|
*/
|
|
|
|
if (!IsBinaryUpgrade && AutoVacPID == 0 &&
|
2007-07-01 20:28:41 +02:00
|
|
|
(AutoVacuumingActive() || start_autovac_launcher) &&
|
2007-08-09 03:18:43 +02:00
|
|
|
pmState == PM_RUN)
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
{
|
2007-02-16 00:23:23 +01:00
|
|
|
AutoVacPID = StartAutoVacLauncher();
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
if (AutoVacPID != 0)
|
2007-11-15 22:14:46 +01:00
|
|
|
start_autovac_launcher = false; /* signal processed */
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
}
|
2005-07-14 07:13:45 +02:00
|
|
|
|
2015-05-18 09:18:46 +02:00
|
|
|
/* If we have lost the stats collector, try to start a new one */
|
2016-10-27 20:27:40 +02:00
|
|
|
if (PgStatPID == 0 &&
|
|
|
|
(pmState == PM_RUN || pmState == PM_HOT_STANDBY))
|
2015-05-18 09:18:46 +02:00
|
|
|
PgStatPID = pgstat_start();
|
|
|
|
|
2015-06-12 16:11:51 +02:00
|
|
|
/* If we have lost the archiver, try to start a new one. */
|
|
|
|
if (PgArchPID == 0 && PgArchStartupAllowed())
|
2015-07-09 19:22:22 +02:00
|
|
|
PgArchPID = pgarch_start();
|
2004-05-30 00:48:23 +02:00
|
|
|
|
2009-08-24 19:23:02 +02:00
|
|
|
/* If we need to signal the autovacuum launcher, do so now */
|
|
|
|
if (avlauncher_needs_signal)
|
|
|
|
{
|
|
|
|
avlauncher_needs_signal = false;
|
|
|
|
if (AutoVacPID != 0)
|
2009-08-31 21:41:00 +02:00
|
|
|
kill(AutoVacPID, SIGUSR2);
|
2009-08-24 19:23:02 +02:00
|
|
|
}
|
|
|
|
|
Don't lose walreceiver start requests due to race condition in postmaster.
When a walreceiver dies, the startup process will notice that and send
a PMSIGNAL_START_WALRECEIVER signal to the postmaster, asking for a new
walreceiver to be launched. There's a race condition, which at least
in HEAD is very easy to hit, whereby the postmaster might see that
signal before it processes the SIGCHLD from the walreceiver process.
In that situation, sigusr1_handler() just dropped the start request
on the floor, reasoning that it must be redundant. Eventually, after
10 seconds (WALRCV_STARTUP_TIMEOUT), the startup process would make a
fresh request --- but that's a long time if the connection could have
been re-established almost immediately.
Fix it by setting a state flag inside the postmaster that we won't
clear until we do launch a walreceiver. In cases where that results
in an extra walreceiver launch, it's up to the walreceiver to realize
it's unwanted and go away --- but we have, and need, that logic anyway
for the opposite race case.
I came across this through investigating unexpected delays in the
src/test/recovery TAP tests: it manifests there in test cases where
a master server is stopped and restarted while leaving streaming
slaves active.
This logic has been broken all along, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/21344.1498494720@sss.pgh.pa.us
2017-06-26 23:31:56 +02:00
|
|
|
/* If we need to start a WAL receiver, try to do that now */
|
|
|
|
if (WalReceiverRequested)
|
|
|
|
MaybeStartWalReceiver();
|
|
|
|
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/* Get other worker processes running, if needed */
|
|
|
|
if (StartWorkerNeeded || HaveCrashedWorker)
|
Allow multiple bgworkers to be launched per postmaster iteration.
Previously, maybe_start_bgworker() would launch at most one bgworker
process per call, on the grounds that the postmaster might otherwise
neglect its other duties for too long. However, that seems overly
conservative, especially since bad effects only become obvious when
many hundreds of bgworkers need to be launched at once. On the other
side of the coin is that the existing logic could result in substantial
delay of bgworker launches, because ServerLoop isn't guaranteed to
iterate immediately after a signal arrives. (My attempt to fix that
by using pselect(2) encountered too many portability question marks,
and in any case could not help on platforms without pselect().)
One could also question the wisdom of using an O(N^2) processing
method if the system is intended to support so many bgworkers.
As a compromise, allow that function to launch up to 100 bgworkers
per call (and in consequence, rename it to maybe_start_bgworkers).
This will allow any normal parallel-query request for workers
to be satisfied immediately during sigusr1_handler, avoiding the
question of whether ServerLoop will be able to launch more promptly.
There is talk of rewriting the postmaster to use a WaitEventSet to
avoid the signal-response-delay problem, but I'd argue that this change
should be kept even after that happens (if it ever does).
Backpatch to 9.6 where parallel query was added. The issue exists
before that, but previous uses of bgworkers typically aren't as
sensitive to how quickly they get launched.
Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
2017-04-26 22:17:29 +02:00
|
|
|
maybe_start_bgworkers();
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
2015-01-08 04:35:44 +01:00
|
|
|
#ifdef HAVE_PTHREAD_IS_THREADED_NP
|
|
|
|
|
|
|
|
/*
|
|
|
|
* With assertions enabled, check regularly for appearance of
|
|
|
|
* additional threads. All builds check at start and exit.
|
|
|
|
*/
|
|
|
|
Assert(pthread_is_threaded_np() == 0);
|
|
|
|
#endif
|
|
|
|
|
2015-10-09 16:12:03 +02:00
|
|
|
/*
|
|
|
|
* Lastly, check to see if it's time to do some things that we don't
|
|
|
|
* want to do every single time through the loop, because they're a
|
|
|
|
* bit expensive. Note that there's up to a minute of slop in when
|
|
|
|
* these tasks will be performed, since DetermineSleepTime() will let
|
|
|
|
* us sleep at most that long; except for SIGKILL timeout which has
|
|
|
|
* special-case logic there.
|
|
|
|
*/
|
|
|
|
now = time(NULL);
|
|
|
|
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
/*
|
|
|
|
* If we already sent SIGQUIT to children and they are slow to shut
|
2014-05-06 18:12:18 +02:00
|
|
|
* down, it's time to send them SIGKILL. This doesn't happen
|
|
|
|
* normally, but under certain conditions backends can get stuck while
|
|
|
|
* shutting down. This is a last measure to get them unwedged.
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
*
|
|
|
|
* Note we also do this during recovery from a process crash.
|
|
|
|
*/
|
|
|
|
if ((Shutdown >= ImmediateShutdown || (FatalError && !SendStop)) &&
|
2015-06-19 20:23:39 +02:00
|
|
|
AbortStartTime != 0 &&
|
|
|
|
(now - AbortStartTime) >= SIGKILL_CHILDREN_AFTER_SECS)
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
{
|
|
|
|
/* We were gentle with them before. Not anymore */
|
|
|
|
TerminateChildren(SIGKILL);
|
2013-10-06 04:24:50 +02:00
|
|
|
/* reset flag so we don't SIGKILL again */
|
|
|
|
AbortStartTime = 0;
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
}
|
Perform an immediate shutdown if the postmaster.pid file is removed.
The postmaster now checks every minute or so (worst case, at most two
minutes) that postmaster.pid is still there and still contains its own PID.
If not, it performs an immediate shutdown, as though it had received
SIGQUIT.
The original goal behind this change was to ensure that failed buildfarm
runs would get fully cleaned up, even if the test scripts had left a
postmaster running, which is not an infrequent occurrence. When the
buildfarm script removes a test postmaster's $PGDATA directory, its next
check on postmaster.pid will fail and cause it to exit. Previously, manual
intervention was often needed to get rid of such orphaned postmasters,
since they'd block new test postmasters from obtaining the expected socket
address.
However, by checking postmaster.pid and not something else, we can provide
additional robustness: manual removal of postmaster.pid is a frequent DBA
mistake, and now we can at least limit the damage that will ensue if a new
postmaster is started while the old one is still alive.
Back-patch to all supported branches, since we won't get the desired
improvement in buildfarm reliability otherwise.
2015-10-06 23:15:27 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Once a minute, verify that postmaster.pid hasn't been removed or
|
|
|
|
* overwritten. If it has, we force a shutdown. This avoids having
|
|
|
|
* postmasters and child processes hanging around after their database
|
|
|
|
* is gone, and maybe causing problems if a new database cluster is
|
|
|
|
* created in the same place. It also provides some protection
|
|
|
|
* against a DBA foolishly removing postmaster.pid and manually
|
|
|
|
* starting a new postmaster. Data corruption is likely to ensue from
|
|
|
|
* that anyway, but we can minimize the damage by aborting ASAP.
|
|
|
|
*/
|
|
|
|
if (now - last_lockfile_recheck_time >= 1 * SECS_PER_MINUTE)
|
|
|
|
{
|
|
|
|
if (!RecheckDataDirLockFile())
|
|
|
|
{
|
|
|
|
ereport(LOG,
|
|
|
|
(errmsg("performing immediate shutdown because data directory lock file is invalid")));
|
|
|
|
kill(MyProcPid, SIGQUIT);
|
|
|
|
}
|
|
|
|
last_lockfile_recheck_time = now;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Touch Unix socket and lock files every 58 minutes, to ensure that
|
|
|
|
* they are not removed by overzealous /tmp-cleaning tasks. We assume
|
|
|
|
* no one runs cleaners with cutoff times of less than an hour ...
|
|
|
|
*/
|
|
|
|
if (now - last_touch_time >= 58 * SECS_PER_MINUTE)
|
|
|
|
{
|
|
|
|
TouchSocketFiles();
|
|
|
|
TouchSocketLockFiles();
|
|
|
|
last_touch_time = now;
|
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
}
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
1996-10-12 09:48:49 +02:00
|
|
|
/*
|
2004-05-30 00:48:23 +02:00
|
|
|
* Initialise the masks for select() for the ports we are listening on.
|
|
|
|
* Return the number of sockets to listen on.
|
1998-01-26 02:42:53 +01:00
|
|
|
*/
|
1998-02-26 05:46:47 +01:00
|
|
|
static int
|
2003-07-24 01:30:41 +02:00
|
|
|
initMasks(fd_set *rmask)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2007-08-09 03:18:43 +02:00
|
|
|
int maxsock = -1;
|
2003-06-12 09:36:51 +02:00
|
|
|
int i;
|
1998-01-26 02:42:53 +01:00
|
|
|
|
|
|
|
FD_ZERO(rmask);
|
|
|
|
|
2003-06-12 09:36:51 +02:00
|
|
|
for (i = 0; i < MAXLISTEN; i++)
|
1997-09-07 07:04:48 +02:00
|
|
|
{
|
2003-08-04 02:43:34 +02:00
|
|
|
int fd = ListenSocket[i];
|
2003-07-24 01:30:41 +02:00
|
|
|
|
2010-01-10 15:16:08 +01:00
|
|
|
if (fd == PGINVALID_SOCKET)
|
2003-07-24 01:30:41 +02:00
|
|
|
break;
|
2010-07-06 21:19:02 +02:00
|
|
|
FD_SET(fd, rmask);
|
2009-06-11 16:49:15 +02:00
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
if (fd > maxsock)
|
|
|
|
maxsock = fd;
|
1997-09-07 07:04:48 +02:00
|
|
|
}
|
1998-01-26 02:42:53 +01:00
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
return maxsock + 1;
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
1996-10-12 09:48:49 +02:00
|
|
|
|
|
|
|
/*
|
2007-08-09 03:18:43 +02:00
|
|
|
* Read a client's startup packet and do something according to it.
|
2001-06-20 20:07:56 +02:00
|
|
|
*
|
2003-07-22 21:00:12 +02:00
|
|
|
* Returns STATUS_OK or STATUS_ERROR, or might call ereport(FATAL) and
|
2001-06-20 20:07:56 +02:00
|
|
|
* not return at all.
|
2001-08-30 21:02:42 +02:00
|
|
|
*
|
2003-07-22 21:00:12 +02:00
|
|
|
* (Note that ereport(FATAL) stuff is sent to the client, so only use it
|
2001-10-19 02:44:08 +02:00
|
|
|
* if that's what you want. Return STATUS_ERROR if you don't want to
|
|
|
|
* send anything to the client, which would typically be appropriate
|
|
|
|
* if we detect a communications failure.)
|
1998-01-26 02:42:53 +01:00
|
|
|
*/
|
1998-07-09 05:29:11 +02:00
|
|
|
static int
|
2001-06-21 18:43:24 +02:00
|
|
|
ProcessStartupPacket(Port *port, bool SSLdone)
|
1996-10-12 09:48:49 +02:00
|
|
|
{
|
2001-06-20 20:07:56 +02:00
|
|
|
int32 len;
|
|
|
|
void *buf;
|
2003-04-18 00:26:02 +02:00
|
|
|
ProtocolVersion proto;
|
|
|
|
MemoryContext oldcontext;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
Be more careful to not lose sync in the FE/BE protocol.
If any error occurred while we were in the middle of reading a protocol
message from the client, we could lose sync, and incorrectly try to
interpret a part of another message as a new protocol message. That will
usually lead to an "invalid frontend message" error that terminates the
connection. However, this is a security issue because an attacker might
be able to deliberately cause an error, inject a Query message in what's
supposed to be just user data, and have the server execute it.
We were quite careful to not have CHECK_FOR_INTERRUPTS() calls or other
operations that could ereport(ERROR) in the middle of processing a message,
but a query cancel interrupt or statement timeout could nevertheless cause
it to happen. Also, the V2 fastpath and COPY handling were not so careful.
It's very difficult to recover in the V2 COPY protocol, so we will just
terminate the connection on error. In practice, that's what happened
previously anyway, as we lost protocol sync.
To fix, add a new variable in pqcomm.c, PqCommReadingMsg, that is set
whenever we're in the middle of reading a message. When it's set, we cannot
safely ERROR out and continue running, because we might've read only part
of a message. PqCommReadingMsg acts somewhat similarly to critical sections
in that if an error occurs while it's set, the error handler will force the
connection to be terminated, as if the error was FATAL. It's not
implemented by promoting ERROR to FATAL in elog.c, like ERROR is promoted
to PANIC in critical sections, because we want to be able to use
PG_TRY/CATCH to recover and regain protocol sync. pq_getmessage() takes
advantage of that to prevent an OOM error from terminating the connection.
To prevent unnecessary connection terminations, add a holdoff mechanism
similar to HOLD/RESUME_INTERRUPTS() that can be used hold off query cancel
interrupts, but still allow die interrupts. The rules on which interrupts
are processed when are now a bit more complicated, so refactor
ProcessInterrupts() and the calls to it in signal handlers so that the
signal handlers always call it if ImmediateInterruptOK is set, and
ProcessInterrupts() can decide to not do anything if the other conditions
are not met.
Reported by Emil Lenngren. Patch reviewed by Noah Misch and Andres Freund.
Backpatch to all supported versions.
Security: CVE-2015-0244
2015-02-02 16:08:45 +01:00
|
|
|
pq_startmsgread();
|
2001-10-19 02:44:08 +02:00
|
|
|
if (pq_getbytes((char *) &len, 4) == EOF)
|
|
|
|
{
|
2003-04-22 02:08:07 +02:00
|
|
|
/*
|
|
|
|
* EOF after SSLdone probably means the client didn't like our
|
2014-05-06 18:12:18 +02:00
|
|
|
* response to NEGOTIATE_SSL_CODE. That's not an error condition, so
|
2005-10-15 04:49:52 +02:00
|
|
|
* don't clutter the log with a complaint.
|
2003-04-22 02:08:07 +02:00
|
|
|
*/
|
|
|
|
if (!SSLdone)
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(COMMERROR,
|
|
|
|
(errcode(ERRCODE_PROTOCOL_VIOLATION),
|
|
|
|
errmsg("incomplete startup packet")));
|
2001-10-19 02:44:08 +02:00
|
|
|
return STATUS_ERROR;
|
|
|
|
}
|
|
|
|
|
2001-06-20 20:07:56 +02:00
|
|
|
len = ntohl(len);
|
|
|
|
len -= 4;
|
|
|
|
|
2003-04-18 00:26:02 +02:00
|
|
|
if (len < (int32) sizeof(ProtocolVersion) ||
|
|
|
|
len > MAX_STARTUP_PACKET_LENGTH)
|
2003-04-22 02:08:07 +02:00
|
|
|
{
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(COMMERROR,
|
|
|
|
(errcode(ERRCODE_PROTOCOL_VIOLATION),
|
|
|
|
errmsg("invalid length of startup packet")));
|
2003-04-22 02:08:07 +02:00
|
|
|
return STATUS_ERROR;
|
|
|
|
}
|
2001-06-20 20:07:56 +02:00
|
|
|
|
2003-04-18 00:26:02 +02:00
|
|
|
/*
|
|
|
|
* Allocate at least the size of an old-style startup packet, plus one
|
2005-10-15 04:49:52 +02:00
|
|
|
* extra byte, and make sure all are zeroes. This ensures we will have
|
|
|
|
* null termination of all strings, in both fixed- and variable-length
|
|
|
|
* packet layouts.
|
2003-04-18 00:26:02 +02:00
|
|
|
*/
|
|
|
|
if (len <= (int32) sizeof(StartupPacket))
|
|
|
|
buf = palloc0(sizeof(StartupPacket) + 1);
|
|
|
|
else
|
|
|
|
buf = palloc0(len + 1);
|
2001-10-19 02:44:08 +02:00
|
|
|
|
|
|
|
if (pq_getbytes(buf, len) == EOF)
|
|
|
|
{
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(COMMERROR,
|
|
|
|
(errcode(ERRCODE_PROTOCOL_VIOLATION),
|
|
|
|
errmsg("incomplete startup packet")));
|
2001-10-19 02:44:08 +02:00
|
|
|
return STATUS_ERROR;
|
|
|
|
}
|
Be more careful to not lose sync in the FE/BE protocol.
If any error occurred while we were in the middle of reading a protocol
message from the client, we could lose sync, and incorrectly try to
interpret a part of another message as a new protocol message. That will
usually lead to an "invalid frontend message" error that terminates the
connection. However, this is a security issue because an attacker might
be able to deliberately cause an error, inject a Query message in what's
supposed to be just user data, and have the server execute it.
We were quite careful to not have CHECK_FOR_INTERRUPTS() calls or other
operations that could ereport(ERROR) in the middle of processing a message,
but a query cancel interrupt or statement timeout could nevertheless cause
it to happen. Also, the V2 fastpath and COPY handling were not so careful.
It's very difficult to recover in the V2 COPY protocol, so we will just
terminate the connection on error. In practice, that's what happened
previously anyway, as we lost protocol sync.
To fix, add a new variable in pqcomm.c, PqCommReadingMsg, that is set
whenever we're in the middle of reading a message. When it's set, we cannot
safely ERROR out and continue running, because we might've read only part
of a message. PqCommReadingMsg acts somewhat similarly to critical sections
in that if an error occurs while it's set, the error handler will force the
connection to be terminated, as if the error was FATAL. It's not
implemented by promoting ERROR to FATAL in elog.c, like ERROR is promoted
to PANIC in critical sections, because we want to be able to use
PG_TRY/CATCH to recover and regain protocol sync. pq_getmessage() takes
advantage of that to prevent an OOM error from terminating the connection.
To prevent unnecessary connection terminations, add a holdoff mechanism
similar to HOLD/RESUME_INTERRUPTS() that can be used hold off query cancel
interrupts, but still allow die interrupts. The rules on which interrupts
are processed when are now a bit more complicated, so refactor
ProcessInterrupts() and the calls to it in signal handlers so that the
signal handlers always call it if ImmediateInterruptOK is set, and
ProcessInterrupts() can decide to not do anything if the other conditions
are not met.
Reported by Emil Lenngren. Patch reviewed by Noah Misch and Andres Freund.
Backpatch to all supported versions.
Security: CVE-2015-0244
2015-02-02 16:08:45 +01:00
|
|
|
pq_endmsgread();
|
2001-07-30 16:50:24 +02:00
|
|
|
|
1998-09-01 06:40:42 +02:00
|
|
|
/*
|
|
|
|
* The first field is either a protocol version number or a special
|
|
|
|
* request code.
|
1998-07-09 05:29:11 +02:00
|
|
|
*/
|
2003-04-18 00:26:02 +02:00
|
|
|
port->proto = proto = ntohl(*((ProtocolVersion *) buf));
|
1998-07-09 05:29:11 +02:00
|
|
|
|
2003-04-18 00:26:02 +02:00
|
|
|
if (proto == CANCEL_REQUEST_CODE)
|
2001-06-20 20:07:56 +02:00
|
|
|
{
|
2003-04-18 00:26:02 +02:00
|
|
|
processCancelRequest(port, buf);
|
2009-08-29 21:26:52 +02:00
|
|
|
/* Not really an error, but we don't want to proceed further */
|
|
|
|
return STATUS_ERROR;
|
2001-06-20 20:07:56 +02:00
|
|
|
}
|
1998-07-09 05:29:11 +02:00
|
|
|
|
2003-04-18 00:26:02 +02:00
|
|
|
if (proto == NEGOTIATE_SSL_CODE && !SSLdone)
|
2000-04-12 19:17:23 +02:00
|
|
|
{
|
|
|
|
char SSLok;
|
|
|
|
|
1999-09-27 05:13:16 +02:00
|
|
|
#ifdef USE_SSL
|
2000-10-26 00:27:25 +02:00
|
|
|
/* No SSL when disabled or on Unix sockets */
|
2017-01-03 03:37:12 +01:00
|
|
|
if (!LoadedSSL || IS_AF_UNIX(port->laddr.addr.ss_family))
|
2000-10-26 00:27:25 +02:00
|
|
|
SSLok = 'N';
|
2000-08-30 16:54:24 +02:00
|
|
|
else
|
2000-10-26 00:27:25 +02:00
|
|
|
SSLok = 'S'; /* Support for SSL */
|
1999-09-27 05:13:16 +02:00
|
|
|
#else
|
2000-04-12 19:17:23 +02:00
|
|
|
SSLok = 'N'; /* No support for SSL */
|
1999-09-27 05:13:16 +02:00
|
|
|
#endif
|
2006-07-16 20:17:14 +02:00
|
|
|
|
|
|
|
retry1:
|
2000-04-12 19:17:23 +02:00
|
|
|
if (send(port->sock, &SSLok, 1, 0) != 1)
|
|
|
|
{
|
2006-07-16 20:17:14 +02:00
|
|
|
if (errno == EINTR)
|
|
|
|
goto retry1; /* if interrupted, just retry */
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(COMMERROR,
|
|
|
|
(errcode_for_socket_access(),
|
2005-10-15 04:49:52 +02:00
|
|
|
errmsg("failed to send SSL negotiation response: %m")));
|
2001-10-28 07:26:15 +01:00
|
|
|
return STATUS_ERROR; /* close the connection */
|
2000-04-12 19:17:23 +02:00
|
|
|
}
|
|
|
|
|
1999-09-27 05:13:16 +02:00
|
|
|
#ifdef USE_SSL
|
UPDATED PATCH:
Attached are a revised set of SSL patches. Many of these patches
are motivated by security concerns, it's not just bug fixes. The key
differences (from stock 7.2.1) are:
*) almost all code that directly uses the OpenSSL library is in two
new files,
src/interfaces/libpq/fe-ssl.c
src/backend/postmaster/be-ssl.c
in the long run, it would be nice to merge these two files.
*) the legacy code to read and write network data have been
encapsulated into read_SSL() and write_SSL(). These functions
should probably be renamed - they handle both SSL and non-SSL
cases.
the remaining code should eliminate the problems identified
earlier, albeit not very cleanly.
*) both front- and back-ends will send a SSL shutdown via the
new close_SSL() function. This is necessary for sessions to
work properly.
(Sessions are not yet fully supported, but by cleanly closing
the SSL connection instead of just sending a TCP FIN packet
other SSL tools will be much happier.)
*) The client certificate and key are now expected in a subdirectory
of the user's home directory. Specifically,
- the directory .postgresql must be owned by the user, and
allow no access by 'group' or 'other.'
- the file .postgresql/postgresql.crt must be a regular file
owned by the user.
- the file .postgresql/postgresql.key must be a regular file
owned by the user, and allow no access by 'group' or 'other'.
At the current time encrypted private keys are not supported.
There should also be a way to support multiple client certs/keys.
*) the front-end performs minimal validation of the back-end cert.
Self-signed certs are permitted, but the common name *must*
match the hostname used by the front-end. (The cert itself
should always use a fully qualified domain name (FDQN) in its
common name field.)
This means that
psql -h eris db
will fail, but
psql -h eris.example.com db
will succeed. At the current time this must be an exact match;
future patches may support any FQDN that resolves to the address
returned by getpeername(2).
Another common "problem" is expiring certs. For now, it may be
a good idea to use a very-long-lived self-signed cert.
As a compile-time option, the front-end can specify a file
containing valid root certificates, but it is not yet required.
*) the back-end performs minimal validation of the client cert.
It allows self-signed certs. It checks for expiration. It
supports a compile-time option specifying a file containing
valid root certificates.
*) both front- and back-ends default to TLSv1, not SSLv3/SSLv2.
*) both front- and back-ends support DSA keys. DSA keys are
moderately more expensive on startup, but many people consider
them preferable than RSA keys. (E.g., SSH2 prefers DSA keys.)
*) if /dev/urandom exists, both client and server will read 16k
of randomization data from it.
*) the server can read empheral DH parameters from the files
$DataDir/dh512.pem
$DataDir/dh1024.pem
$DataDir/dh2048.pem
$DataDir/dh4096.pem
if none are provided, the server will default to hardcoded
parameter files provided by the OpenSSL project.
Remaining tasks:
*) the select() clauses need to be revisited - the SSL abstraction
layer may need to absorb more of the current code to avoid rare
deadlock conditions. This also touches on a true solution to
the pg_eof() problem.
*) the SIGPIPE signal handler may need to be revisited.
*) support encrypted private keys.
*) sessions are not yet fully supported. (SSL sessions can span
multiple "connections," and allow the client and server to avoid
costly renegotiations.)
*) makecert - a script that creates back-end certs.
*) pgkeygen - a tool that creates front-end certs.
*) the whole protocol issue, SASL, etc.
*) certs are fully validated - valid root certs must be available.
This is a hassle, but it means that you *can* trust the identity
of the server.
*) the client library can handle hardcoded root certificates, to
avoid the need to copy these files.
*) host name of server cert must resolve to IP address, or be a
recognized alias. This is more liberal than the previous
iteration.
*) the number of bytes transferred is tracked, and the session
key is periodically renegotiated.
*) basic cert generation scripts (mkcert.sh, pgkeygen.sh). The
configuration files have reasonable defaults for each type
of use.
Bear Giles
2002-06-14 06:23:17 +02:00
|
|
|
if (SSLok == 'S' && secure_open_server(port) == -1)
|
2002-09-04 22:31:48 +02:00
|
|
|
return STATUS_ERROR;
|
1999-09-27 05:13:16 +02:00
|
|
|
#endif
|
2001-06-21 18:43:24 +02:00
|
|
|
/* regular startup packet, cancel, etc packet should follow... */
|
|
|
|
/* but not another SSL negotiation request */
|
|
|
|
return ProcessStartupPacket(port, true);
|
2000-04-12 19:17:23 +02:00
|
|
|
}
|
1999-09-27 05:13:16 +02:00
|
|
|
|
1998-07-09 05:29:11 +02:00
|
|
|
/* Could add additional special packet types here */
|
|
|
|
|
2003-04-22 02:08:07 +02:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Set FrontendProtocol now so that ereport() knows what format to send if
|
|
|
|
* we fail during startup.
|
2003-04-22 02:08:07 +02:00
|
|
|
*/
|
|
|
|
FrontendProtocol = proto;
|
1999-09-27 05:13:16 +02:00
|
|
|
|
1998-07-09 05:29:11 +02:00
|
|
|
/* Check we can handle the protocol the frontend is using. */
|
|
|
|
|
2003-04-18 00:26:02 +02:00
|
|
|
if (PG_PROTOCOL_MAJOR(proto) < PG_PROTOCOL_MAJOR(PG_PROTOCOL_EARLIEST) ||
|
2005-10-15 04:49:52 +02:00
|
|
|
PG_PROTOCOL_MAJOR(proto) > PG_PROTOCOL_MAJOR(PG_PROTOCOL_LATEST) ||
|
|
|
|
(PG_PROTOCOL_MAJOR(proto) == PG_PROTOCOL_MAJOR(PG_PROTOCOL_LATEST) &&
|
|
|
|
PG_PROTOCOL_MINOR(proto) > PG_PROTOCOL_MINOR(PG_PROTOCOL_LATEST)))
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
|
|
|
|
errmsg("unsupported frontend protocol %u.%u: server supports %u.0 to %u.%u",
|
2005-10-15 04:49:52 +02:00
|
|
|
PG_PROTOCOL_MAJOR(proto), PG_PROTOCOL_MINOR(proto),
|
2003-07-22 21:00:12 +02:00
|
|
|
PG_PROTOCOL_MAJOR(PG_PROTOCOL_EARLIEST),
|
|
|
|
PG_PROTOCOL_MAJOR(PG_PROTOCOL_LATEST),
|
|
|
|
PG_PROTOCOL_MINOR(PG_PROTOCOL_LATEST))));
|
1998-07-09 05:29:11 +02:00
|
|
|
|
2001-03-22 05:01:46 +01:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Now fetch parameters out of startup packet and save them into the Port
|
|
|
|
* structure. All data structures attached to the Port struct must be
|
2010-02-26 03:01:40 +01:00
|
|
|
* allocated in TopMemoryContext so that they will remain available in a
|
|
|
|
* running backend (even after PostmasterContext is destroyed). We need
|
2009-08-29 21:26:52 +02:00
|
|
|
* not worry about leaking this storage on failure, since we aren't in the
|
|
|
|
* postmaster process anymore.
|
2001-03-22 05:01:46 +01:00
|
|
|
*/
|
2003-04-18 00:26:02 +02:00
|
|
|
oldcontext = MemoryContextSwitchTo(TopMemoryContext);
|
|
|
|
|
|
|
|
if (PG_PROTOCOL_MAJOR(proto) >= 3)
|
|
|
|
{
|
2003-08-04 02:43:34 +02:00
|
|
|
int32 offset = sizeof(ProtocolVersion);
|
2003-04-18 00:26:02 +02:00
|
|
|
|
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* Scan packet body for name/option pairs. We can assume any string
|
2005-10-15 04:49:52 +02:00
|
|
|
* beginning within the packet body is null-terminated, thanks to
|
|
|
|
* zeroing extra byte above.
|
2003-04-18 00:26:02 +02:00
|
|
|
*/
|
|
|
|
port->guc_options = NIL;
|
|
|
|
|
|
|
|
while (offset < len)
|
|
|
|
{
|
2003-08-04 02:43:34 +02:00
|
|
|
char *nameptr = ((char *) buf) + offset;
|
|
|
|
int32 valoffset;
|
|
|
|
char *valptr;
|
2003-04-18 00:26:02 +02:00
|
|
|
|
|
|
|
if (*nameptr == '\0')
|
|
|
|
break; /* found packet terminator */
|
|
|
|
valoffset = offset + strlen(nameptr) + 1;
|
|
|
|
if (valoffset >= len)
|
|
|
|
break; /* missing value, will complain below */
|
|
|
|
valptr = ((char *) buf) + valoffset;
|
|
|
|
|
|
|
|
if (strcmp(nameptr, "database") == 0)
|
|
|
|
port->database_name = pstrdup(valptr);
|
|
|
|
else if (strcmp(nameptr, "user") == 0)
|
|
|
|
port->user_name = pstrdup(valptr);
|
|
|
|
else if (strcmp(nameptr, "options") == 0)
|
|
|
|
port->cmdline_options = pstrdup(valptr);
|
2010-01-15 10:19:10 +01:00
|
|
|
else if (strcmp(nameptr, "replication") == 0)
|
|
|
|
{
|
2014-03-10 18:50:28 +01:00
|
|
|
/*
|
|
|
|
* Due to backward compatibility concerns the replication
|
|
|
|
* parameter is a hybrid beast which allows the value to be
|
|
|
|
* either boolean or the string 'database'. The latter
|
|
|
|
* connects to a specific database which is e.g. required for
|
|
|
|
* logical decoding while.
|
|
|
|
*/
|
|
|
|
if (strcmp(valptr, "database") == 0)
|
|
|
|
{
|
|
|
|
am_walsender = true;
|
|
|
|
am_db_walsender = true;
|
|
|
|
}
|
|
|
|
else if (!parse_bool(valptr, &am_walsender))
|
2010-01-15 10:19:10 +01:00
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
errmsg("invalid value for parameter \"%s\": \"%s\"",
|
|
|
|
"replication",
|
|
|
|
valptr),
|
2015-11-17 03:16:42 +01:00
|
|
|
errhint("Valid values are: \"false\", 0, \"true\", 1, \"database\".")));
|
2010-01-15 10:19:10 +01:00
|
|
|
}
|
2003-04-18 00:26:02 +02:00
|
|
|
else
|
|
|
|
{
|
|
|
|
/* Assume it's a generic GUC option */
|
|
|
|
port->guc_options = lappend(port->guc_options,
|
|
|
|
pstrdup(nameptr));
|
|
|
|
port->guc_options = lappend(port->guc_options,
|
|
|
|
pstrdup(valptr));
|
|
|
|
}
|
|
|
|
offset = valoffset + strlen(valptr) + 1;
|
|
|
|
}
|
2003-08-04 02:43:34 +02:00
|
|
|
|
2003-04-18 00:26:02 +02:00
|
|
|
/*
|
|
|
|
* If we didn't find a packet terminator exactly at the end of the
|
|
|
|
* given packet length, complain.
|
|
|
|
*/
|
2003-08-04 02:43:34 +02:00
|
|
|
if (offset != len - 1)
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_PROTOCOL_VIOLATION),
|
|
|
|
errmsg("invalid startup packet layout: expected terminator as last byte")));
|
2003-04-18 00:26:02 +02:00
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Get the parameters from the old-style, fixed-width-fields startup
|
|
|
|
* packet as C strings. The packet destination was cleared first so a
|
|
|
|
* short packet has zeros silently added. We have to be prepared to
|
|
|
|
* truncate the pstrdup result for oversize fields, though.
|
2003-04-18 00:26:02 +02:00
|
|
|
*/
|
|
|
|
StartupPacket *packet = (StartupPacket *) buf;
|
|
|
|
|
|
|
|
port->database_name = pstrdup(packet->database);
|
|
|
|
if (strlen(port->database_name) > sizeof(packet->database))
|
|
|
|
port->database_name[sizeof(packet->database)] = '\0';
|
|
|
|
port->user_name = pstrdup(packet->user);
|
|
|
|
if (strlen(port->user_name) > sizeof(packet->user))
|
|
|
|
port->user_name[sizeof(packet->user)] = '\0';
|
|
|
|
port->cmdline_options = pstrdup(packet->options);
|
|
|
|
if (strlen(port->cmdline_options) > sizeof(packet->options))
|
|
|
|
port->cmdline_options[sizeof(packet->options)] = '\0';
|
|
|
|
port->guc_options = NIL;
|
|
|
|
}
|
2001-02-20 02:34:40 +01:00
|
|
|
|
1998-01-26 02:42:53 +01:00
|
|
|
/* Check a user name was given. */
|
2003-04-18 00:26:02 +02:00
|
|
|
if (port->user_name == NULL || port->user_name[0] == '\0')
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION),
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
errmsg("no PostgreSQL user name specified in startup packet")));
|
1998-01-26 02:42:53 +01:00
|
|
|
|
2003-04-18 00:26:02 +02:00
|
|
|
/* The database defaults to the user name. */
|
|
|
|
if (port->database_name == NULL || port->database_name[0] == '\0')
|
|
|
|
port->database_name = pstrdup(port->user_name);
|
|
|
|
|
2002-08-18 05:03:26 +02:00
|
|
|
if (Db_user_namespace)
|
2002-09-04 22:31:48 +02:00
|
|
|
{
|
2002-08-18 05:03:26 +02:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* If user@, it is a global user, remove '@'. We only want to do this
|
|
|
|
* if there is an '@' at the end and no earlier in the user string or
|
|
|
|
* they may fake as a local user of another database attaching to this
|
|
|
|
* database.
|
2002-08-18 05:03:26 +02:00
|
|
|
*/
|
2003-04-18 00:26:02 +02:00
|
|
|
if (strchr(port->user_name, '@') ==
|
|
|
|
port->user_name + strlen(port->user_name) - 1)
|
|
|
|
*strchr(port->user_name, '@') = '\0';
|
2002-08-18 05:03:26 +02:00
|
|
|
else
|
|
|
|
{
|
|
|
|
/* Append '@' and dbname */
|
2013-10-13 06:09:18 +02:00
|
|
|
port->user_name = psprintf("%s@%s", port->user_name, port->database_name);
|
2002-08-18 05:03:26 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2003-04-18 00:26:02 +02:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Truncate given database and user names to length of a Postgres name.
|
|
|
|
* This avoids lookup failures when overlength names are given.
|
2003-04-18 00:26:02 +02:00
|
|
|
*/
|
|
|
|
if (strlen(port->database_name) >= NAMEDATALEN)
|
|
|
|
port->database_name[NAMEDATALEN - 1] = '\0';
|
|
|
|
if (strlen(port->user_name) >= NAMEDATALEN)
|
|
|
|
port->user_name[NAMEDATALEN - 1] = '\0';
|
|
|
|
|
2014-03-10 18:50:28 +01:00
|
|
|
/*
|
|
|
|
* Normal walsender backends, e.g. for streaming replication, are not
|
|
|
|
* connected to a particular database. But walsenders used for logical
|
|
|
|
* replication need to connect to a specific database. We allow streaming
|
|
|
|
* replication commands to be issued even if connected to a database as it
|
|
|
|
* can make sense to first make a basebackup and then stream changes
|
|
|
|
* starting from that.
|
|
|
|
*/
|
|
|
|
if (am_walsender && !am_db_walsender)
|
2010-01-15 10:19:10 +01:00
|
|
|
port->database_name[0] = '\0';
|
|
|
|
|
2003-04-18 00:26:02 +02:00
|
|
|
/*
|
|
|
|
* Done putting stuff in TopMemoryContext.
|
|
|
|
*/
|
|
|
|
MemoryContextSwitchTo(oldcontext);
|
|
|
|
|
2000-11-29 21:59:54 +01:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* If we're going to reject the connection due to database state, say so
|
|
|
|
* now instead of wasting cycles on an authentication exchange. (This also
|
|
|
|
* allows a pg_ping utility to be written.)
|
2000-11-29 21:59:54 +01:00
|
|
|
*/
|
2003-12-20 18:31:21 +01:00
|
|
|
switch (port->canAcceptConnections)
|
2001-08-30 21:02:42 +02:00
|
|
|
{
|
|
|
|
case CAC_STARTUP:
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_CANNOT_CONNECT_NOW),
|
|
|
|
errmsg("the database system is starting up")));
|
2001-08-30 21:02:42 +02:00
|
|
|
break;
|
|
|
|
case CAC_SHUTDOWN:
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_CANNOT_CONNECT_NOW),
|
|
|
|
errmsg("the database system is shutting down")));
|
2001-08-30 21:02:42 +02:00
|
|
|
break;
|
|
|
|
case CAC_RECOVERY:
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_CANNOT_CONNECT_NOW),
|
|
|
|
errmsg("the database system is in recovery mode")));
|
2001-08-30 21:02:42 +02:00
|
|
|
break;
|
|
|
|
case CAC_TOOMANY:
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
|
|
|
|
errmsg("sorry, too many clients already")));
|
2001-08-30 21:02:42 +02:00
|
|
|
break;
|
2008-04-27 00:47:40 +02:00
|
|
|
case CAC_WAITBACKUP:
|
|
|
|
/* OK for now, will check in InitPostgres */
|
|
|
|
break;
|
2001-08-30 21:02:42 +02:00
|
|
|
case CAC_OK:
|
2002-05-29 01:56:51 +02:00
|
|
|
break;
|
2001-08-30 21:02:42 +02:00
|
|
|
}
|
2000-11-29 21:59:54 +01:00
|
|
|
|
2001-06-20 20:07:56 +02:00
|
|
|
return STATUS_OK;
|
1998-07-09 05:29:11 +02:00
|
|
|
}
|
|
|
|
|
2001-06-20 20:07:56 +02:00
|
|
|
|
1998-07-09 05:29:11 +02:00
|
|
|
/*
|
|
|
|
* The client has sent a cancel request packet, not a normal
|
2001-06-20 20:07:56 +02:00
|
|
|
* start-a-new-connection packet. Perform the necessary processing.
|
|
|
|
* Nothing is sent back to the client.
|
1998-07-09 05:29:11 +02:00
|
|
|
*/
|
2001-06-20 20:07:56 +02:00
|
|
|
static void
|
|
|
|
processCancelRequest(Port *port, void *pkt)
|
1998-07-09 05:29:11 +02:00
|
|
|
{
|
1998-09-01 06:40:42 +02:00
|
|
|
CancelRequestPacket *canc = (CancelRequestPacket *) pkt;
|
1998-07-09 05:29:11 +02:00
|
|
|
int backendPID;
|
2016-12-07 08:47:43 +01:00
|
|
|
int32 cancelAuthCode;
|
2004-01-26 23:54:58 +01:00
|
|
|
Backend *bp;
|
2005-10-15 04:49:52 +02:00
|
|
|
|
2004-01-26 23:59:54 +01:00
|
|
|
#ifndef EXEC_BACKEND
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
dlist_iter iter;
|
2004-01-26 23:59:54 +01:00
|
|
|
#else
|
2004-05-27 17:07:41 +02:00
|
|
|
int i;
|
2004-01-26 23:59:54 +01:00
|
|
|
#endif
|
1998-07-09 05:29:11 +02:00
|
|
|
|
|
|
|
backendPID = (int) ntohl(canc->backendPID);
|
2016-12-07 08:47:43 +01:00
|
|
|
cancelAuthCode = (int32) ntohl(canc->cancelAuthCode);
|
1998-07-09 05:29:11 +02:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* See if we have a matching backend. In the EXEC_BACKEND case, we can no
|
|
|
|
* longer access the postmaster's own backend list, and must rely on the
|
|
|
|
* duplicate array in shared memory.
|
2004-05-28 07:13:32 +02:00
|
|
|
*/
|
2004-01-26 23:59:54 +01:00
|
|
|
#ifndef EXEC_BACKEND
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_foreach(iter, &BackendList)
|
1998-07-09 05:29:11 +02:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
bp = dlist_container(Backend, elem, iter.cur);
|
2004-01-26 23:59:54 +01:00
|
|
|
#else
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
for (i = MaxLivePostmasterChildren() - 1; i >= 0; i--)
|
2004-01-26 23:59:54 +01:00
|
|
|
{
|
2004-05-27 17:07:41 +02:00
|
|
|
bp = (Backend *) &ShmemBackendArray[i];
|
2004-01-26 23:59:54 +01:00
|
|
|
#endif
|
1998-07-09 05:29:11 +02:00
|
|
|
if (bp->pid == backendPID)
|
|
|
|
{
|
|
|
|
if (bp->cancel_key == cancelAuthCode)
|
|
|
|
{
|
|
|
|
/* Found a match; signal that backend to cancel current op */
|
2003-08-12 20:23:21 +02:00
|
|
|
ereport(DEBUG2,
|
|
|
|
(errmsg_internal("processing cancel request: sending SIGINT to process %d",
|
|
|
|
backendPID)));
|
2006-11-21 21:59:53 +01:00
|
|
|
signal_child(bp->pid, SIGINT);
|
1998-07-09 05:29:11 +02:00
|
|
|
}
|
|
|
|
else
|
|
|
|
/* Right PID, wrong key: no way, Jose */
|
2007-11-05 01:00:34 +01:00
|
|
|
ereport(LOG,
|
2007-12-13 12:55:44 +01:00
|
|
|
(errmsg("wrong key in cancel request for process %d",
|
2007-11-05 01:00:34 +01:00
|
|
|
backendPID)));
|
2001-06-20 20:07:56 +02:00
|
|
|
return;
|
1998-07-09 05:29:11 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* No matching backend */
|
2007-11-05 01:00:34 +01:00
|
|
|
ereport(LOG,
|
2007-12-13 12:55:44 +01:00
|
|
|
(errmsg("PID %d in cancel request did not match any process",
|
2007-11-05 01:00:34 +01:00
|
|
|
backendPID)));
|
1996-10-12 09:48:49 +02:00
|
|
|
}
|
|
|
|
|
2000-11-29 21:59:54 +01:00
|
|
|
/*
|
|
|
|
* canAcceptConnections --- check to see if database state allows connections.
|
|
|
|
*/
|
2010-11-14 21:57:37 +01:00
|
|
|
static CAC_state
|
2000-11-29 21:59:54 +01:00
|
|
|
canAcceptConnections(void)
|
|
|
|
{
|
2010-11-14 21:57:37 +01:00
|
|
|
CAC_state result = CAC_OK;
|
|
|
|
|
2008-04-23 15:44:59 +02:00
|
|
|
/*
|
2010-02-26 03:01:40 +01:00
|
|
|
* Can't start backends when in startup/shutdown/inconsistent recovery
|
|
|
|
* state.
|
2008-04-27 00:47:40 +02:00
|
|
|
*
|
|
|
|
* In state PM_WAIT_BACKUP only superusers can connect (this must be
|
|
|
|
* allowed so that a superuser can end online backup mode); we return
|
2011-04-10 17:42:00 +02:00
|
|
|
* CAC_WAITBACKUP code to indicate that this must be checked later. Note
|
|
|
|
* that neither CAC_OK nor CAC_WAITBACKUP can safely be returned until we
|
|
|
|
* have checked for too many children.
|
2008-04-23 15:44:59 +02:00
|
|
|
*/
|
2008-04-27 00:47:40 +02:00
|
|
|
if (pmState != PM_RUN)
|
2007-08-09 03:18:43 +02:00
|
|
|
{
|
2008-04-27 00:47:40 +02:00
|
|
|
if (pmState == PM_WAIT_BACKUP)
|
2010-11-14 21:57:37 +01:00
|
|
|
result = CAC_WAITBACKUP; /* allow superusers only */
|
|
|
|
else if (Shutdown > NoShutdown)
|
2007-11-15 22:14:46 +01:00
|
|
|
return CAC_SHUTDOWN; /* shutdown is pending */
|
2010-11-14 21:57:37 +01:00
|
|
|
else if (!FatalError &&
|
|
|
|
(pmState == PM_STARTUP ||
|
|
|
|
pmState == PM_RECOVERY))
|
2011-04-10 17:42:00 +02:00
|
|
|
return CAC_STARTUP; /* normal startup */
|
2010-11-14 21:57:37 +01:00
|
|
|
else if (!FatalError &&
|
|
|
|
pmState == PM_HOT_STANDBY)
|
2011-04-10 17:42:00 +02:00
|
|
|
result = CAC_OK; /* connection OK during hot standby */
|
2010-11-14 21:57:37 +01:00
|
|
|
else
|
|
|
|
return CAC_RECOVERY; /* else must be crash recovery */
|
2007-08-09 03:18:43 +02:00
|
|
|
}
|
2001-10-25 07:50:21 +02:00
|
|
|
|
2001-06-17 00:58:17 +02:00
|
|
|
/*
|
|
|
|
* Don't start too many children.
|
|
|
|
*
|
2001-10-25 07:50:21 +02:00
|
|
|
* We allow more connections than we can have backends here because some
|
2005-10-15 04:49:52 +02:00
|
|
|
* might still be authenticating; they might fail auth, or some existing
|
|
|
|
* backend might exit before the auth cycle is completed. The exact
|
|
|
|
* MaxBackends limit is enforced when a new backend tries to join the
|
|
|
|
* shared-inval backend array.
|
2007-08-09 03:18:43 +02:00
|
|
|
*
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
* The limit here must match the sizes of the per-child-process arrays;
|
|
|
|
* see comments for MaxLivePostmasterChildren().
|
2001-06-17 00:58:17 +02:00
|
|
|
*/
|
2010-01-15 10:19:10 +01:00
|
|
|
if (CountChildren(BACKEND_TYPE_ALL) >= MaxLivePostmasterChildren())
|
2010-11-14 21:57:37 +01:00
|
|
|
result = CAC_TOOMANY;
|
2000-11-29 21:59:54 +01:00
|
|
|
|
2010-11-14 21:57:37 +01:00
|
|
|
return result;
|
2000-11-29 21:59:54 +01:00
|
|
|
}
|
1996-10-12 09:48:49 +02:00
|
|
|
|
2001-06-20 20:07:56 +02:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
|
|
|
* ConnCreate -- create a local connection data structure
|
2010-10-27 21:26:24 +02:00
|
|
|
*
|
|
|
|
* Returns NULL on failure, other than out-of-memory which is fatal.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
1998-01-26 02:42:53 +01:00
|
|
|
static Port *
|
|
|
|
ConnCreate(int serverFd)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
1997-09-08 04:41:22 +02:00
|
|
|
Port *port;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
|
|
|
if (!(port = (Port *) calloc(1, sizeof(Port))))
|
|
|
|
{
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(LOG,
|
|
|
|
(errcode(ERRCODE_OUT_OF_MEMORY),
|
|
|
|
errmsg("out of memory")));
|
1997-09-07 07:04:48 +02:00
|
|
|
ExitPostmaster(1);
|
|
|
|
}
|
|
|
|
|
1998-01-26 02:42:53 +01:00
|
|
|
if (StreamConnection(serverFd, port) != STATUS_OK)
|
1997-09-07 07:04:48 +02:00
|
|
|
{
|
2014-04-16 16:45:48 +02:00
|
|
|
if (port->sock != PGINVALID_SOCKET)
|
2007-02-13 20:18:54 +01:00
|
|
|
StreamClose(port->sock);
|
2000-05-26 03:38:08 +02:00
|
|
|
ConnFree(port);
|
2010-10-27 19:03:00 +02:00
|
|
|
return NULL;
|
1997-09-07 07:04:48 +02:00
|
|
|
}
|
2010-10-27 21:26:24 +02:00
|
|
|
|
2007-07-10 15:14:22 +02:00
|
|
|
/*
|
2007-11-15 22:14:46 +01:00
|
|
|
* Allocate GSSAPI specific state struct
|
2007-07-10 15:14:22 +02:00
|
|
|
*/
|
2007-07-23 12:16:54 +02:00
|
|
|
#ifndef EXEC_BACKEND
|
2007-11-15 22:14:46 +01:00
|
|
|
#if defined(ENABLE_GSS) || defined(ENABLE_SSPI)
|
|
|
|
port->gss = (pg_gssinfo *) calloc(1, sizeof(pg_gssinfo));
|
2007-07-11 10:27:33 +02:00
|
|
|
if (!port->gss)
|
|
|
|
{
|
|
|
|
ereport(LOG,
|
|
|
|
(errcode(ERRCODE_OUT_OF_MEMORY),
|
|
|
|
errmsg("out of memory")));
|
|
|
|
ExitPostmaster(1);
|
|
|
|
}
|
2007-07-23 12:16:54 +02:00
|
|
|
#endif
|
2007-07-10 15:14:22 +02:00
|
|
|
#endif
|
|
|
|
|
1998-01-26 02:42:53 +01:00
|
|
|
return port;
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
2001-06-20 20:07:56 +02:00
|
|
|
|
1999-09-27 05:13:16 +02:00
|
|
|
/*
|
2000-05-26 03:38:08 +02:00
|
|
|
* ConnFree -- free a local connection data structure
|
1999-09-27 05:13:16 +02:00
|
|
|
*/
|
2000-03-17 03:36:41 +01:00
|
|
|
static void
|
2000-04-12 19:17:23 +02:00
|
|
|
ConnFree(Port *conn)
|
1999-09-27 05:13:16 +02:00
|
|
|
{
|
|
|
|
#ifdef USE_SSL
|
UPDATED PATCH:
Attached are a revised set of SSL patches. Many of these patches
are motivated by security concerns, it's not just bug fixes. The key
differences (from stock 7.2.1) are:
*) almost all code that directly uses the OpenSSL library is in two
new files,
src/interfaces/libpq/fe-ssl.c
src/backend/postmaster/be-ssl.c
in the long run, it would be nice to merge these two files.
*) the legacy code to read and write network data have been
encapsulated into read_SSL() and write_SSL(). These functions
should probably be renamed - they handle both SSL and non-SSL
cases.
the remaining code should eliminate the problems identified
earlier, albeit not very cleanly.
*) both front- and back-ends will send a SSL shutdown via the
new close_SSL() function. This is necessary for sessions to
work properly.
(Sessions are not yet fully supported, but by cleanly closing
the SSL connection instead of just sending a TCP FIN packet
other SSL tools will be much happier.)
*) The client certificate and key are now expected in a subdirectory
of the user's home directory. Specifically,
- the directory .postgresql must be owned by the user, and
allow no access by 'group' or 'other.'
- the file .postgresql/postgresql.crt must be a regular file
owned by the user.
- the file .postgresql/postgresql.key must be a regular file
owned by the user, and allow no access by 'group' or 'other'.
At the current time encrypted private keys are not supported.
There should also be a way to support multiple client certs/keys.
*) the front-end performs minimal validation of the back-end cert.
Self-signed certs are permitted, but the common name *must*
match the hostname used by the front-end. (The cert itself
should always use a fully qualified domain name (FDQN) in its
common name field.)
This means that
psql -h eris db
will fail, but
psql -h eris.example.com db
will succeed. At the current time this must be an exact match;
future patches may support any FQDN that resolves to the address
returned by getpeername(2).
Another common "problem" is expiring certs. For now, it may be
a good idea to use a very-long-lived self-signed cert.
As a compile-time option, the front-end can specify a file
containing valid root certificates, but it is not yet required.
*) the back-end performs minimal validation of the client cert.
It allows self-signed certs. It checks for expiration. It
supports a compile-time option specifying a file containing
valid root certificates.
*) both front- and back-ends default to TLSv1, not SSLv3/SSLv2.
*) both front- and back-ends support DSA keys. DSA keys are
moderately more expensive on startup, but many people consider
them preferable than RSA keys. (E.g., SSH2 prefers DSA keys.)
*) if /dev/urandom exists, both client and server will read 16k
of randomization data from it.
*) the server can read empheral DH parameters from the files
$DataDir/dh512.pem
$DataDir/dh1024.pem
$DataDir/dh2048.pem
$DataDir/dh4096.pem
if none are provided, the server will default to hardcoded
parameter files provided by the OpenSSL project.
Remaining tasks:
*) the select() clauses need to be revisited - the SSL abstraction
layer may need to absorb more of the current code to avoid rare
deadlock conditions. This also touches on a true solution to
the pg_eof() problem.
*) the SIGPIPE signal handler may need to be revisited.
*) support encrypted private keys.
*) sessions are not yet fully supported. (SSL sessions can span
multiple "connections," and allow the client and server to avoid
costly renegotiations.)
*) makecert - a script that creates back-end certs.
*) pgkeygen - a tool that creates front-end certs.
*) the whole protocol issue, SASL, etc.
*) certs are fully validated - valid root certs must be available.
This is a hassle, but it means that you *can* trust the identity
of the server.
*) the client library can handle hardcoded root certificates, to
avoid the need to copy these files.
*) host name of server cert must resolve to IP address, or be a
recognized alias. This is more liberal than the previous
iteration.
*) the number of bytes transferred is tracked, and the session
key is periodically renegotiated.
*) basic cert generation scripts (mkcert.sh, pgkeygen.sh). The
configuration files have reasonable defaults for each type
of use.
Bear Giles
2002-06-14 06:23:17 +02:00
|
|
|
secure_close(conn);
|
1999-09-27 05:13:16 +02:00
|
|
|
#endif
|
2007-07-10 15:14:22 +02:00
|
|
|
if (conn->gss)
|
|
|
|
free(conn->gss);
|
1999-09-27 05:13:16 +02:00
|
|
|
free(conn);
|
|
|
|
}
|
|
|
|
|
2001-06-20 20:07:56 +02:00
|
|
|
|
2001-02-08 01:35:10 +01:00
|
|
|
/*
|
|
|
|
* ClosePostmasterPorts -- close all the postmaster's open sockets
|
|
|
|
*
|
|
|
|
* This is called during child process startup to release file descriptors
|
2001-06-21 18:43:24 +02:00
|
|
|
* that are not needed by that child process. The postmaster still has
|
|
|
|
* them open, of course.
|
2004-08-06 01:32:13 +02:00
|
|
|
*
|
|
|
|
* Note: we pass am_syslogger as a boolean because we don't want to set
|
|
|
|
* the global variable yet when this is called.
|
2001-02-08 01:35:10 +01:00
|
|
|
*/
|
2001-08-05 04:06:50 +02:00
|
|
|
void
|
2004-08-06 01:32:13 +02:00
|
|
|
ClosePostmasterPorts(bool am_syslogger)
|
2001-02-08 01:35:10 +01:00
|
|
|
{
|
2003-08-04 02:43:34 +02:00
|
|
|
int i;
|
2003-06-12 09:36:51 +02:00
|
|
|
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
#ifndef WIN32
|
2012-06-10 21:20:04 +02:00
|
|
|
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
/*
|
|
|
|
* Close the write end of postmaster death watch pipe. It's important to
|
|
|
|
* do this as early as possible, so that if postmaster dies, others won't
|
|
|
|
* think that it's still running because we're holding the pipe open.
|
|
|
|
*/
|
|
|
|
if (close(postmaster_alive_fds[POSTMASTER_FD_OWN]))
|
|
|
|
ereport(FATAL,
|
2012-06-10 21:20:04 +02:00
|
|
|
(errcode_for_file_access(),
|
|
|
|
errmsg_internal("could not close postmaster death monitoring pipe in child process: %m")));
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
postmaster_alive_fds[POSTMASTER_FD_OWN] = -1;
|
|
|
|
#endif
|
|
|
|
|
2001-02-08 01:35:10 +01:00
|
|
|
/* Close the listen sockets */
|
2003-06-12 09:36:51 +02:00
|
|
|
for (i = 0; i < MAXLISTEN; i++)
|
|
|
|
{
|
2010-01-10 15:16:08 +01:00
|
|
|
if (ListenSocket[i] != PGINVALID_SOCKET)
|
2003-06-12 09:36:51 +02:00
|
|
|
{
|
|
|
|
StreamClose(ListenSocket[i]);
|
2010-01-10 15:16:08 +01:00
|
|
|
ListenSocket[i] = PGINVALID_SOCKET;
|
2003-06-12 09:36:51 +02:00
|
|
|
}
|
|
|
|
}
|
2004-08-06 01:32:13 +02:00
|
|
|
|
|
|
|
/* If using syslogger, close the read side of the pipe */
|
|
|
|
if (!am_syslogger)
|
|
|
|
{
|
|
|
|
#ifndef WIN32
|
|
|
|
if (syslogPipe[0] >= 0)
|
|
|
|
close(syslogPipe[0]);
|
|
|
|
syslogPipe[0] = -1;
|
|
|
|
#else
|
|
|
|
if (syslogPipe[0])
|
|
|
|
CloseHandle(syslogPipe[0]);
|
|
|
|
syslogPipe[0] = 0;
|
|
|
|
#endif
|
|
|
|
}
|
2009-09-08 18:08:26 +02:00
|
|
|
|
|
|
|
#ifdef USE_BONJOUR
|
|
|
|
/* If using Bonjour, close the connection to the mDNS daemon */
|
|
|
|
if (bonjour_sdref)
|
|
|
|
close(DNSServiceRefSockFD(bonjour_sdref));
|
|
|
|
#endif
|
2001-02-08 01:35:10 +01:00
|
|
|
}
|
|
|
|
|
UUNET is looking into offering PostgreSQL as a part of a managed web
hosting product, on both shared and dedicated machines. We currently
offer Oracle and MySQL, and it would be a nice middle-ground.
However, as shipped, PostgreSQL lacks the following features we need
that MySQL has:
1. The ability to listen only on a particular IP address. Each
hosting customer has their own IP address, on which all of their
servers (http, ftp, real media, etc.) run.
2. The ability to place the Unix-domain socket in a mode 700 directory.
This allows us to automatically create an empty database, with an
empty DBA password, for new or upgrading customers without having
to interactively set a DBA password and communicate it to (or from)
the customer. This in turn cuts down our install and upgrade times.
3. The ability to connect to the Unix-domain socket from within a
change-rooted environment. We run CGI programs chrooted to the
user's home directory, which is another reason why we need to be
able to specify where the Unix-domain socket is, instead of /tmp.
4. The ability to, if run as root, open a pid file in /var/run as
root, and then setuid to the desired user. (mysqld -u can almost
do this; I had to patch it, too).
The patch below fixes problem 1-3. I plan to address #4, also, but
haven't done so yet. These diffs are big enough that they should give
the PG development team something to think about in the meantime :-)
Also, I'm about to leave for 2 weeks' vacation, so I thought I'd get
out what I have, which works (for the problems it tackles), now.
With these changes, we can set up and run PostgreSQL with scripts the
same way we can with apache or proftpd or mysql.
In summary, this patch makes the following enhancements:
1. Adds an environment variable PGUNIXSOCKET, analogous to MYSQL_UNIX_PORT,
and command line options -k --unix-socket to the relevant programs.
2. Adds a -h option to postmaster to set the hostname or IP address to
listen on instead of the default INADDR_ANY.
3. Extends some library interfaces to support the above.
4. Fixes a few memory leaks in PQconnectdb().
The default behavior is unchanged from stock 7.0.2; if you don't use
any of these new features, they don't change the operation.
David J. MacKenzie
2000-11-13 16:18:15 +01:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
|
|
|
* reset_shared -- reset shared memory and semaphores
|
|
|
|
*/
|
|
|
|
static void
|
2005-06-18 00:32:51 +02:00
|
|
|
reset_shared(int port)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2000-11-29 00:27:57 +01:00
|
|
|
/*
|
|
|
|
* Create or re-create shared memory and semaphores.
|
2002-05-05 02:03:29 +02:00
|
|
|
*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Note: in each "cycle of life" we will normally assign the same IPC keys
|
|
|
|
* (if using SysV shmem and/or semas), since the port number is used to
|
2014-05-06 18:12:18 +02:00
|
|
|
* determine IPC keys. This helps ensure that we will clean up dead IPC
|
2005-10-15 04:49:52 +02:00
|
|
|
* objects if the postmaster crashes and is restarted.
|
2000-11-29 00:27:57 +01:00
|
|
|
*/
|
2005-06-18 00:32:51 +02:00
|
|
|
CreateSharedMemoryAndSemaphores(false, port);
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
2000-05-31 02:28:42 +02:00
|
|
|
|
|
|
|
/*
|
2001-11-04 21:12:57 +01:00
|
|
|
* SIGHUP -- reread config files, and tell children to do same
|
2000-05-31 02:28:42 +02:00
|
|
|
*/
|
|
|
|
static void
|
2000-08-29 11:36:51 +02:00
|
|
|
SIGHUP_handler(SIGNAL_ARGS)
|
2000-05-31 02:28:42 +02:00
|
|
|
{
|
2000-12-18 18:33:42 +01:00
|
|
|
int save_errno = errno;
|
|
|
|
|
2001-06-14 21:59:24 +02:00
|
|
|
PG_SETMASK(&BlockSig);
|
|
|
|
|
|
|
|
if (Shutdown <= SmartShutdown)
|
|
|
|
{
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(LOG,
|
2005-10-15 04:49:52 +02:00
|
|
|
(errmsg("received SIGHUP, reloading configuration files")));
|
2001-11-04 20:55:31 +01:00
|
|
|
ProcessConfigFile(PGC_SIGHUP);
|
2003-05-08 22:43:07 +02:00
|
|
|
SignalChildren(SIGHUP);
|
2009-03-04 14:56:40 +01:00
|
|
|
if (StartupPID != 0)
|
|
|
|
signal_child(StartupPID, SIGHUP);
|
2004-05-30 00:48:23 +02:00
|
|
|
if (BgWriterPID != 0)
|
2006-11-21 21:59:53 +01:00
|
|
|
signal_child(BgWriterPID, SIGHUP);
|
2011-11-01 18:14:47 +01:00
|
|
|
if (CheckpointerPID != 0)
|
|
|
|
signal_child(CheckpointerPID, SIGHUP);
|
2007-07-24 06:54:09 +02:00
|
|
|
if (WalWriterPID != 0)
|
|
|
|
signal_child(WalWriterPID, SIGHUP);
|
2010-01-15 10:19:10 +01:00
|
|
|
if (WalReceiverPID != 0)
|
|
|
|
signal_child(WalReceiverPID, SIGHUP);
|
2005-07-14 07:13:45 +02:00
|
|
|
if (AutoVacPID != 0)
|
2006-11-21 21:59:53 +01:00
|
|
|
signal_child(AutoVacPID, SIGHUP);
|
2004-07-19 04:47:16 +02:00
|
|
|
if (PgArchPID != 0)
|
2006-11-21 21:59:53 +01:00
|
|
|
signal_child(PgArchPID, SIGHUP);
|
2004-08-06 01:32:13 +02:00
|
|
|
if (SysLoggerPID != 0)
|
2006-11-21 21:59:53 +01:00
|
|
|
signal_child(SysLoggerPID, SIGHUP);
|
2008-08-25 17:11:01 +02:00
|
|
|
if (PgStatPID != 0)
|
|
|
|
signal_child(PgStatPID, SIGHUP);
|
2005-02-20 03:22:07 +01:00
|
|
|
|
|
|
|
/* Reload authentication config files too */
|
2008-09-15 14:32:57 +02:00
|
|
|
if (!load_hba())
|
2017-01-03 03:37:12 +01:00
|
|
|
ereport(LOG,
|
2017-01-04 18:43:52 +01:00
|
|
|
(errmsg("pg_hba.conf was not reloaded")));
|
2008-09-15 14:32:57 +02:00
|
|
|
|
Parse pg_ident.conf when it's loaded, keeping it in memory in parsed format.
Similar changes were done to pg_hba.conf earlier already, this commit makes
pg_ident.conf to behave the same as pg_hba.conf.
This has two user-visible effects. First, if pg_ident.conf contains multiple
errors, the whole file is parsed at postmaster startup time and all the
errors are immediately reported. Before this patch, the file was parsed and
the errors were reported only when someone tries to connect using an
authentication method that uses the file, and the parsing stopped on first
error. Second, if you SIGHUP to reload the config files, and the new
pg_ident.conf file contains an error, the error is logged but the old file
stays in effect.
Also, regular expressions in pg_ident.conf are now compiled only once when
the file is loaded, rather than every time the a user is authenticated. That
should speed up authentication if you have a lot of regexps in the file.
Amit Kapila
2012-09-21 16:41:22 +02:00
|
|
|
if (!load_ident())
|
2017-01-03 03:37:12 +01:00
|
|
|
ereport(LOG,
|
2017-01-04 18:43:52 +01:00
|
|
|
(errmsg("pg_ident.conf was not reloaded")));
|
2003-11-19 16:55:08 +01:00
|
|
|
|
2017-01-03 03:37:12 +01:00
|
|
|
#ifdef USE_SSL
|
|
|
|
/* Reload SSL configuration as well */
|
|
|
|
if (EnableSSL)
|
|
|
|
{
|
|
|
|
if (secure_initialize(false) == 0)
|
|
|
|
LoadedSSL = true;
|
|
|
|
else
|
|
|
|
ereport(LOG,
|
2017-01-04 18:43:52 +01:00
|
|
|
(errmsg("SSL configuration was not reloaded")));
|
2017-01-03 03:37:12 +01:00
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
secure_destroy();
|
|
|
|
LoadedSSL = false;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
#ifdef EXEC_BACKEND
|
|
|
|
/* Update the starting-point file for future children */
|
|
|
|
write_nondefault_variables(PGC_SIGHUP);
|
|
|
|
#endif
|
2001-06-14 21:59:24 +02:00
|
|
|
}
|
|
|
|
|
2001-11-04 20:55:31 +01:00
|
|
|
PG_SETMASK(&UnBlockSig);
|
|
|
|
|
2000-12-18 18:33:42 +01:00
|
|
|
errno = save_errno;
|
2000-05-31 02:28:42 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-13 02:17:06 +01:00
|
|
|
* pmdie -- signal handler for processing various postmaster signals.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
|
|
|
static void
|
2000-08-29 11:36:51 +02:00
|
|
|
pmdie(SIGNAL_ARGS)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2000-12-18 18:33:42 +01:00
|
|
|
int save_errno = errno;
|
|
|
|
|
1999-10-06 23:58:18 +02:00
|
|
|
PG_SETMASK(&BlockSig);
|
2000-04-12 19:17:23 +02:00
|
|
|
|
2003-08-12 20:23:21 +02:00
|
|
|
ereport(DEBUG2,
|
|
|
|
(errmsg_internal("postmaster received signal %d",
|
|
|
|
postgres_signal_arg)));
|
1998-08-25 23:34:10 +02:00
|
|
|
|
2000-08-29 11:36:51 +02:00
|
|
|
switch (postgres_signal_arg)
|
1998-09-01 06:40:42 +02:00
|
|
|
{
|
1998-08-25 23:34:10 +02:00
|
|
|
case SIGTERM:
|
2004-08-29 07:07:03 +02:00
|
|
|
|
1999-10-06 23:58:18 +02:00
|
|
|
/*
|
|
|
|
* Smart Shutdown:
|
|
|
|
*
|
2004-05-30 00:48:23 +02:00
|
|
|
* Wait for children to end their work, then shut down.
|
1999-10-06 23:58:18 +02:00
|
|
|
*/
|
|
|
|
if (Shutdown >= SmartShutdown)
|
2001-11-04 20:55:31 +01:00
|
|
|
break;
|
1999-10-06 23:58:18 +02:00
|
|
|
Shutdown = SmartShutdown;
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(LOG,
|
2016-02-10 22:01:04 +01:00
|
|
|
(errmsg("received smart shutdown request")));
|
Change pg_ctl to detect server-ready by watching status in postmaster.pid.
Traditionally, "pg_ctl start -w" has waited for the server to become
ready to accept connections by attempting a connection once per second.
That has the major problem that connection issues (for instance, a
kernel packet filter blocking traffic) can't be reliably told apart
from server startup issues, and the minor problem that if server startup
isn't quick, we accumulate "the database system is starting up" spam
in the server log. We've hacked around many of the possible connection
issues, but it resulted in ugly and complicated code in pg_ctl.c.
In commit c61559ec3, I changed the probe rate to every tenth of a second.
That prompted Jeff Janes to complain that the log-spam problem had become
much worse. In the ensuing discussion, Andres Freund pointed out that
we could dispense with connection attempts altogether if the postmaster
were changed to report its status in postmaster.pid, which "pg_ctl start"
already relies on being able to read. This patch implements that, teaching
postmaster.c to report a status string into the pidfile at the same
state-change points already identified as being of interest for systemd
status reporting (cf commit 7d17e683f). pg_ctl no longer needs to link
with libpq at all; all its functions now depend on reading server files.
In support of this, teach AddToDataDirLockFile() to allow addition of
postmaster.pid lines in not-necessarily-sequential order. This is needed
on Windows where the SHMEM_KEY line will never be written at all. We still
have the restriction that we don't want to truncate the pidfile; document
the reasons for that a bit better.
Also, fix the pg_ctl TAP tests so they'll notice if "start -w" mode
is broken --- before, they'd just wait out the sixty seconds until
the loop gives up, and then report success anyway. (Yes, I found that
out the hard way.)
While at it, arrange for pg_ctl to not need to #include miscadmin.h;
as a rather low-level backend header, requiring that to be compilable
client-side is pretty dubious. This requires moving the #define's
associated with the pidfile into a new header file, and moving
PG_BACKEND_VERSIONSTR someplace else. For lack of a clearly better
"someplace else", I put it into port.h, beside the declaration of
find_other_exec(), since most users of that macro are passing the value to
find_other_exec(). (initdb still depends on miscadmin.h, but at least
pg_ctl and pg_upgrade no longer do.)
In passing, fix main.c so that PG_BACKEND_VERSIONSTR actually defines the
output of "postgres -V", which remarkably it had never done before.
Discussion: https://postgr.es/m/CAMkU=1xJW8e+CTotojOMBd-yzUvD0e_JZu2xHo=MnuZ4__m7Pg@mail.gmail.com
2017-06-28 23:31:24 +02:00
|
|
|
|
|
|
|
/* Report status */
|
|
|
|
AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STOPPING);
|
2015-11-17 12:46:17 +01:00
|
|
|
#ifdef USE_SYSTEMD
|
|
|
|
sd_notify(0, "STOPPING=1");
|
|
|
|
#endif
|
2004-02-11 23:25:02 +01:00
|
|
|
|
2009-02-23 10:28:50 +01:00
|
|
|
if (pmState == PM_RUN || pmState == PM_RECOVERY ||
|
2010-05-26 14:32:41 +02:00
|
|
|
pmState == PM_HOT_STANDBY || pmState == PM_STARTUP)
|
2007-08-09 03:18:43 +02:00
|
|
|
{
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/* autovac workers are told to shut down immediately */
|
|
|
|
/* and bgworkers too; does this need tweaking? */
|
|
|
|
SignalSomeChildren(SIGTERM,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
BACKEND_TYPE_AUTOVAC | BACKEND_TYPE_BGWORKER);
|
2007-08-09 03:18:43 +02:00
|
|
|
/* and the autovac launcher too */
|
|
|
|
if (AutoVacPID != 0)
|
|
|
|
signal_child(AutoVacPID, SIGTERM);
|
2012-05-11 23:46:08 +02:00
|
|
|
/* and the bgwriter too */
|
|
|
|
if (BgWriterPID != 0)
|
|
|
|
signal_child(BgWriterPID, SIGTERM);
|
2007-08-09 03:18:43 +02:00
|
|
|
/* and the walwriter too */
|
|
|
|
if (WalWriterPID != 0)
|
|
|
|
signal_child(WalWriterPID, SIGTERM);
|
2010-07-06 21:19:02 +02:00
|
|
|
|
2010-04-08 03:39:37 +02:00
|
|
|
/*
|
|
|
|
* If we're in recovery, we can't kill the startup process
|
|
|
|
* right away, because at present doing so does not release
|
|
|
|
* its locks. We might want to change this in a future
|
|
|
|
* release. For the time being, the PM_WAIT_READONLY state
|
|
|
|
* indicates that we're waiting for the regular (read only)
|
|
|
|
* backends to die off; once they do, we'll kill the startup
|
|
|
|
* and walreceiver processes.
|
|
|
|
*/
|
|
|
|
pmState = (pmState == PM_RUN) ?
|
|
|
|
PM_WAIT_BACKUP : PM_WAIT_READONLY;
|
2007-08-09 03:18:43 +02:00
|
|
|
}
|
2000-04-12 19:17:23 +02:00
|
|
|
|
1999-10-06 23:58:18 +02:00
|
|
|
/*
|
2010-02-26 03:01:40 +01:00
|
|
|
* Now wait for online backup mode to end and backends to exit. If
|
|
|
|
* that is already the case, PostmasterStateMachine will take the
|
|
|
|
* next step.
|
1999-10-06 23:58:18 +02:00
|
|
|
*/
|
2007-08-09 03:18:43 +02:00
|
|
|
PostmasterStateMachine();
|
2001-11-04 20:55:31 +01:00
|
|
|
break;
|
1999-10-06 23:58:18 +02:00
|
|
|
|
|
|
|
case SIGINT:
|
2004-08-29 07:07:03 +02:00
|
|
|
|
1999-10-06 23:58:18 +02:00
|
|
|
/*
|
|
|
|
* Fast Shutdown:
|
2000-04-12 19:17:23 +02:00
|
|
|
*
|
2005-11-22 19:17:34 +01:00
|
|
|
* Abort all children with SIGTERM (rollback active transactions
|
|
|
|
* and exit) and shut down when they are gone.
|
1999-10-06 23:58:18 +02:00
|
|
|
*/
|
|
|
|
if (Shutdown >= FastShutdown)
|
2001-11-04 20:55:31 +01:00
|
|
|
break;
|
2004-02-11 23:25:02 +01:00
|
|
|
Shutdown = FastShutdown;
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(LOG,
|
2016-02-10 22:01:04 +01:00
|
|
|
(errmsg("received fast shutdown request")));
|
Change pg_ctl to detect server-ready by watching status in postmaster.pid.
Traditionally, "pg_ctl start -w" has waited for the server to become
ready to accept connections by attempting a connection once per second.
That has the major problem that connection issues (for instance, a
kernel packet filter blocking traffic) can't be reliably told apart
from server startup issues, and the minor problem that if server startup
isn't quick, we accumulate "the database system is starting up" spam
in the server log. We've hacked around many of the possible connection
issues, but it resulted in ugly and complicated code in pg_ctl.c.
In commit c61559ec3, I changed the probe rate to every tenth of a second.
That prompted Jeff Janes to complain that the log-spam problem had become
much worse. In the ensuing discussion, Andres Freund pointed out that
we could dispense with connection attempts altogether if the postmaster
were changed to report its status in postmaster.pid, which "pg_ctl start"
already relies on being able to read. This patch implements that, teaching
postmaster.c to report a status string into the pidfile at the same
state-change points already identified as being of interest for systemd
status reporting (cf commit 7d17e683f). pg_ctl no longer needs to link
with libpq at all; all its functions now depend on reading server files.
In support of this, teach AddToDataDirLockFile() to allow addition of
postmaster.pid lines in not-necessarily-sequential order. This is needed
on Windows where the SHMEM_KEY line will never be written at all. We still
have the restriction that we don't want to truncate the pidfile; document
the reasons for that a bit better.
Also, fix the pg_ctl TAP tests so they'll notice if "start -w" mode
is broken --- before, they'd just wait out the sixty seconds until
the loop gives up, and then report success anyway. (Yes, I found that
out the hard way.)
While at it, arrange for pg_ctl to not need to #include miscadmin.h;
as a rather low-level backend header, requiring that to be compilable
client-side is pretty dubious. This requires moving the #define's
associated with the pidfile into a new header file, and moving
PG_BACKEND_VERSIONSTR someplace else. For lack of a clearly better
"someplace else", I put it into port.h, beside the declaration of
find_other_exec(), since most users of that macro are passing the value to
find_other_exec(). (initdb still depends on miscadmin.h, but at least
pg_ctl and pg_upgrade no longer do.)
In passing, fix main.c so that PG_BACKEND_VERSIONSTR actually defines the
output of "postgres -V", which remarkably it had never done before.
Discussion: https://postgr.es/m/CAMkU=1xJW8e+CTotojOMBd-yzUvD0e_JZu2xHo=MnuZ4__m7Pg@mail.gmail.com
2017-06-28 23:31:24 +02:00
|
|
|
|
|
|
|
/* Report status */
|
|
|
|
AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STOPPING);
|
2015-11-17 12:46:17 +01:00
|
|
|
#ifdef USE_SYSTEMD
|
|
|
|
sd_notify(0, "STOPPING=1");
|
|
|
|
#endif
|
2004-02-11 23:25:02 +01:00
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
if (StartupPID != 0)
|
|
|
|
signal_child(StartupPID, SIGTERM);
|
2011-11-01 18:14:47 +01:00
|
|
|
if (BgWriterPID != 0)
|
|
|
|
signal_child(BgWriterPID, SIGTERM);
|
2012-05-11 23:46:08 +02:00
|
|
|
if (WalReceiverPID != 0)
|
|
|
|
signal_child(WalReceiverPID, SIGTERM);
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
if (pmState == PM_RECOVERY)
|
|
|
|
{
|
2015-09-01 21:30:19 +02:00
|
|
|
SignalSomeChildren(SIGTERM, BACKEND_TYPE_BGWORKER);
|
2016-06-10 00:02:36 +02:00
|
|
|
|
2012-05-11 23:46:08 +02:00
|
|
|
/*
|
2015-09-01 21:30:19 +02:00
|
|
|
* Only startup, bgwriter, walreceiver, possibly bgworkers,
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
* and/or checkpointer should be active in this state; we just
|
|
|
|
* signaled the first four, and we don't want to kill
|
|
|
|
* checkpointer yet.
|
2012-05-11 23:46:08 +02:00
|
|
|
*/
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
pmState = PM_WAIT_BACKENDS;
|
|
|
|
}
|
2010-06-24 18:40:45 +02:00
|
|
|
else if (pmState == PM_RUN ||
|
|
|
|
pmState == PM_WAIT_BACKUP ||
|
|
|
|
pmState == PM_WAIT_READONLY ||
|
|
|
|
pmState == PM_WAIT_BACKENDS ||
|
|
|
|
pmState == PM_HOT_STANDBY)
|
1998-09-01 06:40:42 +02:00
|
|
|
{
|
2007-08-09 03:18:43 +02:00
|
|
|
ereport(LOG,
|
|
|
|
(errmsg("aborting any active transactions")));
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/* shut down all backends and workers */
|
2010-01-15 10:19:10 +01:00
|
|
|
SignalSomeChildren(SIGTERM,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
BACKEND_TYPE_NORMAL | BACKEND_TYPE_AUTOVAC |
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
BACKEND_TYPE_BGWORKER);
|
2007-08-09 03:18:43 +02:00
|
|
|
/* and the autovac launcher too */
|
|
|
|
if (AutoVacPID != 0)
|
|
|
|
signal_child(AutoVacPID, SIGTERM);
|
|
|
|
/* and the walwriter too */
|
|
|
|
if (WalWriterPID != 0)
|
|
|
|
signal_child(WalWriterPID, SIGTERM);
|
|
|
|
pmState = PM_WAIT_BACKENDS;
|
1998-08-25 23:34:10 +02:00
|
|
|
}
|
2000-04-12 19:17:23 +02:00
|
|
|
|
1999-10-06 23:58:18 +02:00
|
|
|
/*
|
2007-08-09 03:18:43 +02:00
|
|
|
* Now wait for backends to exit. If there are none,
|
|
|
|
* PostmasterStateMachine will take the next step.
|
1999-10-06 23:58:18 +02:00
|
|
|
*/
|
2007-08-09 03:18:43 +02:00
|
|
|
PostmasterStateMachine();
|
2001-11-04 20:55:31 +01:00
|
|
|
break;
|
1999-10-06 23:58:18 +02:00
|
|
|
|
|
|
|
case SIGQUIT:
|
2004-08-29 07:07:03 +02:00
|
|
|
|
2000-04-12 19:17:23 +02:00
|
|
|
/*
|
1999-10-06 23:58:18 +02:00
|
|
|
* Immediate Shutdown:
|
2000-04-12 19:17:23 +02:00
|
|
|
*
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
* abort all children with SIGQUIT, wait for them to exit,
|
|
|
|
* terminate remaining ones with SIGKILL, then exit without
|
|
|
|
* attempt to properly shut down the data base system.
|
1999-10-06 23:58:18 +02:00
|
|
|
*/
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
if (Shutdown >= ImmediateShutdown)
|
|
|
|
break;
|
|
|
|
Shutdown = ImmediateShutdown;
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(LOG,
|
2016-02-10 22:01:04 +01:00
|
|
|
(errmsg("received immediate shutdown request")));
|
Change pg_ctl to detect server-ready by watching status in postmaster.pid.
Traditionally, "pg_ctl start -w" has waited for the server to become
ready to accept connections by attempting a connection once per second.
That has the major problem that connection issues (for instance, a
kernel packet filter blocking traffic) can't be reliably told apart
from server startup issues, and the minor problem that if server startup
isn't quick, we accumulate "the database system is starting up" spam
in the server log. We've hacked around many of the possible connection
issues, but it resulted in ugly and complicated code in pg_ctl.c.
In commit c61559ec3, I changed the probe rate to every tenth of a second.
That prompted Jeff Janes to complain that the log-spam problem had become
much worse. In the ensuing discussion, Andres Freund pointed out that
we could dispense with connection attempts altogether if the postmaster
were changed to report its status in postmaster.pid, which "pg_ctl start"
already relies on being able to read. This patch implements that, teaching
postmaster.c to report a status string into the pidfile at the same
state-change points already identified as being of interest for systemd
status reporting (cf commit 7d17e683f). pg_ctl no longer needs to link
with libpq at all; all its functions now depend on reading server files.
In support of this, teach AddToDataDirLockFile() to allow addition of
postmaster.pid lines in not-necessarily-sequential order. This is needed
on Windows where the SHMEM_KEY line will never be written at all. We still
have the restriction that we don't want to truncate the pidfile; document
the reasons for that a bit better.
Also, fix the pg_ctl TAP tests so they'll notice if "start -w" mode
is broken --- before, they'd just wait out the sixty seconds until
the loop gives up, and then report success anyway. (Yes, I found that
out the hard way.)
While at it, arrange for pg_ctl to not need to #include miscadmin.h;
as a rather low-level backend header, requiring that to be compilable
client-side is pretty dubious. This requires moving the #define's
associated with the pidfile into a new header file, and moving
PG_BACKEND_VERSIONSTR someplace else. For lack of a clearly better
"someplace else", I put it into port.h, beside the declaration of
find_other_exec(), since most users of that macro are passing the value to
find_other_exec(). (initdb still depends on miscadmin.h, but at least
pg_ctl and pg_upgrade no longer do.)
In passing, fix main.c so that PG_BACKEND_VERSIONSTR actually defines the
output of "postgres -V", which remarkably it had never done before.
Discussion: https://postgr.es/m/CAMkU=1xJW8e+CTotojOMBd-yzUvD0e_JZu2xHo=MnuZ4__m7Pg@mail.gmail.com
2017-06-28 23:31:24 +02:00
|
|
|
|
|
|
|
/* Report status */
|
|
|
|
AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STOPPING);
|
2015-11-17 12:46:17 +01:00
|
|
|
#ifdef USE_SYSTEMD
|
|
|
|
sd_notify(0, "STOPPING=1");
|
|
|
|
#endif
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
|
|
|
|
TerminateChildren(SIGQUIT);
|
|
|
|
pmState = PM_WAIT_BACKENDS;
|
|
|
|
|
|
|
|
/* set stopwatch for them to die */
|
|
|
|
AbortStartTime = time(NULL);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Now wait for backends to exit. If there are none,
|
|
|
|
* PostmasterStateMachine will take the next step.
|
|
|
|
*/
|
|
|
|
PostmasterStateMachine();
|
1999-10-06 23:58:18 +02:00
|
|
|
break;
|
1998-08-25 23:34:10 +02:00
|
|
|
}
|
|
|
|
|
2001-11-04 20:55:31 +01:00
|
|
|
PG_SETMASK(&UnBlockSig);
|
|
|
|
|
|
|
|
errno = save_errno;
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2007-08-09 03:18:43 +02:00
|
|
|
* Reaper -- signal handler to cleanup after a child process dies.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
|
|
|
static void
|
2000-08-29 11:36:51 +02:00
|
|
|
reaper(SIGNAL_ARGS)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2000-12-18 18:33:42 +01:00
|
|
|
int save_errno = errno;
|
2007-08-09 03:18:43 +02:00
|
|
|
int pid; /* process id of dead child process */
|
|
|
|
int exitstatus; /* its exit status */
|
2001-11-05 18:46:40 +01:00
|
|
|
|
1999-10-06 23:58:18 +02:00
|
|
|
PG_SETMASK(&BlockSig);
|
|
|
|
|
2003-08-12 20:23:21 +02:00
|
|
|
ereport(DEBUG4,
|
|
|
|
(errmsg_internal("reaping dead processes")));
|
2007-08-09 03:18:43 +02:00
|
|
|
|
2012-07-05 20:00:40 +02:00
|
|
|
while ((pid = waitpid(-1, &exitstatus, WNOHANG)) > 0)
|
2004-01-26 23:59:54 +01:00
|
|
|
{
|
2001-11-06 19:02:48 +01:00
|
|
|
/*
|
2004-05-30 00:48:23 +02:00
|
|
|
* Check if this child was a startup process.
|
2001-11-06 19:02:48 +01:00
|
|
|
*/
|
2007-08-09 03:18:43 +02:00
|
|
|
if (pid == StartupPID)
|
1999-10-06 23:58:18 +02:00
|
|
|
{
|
2004-05-30 00:48:23 +02:00
|
|
|
StartupPID = 0;
|
2007-08-09 03:18:43 +02:00
|
|
|
|
Don't launch new child processes after we've been told to shut down.
Once we've received a shutdown signal (SIGINT or SIGTERM), we should not
launch any more child processes, even if we get signals requesting such.
The normal code path for spawning backends has always understood that,
but the postmaster's infrastructure for hot standby and autovacuum didn't
get the memo. As reported by Hari Babu in bug #7643, this could lead to
failure to shut down at all in some cases, such as when SIGINT is received
just before the startup process sends PMSIGNAL_RECOVERY_STARTED: we'd
launch a bgwriter and checkpointer, and then those processes would have no
idea that they ought to quit. Similarly, launching a new autovacuum worker
would result in waiting till it finished before shutting down.
Also, switch the order of the code blocks in reaper() that detect startup
process crash versus shutdown termination. Once we've sent it a signal,
we should not consider that exit(1) is surprising. This is just a cosmetic
fix since shutdown occurs correctly anyway, but better not to log a phony
complaint about startup process crash.
Back-patch to 9.0. Some parts of this might be applicable before that,
but given the lack of prior complaints I'm not going to worry too much
about older branches.
2012-11-21 21:18:38 +01:00
|
|
|
/*
|
|
|
|
* Startup process exited in response to a shutdown request (or it
|
|
|
|
* completed normally regardless of the shutdown request).
|
|
|
|
*/
|
|
|
|
if (Shutdown > NoShutdown &&
|
|
|
|
(EXIT_STATUS_0(exitstatus) || EXIT_STATUS_1(exitstatus)))
|
|
|
|
{
|
2015-07-09 19:22:22 +02:00
|
|
|
StartupStatus = STARTUP_NOT_RUNNING;
|
Don't launch new child processes after we've been told to shut down.
Once we've received a shutdown signal (SIGINT or SIGTERM), we should not
launch any more child processes, even if we get signals requesting such.
The normal code path for spawning backends has always understood that,
but the postmaster's infrastructure for hot standby and autovacuum didn't
get the memo. As reported by Hari Babu in bug #7643, this could lead to
failure to shut down at all in some cases, such as when SIGINT is received
just before the startup process sends PMSIGNAL_RECOVERY_STARTED: we'd
launch a bgwriter and checkpointer, and then those processes would have no
idea that they ought to quit. Similarly, launching a new autovacuum worker
would result in waiting till it finished before shutting down.
Also, switch the order of the code blocks in reaper() that detect startup
process crash versus shutdown termination. Once we've sent it a signal,
we should not consider that exit(1) is surprising. This is just a cosmetic
fix since shutdown occurs correctly anyway, but better not to log a phony
complaint about startup process crash.
Back-patch to 9.0. Some parts of this might be applicable before that,
but given the lack of prior complaints I'm not going to worry too much
about older branches.
2012-11-21 21:18:38 +01:00
|
|
|
pmState = PM_WAIT_BACKENDS;
|
2014-11-25 21:13:30 +01:00
|
|
|
/* PostmasterStateMachine logic does the rest */
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (EXIT_STATUS_3(exitstatus))
|
|
|
|
{
|
|
|
|
ereport(LOG,
|
2015-05-24 03:35:49 +02:00
|
|
|
(errmsg("shutdown at recovery target")));
|
2015-07-09 19:22:22 +02:00
|
|
|
StartupStatus = STARTUP_NOT_RUNNING;
|
2014-11-25 21:13:30 +01:00
|
|
|
Shutdown = SmartShutdown;
|
|
|
|
TerminateChildren(SIGTERM);
|
|
|
|
pmState = PM_WAIT_BACKENDS;
|
Don't launch new child processes after we've been told to shut down.
Once we've received a shutdown signal (SIGINT or SIGTERM), we should not
launch any more child processes, even if we get signals requesting such.
The normal code path for spawning backends has always understood that,
but the postmaster's infrastructure for hot standby and autovacuum didn't
get the memo. As reported by Hari Babu in bug #7643, this could lead to
failure to shut down at all in some cases, such as when SIGINT is received
just before the startup process sends PMSIGNAL_RECOVERY_STARTED: we'd
launch a bgwriter and checkpointer, and then those processes would have no
idea that they ought to quit. Similarly, launching a new autovacuum worker
would result in waiting till it finished before shutting down.
Also, switch the order of the code blocks in reaper() that detect startup
process crash versus shutdown termination. Once we've sent it a signal,
we should not consider that exit(1) is surprising. This is just a cosmetic
fix since shutdown occurs correctly anyway, but better not to log a phony
complaint about startup process crash.
Back-patch to 9.0. Some parts of this might be applicable before that,
but given the lack of prior complaints I'm not going to worry too much
about older branches.
2012-11-21 21:18:38 +01:00
|
|
|
/* PostmasterStateMachine logic does the rest */
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
/*
|
|
|
|
* Unexpected exit of startup process (including FATAL exit)
|
2009-06-26 22:29:04 +02:00
|
|
|
* during PM_STARTUP is treated as catastrophic. There are no
|
|
|
|
* other processes running yet, so we can just exit.
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
*/
|
2009-02-23 10:28:50 +01:00
|
|
|
if (pmState == PM_STARTUP && !EXIT_STATUS_0(exitstatus))
|
1999-10-08 04:16:22 +02:00
|
|
|
{
|
2005-02-22 05:43:23 +01:00
|
|
|
LogChildExit(LOG, _("startup process"),
|
2001-11-11 00:06:12 +01:00
|
|
|
pid, exitstatus);
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(LOG,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
(errmsg("aborting startup due to startup process failure")));
|
2000-11-29 21:59:54 +01:00
|
|
|
ExitPostmaster(1);
|
1999-10-08 04:16:22 +02:00
|
|
|
}
|
2009-06-11 16:49:15 +02:00
|
|
|
|
2003-02-23 05:48:19 +01:00
|
|
|
/*
|
2012-02-06 21:29:26 +01:00
|
|
|
* After PM_STARTUP, any unexpected exit (including FATAL exit) of
|
|
|
|
* the startup process is catastrophic, so kill other children,
|
2015-07-09 19:22:22 +02:00
|
|
|
* and set StartupStatus so we don't try to reinitialize after
|
|
|
|
* they're gone. Exception: if StartupStatus is STARTUP_SIGNALED,
|
|
|
|
* then we previously sent the startup process a SIGQUIT; so
|
2012-02-06 21:29:26 +01:00
|
|
|
* that's probably the reason it died, and we do want to try to
|
|
|
|
* restart in that case.
|
2003-02-23 05:48:19 +01:00
|
|
|
*/
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
if (!EXIT_STATUS_0(exitstatus))
|
2007-08-09 03:18:43 +02:00
|
|
|
{
|
2015-07-09 19:22:22 +02:00
|
|
|
if (StartupStatus == STARTUP_SIGNALED)
|
|
|
|
StartupStatus = STARTUP_NOT_RUNNING;
|
|
|
|
else
|
|
|
|
StartupStatus = STARTUP_CRASHED;
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
HandleChildCrash(pid, exitstatus,
|
|
|
|
_("startup process"));
|
2007-08-09 03:18:43 +02:00
|
|
|
continue;
|
|
|
|
}
|
2009-02-23 10:28:50 +01:00
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
/*
|
2009-02-23 10:28:50 +01:00
|
|
|
* Startup succeeded, commence normal operations
|
2003-02-23 05:48:19 +01:00
|
|
|
*/
|
2015-07-09 19:22:22 +02:00
|
|
|
StartupStatus = STARTUP_NOT_RUNNING;
|
2009-02-23 10:28:50 +01:00
|
|
|
FatalError = false;
|
2013-10-06 04:24:50 +02:00
|
|
|
Assert(AbortStartTime == 0);
|
2010-05-26 14:32:41 +02:00
|
|
|
ReachedNormalRunning = true;
|
2009-02-23 10:28:50 +01:00
|
|
|
pmState = PM_RUN;
|
|
|
|
|
|
|
|
/*
|
2012-05-11 23:46:08 +02:00
|
|
|
* Crank up the background tasks, if we didn't do that already
|
2009-06-26 22:29:04 +02:00
|
|
|
* when we entered consistent recovery state. It doesn't matter
|
2009-02-23 10:28:50 +01:00
|
|
|
* if this fails, we'll just try again later.
|
|
|
|
*/
|
2011-11-01 18:14:47 +01:00
|
|
|
if (CheckpointerPID == 0)
|
|
|
|
CheckpointerPID = StartCheckpointer();
|
2012-06-01 09:25:17 +02:00
|
|
|
if (BgWriterPID == 0)
|
|
|
|
BgWriterPID = StartBackgroundWriter();
|
2012-05-11 23:46:08 +02:00
|
|
|
if (WalWriterPID == 0)
|
|
|
|
WalWriterPID = StartWalWriter();
|
2009-02-23 10:28:50 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Likewise, start other special children as needed. In a restart
|
|
|
|
* situation, some of them may be alive already.
|
|
|
|
*/
|
2011-04-25 18:00:21 +02:00
|
|
|
if (!IsBinaryUpgrade && AutoVacuumingActive() && AutoVacPID == 0)
|
2009-02-23 10:28:50 +01:00
|
|
|
AutoVacPID = StartAutoVacLauncher();
|
2015-06-12 16:11:51 +02:00
|
|
|
if (PgArchStartupAllowed() && PgArchPID == 0)
|
2009-02-23 10:28:50 +01:00
|
|
|
PgArchPID = pgarch_start();
|
|
|
|
if (PgStatPID == 0)
|
|
|
|
PgStatPID = pgstat_start();
|
|
|
|
|
2014-05-07 22:04:47 +02:00
|
|
|
/* workers may be scheduled to start now */
|
Allow multiple bgworkers to be launched per postmaster iteration.
Previously, maybe_start_bgworker() would launch at most one bgworker
process per call, on the grounds that the postmaster might otherwise
neglect its other duties for too long. However, that seems overly
conservative, especially since bad effects only become obvious when
many hundreds of bgworkers need to be launched at once. On the other
side of the coin is that the existing logic could result in substantial
delay of bgworker launches, because ServerLoop isn't guaranteed to
iterate immediately after a signal arrives. (My attempt to fix that
by using pselect(2) encountered too many portability question marks,
and in any case could not help on platforms without pselect().)
One could also question the wisdom of using an O(N^2) processing
method if the system is intended to support so many bgworkers.
As a compromise, allow that function to launch up to 100 bgworkers
per call (and in consequence, rename it to maybe_start_bgworkers).
This will allow any normal parallel-query request for workers
to be satisfied immediately during sigusr1_handler, avoiding the
question of whether ServerLoop will be able to launch more promptly.
There is talk of rewriting the postmaster to use a WaitEventSet to
avoid the signal-response-delay problem, but I'd argue that this change
should be kept even after that happens (if it ever does).
Backpatch to 9.6 where parallel query was added. The issue exists
before that, but previous uses of bgworkers typically aren't as
sensitive to how quickly they get launched.
Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
2017-04-26 22:17:29 +02:00
|
|
|
maybe_start_bgworkers();
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
2009-02-23 10:28:50 +01:00
|
|
|
/* at this point we are really open for business */
|
|
|
|
ereport(LOG,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
(errmsg("database system is ready to accept connections")));
|
2009-02-25 12:07:43 +01:00
|
|
|
|
Change pg_ctl to detect server-ready by watching status in postmaster.pid.
Traditionally, "pg_ctl start -w" has waited for the server to become
ready to accept connections by attempting a connection once per second.
That has the major problem that connection issues (for instance, a
kernel packet filter blocking traffic) can't be reliably told apart
from server startup issues, and the minor problem that if server startup
isn't quick, we accumulate "the database system is starting up" spam
in the server log. We've hacked around many of the possible connection
issues, but it resulted in ugly and complicated code in pg_ctl.c.
In commit c61559ec3, I changed the probe rate to every tenth of a second.
That prompted Jeff Janes to complain that the log-spam problem had become
much worse. In the ensuing discussion, Andres Freund pointed out that
we could dispense with connection attempts altogether if the postmaster
were changed to report its status in postmaster.pid, which "pg_ctl start"
already relies on being able to read. This patch implements that, teaching
postmaster.c to report a status string into the pidfile at the same
state-change points already identified as being of interest for systemd
status reporting (cf commit 7d17e683f). pg_ctl no longer needs to link
with libpq at all; all its functions now depend on reading server files.
In support of this, teach AddToDataDirLockFile() to allow addition of
postmaster.pid lines in not-necessarily-sequential order. This is needed
on Windows where the SHMEM_KEY line will never be written at all. We still
have the restriction that we don't want to truncate the pidfile; document
the reasons for that a bit better.
Also, fix the pg_ctl TAP tests so they'll notice if "start -w" mode
is broken --- before, they'd just wait out the sixty seconds until
the loop gives up, and then report success anyway. (Yes, I found that
out the hard way.)
While at it, arrange for pg_ctl to not need to #include miscadmin.h;
as a rather low-level backend header, requiring that to be compilable
client-side is pretty dubious. This requires moving the #define's
associated with the pidfile into a new header file, and moving
PG_BACKEND_VERSIONSTR someplace else. For lack of a clearly better
"someplace else", I put it into port.h, beside the declaration of
find_other_exec(), since most users of that macro are passing the value to
find_other_exec(). (initdb still depends on miscadmin.h, but at least
pg_ctl and pg_upgrade no longer do.)
In passing, fix main.c so that PG_BACKEND_VERSIONSTR actually defines the
output of "postgres -V", which remarkably it had never done before.
Discussion: https://postgr.es/m/CAMkU=1xJW8e+CTotojOMBd-yzUvD0e_JZu2xHo=MnuZ4__m7Pg@mail.gmail.com
2017-06-28 23:31:24 +02:00
|
|
|
/* Report status */
|
|
|
|
AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_READY);
|
2015-11-17 12:46:17 +01:00
|
|
|
#ifdef USE_SYSTEMD
|
|
|
|
sd_notify(0, "READY=1");
|
|
|
|
#endif
|
|
|
|
|
2009-02-25 12:07:43 +01:00
|
|
|
continue;
|
2004-05-30 00:48:23 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2012-06-10 21:20:04 +02:00
|
|
|
* Was it the bgwriter? Normal exit can be ignored; we'll start a new
|
|
|
|
* one at the next iteration of the postmaster's main loop, if
|
2012-05-11 23:46:08 +02:00
|
|
|
* necessary. Any other exit condition is treated as a crash.
|
2004-05-30 00:48:23 +02:00
|
|
|
*/
|
2007-08-09 03:18:43 +02:00
|
|
|
if (pid == BgWriterPID)
|
2004-05-30 00:48:23 +02:00
|
|
|
{
|
2004-06-14 20:08:19 +02:00
|
|
|
BgWriterPID = 0;
|
2011-11-01 18:14:47 +01:00
|
|
|
if (!EXIT_STATUS_0(exitstatus))
|
|
|
|
HandleChildCrash(pid, exitstatus,
|
|
|
|
_("background writer process"));
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Was it the checkpointer?
|
|
|
|
*/
|
|
|
|
if (pid == CheckpointerPID)
|
|
|
|
{
|
|
|
|
CheckpointerPID = 0;
|
2007-08-09 03:18:43 +02:00
|
|
|
if (EXIT_STATUS_0(exitstatus) && pmState == PM_SHUTDOWN)
|
1999-10-06 23:58:18 +02:00
|
|
|
{
|
2004-05-30 00:48:23 +02:00
|
|
|
/*
|
2012-06-10 21:20:04 +02:00
|
|
|
* OK, we saw normal exit of the checkpointer after it's been
|
|
|
|
* told to shut down. We expect that it wrote a shutdown
|
2014-05-06 18:12:18 +02:00
|
|
|
* checkpoint. (If for some reason it didn't, recovery will
|
2007-08-09 03:18:43 +02:00
|
|
|
* occur on next postmaster start.)
|
2004-07-27 03:46:03 +02:00
|
|
|
*
|
2008-01-11 01:54:09 +01:00
|
|
|
* At this point we should have no normal backend children
|
|
|
|
* left (else we'd not be in PM_SHUTDOWN state) but we might
|
|
|
|
* have dead_end children to wait for.
|
|
|
|
*
|
|
|
|
* If we have an archiver subprocess, tell it to do a last
|
2010-01-15 10:19:10 +01:00
|
|
|
* archive cycle and quit. Likewise, if we have walsender
|
|
|
|
* processes, tell them to send any remaining WAL and quit.
|
2004-05-30 00:48:23 +02:00
|
|
|
*/
|
2007-08-09 03:18:43 +02:00
|
|
|
Assert(Shutdown > NoShutdown);
|
2008-01-11 01:54:09 +01:00
|
|
|
|
2010-01-15 10:19:10 +01:00
|
|
|
/* Waken archiver for the last time */
|
2008-01-11 01:54:09 +01:00
|
|
|
if (PgArchPID != 0)
|
|
|
|
signal_child(PgArchPID, SIGUSR2);
|
2010-01-15 10:19:10 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Waken walsenders for the last time. No regular backends
|
|
|
|
* should be around anymore.
|
|
|
|
*/
|
2017-06-06 03:53:41 +02:00
|
|
|
SignalChildren(SIGUSR2);
|
2010-01-15 10:19:10 +01:00
|
|
|
|
|
|
|
pmState = PM_SHUTDOWN_2;
|
2008-01-11 01:54:09 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* We can also shut down the stats collector now; there's
|
|
|
|
* nothing left for it to do.
|
|
|
|
*/
|
|
|
|
if (PgStatPID != 0)
|
|
|
|
signal_child(PgStatPID, SIGQUIT);
|
1999-10-06 23:58:18 +02:00
|
|
|
}
|
2007-08-09 03:18:43 +02:00
|
|
|
else
|
2006-11-30 19:29:12 +01:00
|
|
|
{
|
2007-08-09 03:18:43 +02:00
|
|
|
/*
|
2012-06-10 21:20:04 +02:00
|
|
|
* Any unexpected exit of the checkpointer (including FATAL
|
|
|
|
* exit) is treated as a crash.
|
2007-08-09 03:18:43 +02:00
|
|
|
*/
|
|
|
|
HandleChildCrash(pid, exitstatus,
|
2011-11-01 18:14:47 +01:00
|
|
|
_("checkpointer process"));
|
2006-11-30 19:29:12 +01:00
|
|
|
}
|
|
|
|
|
2004-05-30 00:48:23 +02:00
|
|
|
continue;
|
1999-10-06 23:58:18 +02:00
|
|
|
}
|
2001-06-22 21:16:24 +02:00
|
|
|
|
2007-07-24 06:54:09 +02:00
|
|
|
/*
|
2007-11-15 22:14:46 +01:00
|
|
|
* Was it the wal writer? Normal exit can be ignored; we'll start a
|
|
|
|
* new one at the next iteration of the postmaster's main loop, if
|
|
|
|
* necessary. Any other exit condition is treated as a crash.
|
2007-07-24 06:54:09 +02:00
|
|
|
*/
|
2007-08-09 03:18:43 +02:00
|
|
|
if (pid == WalWriterPID)
|
2007-07-24 06:54:09 +02:00
|
|
|
{
|
|
|
|
WalWriterPID = 0;
|
|
|
|
if (!EXIT_STATUS_0(exitstatus))
|
|
|
|
HandleChildCrash(pid, exitstatus,
|
2007-11-08 15:47:51 +01:00
|
|
|
_("WAL writer process"));
|
2007-07-24 06:54:09 +02:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2010-01-15 10:19:10 +01:00
|
|
|
/*
|
|
|
|
* Was it the wal receiver? If exit status is zero (normal) or one
|
|
|
|
* (FATAL exit), we assume everything is all right just like normal
|
Don't lose walreceiver start requests due to race condition in postmaster.
When a walreceiver dies, the startup process will notice that and send
a PMSIGNAL_START_WALRECEIVER signal to the postmaster, asking for a new
walreceiver to be launched. There's a race condition, which at least
in HEAD is very easy to hit, whereby the postmaster might see that
signal before it processes the SIGCHLD from the walreceiver process.
In that situation, sigusr1_handler() just dropped the start request
on the floor, reasoning that it must be redundant. Eventually, after
10 seconds (WALRCV_STARTUP_TIMEOUT), the startup process would make a
fresh request --- but that's a long time if the connection could have
been re-established almost immediately.
Fix it by setting a state flag inside the postmaster that we won't
clear until we do launch a walreceiver. In cases where that results
in an extra walreceiver launch, it's up to the walreceiver to realize
it's unwanted and go away --- but we have, and need, that logic anyway
for the opposite race case.
I came across this through investigating unexpected delays in the
src/test/recovery TAP tests: it manifests there in test cases where
a master server is stopped and restarted while leaving streaming
slaves active.
This logic has been broken all along, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/21344.1498494720@sss.pgh.pa.us
2017-06-26 23:31:56 +02:00
|
|
|
* backends. (If we need a new wal receiver, we'll start one at the
|
|
|
|
* next iteration of the postmaster's main loop.)
|
2010-01-15 10:19:10 +01:00
|
|
|
*/
|
|
|
|
if (pid == WalReceiverPID)
|
|
|
|
{
|
|
|
|
WalReceiverPID = 0;
|
|
|
|
if (!EXIT_STATUS_0(exitstatus) && !EXIT_STATUS_1(exitstatus))
|
|
|
|
HandleChildCrash(pid, exitstatus,
|
|
|
|
_("WAL receiver process"));
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2005-07-14 07:13:45 +02:00
|
|
|
/*
|
2007-11-15 22:14:46 +01:00
|
|
|
* Was it the autovacuum launcher? Normal exit can be ignored; we'll
|
|
|
|
* start a new one at the next iteration of the postmaster's main
|
2014-05-06 18:12:18 +02:00
|
|
|
* loop, if necessary. Any other exit condition is treated as a
|
2007-11-15 22:14:46 +01:00
|
|
|
* crash.
|
2005-07-14 07:13:45 +02:00
|
|
|
*/
|
2007-08-09 03:18:43 +02:00
|
|
|
if (pid == AutoVacPID)
|
2005-07-14 07:13:45 +02:00
|
|
|
{
|
|
|
|
AutoVacPID = 0;
|
2007-02-16 00:23:23 +01:00
|
|
|
if (!EXIT_STATUS_0(exitstatus))
|
2005-07-14 07:13:45 +02:00
|
|
|
HandleChildCrash(pid, exitstatus,
|
2007-02-16 00:23:23 +01:00
|
|
|
_("autovacuum launcher process"));
|
2005-07-14 07:13:45 +02:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2004-07-19 04:47:16 +02:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Was it the archiver? If so, just try to start a new one; no need
|
|
|
|
* to force reset of the rest of the system. (If fail, we'll try
|
2010-02-26 03:01:40 +01:00
|
|
|
* again in future cycles of the main loop.). Unless we were waiting
|
2012-04-22 18:23:47 +02:00
|
|
|
* for it to shut down; don't restart it in that case, and
|
2010-02-26 03:01:40 +01:00
|
|
|
* PostmasterStateMachine() will advance to the next shutdown step.
|
2004-07-19 04:47:16 +02:00
|
|
|
*/
|
2007-08-09 03:18:43 +02:00
|
|
|
if (pid == PgArchPID)
|
2004-07-19 04:47:16 +02:00
|
|
|
{
|
|
|
|
PgArchPID = 0;
|
2006-11-21 01:49:55 +01:00
|
|
|
if (!EXIT_STATUS_0(exitstatus))
|
2005-02-22 05:43:23 +01:00
|
|
|
LogChildExit(LOG, _("archiver process"),
|
2004-07-19 04:47:16 +02:00
|
|
|
pid, exitstatus);
|
2015-06-12 16:11:51 +02:00
|
|
|
if (PgArchStartupAllowed())
|
2004-07-19 04:47:16 +02:00
|
|
|
PgArchPID = pgarch_start();
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2004-06-14 20:08:19 +02:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Was it the statistics collector? If so, just try to start a new
|
|
|
|
* one; no need to force reset of the rest of the system. (If fail,
|
|
|
|
* we'll try again in future cycles of the main loop.)
|
2004-06-14 20:08:19 +02:00
|
|
|
*/
|
2007-08-09 03:18:43 +02:00
|
|
|
if (pid == PgStatPID)
|
2004-06-14 20:08:19 +02:00
|
|
|
{
|
|
|
|
PgStatPID = 0;
|
2006-11-21 01:49:55 +01:00
|
|
|
if (!EXIT_STATUS_0(exitstatus))
|
2005-02-22 05:43:23 +01:00
|
|
|
LogChildExit(LOG, _("statistics collector process"),
|
2004-06-14 20:08:19 +02:00
|
|
|
pid, exitstatus);
|
2016-10-27 20:27:40 +02:00
|
|
|
if (pmState == PM_RUN || pmState == PM_HOT_STANDBY)
|
2004-06-14 20:08:19 +02:00
|
|
|
PgStatPID = pgstat_start();
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
/* Was it the system logger? If so, try to start a new one */
|
|
|
|
if (pid == SysLoggerPID)
|
2004-08-06 01:32:13 +02:00
|
|
|
{
|
|
|
|
SysLoggerPID = 0;
|
|
|
|
/* for safety's sake, launch new logger *first* */
|
|
|
|
SysLoggerPID = SysLogger_Start();
|
2006-11-21 01:49:55 +01:00
|
|
|
if (!EXIT_STATUS_0(exitstatus))
|
2005-02-22 05:43:23 +01:00
|
|
|
LogChildExit(LOG, _("system logger process"),
|
2004-08-06 01:32:13 +02:00
|
|
|
pid, exitstatus);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/* Was it one of our background workers? */
|
|
|
|
if (CleanupBackgroundWorker(pid, exitstatus))
|
|
|
|
{
|
|
|
|
/* have it be restarted */
|
|
|
|
HaveCrashedWorker = true;
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2003-05-10 20:15:42 +02:00
|
|
|
/*
|
2004-05-30 00:48:23 +02:00
|
|
|
* Else do standard backend child cleanup.
|
2003-05-10 20:15:42 +02:00
|
|
|
*/
|
2004-08-04 22:09:47 +02:00
|
|
|
CleanupBackend(pid, exitstatus);
|
2003-08-04 02:43:34 +02:00
|
|
|
} /* loop over pending child-death reports */
|
1999-10-06 23:58:18 +02:00
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
/*
|
|
|
|
* After cleaning out the SIGCHLD queue, see if we have any state changes
|
|
|
|
* or actions to make.
|
|
|
|
*/
|
|
|
|
PostmasterStateMachine();
|
1999-10-06 23:58:18 +02:00
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
/* Done with signal handler */
|
2001-11-04 20:55:31 +01:00
|
|
|
PG_SETMASK(&UnBlockSig);
|
|
|
|
|
2000-12-18 18:33:42 +01:00
|
|
|
errno = save_errno;
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/*
|
|
|
|
* Scan the bgworkers list and see if the given PID (which has just stopped
|
|
|
|
* or crashed) is in it. Handle its shutdown if so, and return true. If not a
|
|
|
|
* bgworker, return false.
|
|
|
|
*
|
|
|
|
* This is heavily based on CleanupBackend. One important difference is that
|
|
|
|
* we don't know yet that the dying process is a bgworker, so we must be silent
|
|
|
|
* until we're sure it is.
|
|
|
|
*/
|
|
|
|
static bool
|
|
|
|
CleanupBackgroundWorker(int pid,
|
|
|
|
int exitstatus) /* child's exit status */
|
|
|
|
{
|
|
|
|
char namebuf[MAXPGPATH];
|
2017-05-17 22:31:56 +02:00
|
|
|
slist_mutable_iter iter;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
2017-03-03 04:44:49 +01:00
|
|
|
slist_foreach_modify(iter, &BackgroundWorkerList)
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
{
|
|
|
|
RegisteredBgWorker *rw;
|
|
|
|
|
|
|
|
rw = slist_container(RegisteredBgWorker, rw_lnode, iter.cur);
|
|
|
|
|
|
|
|
if (rw->rw_pid != pid)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
#ifdef WIN32
|
|
|
|
/* see CleanupBackend */
|
|
|
|
if (exitstatus == ERROR_WAIT_NO_CHILDREN)
|
|
|
|
exitstatus = 0;
|
|
|
|
#endif
|
|
|
|
|
|
|
|
snprintf(namebuf, MAXPGPATH, "%s: %s", _("worker process"),
|
|
|
|
rw->rw_worker.bgw_name);
|
|
|
|
|
|
|
|
if (!EXIT_STATUS_0(exitstatus))
|
2014-05-07 23:43:39 +02:00
|
|
|
{
|
|
|
|
/* Record timestamp, so we know when to restart the worker. */
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
rw->rw_crashed_at = GetCurrentTimestamp();
|
2014-05-07 23:43:39 +02:00
|
|
|
}
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
else
|
2014-05-07 23:43:39 +02:00
|
|
|
{
|
|
|
|
/* Zero exit status means terminate */
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
rw->rw_crashed_at = 0;
|
2014-05-07 23:43:39 +02:00
|
|
|
rw->rw_terminate = true;
|
|
|
|
}
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Additionally, for shared-memory-connected workers, just like a
|
|
|
|
* backend, any exit status other than 0 or 1 is considered a crash
|
|
|
|
* and causes a system-wide restart.
|
|
|
|
*/
|
2014-05-07 22:30:23 +02:00
|
|
|
if ((rw->rw_worker.bgw_flags & BGWORKER_SHMEM_ACCESS) != 0)
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
{
|
|
|
|
if (!EXIT_STATUS_0(exitstatus) && !EXIT_STATUS_1(exitstatus))
|
|
|
|
{
|
|
|
|
HandleChildCrash(pid, exitstatus, namebuf);
|
|
|
|
return true;
|
|
|
|
}
|
2014-05-07 22:30:23 +02:00
|
|
|
}
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
2014-05-07 22:30:23 +02:00
|
|
|
/*
|
2015-05-24 03:35:49 +02:00
|
|
|
* We must release the postmaster child slot whether this worker is
|
|
|
|
* connected to shared memory or not, but we only treat it as a crash
|
|
|
|
* if it is in fact connected.
|
2014-05-07 22:30:23 +02:00
|
|
|
*/
|
|
|
|
if (!ReleasePostmasterChildSlot(rw->rw_child_slot) &&
|
|
|
|
(rw->rw_worker.bgw_flags & BGWORKER_SHMEM_ACCESS) != 0)
|
|
|
|
{
|
|
|
|
HandleChildCrash(pid, exitstatus, namebuf);
|
|
|
|
return true;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Get it out of the BackendList and clear out remaining data */
|
2015-09-01 21:30:19 +02:00
|
|
|
dlist_delete(&rw->rw_backend->elem);
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
#ifdef EXEC_BACKEND
|
2015-09-01 21:30:19 +02:00
|
|
|
ShmemBackendArrayRemove(rw->rw_backend);
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
#endif
|
2014-05-06 18:12:18 +02:00
|
|
|
|
2015-09-01 21:30:19 +02:00
|
|
|
/*
|
|
|
|
* It's possible that this background worker started some OTHER
|
2016-06-10 00:02:36 +02:00
|
|
|
* background worker and asked to be notified when that worker started
|
|
|
|
* or stopped. If so, cancel any notifications destined for the
|
|
|
|
* now-dead backend.
|
2015-09-01 21:30:19 +02:00
|
|
|
*/
|
|
|
|
if (rw->rw_backend->bgworker_notify)
|
|
|
|
BackgroundWorkerStopNotifications(rw->rw_pid);
|
|
|
|
free(rw->rw_backend);
|
|
|
|
rw->rw_backend = NULL;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
rw->rw_pid = 0;
|
|
|
|
rw->rw_child_slot = 0;
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
ReportBackgroundWorkerExit(&iter); /* report child death */
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
2015-06-26 17:23:32 +02:00
|
|
|
LogChildExit(EXIT_STATUS_0(exitstatus) ? DEBUG1 : LOG,
|
2015-07-09 19:22:22 +02:00
|
|
|
namebuf, pid, exitstatus);
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
return false;
|
|
|
|
}
|
2004-03-24 05:04:51 +01:00
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
2004-08-04 22:09:47 +02:00
|
|
|
* CleanupBackend -- cleanup after terminated backend.
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
|
|
|
* Remove all local state associated with backend.
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
*
|
|
|
|
* If you change this, see also CleanupBackgroundWorker.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
|
|
|
static void
|
2004-08-04 22:09:47 +02:00
|
|
|
CleanupBackend(int pid,
|
2004-08-29 07:07:03 +02:00
|
|
|
int exitstatus) /* child's exit status. */
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_mutable_iter iter;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2005-02-22 05:43:23 +01:00
|
|
|
LogChildExit(DEBUG2, _("server process"), pid, exitstatus);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
|
|
|
/*
|
2006-11-21 01:49:55 +01:00
|
|
|
* If a backend dies in an ugly way then we must signal all other backends
|
|
|
|
* to quickdie. If exit status is zero (normal) or one (FATAL exit), we
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
* assume everything is all right and proceed to remove the backend from
|
|
|
|
* the active backend list.
|
1997-09-07 07:04:48 +02:00
|
|
|
*/
|
2011-04-10 17:42:00 +02:00
|
|
|
|
2014-03-13 06:34:42 +01:00
|
|
|
#ifdef WIN32
|
2014-05-06 18:12:18 +02:00
|
|
|
|
2010-09-16 22:37:13 +02:00
|
|
|
/*
|
2011-04-10 17:42:00 +02:00
|
|
|
* On win32, also treat ERROR_WAIT_NO_CHILDREN (128) as nonfatal case,
|
|
|
|
* since that sometimes happens under load when the process fails to start
|
|
|
|
* properly (long before it starts using shared memory). Microsoft reports
|
|
|
|
* it is related to mutex failure:
|
|
|
|
* http://archives.postgresql.org/pgsql-hackers/2010-09/msg00790.php
|
2010-09-16 22:37:13 +02:00
|
|
|
*/
|
|
|
|
if (exitstatus == ERROR_WAIT_NO_CHILDREN)
|
|
|
|
{
|
|
|
|
LogChildExit(LOG, _("server process"), pid, exitstatus);
|
|
|
|
exitstatus = 0;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2006-11-21 01:49:55 +01:00
|
|
|
if (!EXIT_STATUS_0(exitstatus) && !EXIT_STATUS_1(exitstatus))
|
2004-05-30 00:48:23 +02:00
|
|
|
{
|
2005-02-22 05:43:23 +01:00
|
|
|
HandleChildCrash(pid, exitstatus, _("server process"));
|
2004-05-30 00:48:23 +02:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_foreach_modify(iter, &BackendList)
|
1997-09-07 07:04:48 +02:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
Backend *bp = dlist_container(Backend, elem, iter.cur);
|
2004-05-30 00:48:23 +02:00
|
|
|
|
|
|
|
if (bp->pid == pid)
|
1997-09-07 07:04:48 +02:00
|
|
|
{
|
2007-08-09 03:18:43 +02:00
|
|
|
if (!bp->dead_end)
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
{
|
|
|
|
if (!ReleasePostmasterChildSlot(bp->child_slot))
|
|
|
|
{
|
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* Uh-oh, the child failed to clean itself up. Treat as a
|
2009-06-11 16:49:15 +02:00
|
|
|
* crash after all.
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
*/
|
|
|
|
HandleChildCrash(pid, exitstatus, _("server process"));
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
#ifdef EXEC_BACKEND
|
|
|
|
ShmemBackendArrayRemove(bp);
|
2007-08-09 03:18:43 +02:00
|
|
|
#endif
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
}
|
2013-08-28 20:08:13 +02:00
|
|
|
if (bp->bgworker_notify)
|
|
|
|
{
|
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* This backend may have been slated to receive SIGUSR1 when
|
|
|
|
* some background worker started or stopped. Cancel those
|
|
|
|
* notifications, as we don't want to signal PIDs that are not
|
|
|
|
* PostgreSQL backends. This gets skipped in the (probably
|
|
|
|
* very common) case where the backend has never requested any
|
|
|
|
* such notifications.
|
2013-08-28 20:08:13 +02:00
|
|
|
*/
|
|
|
|
BackgroundWorkerStopNotifications(bp->pid);
|
|
|
|
}
|
2012-10-19 01:04:20 +02:00
|
|
|
dlist_delete(iter.cur);
|
2004-05-30 00:48:23 +02:00
|
|
|
free(bp);
|
|
|
|
break;
|
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
}
|
2004-05-30 00:48:23 +02:00
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2004-05-30 00:48:23 +02:00
|
|
|
/*
|
2011-11-01 18:14:47 +01:00
|
|
|
* HandleChildCrash -- cleanup after failed backend, bgwriter, checkpointer,
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
* walwriter, autovacuum, or background worker.
|
2004-05-30 00:48:23 +02:00
|
|
|
*
|
|
|
|
* The objectives here are to clean up our local state about the child
|
|
|
|
* process, and to signal all other remaining children to quickdie.
|
|
|
|
*/
|
|
|
|
static void
|
2004-08-04 22:09:47 +02:00
|
|
|
HandleChildCrash(int pid, int exitstatus, const char *procname)
|
2004-05-30 00:48:23 +02:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_mutable_iter iter;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
slist_iter siter;
|
2004-05-30 00:48:23 +02:00
|
|
|
Backend *bp;
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
bool take_action;
|
2001-08-30 21:02:42 +02:00
|
|
|
|
2004-05-30 00:48:23 +02:00
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* We only log messages and send signals if this is the first process
|
|
|
|
* crash and we're not doing an immediate shutdown; otherwise, we're only
|
|
|
|
* here to update postmaster's idea of live processes. If we have already
|
|
|
|
* signalled children, nonzero exit status is to be expected, so don't
|
|
|
|
* clutter log.
|
2004-05-30 00:48:23 +02:00
|
|
|
*/
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
take_action = !FatalError && Shutdown != ImmediateShutdown;
|
|
|
|
|
|
|
|
if (take_action)
|
1999-10-08 04:16:22 +02:00
|
|
|
{
|
2004-08-04 22:09:47 +02:00
|
|
|
LogChildExit(LOG, procname, pid, exitstatus);
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(LOG,
|
2005-10-15 04:49:52 +02:00
|
|
|
(errmsg("terminating any other active server processes")));
|
1999-10-08 04:16:22 +02:00
|
|
|
}
|
2000-12-20 22:51:52 +01:00
|
|
|
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/* Process background workers. */
|
|
|
|
slist_foreach(siter, &BackgroundWorkerList)
|
|
|
|
{
|
|
|
|
RegisteredBgWorker *rw;
|
|
|
|
|
|
|
|
rw = slist_container(RegisteredBgWorker, rw_lnode, siter.cur);
|
|
|
|
if (rw->rw_pid == 0)
|
2013-05-29 22:58:43 +02:00
|
|
|
continue; /* not running */
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
if (rw->rw_pid == pid)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Found entry for freshly-dead worker, so remove it.
|
|
|
|
*/
|
|
|
|
(void) ReleasePostmasterChildSlot(rw->rw_child_slot);
|
2015-09-01 21:30:19 +02:00
|
|
|
dlist_delete(&rw->rw_backend->elem);
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
#ifdef EXEC_BACKEND
|
2015-09-01 21:30:19 +02:00
|
|
|
ShmemBackendArrayRemove(rw->rw_backend);
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
#endif
|
2015-09-01 21:30:19 +02:00
|
|
|
free(rw->rw_backend);
|
|
|
|
rw->rw_backend = NULL;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
rw->rw_pid = 0;
|
|
|
|
rw->rw_child_slot = 0;
|
|
|
|
/* don't reset crashed_at */
|
2013-08-28 20:08:13 +02:00
|
|
|
/* don't report child stop, either */
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/* Keep looping so we can signal remaining workers */
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* This worker is still alive. Unless we did so already, tell it
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
* to commit hara-kiri.
|
|
|
|
*
|
|
|
|
* SIGQUIT is the special signal that says exit without proc_exit
|
|
|
|
* and let the user know what's going on. But if SendStop is set
|
|
|
|
* (-s on command line), then we send SIGSTOP instead, so that we
|
|
|
|
* can get core dumps from all backends by hand.
|
|
|
|
*/
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
if (take_action)
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
{
|
|
|
|
ereport(DEBUG2,
|
|
|
|
(errmsg_internal("sending %s to process %d",
|
|
|
|
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
|
|
|
(int) rw->rw_pid)));
|
|
|
|
signal_child(rw->rw_pid, (SendStop ? SIGSTOP : SIGQUIT));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2004-05-30 00:48:23 +02:00
|
|
|
/* Process regular backends */
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_foreach_modify(iter, &BackendList)
|
1997-09-07 07:04:48 +02:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
bp = dlist_container(Backend, elem, iter.cur);
|
|
|
|
|
2004-05-30 00:48:23 +02:00
|
|
|
if (bp->pid == pid)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Found entry for freshly-dead backend, so remove it.
|
|
|
|
*/
|
2007-08-09 03:18:43 +02:00
|
|
|
if (!bp->dead_end)
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
{
|
|
|
|
(void) ReleasePostmasterChildSlot(bp->child_slot);
|
|
|
|
#ifdef EXEC_BACKEND
|
|
|
|
ShmemBackendArrayRemove(bp);
|
2007-08-09 03:18:43 +02:00
|
|
|
#endif
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
}
|
2012-10-19 01:04:20 +02:00
|
|
|
dlist_delete(iter.cur);
|
2004-05-30 00:48:23 +02:00
|
|
|
free(bp);
|
|
|
|
/* Keep looping so we can signal remaining backends */
|
|
|
|
}
|
|
|
|
else
|
1997-09-07 07:04:48 +02:00
|
|
|
{
|
2000-12-20 22:51:52 +01:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* This backend is still alive. Unless we did so already, tell it
|
|
|
|
* to commit hara-kiri.
|
2000-12-20 22:51:52 +01:00
|
|
|
*
|
2005-11-22 19:17:34 +01:00
|
|
|
* SIGQUIT is the special signal that says exit without proc_exit
|
|
|
|
* and let the user know what's going on. But if SendStop is set
|
|
|
|
* (-s on command line), then we send SIGSTOP instead, so that we
|
|
|
|
* can get core dumps from all backends by hand.
|
2007-08-09 03:18:43 +02:00
|
|
|
*
|
|
|
|
* We could exclude dead_end children here, but at least in the
|
|
|
|
* SIGSTOP case it seems better to include them.
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
*
|
|
|
|
* Background workers were already processed above; ignore them
|
|
|
|
* here.
|
2000-12-20 22:51:52 +01:00
|
|
|
*/
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
if (bp->bkend_type == BACKEND_TYPE_BGWORKER)
|
|
|
|
continue;
|
|
|
|
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
if (take_action)
|
2000-12-20 22:51:52 +01:00
|
|
|
{
|
2003-08-12 20:23:21 +02:00
|
|
|
ereport(DEBUG2,
|
|
|
|
(errmsg_internal("sending %s to process %d",
|
2005-10-15 04:49:52 +02:00
|
|
|
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
2003-08-12 20:23:21 +02:00
|
|
|
(int) bp->pid)));
|
2006-11-21 21:59:53 +01:00
|
|
|
signal_child(bp->pid, (SendStop ? SIGSTOP : SIGQUIT));
|
2000-12-20 22:51:52 +01:00
|
|
|
}
|
1997-09-07 07:04:48 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
/* Take care of the startup process too */
|
|
|
|
if (pid == StartupPID)
|
2015-07-09 19:22:22 +02:00
|
|
|
{
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
StartupPID = 0;
|
2015-07-09 19:22:22 +02:00
|
|
|
StartupStatus = STARTUP_CRASHED;
|
|
|
|
}
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
else if (StartupPID != 0 && take_action)
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
{
|
|
|
|
ereport(DEBUG2,
|
|
|
|
(errmsg_internal("sending %s to process %d",
|
|
|
|
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
|
|
|
(int) StartupPID)));
|
2009-03-03 11:42:05 +01:00
|
|
|
signal_child(StartupPID, (SendStop ? SIGSTOP : SIGQUIT));
|
2015-07-09 19:22:22 +02:00
|
|
|
StartupStatus = STARTUP_SIGNALED;
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
}
|
|
|
|
|
2004-05-30 00:48:23 +02:00
|
|
|
/* Take care of the bgwriter too */
|
|
|
|
if (pid == BgWriterPID)
|
2003-11-19 16:55:08 +01:00
|
|
|
BgWriterPID = 0;
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
else if (BgWriterPID != 0 && take_action)
|
2001-06-22 21:16:24 +02:00
|
|
|
{
|
2004-05-30 00:48:23 +02:00
|
|
|
ereport(DEBUG2,
|
|
|
|
(errmsg_internal("sending %s to process %d",
|
|
|
|
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
|
|
|
(int) BgWriterPID)));
|
2006-11-21 21:59:53 +01:00
|
|
|
signal_child(BgWriterPID, (SendStop ? SIGSTOP : SIGQUIT));
|
2001-06-22 21:16:24 +02:00
|
|
|
}
|
2000-12-20 22:51:52 +01:00
|
|
|
|
2011-11-01 18:14:47 +01:00
|
|
|
/* Take care of the checkpointer too */
|
|
|
|
if (pid == CheckpointerPID)
|
|
|
|
CheckpointerPID = 0;
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
else if (CheckpointerPID != 0 && take_action)
|
2011-11-01 18:14:47 +01:00
|
|
|
{
|
|
|
|
ereport(DEBUG2,
|
|
|
|
(errmsg_internal("sending %s to process %d",
|
|
|
|
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
|
|
|
(int) CheckpointerPID)));
|
|
|
|
signal_child(CheckpointerPID, (SendStop ? SIGSTOP : SIGQUIT));
|
|
|
|
}
|
|
|
|
|
2007-07-24 06:54:09 +02:00
|
|
|
/* Take care of the walwriter too */
|
|
|
|
if (pid == WalWriterPID)
|
|
|
|
WalWriterPID = 0;
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
else if (WalWriterPID != 0 && take_action)
|
2007-07-24 06:54:09 +02:00
|
|
|
{
|
|
|
|
ereport(DEBUG2,
|
|
|
|
(errmsg_internal("sending %s to process %d",
|
|
|
|
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
|
|
|
(int) WalWriterPID)));
|
|
|
|
signal_child(WalWriterPID, (SendStop ? SIGSTOP : SIGQUIT));
|
|
|
|
}
|
|
|
|
|
2010-01-15 10:19:10 +01:00
|
|
|
/* Take care of the walreceiver too */
|
|
|
|
if (pid == WalReceiverPID)
|
|
|
|
WalReceiverPID = 0;
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
else if (WalReceiverPID != 0 && take_action)
|
2010-01-15 10:19:10 +01:00
|
|
|
{
|
|
|
|
ereport(DEBUG2,
|
|
|
|
(errmsg_internal("sending %s to process %d",
|
|
|
|
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
|
|
|
(int) WalReceiverPID)));
|
|
|
|
signal_child(WalReceiverPID, (SendStop ? SIGSTOP : SIGQUIT));
|
|
|
|
}
|
|
|
|
|
2007-02-16 00:23:23 +01:00
|
|
|
/* Take care of the autovacuum launcher too */
|
2005-07-14 07:13:45 +02:00
|
|
|
if (pid == AutoVacPID)
|
|
|
|
AutoVacPID = 0;
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
else if (AutoVacPID != 0 && take_action)
|
2005-07-14 07:13:45 +02:00
|
|
|
{
|
|
|
|
ereport(DEBUG2,
|
|
|
|
(errmsg_internal("sending %s to process %d",
|
|
|
|
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
|
|
|
(int) AutoVacPID)));
|
2006-11-21 21:59:53 +01:00
|
|
|
signal_child(AutoVacPID, (SendStop ? SIGSTOP : SIGQUIT));
|
2005-07-14 07:13:45 +02:00
|
|
|
}
|
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
/*
|
|
|
|
* Force a power-cycle of the pgarch process too. (This isn't absolutely
|
|
|
|
* necessary, but it seems like a good idea for robustness, and it
|
2007-11-15 22:14:46 +01:00
|
|
|
* simplifies the state-machine logic in the case where a shutdown request
|
|
|
|
* arrives during crash processing.)
|
2007-08-09 03:18:43 +02:00
|
|
|
*/
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
if (PgArchPID != 0 && take_action)
|
2004-07-19 04:47:16 +02:00
|
|
|
{
|
|
|
|
ereport(DEBUG2,
|
|
|
|
(errmsg_internal("sending %s to process %d",
|
|
|
|
"SIGQUIT",
|
|
|
|
(int) PgArchPID)));
|
2006-11-21 21:59:53 +01:00
|
|
|
signal_child(PgArchPID, SIGQUIT);
|
2004-07-19 04:47:16 +02:00
|
|
|
}
|
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
/*
|
|
|
|
* Force a power-cycle of the pgstat process too. (This isn't absolutely
|
|
|
|
* necessary, but it seems like a good idea for robustness, and it
|
2007-11-15 22:14:46 +01:00
|
|
|
* simplifies the state-machine logic in the case where a shutdown request
|
|
|
|
* arrives during crash processing.)
|
2007-08-09 03:18:43 +02:00
|
|
|
*/
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
if (PgStatPID != 0 && take_action)
|
2004-06-14 20:08:19 +02:00
|
|
|
{
|
|
|
|
ereport(DEBUG2,
|
|
|
|
(errmsg_internal("sending %s to process %d",
|
|
|
|
"SIGQUIT",
|
|
|
|
(int) PgStatPID)));
|
2006-11-21 21:59:53 +01:00
|
|
|
signal_child(PgStatPID, SIGQUIT);
|
2007-03-22 20:53:31 +01:00
|
|
|
allow_immediate_pgstat_restart();
|
2004-06-14 20:08:19 +02:00
|
|
|
}
|
|
|
|
|
2004-08-06 01:32:13 +02:00
|
|
|
/* We do NOT restart the syslogger */
|
|
|
|
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
if (Shutdown != ImmediateShutdown)
|
|
|
|
FatalError = true;
|
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
/* We now transit into a state of waiting for children to die */
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
if (pmState == PM_RECOVERY ||
|
2010-05-15 22:01:32 +02:00
|
|
|
pmState == PM_HOT_STANDBY ||
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
pmState == PM_RUN ||
|
2008-04-27 00:47:40 +02:00
|
|
|
pmState == PM_WAIT_BACKUP ||
|
2010-04-08 03:39:37 +02:00
|
|
|
pmState == PM_WAIT_READONLY ||
|
2008-04-27 00:47:40 +02:00
|
|
|
pmState == PM_SHUTDOWN)
|
2007-08-09 03:18:43 +02:00
|
|
|
pmState = PM_WAIT_BACKENDS;
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* .. and if this doesn't happen quickly enough, now the clock is ticking
|
|
|
|
* for us to kill them without mercy.
|
|
|
|
*/
|
|
|
|
if (AbortStartTime == 0)
|
|
|
|
AbortStartTime = time(NULL);
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
2001-11-06 19:02:48 +01:00
|
|
|
/*
|
2001-11-11 00:06:12 +01:00
|
|
|
* Log the death of a child process.
|
2001-11-06 19:02:48 +01:00
|
|
|
*/
|
2001-11-11 00:06:12 +01:00
|
|
|
static void
|
2002-03-04 02:46:04 +01:00
|
|
|
LogChildExit(int lev, const char *procname, int pid, int exitstatus)
|
2001-11-06 19:02:48 +01:00
|
|
|
{
|
2011-10-21 22:36:04 +02:00
|
|
|
/*
|
|
|
|
* size of activity_buffer is arbitrary, but set equal to default
|
|
|
|
* track_activity_query_size
|
|
|
|
*/
|
|
|
|
char activity_buffer[1024];
|
|
|
|
const char *activity = NULL;
|
2011-10-21 19:26:40 +02:00
|
|
|
|
2011-10-21 22:36:04 +02:00
|
|
|
if (!EXIT_STATUS_0(exitstatus))
|
|
|
|
activity = pgstat_get_crashed_backend_activity(pid,
|
|
|
|
activity_buffer,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
sizeof(activity_buffer));
|
2011-10-21 19:26:40 +02:00
|
|
|
|
2001-11-06 19:02:48 +01:00
|
|
|
if (WIFEXITED(exitstatus))
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(lev,
|
2003-08-04 02:43:34 +02:00
|
|
|
|
2006-11-28 13:54:42 +01:00
|
|
|
/*------
|
|
|
|
translator: %s is a noun phrase describing a child process, such as
|
|
|
|
"server process" */
|
2003-09-25 08:58:07 +02:00
|
|
|
(errmsg("%s (PID %d) exited with exit code %d",
|
2011-10-21 19:26:40 +02:00
|
|
|
procname, pid, WEXITSTATUS(exitstatus)),
|
2011-10-21 22:36:04 +02:00
|
|
|
activity ? errdetail("Failed process was running: %s", activity) : 0));
|
2001-11-06 19:02:48 +01:00
|
|
|
else if (WIFSIGNALED(exitstatus))
|
2007-01-28 02:12:05 +01:00
|
|
|
#if defined(WIN32)
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(lev,
|
2003-08-04 02:43:34 +02:00
|
|
|
|
2006-11-28 13:54:42 +01:00
|
|
|
/*------
|
|
|
|
translator: %s is a noun phrase describing a child process, such as
|
|
|
|
"server process" */
|
2007-01-28 07:32:03 +01:00
|
|
|
(errmsg("%s (PID %d) was terminated by exception 0x%X",
|
2007-01-28 02:12:05 +01:00
|
|
|
procname, pid, WTERMSIG(exitstatus)),
|
2011-10-21 19:26:40 +02:00
|
|
|
errhint("See C include file \"ntstatus.h\" for a description of the hexadecimal value."),
|
2011-10-21 22:36:04 +02:00
|
|
|
activity ? errdetail("Failed process was running: %s", activity) : 0));
|
2007-01-28 07:32:03 +01:00
|
|
|
#elif defined(HAVE_DECL_SYS_SIGLIST) && HAVE_DECL_SYS_SIGLIST
|
2017-06-21 20:39:04 +02:00
|
|
|
ereport(lev,
|
|
|
|
|
|
|
|
/*------
|
|
|
|
translator: %s is a noun phrase describing a child process, such as
|
|
|
|
"server process" */
|
|
|
|
(errmsg("%s (PID %d) was terminated by signal %d: %s",
|
|
|
|
procname, pid, WTERMSIG(exitstatus),
|
|
|
|
WTERMSIG(exitstatus) < NSIG ?
|
|
|
|
sys_siglist[WTERMSIG(exitstatus)] : "(unknown)"),
|
|
|
|
activity ? errdetail("Failed process was running: %s", activity) : 0));
|
2007-01-22 19:31:51 +01:00
|
|
|
#else
|
2007-01-23 02:45:11 +01:00
|
|
|
ereport(lev,
|
2007-01-23 04:28:49 +01:00
|
|
|
|
2007-01-22 19:31:51 +01:00
|
|
|
/*------
|
|
|
|
translator: %s is a noun phrase describing a child process, such as
|
|
|
|
"server process" */
|
2007-01-28 02:12:05 +01:00
|
|
|
(errmsg("%s (PID %d) was terminated by signal %d",
|
2011-10-21 19:26:40 +02:00
|
|
|
procname, pid, WTERMSIG(exitstatus)),
|
2011-10-21 22:36:04 +02:00
|
|
|
activity ? errdetail("Failed process was running: %s", activity) : 0));
|
2007-01-22 19:31:51 +01:00
|
|
|
#endif
|
2001-11-06 19:02:48 +01:00
|
|
|
else
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(lev,
|
2003-08-04 02:43:34 +02:00
|
|
|
|
2006-11-28 13:54:42 +01:00
|
|
|
/*------
|
|
|
|
translator: %s is a noun phrase describing a child process, such as
|
|
|
|
"server process" */
|
2007-01-28 07:32:03 +01:00
|
|
|
(errmsg("%s (PID %d) exited with unrecognized status %d",
|
2011-10-21 19:26:40 +02:00
|
|
|
procname, pid, exitstatus),
|
2011-10-21 22:36:04 +02:00
|
|
|
activity ? errdetail("Failed process was running: %s", activity) : 0));
|
2001-11-06 19:02:48 +01:00
|
|
|
}
|
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
/*
|
|
|
|
* Advance the postmaster's state machine and take actions as appropriate
|
|
|
|
*
|
2011-04-04 01:42:00 +02:00
|
|
|
* This is common code for pmdie(), reaper() and sigusr1_handler(), which
|
|
|
|
* receive the signals that might mean we need to change state.
|
2007-08-09 03:18:43 +02:00
|
|
|
*/
|
|
|
|
static void
|
|
|
|
PostmasterStateMachine(void)
|
|
|
|
{
|
2008-04-23 15:44:59 +02:00
|
|
|
if (pmState == PM_WAIT_BACKUP)
|
|
|
|
{
|
|
|
|
/*
|
2008-04-27 00:47:40 +02:00
|
|
|
* PM_WAIT_BACKUP state ends when online backup mode is not active.
|
2008-04-23 15:44:59 +02:00
|
|
|
*/
|
|
|
|
if (!BackupInProgress())
|
|
|
|
pmState = PM_WAIT_BACKENDS;
|
|
|
|
}
|
|
|
|
|
2010-04-08 03:39:37 +02:00
|
|
|
if (pmState == PM_WAIT_READONLY)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* PM_WAIT_READONLY state ends when we have no regular backends that
|
|
|
|
* have been started during recovery. We kill the startup and
|
|
|
|
* walreceiver processes and transition to PM_WAIT_BACKENDS. Ideally,
|
|
|
|
* we might like to kill these processes first and then wait for
|
|
|
|
* backends to die off, but that doesn't work at present because
|
|
|
|
* killing the startup process doesn't release its locks.
|
|
|
|
*/
|
|
|
|
if (CountChildren(BACKEND_TYPE_NORMAL) == 0)
|
|
|
|
{
|
|
|
|
if (StartupPID != 0)
|
|
|
|
signal_child(StartupPID, SIGTERM);
|
|
|
|
if (WalReceiverPID != 0)
|
|
|
|
signal_child(WalReceiverPID, SIGTERM);
|
|
|
|
pmState = PM_WAIT_BACKENDS;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
/*
|
2007-11-15 22:14:46 +01:00
|
|
|
* If we are in a state-machine state that implies waiting for backends to
|
|
|
|
* exit, see if they're all gone, and change state if so.
|
2007-08-09 03:18:43 +02:00
|
|
|
*/
|
|
|
|
if (pmState == PM_WAIT_BACKENDS)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* PM_WAIT_BACKENDS state ends when we have no regular backends
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
* (including autovac workers), no bgworkers (including unconnected
|
|
|
|
* ones), and no walwriter, autovac launcher or bgwriter. If we are
|
2014-05-06 18:12:18 +02:00
|
|
|
* doing crash recovery or an immediate shutdown then we expect the
|
|
|
|
* checkpointer to exit as well, otherwise not. The archiver, stats,
|
|
|
|
* and syslogger processes are disregarded since they are not
|
|
|
|
* connected to shared memory; we also disregard dead_end children
|
|
|
|
* here. Walsenders are also disregarded, they will be terminated
|
|
|
|
* later after writing the checkpoint record, like the archiver
|
|
|
|
* process.
|
2007-08-09 03:18:43 +02:00
|
|
|
*/
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
if (CountChildren(BACKEND_TYPE_NORMAL | BACKEND_TYPE_WORKER) == 0 &&
|
2007-08-09 03:18:43 +02:00
|
|
|
StartupPID == 0 &&
|
2010-01-15 10:19:10 +01:00
|
|
|
WalReceiverPID == 0 &&
|
2011-11-01 18:14:47 +01:00
|
|
|
BgWriterPID == 0 &&
|
2013-08-07 19:48:53 +02:00
|
|
|
(CheckpointerPID == 0 ||
|
|
|
|
(!FatalError && Shutdown < ImmediateShutdown)) &&
|
2007-08-09 03:18:43 +02:00
|
|
|
WalWriterPID == 0 &&
|
|
|
|
AutoVacPID == 0)
|
|
|
|
{
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
if (Shutdown >= ImmediateShutdown || FatalError)
|
2007-08-09 03:18:43 +02:00
|
|
|
{
|
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* Start waiting for dead_end children to die. This state
|
2007-08-09 03:18:43 +02:00
|
|
|
* change causes ServerLoop to stop creating new ones.
|
|
|
|
*/
|
|
|
|
pmState = PM_WAIT_DEAD_END;
|
2008-01-11 01:54:09 +01:00
|
|
|
|
|
|
|
/*
|
2009-06-11 16:49:15 +02:00
|
|
|
* We already SIGQUIT'd the archiver and stats processes, if
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
* any, when we started immediate shutdown or entered
|
|
|
|
* FatalError state.
|
2008-01-11 01:54:09 +01:00
|
|
|
*/
|
2007-08-09 03:18:43 +02:00
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/*
|
2007-11-15 22:14:46 +01:00
|
|
|
* If we get here, we are proceeding with normal shutdown. All
|
|
|
|
* the regular children are gone, and it's time to tell the
|
2017-06-06 03:53:41 +02:00
|
|
|
* checkpointer to do a shutdown checkpoint.
|
2007-08-09 03:18:43 +02:00
|
|
|
*/
|
|
|
|
Assert(Shutdown > NoShutdown);
|
2011-11-01 18:14:47 +01:00
|
|
|
/* Start the checkpointer if not running */
|
|
|
|
if (CheckpointerPID == 0)
|
|
|
|
CheckpointerPID = StartCheckpointer();
|
2007-08-09 03:18:43 +02:00
|
|
|
/* And tell it to shut down */
|
2011-11-01 18:14:47 +01:00
|
|
|
if (CheckpointerPID != 0)
|
2007-08-09 03:18:43 +02:00
|
|
|
{
|
2011-11-01 18:14:47 +01:00
|
|
|
signal_child(CheckpointerPID, SIGUSR2);
|
2007-08-09 03:18:43 +02:00
|
|
|
pmState = PM_SHUTDOWN;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/*
|
2012-06-10 21:20:04 +02:00
|
|
|
* If we failed to fork a checkpointer, just shut down.
|
|
|
|
* Any required cleanup will happen at next restart. We
|
|
|
|
* set FatalError so that an "abnormal shutdown" message
|
|
|
|
* gets logged when we exit.
|
2007-08-09 03:18:43 +02:00
|
|
|
*/
|
|
|
|
FatalError = true;
|
|
|
|
pmState = PM_WAIT_DEAD_END;
|
2008-01-11 01:54:09 +01:00
|
|
|
|
2010-01-15 10:19:10 +01:00
|
|
|
/* Kill the walsenders, archiver and stats collector too */
|
2010-12-16 16:20:38 +01:00
|
|
|
SignalChildren(SIGQUIT);
|
2008-01-11 01:54:09 +01:00
|
|
|
if (PgArchPID != 0)
|
|
|
|
signal_child(PgArchPID, SIGQUIT);
|
|
|
|
if (PgStatPID != 0)
|
|
|
|
signal_child(PgStatPID, SIGQUIT);
|
2007-08-09 03:18:43 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2010-01-15 10:19:10 +01:00
|
|
|
if (pmState == PM_SHUTDOWN_2)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* PM_SHUTDOWN_2 state ends when there's no other children than
|
|
|
|
* dead_end children left. There shouldn't be any regular backends
|
2010-02-26 03:01:40 +01:00
|
|
|
* left by now anyway; what we're really waiting for is walsenders and
|
|
|
|
* archiver.
|
2010-01-15 10:19:10 +01:00
|
|
|
*
|
|
|
|
* Walreceiver should normally be dead by now, but not when a fast
|
|
|
|
* shutdown is performed during recovery.
|
|
|
|
*/
|
|
|
|
if (PgArchPID == 0 && CountChildren(BACKEND_TYPE_ALL) == 0 &&
|
|
|
|
WalReceiverPID == 0)
|
|
|
|
{
|
|
|
|
pmState = PM_WAIT_DEAD_END;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
if (pmState == PM_WAIT_DEAD_END)
|
|
|
|
{
|
|
|
|
/*
|
2007-11-15 22:14:46 +01:00
|
|
|
* PM_WAIT_DEAD_END state ends when the BackendList is entirely empty
|
2008-01-11 01:54:09 +01:00
|
|
|
* (ie, no dead_end children remain), and the archiver and stats
|
|
|
|
* collector are gone too.
|
|
|
|
*
|
|
|
|
* The reason we wait for those two is to protect them against a new
|
|
|
|
* postmaster starting conflicting subprocesses; this isn't an
|
|
|
|
* ironclad protection, but it at least helps in the
|
|
|
|
* shutdown-and-immediately-restart scenario. Note that they have
|
|
|
|
* already been sent appropriate shutdown signals, either during a
|
|
|
|
* normal state transition leading up to PM_WAIT_DEAD_END, or during
|
|
|
|
* FatalError processing.
|
2007-08-09 03:18:43 +02:00
|
|
|
*/
|
2012-10-16 22:36:30 +02:00
|
|
|
if (dlist_is_empty(&BackendList) &&
|
2008-01-11 01:54:09 +01:00
|
|
|
PgArchPID == 0 && PgStatPID == 0)
|
2007-08-09 03:18:43 +02:00
|
|
|
{
|
|
|
|
/* These other guys should be dead already */
|
|
|
|
Assert(StartupPID == 0);
|
2010-01-15 10:19:10 +01:00
|
|
|
Assert(WalReceiverPID == 0);
|
2007-08-09 03:18:43 +02:00
|
|
|
Assert(BgWriterPID == 0);
|
2011-11-01 18:14:47 +01:00
|
|
|
Assert(CheckpointerPID == 0);
|
2007-08-09 03:18:43 +02:00
|
|
|
Assert(WalWriterPID == 0);
|
|
|
|
Assert(AutoVacPID == 0);
|
2008-01-11 01:54:09 +01:00
|
|
|
/* syslogger is not considered here */
|
2007-08-09 03:18:43 +02:00
|
|
|
pmState = PM_NO_CHILDREN;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If we've been told to shut down, we exit as soon as there are no
|
2014-05-06 18:12:18 +02:00
|
|
|
* remaining children. If there was a crash, cleanup will occur at the
|
2007-08-09 03:18:43 +02:00
|
|
|
* next startup. (Before PostgreSQL 8.3, we tried to recover from the
|
|
|
|
* crash before exiting, but that seems unwise if we are quitting because
|
|
|
|
* we got SIGTERM from init --- there may well not be time for recovery
|
|
|
|
* before init decides to SIGKILL us.)
|
|
|
|
*
|
2008-01-11 01:54:09 +01:00
|
|
|
* Note that the syslogger continues to run. It will exit when it sees
|
|
|
|
* EOF on its input pipe, which happens when there are no more upstream
|
|
|
|
* processes.
|
2007-08-09 03:18:43 +02:00
|
|
|
*/
|
|
|
|
if (Shutdown > NoShutdown && pmState == PM_NO_CHILDREN)
|
|
|
|
{
|
|
|
|
if (FatalError)
|
|
|
|
{
|
|
|
|
ereport(LOG, (errmsg("abnormal database system shutdown")));
|
|
|
|
ExitPostmaster(1);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
2008-04-27 00:47:40 +02:00
|
|
|
/*
|
2012-06-10 21:20:04 +02:00
|
|
|
* Terminate exclusive backup mode to avoid recovery after a clean
|
|
|
|
* fast shutdown. Since an exclusive backup can only be taken
|
|
|
|
* during normal running (and not, for example, while running
|
|
|
|
* under Hot Standby) it only makes sense to do this if we reached
|
|
|
|
* normal running. If we're still in recovery, the backup file is
|
|
|
|
* one we're recovering *from*, and we must keep it around so that
|
|
|
|
* recovery restarts from the right place.
|
2008-04-27 00:47:40 +02:00
|
|
|
*/
|
2010-05-26 14:32:41 +02:00
|
|
|
if (ReachedNormalRunning)
|
|
|
|
CancelBackup();
|
2008-04-27 00:47:40 +02:00
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
/* Normal exit from the postmaster is here */
|
|
|
|
ExitPostmaster(0);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
/*
|
2015-07-09 19:22:22 +02:00
|
|
|
* If the startup process failed, or the user does not want an automatic
|
|
|
|
* restart after backend crashes, wait for all non-syslogger children to
|
|
|
|
* exit, and then exit postmaster. We don't try to reinitialize when the
|
|
|
|
* startup process fails, because more than likely it will just fail again
|
|
|
|
* and we will keep trying forever.
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
*/
|
2015-07-09 19:22:22 +02:00
|
|
|
if (pmState == PM_NO_CHILDREN &&
|
|
|
|
(StartupStatus == STARTUP_CRASHED || !restart_after_crash))
|
2009-06-11 16:49:15 +02:00
|
|
|
ExitPostmaster(1);
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
/*
|
2009-06-11 16:49:15 +02:00
|
|
|
* If we need to recover from a crash, wait for all non-syslogger children
|
|
|
|
* to exit, then reset shmem and StartupDataBase.
|
2007-08-09 03:18:43 +02:00
|
|
|
*/
|
|
|
|
if (FatalError && pmState == PM_NO_CHILDREN)
|
|
|
|
{
|
|
|
|
ereport(LOG,
|
|
|
|
(errmsg("all server processes terminated; reinitializing")));
|
|
|
|
|
2014-05-07 22:04:47 +02:00
|
|
|
/* allow background workers to immediately restart */
|
|
|
|
ResetBackgroundWorkerCrashTimes();
|
|
|
|
|
2009-01-04 23:19:59 +01:00
|
|
|
shmem_exit(1);
|
2007-08-09 03:18:43 +02:00
|
|
|
reset_shared(PostPortNumber);
|
|
|
|
|
|
|
|
StartupPID = StartupDataBase();
|
|
|
|
Assert(StartupPID != 0);
|
2015-07-09 19:22:22 +02:00
|
|
|
StartupStatus = STARTUP_RUNNING;
|
2007-08-09 03:18:43 +02:00
|
|
|
pmState = PM_STARTUP;
|
2013-10-06 04:24:50 +02:00
|
|
|
/* crash recovery started, reset SIGKILL flag */
|
|
|
|
AbortStartTime = 0;
|
2007-08-09 03:18:43 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2006-11-21 21:59:53 +01:00
|
|
|
/*
|
|
|
|
* Send a signal to a postmaster child process
|
|
|
|
*
|
|
|
|
* On systems that have setsid(), each child process sets itself up as a
|
|
|
|
* process group leader. For signals that are generally interpreted in the
|
|
|
|
* appropriate fashion, we signal the entire process group not just the
|
|
|
|
* direct child process. This allows us to, for example, SIGQUIT a blocked
|
|
|
|
* archive_recovery script, or SIGINT a script being run by a backend via
|
|
|
|
* system().
|
|
|
|
*
|
|
|
|
* There is a race condition for recently-forked children: they might not
|
2014-05-06 18:12:18 +02:00
|
|
|
* have executed setsid() yet. So we signal the child directly as well as
|
2006-11-21 21:59:53 +01:00
|
|
|
* the group. We assume such a child will handle the signal before trying
|
|
|
|
* to spawn any grandchild processes. We also assume that signaling the
|
|
|
|
* child twice will not cause any problems.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
signal_child(pid_t pid, int signal)
|
|
|
|
{
|
|
|
|
if (kill(pid, signal) < 0)
|
|
|
|
elog(DEBUG3, "kill(%ld,%d) failed: %m", (long) pid, signal);
|
|
|
|
#ifdef HAVE_SETSID
|
|
|
|
switch (signal)
|
|
|
|
{
|
|
|
|
case SIGINT:
|
|
|
|
case SIGTERM:
|
|
|
|
case SIGQUIT:
|
|
|
|
case SIGSTOP:
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
case SIGKILL:
|
2006-11-21 21:59:53 +01:00
|
|
|
if (kill(-pid, signal) < 0)
|
|
|
|
elog(DEBUG3, "kill(%ld,%d) failed: %m", (long) (-pid), signal);
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
1998-08-25 23:34:10 +02:00
|
|
|
/*
|
2010-01-15 10:19:10 +01:00
|
|
|
* Send a signal to the targeted children (but NOT special children;
|
|
|
|
* dead_end children are never signaled, either).
|
2007-02-16 00:23:23 +01:00
|
|
|
*/
|
2010-01-15 10:19:10 +01:00
|
|
|
static bool
|
|
|
|
SignalSomeChildren(int signal, int target)
|
1998-08-25 23:34:10 +02:00
|
|
|
{
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
dlist_iter iter;
|
2010-01-15 10:19:10 +01:00
|
|
|
bool signaled = false;
|
1998-08-25 23:34:10 +02:00
|
|
|
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_foreach(iter, &BackendList)
|
1998-08-25 23:34:10 +02:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
Backend *bp = dlist_container(Backend, elem, iter.cur);
|
1998-08-25 23:34:10 +02:00
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
if (bp->dead_end)
|
|
|
|
continue;
|
2011-01-22 04:20:06 +01:00
|
|
|
|
|
|
|
/*
|
2011-04-10 17:42:00 +02:00
|
|
|
* Since target == BACKEND_TYPE_ALL is the most common case, we test
|
|
|
|
* it first and avoid touching shared memory for every child.
|
2011-01-22 04:20:06 +01:00
|
|
|
*/
|
|
|
|
if (target != BACKEND_TYPE_ALL)
|
|
|
|
{
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/*
|
|
|
|
* Assign bkend_type for any recently announced WAL Sender
|
|
|
|
* processes.
|
|
|
|
*/
|
|
|
|
if (bp->bkend_type == BACKEND_TYPE_NORMAL &&
|
|
|
|
IsPostmasterChildWalSender(bp->child_slot))
|
|
|
|
bp->bkend_type = BACKEND_TYPE_WALSND;
|
2011-01-22 04:20:06 +01:00
|
|
|
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
if (!(target & bp->bkend_type))
|
2011-01-22 04:20:06 +01:00
|
|
|
continue;
|
|
|
|
}
|
2007-02-16 00:23:23 +01:00
|
|
|
|
2004-05-30 00:48:23 +02:00
|
|
|
ereport(DEBUG4,
|
|
|
|
(errmsg_internal("sending signal %d to process %d",
|
|
|
|
signal, (int) bp->pid)));
|
2006-11-21 21:59:53 +01:00
|
|
|
signal_child(bp->pid, signal);
|
2010-01-15 10:19:10 +01:00
|
|
|
signaled = true;
|
1998-08-25 23:34:10 +02:00
|
|
|
}
|
2010-01-15 10:19:10 +01:00
|
|
|
return signaled;
|
1998-08-25 23:34:10 +02:00
|
|
|
}
|
|
|
|
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
/*
|
|
|
|
* Send a termination signal to children. This considers all of our children
|
|
|
|
* processes, except syslogger and dead_end backends.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
TerminateChildren(int signal)
|
|
|
|
{
|
|
|
|
SignalChildren(signal);
|
|
|
|
if (StartupPID != 0)
|
2015-07-09 19:22:22 +02:00
|
|
|
{
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
signal_child(StartupPID, signal);
|
2015-07-09 19:22:22 +02:00
|
|
|
if (signal == SIGQUIT || signal == SIGKILL)
|
|
|
|
StartupStatus = STARTUP_SIGNALED;
|
|
|
|
}
|
Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
2013-06-28 23:20:53 +02:00
|
|
|
if (BgWriterPID != 0)
|
|
|
|
signal_child(BgWriterPID, signal);
|
|
|
|
if (CheckpointerPID != 0)
|
|
|
|
signal_child(CheckpointerPID, signal);
|
|
|
|
if (WalWriterPID != 0)
|
|
|
|
signal_child(WalWriterPID, signal);
|
|
|
|
if (WalReceiverPID != 0)
|
|
|
|
signal_child(WalReceiverPID, signal);
|
|
|
|
if (AutoVacPID != 0)
|
|
|
|
signal_child(AutoVacPID, signal);
|
|
|
|
if (PgArchPID != 0)
|
|
|
|
signal_child(PgArchPID, signal);
|
|
|
|
if (PgStatPID != 0)
|
|
|
|
signal_child(PgStatPID, signal);
|
|
|
|
}
|
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
|
|
|
* BackendStartup -- start backend process
|
|
|
|
*
|
2002-03-02 21:46:12 +01:00
|
|
|
* returns: STATUS_ERROR if the fork failed, STATUS_OK otherwise.
|
2007-02-16 00:23:23 +01:00
|
|
|
*
|
|
|
|
* Note: if you change this code, also consider StartAutovacuumWorker.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
1997-08-19 23:40:56 +02:00
|
|
|
static int
|
1998-01-26 02:42:53 +01:00
|
|
|
BackendStartup(Port *port)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
1997-09-08 04:41:22 +02:00
|
|
|
Backend *bn; /* for backend cleanup */
|
2001-06-20 20:07:56 +02:00
|
|
|
pid_t pid;
|
2002-09-04 22:31:48 +02:00
|
|
|
|
1998-06-09 06:06:12 +02:00
|
|
|
/*
|
2009-06-11 16:49:15 +02:00
|
|
|
* Create backend data structure. Better before the fork() so we can
|
|
|
|
* handle failure cleanly.
|
2001-10-21 05:25:36 +02:00
|
|
|
*/
|
|
|
|
bn = (Backend *) malloc(sizeof(Backend));
|
|
|
|
if (!bn)
|
|
|
|
{
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(LOG,
|
|
|
|
(errcode(ERRCODE_OUT_OF_MEMORY),
|
|
|
|
errmsg("out of memory")));
|
2001-10-21 05:25:36 +02:00
|
|
|
return STATUS_ERROR;
|
|
|
|
}
|
|
|
|
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
/*
|
|
|
|
* Compute the cancel key that will be assigned to this backend. The
|
|
|
|
* backend will have its own copy in the forked-off process' value of
|
|
|
|
* MyCancelKey, so that it can transmit the key to the frontend.
|
|
|
|
*/
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
if (!RandomCancelKey(&MyCancelKey))
|
|
|
|
{
|
2016-12-12 08:58:32 +01:00
|
|
|
free(bn);
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
ereport(LOG,
|
2016-12-12 10:55:32 +01:00
|
|
|
(errcode(ERRCODE_INTERNAL_ERROR),
|
|
|
|
errmsg("could not generate random cancel key")));
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
return STATUS_ERROR;
|
|
|
|
}
|
|
|
|
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
bn->cancel_key = MyCancelKey;
|
|
|
|
|
2008-04-27 00:47:40 +02:00
|
|
|
/* Pass down canAcceptConnections state */
|
2004-05-28 07:13:32 +02:00
|
|
|
port->canAcceptConnections = canAcceptConnections();
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
bn->dead_end = (port->canAcceptConnections != CAC_OK &&
|
|
|
|
port->canAcceptConnections != CAC_WAITBACKUP);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Unless it's a dead_end child, assign it a child slot number
|
|
|
|
*/
|
|
|
|
if (!bn->dead_end)
|
|
|
|
bn->child_slot = MyPMChildSlot = AssignPostmasterChildSlot();
|
|
|
|
else
|
|
|
|
bn->child_slot = 0;
|
2004-05-28 07:13:32 +02:00
|
|
|
|
2013-08-28 20:08:13 +02:00
|
|
|
/* Hasn't asked to be notified about any bgworkers yet */
|
|
|
|
bn->bgworker_notify = false;
|
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
#ifdef EXEC_BACKEND
|
|
|
|
pid = backend_forkexec(port);
|
2004-08-29 07:07:03 +02:00
|
|
|
#else /* !EXEC_BACKEND */
|
2005-03-10 08:14:03 +01:00
|
|
|
pid = fork_process();
|
2001-06-20 20:07:56 +02:00
|
|
|
if (pid == 0) /* child */
|
2003-05-28 21:36:28 +02:00
|
|
|
{
|
|
|
|
free(bn);
|
2006-01-04 22:06:32 +01:00
|
|
|
|
2015-01-13 13:12:37 +01:00
|
|
|
/* Detangle from postmaster */
|
|
|
|
InitPostmasterChild();
|
2006-01-04 22:06:32 +01:00
|
|
|
|
|
|
|
/* Close the postmaster's sockets */
|
|
|
|
ClosePostmasterPorts(false);
|
|
|
|
|
2009-08-29 21:26:52 +02:00
|
|
|
/* Perform additional initialization and collect startup packet */
|
2006-01-04 22:06:32 +01:00
|
|
|
BackendInitialize(port);
|
|
|
|
|
|
|
|
/* And run the backend */
|
2012-06-25 20:25:26 +02:00
|
|
|
BackendRun(port);
|
2003-05-28 21:36:28 +02:00
|
|
|
}
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
#endif /* EXEC_BACKEND */
|
2004-05-28 07:13:32 +02:00
|
|
|
|
1997-09-07 07:04:48 +02:00
|
|
|
if (pid < 0)
|
|
|
|
{
|
2004-05-28 07:13:32 +02:00
|
|
|
/* in parent, fork failed */
|
2002-01-06 22:40:02 +01:00
|
|
|
int save_errno = errno;
|
|
|
|
|
2009-08-24 20:09:37 +02:00
|
|
|
if (!bn->dead_end)
|
|
|
|
(void) ReleasePostmasterChildSlot(bn->child_slot);
|
2002-01-06 22:40:02 +01:00
|
|
|
free(bn);
|
2003-07-22 21:00:12 +02:00
|
|
|
errno = save_errno;
|
|
|
|
ereport(LOG,
|
2005-10-15 04:49:52 +02:00
|
|
|
(errmsg("could not fork new process for connection: %m")));
|
2002-01-06 22:40:02 +01:00
|
|
|
report_fork_failure_to_client(port, save_errno);
|
1998-09-01 05:29:17 +02:00
|
|
|
return STATUS_ERROR;
|
1997-09-07 07:04:48 +02:00
|
|
|
}
|
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
/* in parent, successful fork */
|
2003-08-12 20:23:21 +02:00
|
|
|
ereport(DEBUG2,
|
|
|
|
(errmsg_internal("forked new backend, pid=%d socket=%d",
|
2011-04-28 21:05:58 +02:00
|
|
|
(int) pid, (int) port->sock)));
|
1997-09-07 07:04:48 +02:00
|
|
|
|
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Everything's been successful, it's safe to add this backend to our list
|
|
|
|
* of backends.
|
1997-09-07 07:04:48 +02:00
|
|
|
*/
|
|
|
|
bn->pid = pid;
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
bn->bkend_type = BACKEND_TYPE_NORMAL; /* Can change later to WALSND */
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_push_head(&BackendList, &bn->elem);
|
|
|
|
|
2004-01-26 23:59:54 +01:00
|
|
|
#ifdef EXEC_BACKEND
|
2007-08-09 03:18:43 +02:00
|
|
|
if (!bn->dead_end)
|
|
|
|
ShmemBackendArrayAdd(bn);
|
2004-01-26 23:59:54 +01:00
|
|
|
#endif
|
1997-09-07 07:04:48 +02:00
|
|
|
|
1998-09-01 05:29:17 +02:00
|
|
|
return STATUS_OK;
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
2002-01-06 22:40:02 +01:00
|
|
|
/*
|
|
|
|
* Try to report backend fork() failure to client before we close the
|
2014-05-06 18:12:18 +02:00
|
|
|
* connection. Since we do not care to risk blocking the postmaster on
|
2002-01-06 22:40:02 +01:00
|
|
|
* this connection, we set the connection to non-blocking and try only once.
|
|
|
|
*
|
|
|
|
* This is grungy special-purpose code; we cannot use backend libpq since
|
|
|
|
* it's not up and running.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
report_fork_failure_to_client(Port *port, int errnum)
|
|
|
|
{
|
|
|
|
char buffer[1000];
|
2006-07-16 20:17:14 +02:00
|
|
|
int rc;
|
2002-09-04 22:31:48 +02:00
|
|
|
|
2003-07-22 21:00:12 +02:00
|
|
|
/* Format the error message packet (always V2 protocol) */
|
2002-01-06 22:40:02 +01:00
|
|
|
snprintf(buffer, sizeof(buffer), "E%s%s\n",
|
2005-02-22 05:43:23 +01:00
|
|
|
_("could not fork new process for connection: "),
|
2002-01-06 22:40:02 +01:00
|
|
|
strerror(errnum));
|
|
|
|
|
|
|
|
/* Set port to non-blocking. Don't do send() if this fails */
|
2005-03-25 01:34:31 +01:00
|
|
|
if (!pg_set_noblock(port->sock))
|
2002-01-06 22:40:02 +01:00
|
|
|
return;
|
|
|
|
|
2006-07-16 20:17:14 +02:00
|
|
|
/* We'll retry after EINTR, but ignore all other failures */
|
|
|
|
do
|
|
|
|
{
|
|
|
|
rc = send(port->sock, buffer, strlen(buffer) + 1, 0);
|
|
|
|
} while (rc < 0 && errno == EINTR);
|
2002-01-06 22:40:02 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
2006-01-04 22:06:32 +01:00
|
|
|
* BackendInitialize -- initialize an interactive (postmaster-child)
|
2009-08-29 21:26:52 +02:00
|
|
|
* backend process, and collect the client's startup packet.
|
1996-07-09 08:22:35 +02:00
|
|
|
*
|
2006-01-04 22:06:32 +01:00
|
|
|
* returns: nothing. Will not return at all if there's any failure.
|
|
|
|
*
|
|
|
|
* Note: this code does not depend on having any access to shared memory.
|
|
|
|
* In the EXEC_BACKEND case, we are physically attached to shared memory
|
|
|
|
* but have not yet set up most of our local pointers to shmem structures.
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2006-01-04 22:06:32 +01:00
|
|
|
static void
|
|
|
|
BackendInitialize(Port *port)
|
1996-07-09 08:22:35 +02:00
|
|
|
{
|
2003-08-04 02:43:34 +02:00
|
|
|
int status;
|
2012-10-05 03:45:14 +02:00
|
|
|
int ret;
|
2003-06-12 09:36:51 +02:00
|
|
|
char remote_host[NI_MAXHOST];
|
|
|
|
char remote_port[NI_MAXSERV];
|
2004-03-15 16:56:28 +01:00
|
|
|
char remote_ps_data[NI_MAXHOST];
|
1998-05-29 19:00:34 +02:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
/* Save port etc. for ps status */
|
|
|
|
MyProcPort = port;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* PreAuthDelay is a debugging aid for investigating problems in the
|
2005-10-15 04:49:52 +02:00
|
|
|
* authentication cycle: it can be set in postgresql.conf to allow time to
|
2014-05-06 18:12:18 +02:00
|
|
|
* attach to the newly-forked backend with a debugger. (See also
|
2010-02-26 03:01:40 +01:00
|
|
|
* PostAuthDelay, which we allow clients to pass through PGOPTIONS, but it
|
|
|
|
* is not honored until after authentication.)
|
2004-05-28 07:13:32 +02:00
|
|
|
*/
|
|
|
|
if (PreAuthDelay > 0)
|
|
|
|
pg_usleep(PreAuthDelay * 1000000L);
|
|
|
|
|
2009-08-29 21:26:52 +02:00
|
|
|
/* This flag will remain set until InitPostgres finishes authentication */
|
2004-05-28 07:13:32 +02:00
|
|
|
ClientAuthInProgress = true; /* limit visibility of log messages */
|
|
|
|
|
2006-06-21 00:52:00 +02:00
|
|
|
/* save process start time */
|
|
|
|
port->SessionStartTime = GetCurrentTimestamp();
|
2007-08-03 01:39:45 +02:00
|
|
|
MyStartTime = timestamptz_to_time_t(port->SessionStartTime);
|
2004-02-17 04:54:57 +01:00
|
|
|
|
|
|
|
/* set these to empty in case they are needed before we set them up */
|
|
|
|
port->remote_host = "";
|
|
|
|
port->remote_port = "";
|
|
|
|
|
2001-09-21 19:06:12 +02:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Initialize libpq and enable reporting of ereport errors to the client.
|
|
|
|
* Must do this now because authentication uses libpq to send messages.
|
2001-09-21 19:06:12 +02:00
|
|
|
*/
|
|
|
|
pq_init(); /* initialize libpq to talk to client */
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
whereToSendOutput = DestRemote; /* now safe to ereport to client */
|
2001-09-21 19:06:12 +02:00
|
|
|
|
2006-11-21 21:59:53 +01:00
|
|
|
/*
|
2010-02-26 03:01:40 +01:00
|
|
|
* We arrange for a simple exit(1) if we receive SIGTERM or SIGQUIT or
|
2014-05-06 18:12:18 +02:00
|
|
|
* timeout while trying to collect the startup packet. Otherwise the
|
2005-10-15 04:49:52 +02:00
|
|
|
* postmaster cannot shutdown the database FAST or IMMED cleanly if a
|
2015-05-18 16:02:31 +02:00
|
|
|
* buggy client fails to send the packet promptly. XXX it follows that
|
|
|
|
* the remainder of this function must tolerate losing control at any
|
|
|
|
* instant. Likewise, any pg_on_exit_callback registered before or during
|
|
|
|
* this function must be prepared to execute at any instant between here
|
|
|
|
* and the end of this function. Furthermore, affected callbacks execute
|
|
|
|
* partially or not at all when a second exit-inducing signal arrives
|
|
|
|
* after proc_exit_prepare() decrements on_proc_exit_index. (Thanks to
|
|
|
|
* that mechanic, callbacks need not anticipate more than one call.) This
|
|
|
|
* is fragile; it ought to instead follow the norm of handling interrupts
|
|
|
|
* at selected, safe opportunities.
|
2001-09-07 18:12:49 +02:00
|
|
|
*/
|
2009-08-29 21:26:52 +02:00
|
|
|
pqsignal(SIGTERM, startup_die);
|
|
|
|
pqsignal(SIGQUIT, startup_die);
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
InitializeTimeouts(); /* establishes SIGALRM handler */
|
2009-08-29 21:26:52 +02:00
|
|
|
PG_SETMASK(&StartupBlockSig);
|
2001-09-07 18:12:49 +02:00
|
|
|
|
2001-08-01 00:55:45 +02:00
|
|
|
/*
|
2002-05-29 01:56:51 +02:00
|
|
|
* Get the remote host name and port for logging and status display.
|
2001-10-19 02:44:08 +02:00
|
|
|
*/
|
2003-06-12 09:36:51 +02:00
|
|
|
remote_host[0] = '\0';
|
|
|
|
remote_port[0] = '\0';
|
2012-10-05 03:45:14 +02:00
|
|
|
if ((ret = pg_getnameinfo_all(&port->raddr.addr, port->raddr.salen,
|
2013-05-29 22:58:43 +02:00
|
|
|
remote_host, sizeof(remote_host),
|
|
|
|
remote_port, sizeof(remote_port),
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
(log_hostname ? 0 : NI_NUMERICHOST) | NI_NUMERICSERV)) != 0)
|
2012-10-05 03:45:14 +02:00
|
|
|
ereport(WARNING,
|
|
|
|
(errmsg_internal("pg_getnameinfo_all() failed: %s",
|
|
|
|
gai_strerror(ret))));
|
2010-06-16 02:54:16 +02:00
|
|
|
if (remote_port[0] == '\0')
|
|
|
|
snprintf(remote_ps_data, sizeof(remote_ps_data), "%s", remote_host);
|
|
|
|
else
|
|
|
|
snprintf(remote_ps_data, sizeof(remote_ps_data), "%s(%s)", remote_host, remote_port);
|
2003-06-12 09:36:51 +02:00
|
|
|
|
2016-01-26 21:38:33 +01:00
|
|
|
/*
|
|
|
|
* Save remote_host and remote_port in port structure (after this, they
|
|
|
|
* will appear in log_line_prefix data for log messages).
|
|
|
|
*/
|
|
|
|
port->remote_host = strdup(remote_host);
|
|
|
|
port->remote_port = strdup(remote_port);
|
|
|
|
|
|
|
|
/* And now we can issue the Log_connections message, if wanted */
|
2003-06-12 09:36:51 +02:00
|
|
|
if (Log_connections)
|
2010-03-25 21:40:17 +01:00
|
|
|
{
|
|
|
|
if (remote_port[0])
|
|
|
|
ereport(LOG,
|
2010-07-06 21:19:02 +02:00
|
|
|
(errmsg("connection received: host=%s port=%s",
|
|
|
|
remote_host,
|
|
|
|
remote_port)));
|
2010-03-25 21:40:17 +01:00
|
|
|
else
|
|
|
|
ereport(LOG,
|
2010-07-06 21:19:02 +02:00
|
|
|
(errmsg("connection received: host=%s",
|
|
|
|
remote_host)));
|
2010-03-25 21:40:17 +01:00
|
|
|
}
|
2002-05-29 01:56:51 +02:00
|
|
|
|
Fix assorted issues in client host name lookup.
The code for matching clients to pg_hba.conf lines that specify host names
(instead of IP address ranges) failed to complain if reverse DNS lookup
failed; instead it silently didn't match, so that you might end up getting
a surprising "no pg_hba.conf entry for ..." error, as seen in bug #9518
from Mike Blackwell. Since we don't want to make this a fatal error in
situations where pg_hba.conf contains a mixture of host names and IP
addresses (clients matching one of the numeric entries should not have to
have rDNS data), remember the lookup failure and mention it as DETAIL if
we get to "no pg_hba.conf entry". Apply the same approach to forward-DNS
lookup failures, too, rather than treating them as immediate hard errors.
Along the way, fix a couple of bugs that prevented us from detecting an
rDNS lookup error reliably, and make sure that we make only one rDNS lookup
attempt; formerly, if the lookup attempt failed, the code would try again
for each host name entry in pg_hba.conf. Since more or less the whole
point of this design is to ensure there's only one lookup attempt not one
per entry, the latter point represents a performance bug that seems
sufficient justification for back-patching.
Also, adjust src/port/getaddrinfo.c so that it plays as well as it can
with this code. Which is not all that well, since it does not have actual
support for rDNS lookup, but at least it should return the expected (and
required by spec) error codes so that the main code correctly perceives the
lack of functionality as a lookup failure. It's unlikely that PG is still
being used in production on any machines that require our getaddrinfo.c,
so I'm not excited about working harder than this.
To keep the code in the various branches similar, this includes
back-patching commits c424d0d1052cb4053c8712ac44123f9b9a9aa3f2 and
1997f34db4687e671690ed054c8f30bb501b1168 into 9.2 and earlier.
Back-patch to 9.1 where the facility for hostnames in pg_hba.conf was
introduced.
2014-04-02 23:11:24 +02:00
|
|
|
/*
|
|
|
|
* If we did a reverse lookup to name, we might as well save the results
|
|
|
|
* rather than possibly repeating the lookup during authentication.
|
|
|
|
*
|
|
|
|
* Note that we don't want to specify NI_NAMEREQD above, because then we'd
|
|
|
|
* get nothing useful for a client without an rDNS entry. Therefore, we
|
|
|
|
* must check whether we got a numeric IPv4 or IPv6 address, and not save
|
|
|
|
* it into remote_hostname if so. (This test is conservative and might
|
|
|
|
* sometimes classify a hostname as numeric, but an error in that
|
|
|
|
* direction is safe; it only results in a possible extra lookup.)
|
|
|
|
*/
|
|
|
|
if (log_hostname &&
|
|
|
|
ret == 0 &&
|
|
|
|
strspn(remote_host, "0123456789.") < strlen(remote_host) &&
|
|
|
|
strspn(remote_host, "0123456789ABCDEFabcdef:") < strlen(remote_host))
|
|
|
|
port->remote_hostname = strdup(remote_host);
|
2004-02-17 04:54:57 +01:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
/*
|
2009-08-29 21:26:52 +02:00
|
|
|
* Ready to begin client interaction. We will give up and exit(1) after a
|
2005-10-15 04:49:52 +02:00
|
|
|
* time delay, so that a broken client can't hog a connection
|
2009-08-29 21:26:52 +02:00
|
|
|
* indefinitely. PreAuthDelay and any DNS interactions above don't count
|
|
|
|
* against the time limit.
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
*
|
|
|
|
* Note: AuthenticationTimeout is applied here while waiting for the
|
|
|
|
* startup packet, and then again in InitPostgres for the duration of any
|
|
|
|
* authentication operations. So a hostile client could tie up the
|
|
|
|
* process for nearly twice AuthenticationTimeout before we kick him off.
|
|
|
|
*
|
|
|
|
* Note: because PostgresMain will call InitializeTimeouts again, the
|
|
|
|
* registration of STARTUP_PACKET_TIMEOUT will be lost. This is okay
|
|
|
|
* since we never use it again after this function.
|
2002-05-29 01:56:51 +02:00
|
|
|
*/
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
RegisterTimeout(STARTUP_PACKET_TIMEOUT, StartupPacketTimeoutHandler);
|
|
|
|
enable_timeout_after(STARTUP_PACKET_TIMEOUT, AuthenticationTimeout * 1000);
|
2002-05-29 01:56:51 +02:00
|
|
|
|
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Receive the startup packet (which might turn out to be a cancel request
|
|
|
|
* packet).
|
2002-05-29 01:56:51 +02:00
|
|
|
*/
|
|
|
|
status = ProcessStartupPacket(port, false);
|
|
|
|
|
2009-08-29 21:26:52 +02:00
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* Stop here if it was bad or a cancel packet. ProcessStartupPacket
|
2009-08-29 21:26:52 +02:00
|
|
|
* already did any appropriate error reporting.
|
|
|
|
*/
|
2002-05-29 01:56:51 +02:00
|
|
|
if (status != STATUS_OK)
|
2004-01-10 00:11:39 +01:00
|
|
|
proc_exit(0);
|
2002-05-29 01:56:51 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Now that we have the user and database name, we can set the process
|
2005-10-15 04:49:52 +02:00
|
|
|
* title for ps. It's good to do this as early as possible in startup.
|
2010-01-15 10:19:10 +01:00
|
|
|
*
|
|
|
|
* For a walsender, the ps display is set in the following form:
|
|
|
|
*
|
2010-02-26 03:01:40 +01:00
|
|
|
* postgres: wal sender process <user> <host> <activity>
|
2010-01-15 10:19:10 +01:00
|
|
|
*
|
|
|
|
* To achieve that, we pass "wal sender process" as username and username
|
|
|
|
* as dbname to init_ps_display(). XXX: should add a new variant of
|
|
|
|
* init_ps_display() to avoid abusing the parameters like this.
|
2001-10-19 02:44:08 +02:00
|
|
|
*/
|
2010-01-15 10:19:10 +01:00
|
|
|
if (am_walsender)
|
|
|
|
init_ps_display("wal sender process", port->user_name, remote_ps_data,
|
|
|
|
update_process_title ? "authentication" : "");
|
|
|
|
else
|
|
|
|
init_ps_display(port->user_name, port->database_name, remote_ps_data,
|
|
|
|
update_process_title ? "authentication" : "");
|
2006-10-04 02:30:14 +02:00
|
|
|
|
2001-10-19 02:44:08 +02:00
|
|
|
/*
|
2009-08-29 21:26:52 +02:00
|
|
|
* Disable the timeout, and prevent SIGTERM/SIGQUIT again.
|
2001-09-08 03:10:21 +02:00
|
|
|
*/
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
disable_timeout(STARTUP_PACKET_TIMEOUT, false);
|
2001-09-07 18:12:49 +02:00
|
|
|
PG_SETMASK(&BlockSig);
|
2006-01-04 22:06:32 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
* BackendRun -- set up the backend's argument list and invoke PostgresMain()
|
|
|
|
*
|
|
|
|
* returns:
|
|
|
|
* Shouldn't return at all.
|
|
|
|
* If PostgresMain() fails, return status.
|
|
|
|
*/
|
2012-06-25 20:25:26 +02:00
|
|
|
static void
|
2006-01-04 22:06:32 +01:00
|
|
|
BackendRun(Port *port)
|
|
|
|
{
|
|
|
|
char **av;
|
|
|
|
int maxac;
|
|
|
|
int ac;
|
2006-06-21 00:52:00 +02:00
|
|
|
long secs;
|
|
|
|
int usecs;
|
2006-01-04 22:06:32 +01:00
|
|
|
int i;
|
2001-10-19 02:44:08 +02:00
|
|
|
|
2016-10-18 15:28:23 +02:00
|
|
|
/*
|
|
|
|
* Don't want backend to be able to see the postmaster random number
|
|
|
|
* generator state. We have to clobber the static random_seed *and* start
|
|
|
|
* a new random sequence in the random() library function.
|
|
|
|
*/
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
#ifndef HAVE_STRONG_RANDOM
|
2016-10-18 15:28:23 +02:00
|
|
|
random_seed = 0;
|
|
|
|
random_start_time.tv_usec = 0;
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
#endif
|
2013-10-24 15:55:22 +02:00
|
|
|
/* slightly hacky way to convert timestamptz into integers */
|
2006-06-21 00:52:00 +02:00
|
|
|
TimestampDifference(0, port->SessionStartTime, &secs, &usecs);
|
2013-10-24 15:55:22 +02:00
|
|
|
srandom((unsigned int) (MyProcPid ^ (usecs << 12) ^ secs));
|
2003-12-20 18:31:21 +01:00
|
|
|
|
2009-08-29 21:26:52 +02:00
|
|
|
/*
|
1999-05-22 19:47:54 +02:00
|
|
|
* Now, build the argv vector that will be given to PostgresMain.
|
|
|
|
*
|
2003-04-18 00:26:02 +02:00
|
|
|
* The maximum possible number of commandline arguments that could come
|
2009-08-29 21:26:52 +02:00
|
|
|
* from ExtraOptions is (strlen(ExtraOptions) + 1) / 2; see
|
|
|
|
* pg_split_opts().
|
1999-05-22 19:47:54 +02:00
|
|
|
*/
|
2013-04-01 20:00:51 +02:00
|
|
|
maxac = 2; /* for fixed args supplied below */
|
2003-04-18 00:26:02 +02:00
|
|
|
maxac += (strlen(ExtraOptions) + 1) / 2;
|
|
|
|
|
|
|
|
av = (char **) MemoryContextAlloc(TopMemoryContext,
|
|
|
|
maxac * sizeof(char *));
|
|
|
|
ac = 0;
|
1998-05-29 19:00:34 +02:00
|
|
|
|
2000-06-04 03:44:38 +02:00
|
|
|
av[ac++] = "postgres";
|
1998-05-29 19:00:34 +02:00
|
|
|
|
1999-05-22 19:47:54 +02:00
|
|
|
/*
|
2009-08-29 21:26:52 +02:00
|
|
|
* Pass any backend switches specified with -o on the postmaster's own
|
2014-08-28 13:59:29 +02:00
|
|
|
* command line. We assume these are secure.
|
1999-05-22 19:47:54 +02:00
|
|
|
*/
|
2009-08-29 21:26:52 +02:00
|
|
|
pg_split_opts(av, &ac, ExtraOptions);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2004-01-07 19:56:30 +01:00
|
|
|
av[ac] = NULL;
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2003-04-18 00:26:02 +02:00
|
|
|
Assert(ac < maxac);
|
|
|
|
|
2000-06-28 05:33:33 +02:00
|
|
|
/*
|
|
|
|
* Debug: print arguments being passed to backend
|
|
|
|
*/
|
2003-08-12 20:23:21 +02:00
|
|
|
ereport(DEBUG3,
|
|
|
|
(errmsg_internal("%s child[%d]: starting with (",
|
2005-10-15 04:49:52 +02:00
|
|
|
progname, (int) getpid())));
|
Commit to match discussed elog() changes. Only update is that LOG is
now just below FATAL in server_min_messages. Added more text to
highlight ordering difference between it and client_min_messages.
---------------------------------------------------------------------------
REALLYFATAL => PANIC
STOP => PANIC
New INFO level the prints to client by default
New LOG level the prints to server log by default
Cause VACUUM information to print only to the client
NOTICE => INFO where purely information messages are sent
DEBUG => LOG for purely server status messages
DEBUG removed, kept as backward compatible
DEBUG5, DEBUG4, DEBUG3, DEBUG2, DEBUG1 added
DebugLvl removed in favor of new DEBUG[1-5] symbols
New server_min_messages GUC parameter with values:
DEBUG[5-1], INFO, NOTICE, ERROR, LOG, FATAL, PANIC
New client_min_messages GUC parameter with values:
DEBUG[5-1], LOG, INFO, NOTICE, ERROR, FATAL, PANIC
Server startup now logged with LOG instead of DEBUG
Remove debug_level GUC parameter
elog() numbers now start at 10
Add test to print error message if older elog() values are passed to elog()
Bootstrap mode now has a -d that requires an argument, like postmaster
2002-03-02 22:39:36 +01:00
|
|
|
for (i = 0; i < ac; ++i)
|
2003-08-12 20:23:21 +02:00
|
|
|
ereport(DEBUG3,
|
|
|
|
(errmsg_internal("\t%s", av[i])));
|
|
|
|
ereport(DEBUG3,
|
|
|
|
(errmsg_internal(")")));
|
2002-03-04 02:46:04 +01:00
|
|
|
|
2009-08-29 21:26:52 +02:00
|
|
|
/*
|
|
|
|
* Make sure we aren't in PostmasterContext anymore. (We can't delete it
|
|
|
|
* just yet, though, because InitPostgres will need the HBA data.)
|
|
|
|
*/
|
|
|
|
MemoryContextSwitchTo(TopMemoryContext);
|
1997-09-07 07:04:48 +02:00
|
|
|
|
2013-04-01 20:00:51 +02:00
|
|
|
PostgresMain(ac, av, port->database_name, port->user_name);
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
2004-01-07 00:15:22 +01:00
|
|
|
|
|
|
|
#ifdef EXEC_BACKEND
|
|
|
|
|
|
|
|
/*
|
2004-05-28 07:13:32 +02:00
|
|
|
* postmaster_forkexec -- fork and exec a postmaster subprocess
|
2004-01-07 00:15:22 +01:00
|
|
|
*
|
2004-05-28 07:13:32 +02:00
|
|
|
* The caller must have set up the argv array already, except for argv[2]
|
|
|
|
* which will be filled with the name of the temp variable file.
|
|
|
|
*
|
|
|
|
* Returns the child process PID, or -1 on fork failure (a suitable error
|
|
|
|
* message has been logged on failure).
|
|
|
|
*
|
|
|
|
* All uses of this routine will dispatch to SubPostmasterMain in the
|
|
|
|
* child process.
|
2004-01-07 00:15:22 +01:00
|
|
|
*/
|
2004-05-28 07:13:32 +02:00
|
|
|
pid_t
|
|
|
|
postmaster_forkexec(int argc, char *argv[])
|
2004-01-07 00:15:22 +01:00
|
|
|
{
|
2004-05-27 17:07:41 +02:00
|
|
|
Port port;
|
2004-01-07 00:15:22 +01:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
/* This entry point passes dummy values for the Port variables */
|
|
|
|
memset(&port, 0, sizeof(port));
|
|
|
|
return internal_forkexec(argc, argv, &port);
|
|
|
|
}
|
2004-01-07 00:15:22 +01:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
/*
|
|
|
|
* backend_forkexec -- fork/exec off a backend process
|
|
|
|
*
|
2008-09-23 22:35:38 +02:00
|
|
|
* Some operating systems (WIN32) don't have fork() so we have to simulate
|
|
|
|
* it by storing parameters that need to be passed to the child and
|
|
|
|
* then create a new child process.
|
|
|
|
*
|
2004-05-28 07:13:32 +02:00
|
|
|
* returns the pid of the fork/exec'd process, or -1 on failure
|
|
|
|
*/
|
|
|
|
static pid_t
|
|
|
|
backend_forkexec(Port *port)
|
|
|
|
{
|
|
|
|
char *av[4];
|
|
|
|
int ac = 0;
|
2004-01-26 23:59:54 +01:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
av[ac++] = "postgres";
|
2006-06-18 17:38:37 +02:00
|
|
|
av[ac++] = "--forkbackend";
|
2004-05-28 07:13:32 +02:00
|
|
|
av[ac++] = NULL; /* filled in by internal_forkexec */
|
2004-01-28 22:02:40 +01:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
av[ac] = NULL;
|
|
|
|
Assert(ac < lengthof(av));
|
2004-01-07 00:15:22 +01:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
return internal_forkexec(ac, av, port);
|
|
|
|
}
|
2004-01-07 00:15:22 +01:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
#ifndef WIN32
|
|
|
|
|
|
|
|
/*
|
|
|
|
* internal_forkexec non-win32 implementation
|
|
|
|
*
|
|
|
|
* - writes out backend variables to the parameter file
|
|
|
|
* - fork():s, and then exec():s the child process
|
|
|
|
*/
|
2004-05-28 07:13:32 +02:00
|
|
|
static pid_t
|
|
|
|
internal_forkexec(int argc, char *argv[], Port *port)
|
|
|
|
{
|
2004-11-17 01:14:14 +01:00
|
|
|
static unsigned long tmpBackendFileNum = 0;
|
2004-05-28 07:13:32 +02:00
|
|
|
pid_t pid;
|
|
|
|
char tmpfilename[MAXPGPATH];
|
2004-11-17 01:14:14 +01:00
|
|
|
BackendParameters param;
|
2005-10-15 04:49:52 +02:00
|
|
|
FILE *fp;
|
2004-11-17 01:14:14 +01:00
|
|
|
|
|
|
|
if (!save_backend_variables(¶m, port))
|
|
|
|
return -1; /* log made by save_backend_variables */
|
|
|
|
|
|
|
|
/* Calculate name for temp file */
|
2005-07-04 06:51:52 +02:00
|
|
|
snprintf(tmpfilename, MAXPGPATH, "%s/%s.backend_var.%d.%lu",
|
|
|
|
PG_TEMP_FILES_DIR, PG_TEMP_FILE_PREFIX,
|
2004-11-17 01:14:14 +01:00
|
|
|
MyProcPid, ++tmpBackendFileNum);
|
2004-01-07 00:15:22 +01:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
/* Open file */
|
|
|
|
fp = AllocateFile(tmpfilename, PG_BINARY_W);
|
|
|
|
if (!fp)
|
|
|
|
{
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/*
|
|
|
|
* As in OpenTemporaryFileInTablespace, try to make the temp-file
|
|
|
|
* directory
|
|
|
|
*/
|
2005-07-04 06:51:52 +02:00
|
|
|
mkdir(PG_TEMP_FILES_DIR, S_IRWXU);
|
2004-11-17 01:14:14 +01:00
|
|
|
|
|
|
|
fp = AllocateFile(tmpfilename, PG_BINARY_W);
|
|
|
|
if (!fp)
|
|
|
|
{
|
|
|
|
ereport(LOG,
|
|
|
|
(errcode_for_file_access(),
|
|
|
|
errmsg("could not create file \"%s\": %m",
|
|
|
|
tmpfilename)));
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (fwrite(¶m, sizeof(param), 1, fp) != 1)
|
|
|
|
{
|
|
|
|
ereport(LOG,
|
|
|
|
(errcode_for_file_access(),
|
|
|
|
errmsg("could not write to file \"%s\": %m", tmpfilename)));
|
|
|
|
FreeFile(fp);
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Release file */
|
|
|
|
if (FreeFile(fp))
|
|
|
|
{
|
|
|
|
ereport(LOG,
|
|
|
|
(errcode_for_file_access(),
|
|
|
|
errmsg("could not write to file \"%s\": %m", tmpfilename)));
|
|
|
|
return -1;
|
|
|
|
}
|
2004-03-09 06:11:53 +01:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
/* Make sure caller set up argv properly */
|
|
|
|
Assert(argc >= 3);
|
|
|
|
Assert(argv[argc] == NULL);
|
2006-06-18 17:38:37 +02:00
|
|
|
Assert(strncmp(argv[1], "--fork", 6) == 0);
|
2004-05-28 07:13:32 +02:00
|
|
|
Assert(argv[2] == NULL);
|
2004-01-07 00:15:22 +01:00
|
|
|
|
2006-06-18 17:38:37 +02:00
|
|
|
/* Insert temp file name after --fork argument */
|
2004-05-28 07:13:32 +02:00
|
|
|
argv[2] = tmpfilename;
|
2004-01-26 23:59:54 +01:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
/* Fire off execv in child */
|
2005-03-10 08:14:03 +01:00
|
|
|
if ((pid = fork_process()) == 0)
|
2004-05-28 07:13:32 +02:00
|
|
|
{
|
|
|
|
if (execv(postgres_exec_path, argv) < 0)
|
|
|
|
{
|
|
|
|
ereport(LOG,
|
2004-11-09 14:01:27 +01:00
|
|
|
(errmsg("could not execute server process \"%s\": %m",
|
2004-05-28 07:13:32 +02:00
|
|
|
postgres_exec_path)));
|
|
|
|
/* We're already in the child process here, can't return */
|
|
|
|
exit(1);
|
|
|
|
}
|
|
|
|
}
|
2004-01-07 00:15:22 +01:00
|
|
|
|
2005-10-15 04:49:52 +02:00
|
|
|
return pid; /* Parent returns pid, or -1 on fork failure */
|
2004-05-28 07:13:32 +02:00
|
|
|
}
|
2005-10-15 04:49:52 +02:00
|
|
|
#else /* WIN32 */
|
2004-11-17 01:14:14 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* internal_forkexec win32 implementation
|
|
|
|
*
|
|
|
|
* - starts backend using CreateProcess(), in suspended state
|
|
|
|
* - writes out backend variables to the parameter file
|
2005-10-15 04:49:52 +02:00
|
|
|
* - during this, duplicates handles and sockets required for
|
|
|
|
* inheritance into the new process
|
2004-11-17 01:14:14 +01:00
|
|
|
* - resumes execution of the new process once the backend parameter
|
2005-10-15 04:49:52 +02:00
|
|
|
* file is complete.
|
2004-11-17 01:14:14 +01:00
|
|
|
*/
|
|
|
|
static pid_t
|
|
|
|
internal_forkexec(int argc, char *argv[], Port *port)
|
|
|
|
{
|
2017-07-10 17:00:09 +02:00
|
|
|
int retry_count = 0;
|
2004-11-17 01:14:14 +01:00
|
|
|
STARTUPINFO si;
|
|
|
|
PROCESS_INFORMATION pi;
|
|
|
|
int i;
|
|
|
|
int j;
|
|
|
|
char cmdLine[MAXPGPATH * 2];
|
2005-10-15 04:49:52 +02:00
|
|
|
HANDLE paramHandle;
|
2004-11-17 01:14:14 +01:00
|
|
|
BackendParameters *param;
|
|
|
|
SECURITY_ATTRIBUTES sa;
|
2005-10-15 04:49:52 +02:00
|
|
|
char paramHandleStr[32];
|
2007-10-26 23:50:10 +02:00
|
|
|
win32_deadchild_waitinfo *childinfo;
|
2004-11-17 01:14:14 +01:00
|
|
|
|
|
|
|
/* Make sure caller set up argv properly */
|
|
|
|
Assert(argc >= 3);
|
|
|
|
Assert(argv[argc] == NULL);
|
2006-06-18 17:38:37 +02:00
|
|
|
Assert(strncmp(argv[1], "--fork", 6) == 0);
|
2004-11-17 01:14:14 +01:00
|
|
|
Assert(argv[2] == NULL);
|
|
|
|
|
2017-07-10 17:00:09 +02:00
|
|
|
/* Resume here if we need to retry */
|
|
|
|
retry:
|
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
/* Set up shared memory for parameter passing */
|
2005-10-15 04:49:52 +02:00
|
|
|
ZeroMemory(&sa, sizeof(sa));
|
2004-11-17 01:14:14 +01:00
|
|
|
sa.nLength = sizeof(sa);
|
|
|
|
sa.bInheritHandle = TRUE;
|
|
|
|
paramHandle = CreateFileMapping(INVALID_HANDLE_VALUE,
|
|
|
|
&sa,
|
|
|
|
PAGE_READWRITE,
|
|
|
|
0,
|
|
|
|
sizeof(BackendParameters),
|
|
|
|
NULL);
|
|
|
|
if (paramHandle == INVALID_HANDLE_VALUE)
|
|
|
|
{
|
2011-08-23 21:00:52 +02:00
|
|
|
elog(LOG, "could not create backend parameter file mapping: error code %lu",
|
|
|
|
GetLastError());
|
2004-11-17 01:14:14 +01:00
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
param = MapViewOfFile(paramHandle, FILE_MAP_WRITE, 0, 0, sizeof(BackendParameters));
|
|
|
|
if (!param)
|
|
|
|
{
|
2011-08-23 21:00:52 +02:00
|
|
|
elog(LOG, "could not map backend parameter memory: error code %lu",
|
|
|
|
GetLastError());
|
2004-11-17 01:14:14 +01:00
|
|
|
CloseHandle(paramHandle);
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
2006-06-18 17:38:37 +02:00
|
|
|
/* Insert temp file name after --fork argument */
|
2010-11-16 12:40:56 +01:00
|
|
|
#ifdef _WIN64
|
2011-04-27 17:28:14 +02:00
|
|
|
sprintf(paramHandleStr, "%llu", (LONG_PTR) paramHandle);
|
2010-11-16 12:40:56 +01:00
|
|
|
#else
|
2005-10-15 04:49:52 +02:00
|
|
|
sprintf(paramHandleStr, "%lu", (DWORD) paramHandle);
|
2010-11-16 12:40:56 +01:00
|
|
|
#endif
|
2004-11-17 01:14:14 +01:00
|
|
|
argv[2] = paramHandleStr;
|
|
|
|
|
|
|
|
/* Format the cmd line */
|
|
|
|
cmdLine[sizeof(cmdLine) - 1] = '\0';
|
|
|
|
cmdLine[sizeof(cmdLine) - 2] = '\0';
|
|
|
|
snprintf(cmdLine, sizeof(cmdLine) - 1, "\"%s\"", postgres_exec_path);
|
|
|
|
i = 0;
|
|
|
|
while (argv[++i] != NULL)
|
|
|
|
{
|
|
|
|
j = strlen(cmdLine);
|
|
|
|
snprintf(cmdLine + j, sizeof(cmdLine) - 1 - j, " \"%s\"", argv[i]);
|
|
|
|
}
|
|
|
|
if (cmdLine[sizeof(cmdLine) - 2] != '\0')
|
|
|
|
{
|
|
|
|
elog(LOG, "subprocess command line too long");
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
memset(&pi, 0, sizeof(pi));
|
|
|
|
memset(&si, 0, sizeof(si));
|
|
|
|
si.cb = sizeof(si);
|
2005-10-15 04:49:52 +02:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Create the subprocess in a suspended state. This will be resumed later,
|
|
|
|
* once we have written out the parameter file.
|
2004-11-17 01:14:14 +01:00
|
|
|
*/
|
|
|
|
if (!CreateProcess(NULL, cmdLine, NULL, NULL, TRUE, CREATE_SUSPENDED,
|
|
|
|
NULL, NULL, &si, &pi))
|
|
|
|
{
|
2011-08-23 21:00:52 +02:00
|
|
|
elog(LOG, "CreateProcess call failed: %m (error code %lu)",
|
|
|
|
GetLastError());
|
2004-11-17 01:14:14 +01:00
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!save_backend_variables(param, port, pi.hProcess, pi.dwProcessId))
|
|
|
|
{
|
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* log made by save_backend_variables, but we have to clean up the
|
|
|
|
* mess with the half-started process
|
2004-11-17 01:14:14 +01:00
|
|
|
*/
|
|
|
|
if (!TerminateProcess(pi.hProcess, 255))
|
2009-08-06 11:50:22 +02:00
|
|
|
ereport(LOG,
|
2011-08-23 21:00:52 +02:00
|
|
|
(errmsg_internal("could not terminate unstarted process: error code %lu",
|
|
|
|
GetLastError())));
|
2004-11-17 01:14:14 +01:00
|
|
|
CloseHandle(pi.hProcess);
|
|
|
|
CloseHandle(pi.hThread);
|
|
|
|
return -1; /* log made by save_backend_variables */
|
|
|
|
}
|
|
|
|
|
2009-07-24 22:12:42 +02:00
|
|
|
/* Drop the parameter shared memory that is now inherited to the backend */
|
2004-11-17 01:14:14 +01:00
|
|
|
if (!UnmapViewOfFile(param))
|
2011-08-23 21:00:52 +02:00
|
|
|
elog(LOG, "could not unmap view of backend parameter file: error code %lu",
|
|
|
|
GetLastError());
|
2004-11-17 01:14:14 +01:00
|
|
|
if (!CloseHandle(paramHandle))
|
2011-08-23 21:00:52 +02:00
|
|
|
elog(LOG, "could not close handle to backend parameter file: error code %lu",
|
|
|
|
GetLastError());
|
2004-11-17 01:14:14 +01:00
|
|
|
|
2009-07-24 22:12:42 +02:00
|
|
|
/*
|
2010-02-26 03:01:40 +01:00
|
|
|
* Reserve the memory region used by our main shared memory segment before
|
2017-07-10 17:00:09 +02:00
|
|
|
* we resume the child process. Normally this should succeed, but if ASLR
|
|
|
|
* is active then it might sometimes fail due to the stack or heap having
|
|
|
|
* gotten mapped into that range. In that case, just terminate the
|
|
|
|
* process and retry.
|
2009-07-24 22:12:42 +02:00
|
|
|
*/
|
|
|
|
if (!pgwin32_ReserveSharedMemoryRegion(pi.hProcess))
|
|
|
|
{
|
2017-07-10 17:00:09 +02:00
|
|
|
/* pgwin32_ReserveSharedMemoryRegion already made a log entry */
|
2009-07-24 22:12:42 +02:00
|
|
|
if (!TerminateProcess(pi.hProcess, 255))
|
2009-08-06 11:50:22 +02:00
|
|
|
ereport(LOG,
|
2011-08-23 21:00:52 +02:00
|
|
|
(errmsg_internal("could not terminate process that failed to reserve memory: error code %lu",
|
|
|
|
GetLastError())));
|
2009-07-24 22:12:42 +02:00
|
|
|
CloseHandle(pi.hProcess);
|
|
|
|
CloseHandle(pi.hThread);
|
2017-07-10 17:00:09 +02:00
|
|
|
if (++retry_count < 100)
|
|
|
|
goto retry;
|
|
|
|
ereport(LOG,
|
|
|
|
(errmsg("giving up after too many tries to reserve shared memory"),
|
|
|
|
errhint("This might be caused by ASLR or antivirus software.")));
|
|
|
|
return -1;
|
2009-07-24 22:12:42 +02:00
|
|
|
}
|
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Now that the backend variables are written out, we start the child
|
|
|
|
* thread so it can start initializing while we set up the rest of the
|
|
|
|
* parent state.
|
2004-11-17 01:14:14 +01:00
|
|
|
*/
|
|
|
|
if (ResumeThread(pi.hThread) == -1)
|
|
|
|
{
|
|
|
|
if (!TerminateProcess(pi.hProcess, 255))
|
|
|
|
{
|
2009-08-06 11:50:22 +02:00
|
|
|
ereport(LOG,
|
2011-08-23 21:00:52 +02:00
|
|
|
(errmsg_internal("could not terminate unstartable process: error code %lu",
|
|
|
|
GetLastError())));
|
2004-11-17 01:14:14 +01:00
|
|
|
CloseHandle(pi.hProcess);
|
|
|
|
CloseHandle(pi.hThread);
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
CloseHandle(pi.hProcess);
|
|
|
|
CloseHandle(pi.hThread);
|
2009-08-06 11:50:22 +02:00
|
|
|
ereport(LOG,
|
2011-08-23 21:00:52 +02:00
|
|
|
(errmsg_internal("could not resume thread of unstarted process: error code %lu",
|
|
|
|
GetLastError())));
|
2004-11-17 01:14:14 +01:00
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
2007-10-26 23:50:10 +02:00
|
|
|
/*
|
2007-11-15 22:14:46 +01:00
|
|
|
* Queue a waiter for to signal when this child dies. The wait will be
|
|
|
|
* handled automatically by an operating system thread pool.
|
2007-10-26 23:50:10 +02:00
|
|
|
*
|
2007-11-15 22:14:46 +01:00
|
|
|
* Note: use malloc instead of palloc, since it needs to be thread-safe.
|
|
|
|
* Struct will be free():d from the callback function that runs on a
|
|
|
|
* different thread.
|
2007-10-26 23:50:10 +02:00
|
|
|
*/
|
|
|
|
childinfo = malloc(sizeof(win32_deadchild_waitinfo));
|
|
|
|
if (!childinfo)
|
2004-11-17 01:14:14 +01:00
|
|
|
ereport(FATAL,
|
2007-11-15 22:14:46 +01:00
|
|
|
(errcode(ERRCODE_OUT_OF_MEMORY),
|
|
|
|
errmsg("out of memory")));
|
2007-10-26 23:50:10 +02:00
|
|
|
|
|
|
|
childinfo->procHandle = pi.hProcess;
|
|
|
|
childinfo->procId = pi.dwProcessId;
|
|
|
|
|
|
|
|
if (!RegisterWaitForSingleObject(&childinfo->waitHandle,
|
|
|
|
pi.hProcess,
|
|
|
|
pgwin32_deadchild_callback,
|
|
|
|
childinfo,
|
|
|
|
INFINITE,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
WT_EXECUTEONLYONCE | WT_EXECUTEINWAITTHREAD))
|
2007-10-26 23:50:10 +02:00
|
|
|
ereport(FATAL,
|
2012-06-10 21:20:04 +02:00
|
|
|
(errmsg_internal("could not register process for wait: error code %lu",
|
|
|
|
GetLastError())));
|
2004-11-17 01:14:14 +01:00
|
|
|
|
2007-10-26 23:50:10 +02:00
|
|
|
/* Don't close pi.hProcess here - the wait thread needs access to it */
|
2004-11-17 01:14:14 +01:00
|
|
|
|
|
|
|
CloseHandle(pi.hThread);
|
|
|
|
|
|
|
|
return pi.dwProcessId;
|
|
|
|
}
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
#endif /* WIN32 */
|
2004-11-17 01:14:14 +01:00
|
|
|
|
|
|
|
|
2004-01-07 00:15:22 +01:00
|
|
|
/*
|
2004-05-28 07:13:32 +02:00
|
|
|
* SubPostmasterMain -- Get the fork/exec'd process into a state equivalent
|
|
|
|
* to what it would be if we'd simply forked on Unix, and then
|
|
|
|
* dispatch to the appropriate place.
|
2004-01-07 00:15:22 +01:00
|
|
|
*
|
2006-06-18 17:38:37 +02:00
|
|
|
* The first two command line arguments are expected to be "--forkFOO"
|
2004-05-28 07:13:32 +02:00
|
|
|
* (where FOO indicates which postmaster child we are to become), and
|
|
|
|
* the name of a variables file that we can read to load data that would
|
|
|
|
* have been inherited by fork() on Unix. Remaining arguments go to the
|
|
|
|
* subprocess FooMain() routine.
|
2004-01-07 00:15:22 +01:00
|
|
|
*/
|
2012-06-25 20:25:26 +02:00
|
|
|
void
|
2004-05-28 07:13:32 +02:00
|
|
|
SubPostmasterMain(int argc, char *argv[])
|
2004-01-07 00:15:22 +01:00
|
|
|
{
|
2004-05-28 07:13:32 +02:00
|
|
|
Port port;
|
2004-01-07 00:15:22 +01:00
|
|
|
|
2004-12-29 22:36:09 +01:00
|
|
|
/* In EXEC_BACKEND case we will not have inherited these settings */
|
|
|
|
IsPostmasterEnvironment = true;
|
2005-11-03 21:02:50 +01:00
|
|
|
whereToSendOutput = DestNone;
|
2004-12-29 22:36:09 +01:00
|
|
|
|
2015-01-13 13:12:37 +01:00
|
|
|
/* Setup as postmaster child */
|
|
|
|
InitPostmasterChild();
|
|
|
|
|
2004-12-29 22:36:09 +01:00
|
|
|
/* Setup essential subsystems (to ensure elog() behaves sanely) */
|
|
|
|
InitializeGUCOptions();
|
|
|
|
|
2016-10-01 23:15:09 +02:00
|
|
|
/* Check we got appropriate args */
|
|
|
|
if (argc < 3)
|
|
|
|
elog(FATAL, "invalid subpostmaster invocation");
|
|
|
|
|
2004-12-29 22:36:09 +01:00
|
|
|
/* Read in the variables file */
|
2004-11-17 01:14:14 +01:00
|
|
|
memset(&port, 0, sizeof(Port));
|
|
|
|
read_backend_variables(argv[2], &port);
|
|
|
|
|
2016-10-01 23:15:09 +02:00
|
|
|
/* Close the postmaster's sockets (as soon as we know them) */
|
|
|
|
ClosePostmasterPorts(strcmp(argv[1], "--forklog") == 0);
|
|
|
|
|
Do stack-depth checking in all postmaster children.
We used to only initialize the stack base pointer when starting up a regular
backend, not in other processes. In particular, autovacuum workers can run
arbitrary user code, and without stack-depth checking, infinite recursion
in e.g an index expression will bring down the whole cluster.
The comment about PL/Java using set_stack_base() is not yet true. As the
code stands, PL/java still modifies the stack_base_ptr variable directly.
However, it's been discussed in the PL/Java mailing list that it should be
changed to use the function, because PL/Java is currently oblivious to the
register stack used on Itanium. There's another issues with PL/Java, namely
that the stack base pointer it sets is not really the base of the stack, it
could be something close to the bottom of the stack. That's a separate issue
that might need some further changes to this code, but that's a different
story.
Backpatch to all supported releases.
2012-04-08 17:28:12 +02:00
|
|
|
/*
|
|
|
|
* Set reference point for stack-depth checking
|
|
|
|
*/
|
|
|
|
set_stack_base();
|
|
|
|
|
2007-11-15 22:14:46 +01:00
|
|
|
/*
|
|
|
|
* Set up memory area for GSS information. Mirrors the code in ConnCreate
|
|
|
|
* for the non-exec case.
|
2007-07-23 12:16:54 +02:00
|
|
|
*/
|
|
|
|
#if defined(ENABLE_GSS) || defined(ENABLE_SSPI)
|
2007-11-15 22:14:46 +01:00
|
|
|
port.gss = (pg_gssinfo *) calloc(1, sizeof(pg_gssinfo));
|
2007-07-23 12:16:54 +02:00
|
|
|
if (!port.gss)
|
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_OUT_OF_MEMORY),
|
|
|
|
errmsg("out of memory")));
|
|
|
|
#endif
|
|
|
|
|
2004-12-29 22:36:09 +01:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* If appropriate, physically re-attach to shared memory segment. We want
|
|
|
|
* to do this before going any further to ensure that we can attach at the
|
On Windows, ensure shared memory handle gets closed if not being used.
Postmaster child processes that aren't supposed to be attached to shared
memory were not bothering to close the shared memory mapping handle they
inherit from the postmaster process. That's mostly harmless, since the
handle vanishes anyway when the child process exits -- but the syslogger
process, if used, doesn't get killed and restarted during recovery from a
backend crash. That meant that Windows doesn't see the shared memory
mapping as becoming free, so it doesn't delete it and the postmaster is
unable to create a new one, resulting in failure to recover from crashes
whenever logging_collector is turned on.
Per report from Dmitry Vasilyev. It's a bit astonishing that we'd not
figured this out long ago, since it's been broken from the very beginnings
of out native Windows support; probably some previously-unexplained trouble
reports trace to this.
A secondary problem is that on Cygwin (perhaps only in older versions?),
exec() may not detach from the shared memory segment after all, in which
case these child processes did remain attached to shared memory, posing
the risk of an unexpected shared memory clobber if they went off the rails
somehow. That may be a long-gone bug, but we can deal with it now if it's
still live, by detaching within the infrastructure introduced here to deal
with closing the handle.
Back-patch to all supported branches.
Tom Lane and Amit Kapila
2015-10-13 17:21:33 +02:00
|
|
|
* same address the postmaster used. On the other hand, if we choose not
|
|
|
|
* to re-attach, we may have other cleanup to do.
|
2016-10-01 23:15:09 +02:00
|
|
|
*
|
|
|
|
* If testing EXEC_BACKEND on Linux, you should run this as root before
|
|
|
|
* starting the postmaster:
|
|
|
|
*
|
|
|
|
* echo 0 >/proc/sys/kernel/randomize_va_space
|
|
|
|
*
|
|
|
|
* This prevents using randomized stack and code addresses that cause the
|
|
|
|
* child process's memory map to be different from the parent's, making it
|
|
|
|
* sometimes impossible to attach to shared memory at the desired address.
|
|
|
|
* Return the setting to its old value (usually '1' or '2') when finished.
|
2004-12-29 22:36:09 +01:00
|
|
|
*/
|
2006-06-18 17:38:37 +02:00
|
|
|
if (strcmp(argv[1], "--forkbackend") == 0 ||
|
2007-02-16 00:23:23 +01:00
|
|
|
strcmp(argv[1], "--forkavlauncher") == 0 ||
|
|
|
|
strcmp(argv[1], "--forkavworker") == 0 ||
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
strcmp(argv[1], "--forkboot") == 0 ||
|
|
|
|
strncmp(argv[1], "--forkbgworker=", 15) == 0)
|
2004-12-29 22:36:09 +01:00
|
|
|
PGSharedMemoryReAttach();
|
On Windows, ensure shared memory handle gets closed if not being used.
Postmaster child processes that aren't supposed to be attached to shared
memory were not bothering to close the shared memory mapping handle they
inherit from the postmaster process. That's mostly harmless, since the
handle vanishes anyway when the child process exits -- but the syslogger
process, if used, doesn't get killed and restarted during recovery from a
backend crash. That meant that Windows doesn't see the shared memory
mapping as becoming free, so it doesn't delete it and the postmaster is
unable to create a new one, resulting in failure to recover from crashes
whenever logging_collector is turned on.
Per report from Dmitry Vasilyev. It's a bit astonishing that we'd not
figured this out long ago, since it's been broken from the very beginnings
of out native Windows support; probably some previously-unexplained trouble
reports trace to this.
A secondary problem is that on Cygwin (perhaps only in older versions?),
exec() may not detach from the shared memory segment after all, in which
case these child processes did remain attached to shared memory, posing
the risk of an unexpected shared memory clobber if they went off the rails
somehow. That may be a long-gone bug, but we can deal with it now if it's
still live, by detaching within the infrastructure introduced here to deal
with closing the handle.
Back-patch to all supported branches.
Tom Lane and Amit Kapila
2015-10-13 17:21:33 +02:00
|
|
|
else
|
|
|
|
PGSharedMemoryNoReAttach();
|
2004-12-29 22:36:09 +01:00
|
|
|
|
2007-01-16 14:28:57 +01:00
|
|
|
/* autovacuum needs this set before calling InitProcess */
|
2007-02-16 00:23:23 +01:00
|
|
|
if (strcmp(argv[1], "--forkavlauncher") == 0)
|
|
|
|
AutovacuumLauncherIAm();
|
|
|
|
if (strcmp(argv[1], "--forkavworker") == 0)
|
|
|
|
AutovacuumWorkerIAm();
|
2007-01-16 14:28:57 +01:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Start our win32 signal implementation. This has to be done after we
|
|
|
|
* read the backend variables, because we need to pick up the signal pipe
|
|
|
|
* from the parent process.
|
2004-11-17 01:14:14 +01:00
|
|
|
*/
|
|
|
|
#ifdef WIN32
|
|
|
|
pgwin32_signal_initialize();
|
|
|
|
#endif
|
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
/* In EXEC_BACKEND case we will not have inherited these settings */
|
|
|
|
pqinitmask();
|
|
|
|
PG_SETMASK(&BlockSig);
|
2004-01-07 00:15:22 +01:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
/* Read in remaining GUC variables */
|
2004-05-28 07:13:32 +02:00
|
|
|
read_nondefault_variables();
|
2004-01-07 00:15:22 +01:00
|
|
|
|
2017-09-13 11:12:17 +02:00
|
|
|
/* (re-)read control file (contains config) */
|
|
|
|
LocalProcessControlFile();
|
|
|
|
|
2009-01-03 18:08:39 +01:00
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* Reload any libraries that were preloaded by the postmaster. Since we
|
2009-06-11 16:49:15 +02:00
|
|
|
* exec'd this process, those libraries didn't come along with us; but we
|
|
|
|
* should load them into all child processes to be consistent with the
|
|
|
|
* non-EXEC_BACKEND behavior.
|
2009-01-03 18:08:39 +01:00
|
|
|
*/
|
|
|
|
process_shared_preload_libraries();
|
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
/* Run backend or appropriate child */
|
2006-06-18 17:38:37 +02:00
|
|
|
if (strcmp(argv[1], "--forkbackend") == 0)
|
2004-05-28 07:13:32 +02:00
|
|
|
{
|
2006-01-04 22:06:32 +01:00
|
|
|
Assert(argc == 3); /* shouldn't be any more args */
|
2004-05-27 17:07:41 +02:00
|
|
|
|
2004-10-06 11:35:23 +02:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Need to reinitialize the SSL library in the backend, since the
|
|
|
|
* context structures contain function pointers and cannot be passed
|
|
|
|
* through the parameter file.
|
2009-01-03 18:08:39 +01:00
|
|
|
*
|
2017-01-03 03:37:12 +01:00
|
|
|
* If for some reason reload fails (maybe the user installed broken
|
|
|
|
* key files), soldier on without SSL; that's better than all
|
|
|
|
* connections becoming impossible.
|
|
|
|
*
|
2009-01-03 18:08:39 +01:00
|
|
|
* XXX should we do this in all child processes? For the moment it's
|
|
|
|
* enough to do it in backend children.
|
2004-10-06 11:35:23 +02:00
|
|
|
*/
|
2006-01-04 22:06:32 +01:00
|
|
|
#ifdef USE_SSL
|
2004-10-06 11:35:23 +02:00
|
|
|
if (EnableSSL)
|
2017-01-03 03:37:12 +01:00
|
|
|
{
|
|
|
|
if (secure_initialize(false) == 0)
|
|
|
|
LoadedSSL = true;
|
|
|
|
else
|
|
|
|
ereport(LOG,
|
2017-01-04 18:43:52 +01:00
|
|
|
(errmsg("SSL configuration could not be loaded in child process")));
|
2017-01-03 03:37:12 +01:00
|
|
|
}
|
2004-10-06 11:35:23 +02:00
|
|
|
#endif
|
|
|
|
|
2006-01-04 22:06:32 +01:00
|
|
|
/*
|
2009-08-29 21:26:52 +02:00
|
|
|
* Perform additional initialization and collect startup packet.
|
2006-01-04 22:06:32 +01:00
|
|
|
*
|
2006-10-04 02:30:14 +02:00
|
|
|
* We want to do this before InitProcess() for a couple of reasons: 1.
|
|
|
|
* so that we aren't eating up a PGPROC slot while waiting on the
|
|
|
|
* client. 2. so that if InitProcess() fails due to being out of
|
|
|
|
* PGPROC slots, we have already initialized libpq and are able to
|
|
|
|
* report the error to the client.
|
2006-01-04 22:06:32 +01:00
|
|
|
*/
|
|
|
|
BackendInitialize(&port);
|
|
|
|
|
|
|
|
/* Restore basic shared memory pointers */
|
|
|
|
InitShmemAccess(UsedShmemSegAddr);
|
|
|
|
|
|
|
|
/* Need a PGPROC to run CreateSharedMemoryAndSemaphores */
|
|
|
|
InitProcess();
|
|
|
|
|
2016-10-01 23:15:09 +02:00
|
|
|
/* Attach process to shared data structures */
|
2006-01-04 22:06:32 +01:00
|
|
|
CreateSharedMemoryAndSemaphores(false, 0);
|
|
|
|
|
|
|
|
/* And run the backend */
|
2012-06-25 20:25:26 +02:00
|
|
|
BackendRun(&port); /* does not return */
|
2004-05-28 07:13:32 +02:00
|
|
|
}
|
2006-06-18 17:38:37 +02:00
|
|
|
if (strcmp(argv[1], "--forkboot") == 0)
|
2004-05-28 07:13:32 +02:00
|
|
|
{
|
2006-01-04 22:06:32 +01:00
|
|
|
/* Restore basic shared memory pointers */
|
|
|
|
InitShmemAccess(UsedShmemSegAddr);
|
|
|
|
|
|
|
|
/* Need a PGPROC to run CreateSharedMemoryAndSemaphores */
|
2007-03-07 14:35:03 +01:00
|
|
|
InitAuxiliaryProcess();
|
2006-01-04 22:06:32 +01:00
|
|
|
|
2004-12-29 22:36:09 +01:00
|
|
|
/* Attach process to shared data structures */
|
2005-06-18 00:32:51 +02:00
|
|
|
CreateSharedMemoryAndSemaphores(false, 0);
|
2004-05-28 07:13:32 +02:00
|
|
|
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
AuxiliaryProcessMain(argc - 2, argv + 2); /* does not return */
|
2004-05-28 07:13:32 +02:00
|
|
|
}
|
2007-02-16 00:23:23 +01:00
|
|
|
if (strcmp(argv[1], "--forkavlauncher") == 0)
|
|
|
|
{
|
|
|
|
/* Restore basic shared memory pointers */
|
|
|
|
InitShmemAccess(UsedShmemSegAddr);
|
|
|
|
|
|
|
|
/* Need a PGPROC to run CreateSharedMemoryAndSemaphores */
|
2009-08-31 21:41:00 +02:00
|
|
|
InitProcess();
|
2007-02-16 00:23:23 +01:00
|
|
|
|
|
|
|
/* Attach process to shared data structures */
|
|
|
|
CreateSharedMemoryAndSemaphores(false, 0);
|
|
|
|
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
AutoVacLauncherMain(argc - 2, argv + 2); /* does not return */
|
2007-02-16 00:23:23 +01:00
|
|
|
}
|
|
|
|
if (strcmp(argv[1], "--forkavworker") == 0)
|
2005-07-14 07:13:45 +02:00
|
|
|
{
|
2006-01-04 22:06:32 +01:00
|
|
|
/* Restore basic shared memory pointers */
|
|
|
|
InitShmemAccess(UsedShmemSegAddr);
|
|
|
|
|
|
|
|
/* Need a PGPROC to run CreateSharedMemoryAndSemaphores */
|
|
|
|
InitProcess();
|
|
|
|
|
2005-07-29 21:30:09 +02:00
|
|
|
/* Attach process to shared data structures */
|
2005-07-14 07:13:45 +02:00
|
|
|
CreateSharedMemoryAndSemaphores(false, 0);
|
|
|
|
|
2013-05-29 22:58:43 +02:00
|
|
|
AutoVacWorkerMain(argc - 2, argv + 2); /* does not return */
|
2005-07-14 07:13:45 +02:00
|
|
|
}
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
if (strncmp(argv[1], "--forkbgworker=", 15) == 0)
|
|
|
|
{
|
Allow background workers to be started dynamically.
There is a new API, RegisterDynamicBackgroundWorker, which allows
an ordinary user backend to register a new background writer during
normal running. This means that it's no longer necessary for all
background workers to be registered during processing of
shared_preload_libraries, although the option of registering workers
at that time remains available.
When a background worker exits and will not be restarted, the
slot previously used by that background worker is automatically
released and becomes available for reuse. Slots used by background
workers that are configured for automatic restart can't (yet) be
released without shutting down the system.
This commit adds a new source file, bgworker.c, and moves some
of the existing control logic for background workers there.
Previously, there was little enough logic that it made sense to
keep everything in postmaster.c, but not any more.
This commit also makes the worker_spi contrib module into an
extension and adds a new function, worker_spi_launch, which can
be used to demonstrate the new facility.
2013-07-16 19:02:15 +02:00
|
|
|
int shmem_slot;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
2014-07-30 17:25:58 +02:00
|
|
|
/* do this as early as possible; in particular, before InitProcess() */
|
|
|
|
IsBackgroundWorker = true;
|
|
|
|
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/* Restore basic shared memory pointers */
|
|
|
|
InitShmemAccess(UsedShmemSegAddr);
|
|
|
|
|
|
|
|
/* Need a PGPROC to run CreateSharedMemoryAndSemaphores */
|
|
|
|
InitProcess();
|
|
|
|
|
|
|
|
/* Attach process to shared data structures */
|
|
|
|
CreateSharedMemoryAndSemaphores(false, 0);
|
|
|
|
|
2016-08-03 00:39:14 +02:00
|
|
|
/* Fetch MyBgworkerEntry from shared memory */
|
Allow background workers to be started dynamically.
There is a new API, RegisterDynamicBackgroundWorker, which allows
an ordinary user backend to register a new background writer during
normal running. This means that it's no longer necessary for all
background workers to be registered during processing of
shared_preload_libraries, although the option of registering workers
at that time remains available.
When a background worker exits and will not be restarted, the
slot previously used by that background worker is automatically
released and becomes available for reuse. Slots used by background
workers that are configured for automatic restart can't (yet) be
released without shutting down the system.
This commit adds a new source file, bgworker.c, and moves some
of the existing control logic for background workers there.
Previously, there was little enough logic that it made sense to
keep everything in postmaster.c, but not any more.
This commit also makes the worker_spi contrib module into an
extension and adds a new function, worker_spi_launch, which can
be used to demonstrate the new facility.
2013-07-16 19:02:15 +02:00
|
|
|
shmem_slot = atoi(argv[1] + 15);
|
|
|
|
MyBgworkerEntry = BackgroundWorkerEntry(shmem_slot);
|
2016-08-03 00:39:14 +02:00
|
|
|
|
2013-08-16 21:14:54 +02:00
|
|
|
StartBackgroundWorker();
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
}
|
2006-06-18 17:38:37 +02:00
|
|
|
if (strcmp(argv[1], "--forkarch") == 0)
|
2004-07-19 04:47:16 +02:00
|
|
|
{
|
|
|
|
/* Do not want to attach to shared memory */
|
|
|
|
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
PgArchiverMain(argc, argv); /* does not return */
|
2004-07-19 04:47:16 +02:00
|
|
|
}
|
2006-06-29 22:00:08 +02:00
|
|
|
if (strcmp(argv[1], "--forkcol") == 0)
|
2004-05-28 07:13:32 +02:00
|
|
|
{
|
|
|
|
/* Do not want to attach to shared memory */
|
|
|
|
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
PgstatCollectorMain(argc, argv); /* does not return */
|
2004-05-28 07:13:32 +02:00
|
|
|
}
|
2006-06-18 17:38:37 +02:00
|
|
|
if (strcmp(argv[1], "--forklog") == 0)
|
2004-08-06 01:32:13 +02:00
|
|
|
{
|
|
|
|
/* Do not want to attach to shared memory */
|
|
|
|
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
SysLoggerMain(argc, argv); /* does not return */
|
2004-08-06 01:32:13 +02:00
|
|
|
}
|
2004-05-28 07:13:32 +02:00
|
|
|
|
2012-06-25 20:25:26 +02:00
|
|
|
abort(); /* shouldn't get here */
|
2004-01-07 00:15:22 +01:00
|
|
|
}
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
#endif /* EXEC_BACKEND */
|
2004-01-07 00:15:22 +01:00
|
|
|
|
|
|
|
|
1996-07-09 08:22:35 +02:00
|
|
|
/*
|
|
|
|
* ExitPostmaster -- cleanup
|
2000-11-29 21:59:54 +01:00
|
|
|
*
|
|
|
|
* Do NOT call exit() directly --- always go through here!
|
1996-07-09 08:22:35 +02:00
|
|
|
*/
|
2003-07-27 23:49:55 +02:00
|
|
|
static void
|
1996-07-09 08:22:35 +02:00
|
|
|
ExitPostmaster(int status)
|
|
|
|
{
|
2015-01-08 04:35:44 +01:00
|
|
|
#ifdef HAVE_PTHREAD_IS_THREADED_NP
|
|
|
|
|
|
|
|
/*
|
|
|
|
* There is no known cause for a postmaster to become multithreaded after
|
|
|
|
* startup. Recheck to account for the possibility of unknown causes.
|
2015-01-08 04:46:59 +01:00
|
|
|
* This message uses LOG level, because an unclean shutdown at this point
|
|
|
|
* would usually not look much different from a clean shutdown.
|
2015-01-08 04:35:44 +01:00
|
|
|
*/
|
|
|
|
if (pthread_is_threaded_np() != 0)
|
|
|
|
ereport(LOG,
|
2015-01-08 04:46:59 +01:00
|
|
|
(errcode(ERRCODE_INTERNAL_ERROR),
|
|
|
|
errmsg_internal("postmaster became multithreaded"),
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
errdetail("Please report this to <pgsql-bugs@postgresql.org>.")));
|
2015-01-08 04:35:44 +01:00
|
|
|
#endif
|
|
|
|
|
1997-09-07 07:04:48 +02:00
|
|
|
/* should cleanup shared memory and kill all backends */
|
|
|
|
|
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* Not sure of the semantics here. When the Postmaster dies, should the
|
2005-10-15 04:49:52 +02:00
|
|
|
* backends all be killed? probably not.
|
1999-10-06 23:58:18 +02:00
|
|
|
*
|
|
|
|
* MUST -- vadim 05-10-1999
|
1997-09-07 07:04:48 +02:00
|
|
|
*/
|
2000-05-26 03:38:08 +02:00
|
|
|
|
1998-06-27 06:53:49 +02:00
|
|
|
proc_exit(status);
|
1996-07-09 08:22:35 +02:00
|
|
|
}
|
|
|
|
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
/*
|
2009-02-23 10:28:50 +01:00
|
|
|
* sigusr1_handler - handle signal conditions from child processes
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
*/
|
|
|
|
static void
|
2009-02-23 10:28:50 +01:00
|
|
|
sigusr1_handler(SIGNAL_ARGS)
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
{
|
2009-02-23 10:28:50 +01:00
|
|
|
int save_errno = errno;
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
|
2009-02-23 10:28:50 +01:00
|
|
|
PG_SETMASK(&BlockSig);
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
|
Allow background workers to be started dynamically.
There is a new API, RegisterDynamicBackgroundWorker, which allows
an ordinary user backend to register a new background writer during
normal running. This means that it's no longer necessary for all
background workers to be registered during processing of
shared_preload_libraries, although the option of registering workers
at that time remains available.
When a background worker exits and will not be restarted, the
slot previously used by that background worker is automatically
released and becomes available for reuse. Slots used by background
workers that are configured for automatic restart can't (yet) be
released without shutting down the system.
This commit adds a new source file, bgworker.c, and moves some
of the existing control logic for background workers there.
Previously, there was little enough logic that it made sense to
keep everything in postmaster.c, but not any more.
This commit also makes the worker_spi contrib module into an
extension and adds a new function, worker_spi_launch, which can
be used to demonstrate the new facility.
2013-07-16 19:02:15 +02:00
|
|
|
/* Process background worker state change. */
|
|
|
|
if (CheckPostmasterSignal(PMSIGNAL_BACKGROUND_WORKER_CHANGE))
|
|
|
|
{
|
|
|
|
BackgroundWorkerStateChange();
|
Eliminate one background-worker-related flag variable.
Teach sigusr1_handler() to use the same test for whether a worker
might need to be started as ServerLoop(). Aside from being perhaps
a bit simpler, this prevents a potentially-unbounded delay when
starting a background worker. On some platforms, select() doesn't
return when interrupted by a signal, but is instead restarted,
including a reset of the timeout to the originally-requested value.
If signals arrive often enough, but no connection requests arrive,
sigusr1_handler() will be executed repeatedly, but the body of
ServerLoop() won't be reached. This change ensures that, even in
that case, background workers will eventually get launched.
This is far from a perfect fix; really, we need select() to return
control to ServerLoop() after an interrupt, either via the self-pipe
trick or some other mechanism. But that's going to require more
work and discussion, so let's do this for now to at least mitigate
the damage.
Per investigation of test_shm_mq failures on buildfarm member anole.
2014-10-05 03:25:41 +02:00
|
|
|
StartWorkerNeeded = true;
|
Allow background workers to be started dynamically.
There is a new API, RegisterDynamicBackgroundWorker, which allows
an ordinary user backend to register a new background writer during
normal running. This means that it's no longer necessary for all
background workers to be registered during processing of
shared_preload_libraries, although the option of registering workers
at that time remains available.
When a background worker exits and will not be restarted, the
slot previously used by that background worker is automatically
released and becomes available for reuse. Slots used by background
workers that are configured for automatic restart can't (yet) be
released without shutting down the system.
This commit adds a new source file, bgworker.c, and moves some
of the existing control logic for background workers there.
Previously, there was little enough logic that it made sense to
keep everything in postmaster.c, but not any more.
This commit also makes the worker_spi contrib module into an
extension and adds a new function, worker_spi_launch, which can
be used to demonstrate the new facility.
2013-07-16 19:02:15 +02:00
|
|
|
}
|
|
|
|
|
2009-02-23 10:28:50 +01:00
|
|
|
/*
|
2010-05-15 22:01:32 +02:00
|
|
|
* RECOVERY_STARTED and BEGIN_HOT_STANDBY signals are ignored in
|
2009-02-23 10:28:50 +01:00
|
|
|
* unexpected states. If the startup process quickly starts up, completes
|
|
|
|
* recovery, exits, we might process the death of the startup process
|
|
|
|
* first. We don't want to go back to recovery in that case.
|
|
|
|
*/
|
|
|
|
if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_STARTED) &&
|
Don't launch new child processes after we've been told to shut down.
Once we've received a shutdown signal (SIGINT or SIGTERM), we should not
launch any more child processes, even if we get signals requesting such.
The normal code path for spawning backends has always understood that,
but the postmaster's infrastructure for hot standby and autovacuum didn't
get the memo. As reported by Hari Babu in bug #7643, this could lead to
failure to shut down at all in some cases, such as when SIGINT is received
just before the startup process sends PMSIGNAL_RECOVERY_STARTED: we'd
launch a bgwriter and checkpointer, and then those processes would have no
idea that they ought to quit. Similarly, launching a new autovacuum worker
would result in waiting till it finished before shutting down.
Also, switch the order of the code blocks in reaper() that detect startup
process crash versus shutdown termination. Once we've sent it a signal,
we should not consider that exit(1) is surprising. This is just a cosmetic
fix since shutdown occurs correctly anyway, but better not to log a phony
complaint about startup process crash.
Back-patch to 9.0. Some parts of this might be applicable before that,
but given the lack of prior complaints I'm not going to worry too much
about older branches.
2012-11-21 21:18:38 +01:00
|
|
|
pmState == PM_STARTUP && Shutdown == NoShutdown)
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
{
|
2009-02-23 10:28:50 +01:00
|
|
|
/* WAL redo has started. We're out of reinitialization. */
|
|
|
|
FatalError = false;
|
2013-10-06 04:24:50 +02:00
|
|
|
Assert(AbortStartTime == 0);
|
2009-02-23 10:28:50 +01:00
|
|
|
|
|
|
|
/*
|
2012-05-11 23:46:08 +02:00
|
|
|
* Crank up the background tasks. It doesn't matter if this fails,
|
2009-06-11 16:49:15 +02:00
|
|
|
* we'll just try again later.
|
2009-02-23 10:28:50 +01:00
|
|
|
*/
|
2012-05-11 23:46:08 +02:00
|
|
|
Assert(CheckpointerPID == 0);
|
|
|
|
CheckpointerPID = StartCheckpointer();
|
2012-06-01 09:25:17 +02:00
|
|
|
Assert(BgWriterPID == 0);
|
|
|
|
BgWriterPID = StartBackgroundWriter();
|
2009-02-23 10:28:50 +01:00
|
|
|
|
2015-05-15 17:55:24 +02:00
|
|
|
/*
|
|
|
|
* Start the archiver if we're responsible for (re-)archiving received
|
|
|
|
* files.
|
|
|
|
*/
|
|
|
|
Assert(PgArchPID == 0);
|
2015-06-12 16:11:51 +02:00
|
|
|
if (XLogArchivingAlways())
|
2015-05-15 17:55:24 +02:00
|
|
|
PgArchPID = pgarch_start();
|
|
|
|
|
Change pg_ctl to detect server-ready by watching status in postmaster.pid.
Traditionally, "pg_ctl start -w" has waited for the server to become
ready to accept connections by attempting a connection once per second.
That has the major problem that connection issues (for instance, a
kernel packet filter blocking traffic) can't be reliably told apart
from server startup issues, and the minor problem that if server startup
isn't quick, we accumulate "the database system is starting up" spam
in the server log. We've hacked around many of the possible connection
issues, but it resulted in ugly and complicated code in pg_ctl.c.
In commit c61559ec3, I changed the probe rate to every tenth of a second.
That prompted Jeff Janes to complain that the log-spam problem had become
much worse. In the ensuing discussion, Andres Freund pointed out that
we could dispense with connection attempts altogether if the postmaster
were changed to report its status in postmaster.pid, which "pg_ctl start"
already relies on being able to read. This patch implements that, teaching
postmaster.c to report a status string into the pidfile at the same
state-change points already identified as being of interest for systemd
status reporting (cf commit 7d17e683f). pg_ctl no longer needs to link
with libpq at all; all its functions now depend on reading server files.
In support of this, teach AddToDataDirLockFile() to allow addition of
postmaster.pid lines in not-necessarily-sequential order. This is needed
on Windows where the SHMEM_KEY line will never be written at all. We still
have the restriction that we don't want to truncate the pidfile; document
the reasons for that a bit better.
Also, fix the pg_ctl TAP tests so they'll notice if "start -w" mode
is broken --- before, they'd just wait out the sixty seconds until
the loop gives up, and then report success anyway. (Yes, I found that
out the hard way.)
While at it, arrange for pg_ctl to not need to #include miscadmin.h;
as a rather low-level backend header, requiring that to be compilable
client-side is pretty dubious. This requires moving the #define's
associated with the pidfile into a new header file, and moving
PG_BACKEND_VERSIONSTR someplace else. For lack of a clearly better
"someplace else", I put it into port.h, beside the declaration of
find_other_exec(), since most users of that macro are passing the value to
find_other_exec(). (initdb still depends on miscadmin.h, but at least
pg_ctl and pg_upgrade no longer do.)
In passing, fix main.c so that PG_BACKEND_VERSIONSTR actually defines the
output of "postgres -V", which remarkably it had never done before.
Discussion: https://postgr.es/m/CAMkU=1xJW8e+CTotojOMBd-yzUvD0e_JZu2xHo=MnuZ4__m7Pg@mail.gmail.com
2017-06-28 23:31:24 +02:00
|
|
|
/*
|
|
|
|
* If we aren't planning to enter hot standby mode later, treat
|
|
|
|
* RECOVERY_STARTED as meaning we're out of startup, and report status
|
|
|
|
* accordingly.
|
|
|
|
*/
|
2015-11-17 12:46:17 +01:00
|
|
|
if (!EnableHotStandby)
|
Change pg_ctl to detect server-ready by watching status in postmaster.pid.
Traditionally, "pg_ctl start -w" has waited for the server to become
ready to accept connections by attempting a connection once per second.
That has the major problem that connection issues (for instance, a
kernel packet filter blocking traffic) can't be reliably told apart
from server startup issues, and the minor problem that if server startup
isn't quick, we accumulate "the database system is starting up" spam
in the server log. We've hacked around many of the possible connection
issues, but it resulted in ugly and complicated code in pg_ctl.c.
In commit c61559ec3, I changed the probe rate to every tenth of a second.
That prompted Jeff Janes to complain that the log-spam problem had become
much worse. In the ensuing discussion, Andres Freund pointed out that
we could dispense with connection attempts altogether if the postmaster
were changed to report its status in postmaster.pid, which "pg_ctl start"
already relies on being able to read. This patch implements that, teaching
postmaster.c to report a status string into the pidfile at the same
state-change points already identified as being of interest for systemd
status reporting (cf commit 7d17e683f). pg_ctl no longer needs to link
with libpq at all; all its functions now depend on reading server files.
In support of this, teach AddToDataDirLockFile() to allow addition of
postmaster.pid lines in not-necessarily-sequential order. This is needed
on Windows where the SHMEM_KEY line will never be written at all. We still
have the restriction that we don't want to truncate the pidfile; document
the reasons for that a bit better.
Also, fix the pg_ctl TAP tests so they'll notice if "start -w" mode
is broken --- before, they'd just wait out the sixty seconds until
the loop gives up, and then report success anyway. (Yes, I found that
out the hard way.)
While at it, arrange for pg_ctl to not need to #include miscadmin.h;
as a rather low-level backend header, requiring that to be compilable
client-side is pretty dubious. This requires moving the #define's
associated with the pidfile into a new header file, and moving
PG_BACKEND_VERSIONSTR someplace else. For lack of a clearly better
"someplace else", I put it into port.h, beside the declaration of
find_other_exec(), since most users of that macro are passing the value to
find_other_exec(). (initdb still depends on miscadmin.h, but at least
pg_ctl and pg_upgrade no longer do.)
In passing, fix main.c so that PG_BACKEND_VERSIONSTR actually defines the
output of "postgres -V", which remarkably it had never done before.
Discussion: https://postgr.es/m/CAMkU=1xJW8e+CTotojOMBd-yzUvD0e_JZu2xHo=MnuZ4__m7Pg@mail.gmail.com
2017-06-28 23:31:24 +02:00
|
|
|
{
|
|
|
|
AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STANDBY);
|
|
|
|
#ifdef USE_SYSTEMD
|
2015-11-17 12:46:17 +01:00
|
|
|
sd_notify(0, "READY=1");
|
|
|
|
#endif
|
Change pg_ctl to detect server-ready by watching status in postmaster.pid.
Traditionally, "pg_ctl start -w" has waited for the server to become
ready to accept connections by attempting a connection once per second.
That has the major problem that connection issues (for instance, a
kernel packet filter blocking traffic) can't be reliably told apart
from server startup issues, and the minor problem that if server startup
isn't quick, we accumulate "the database system is starting up" spam
in the server log. We've hacked around many of the possible connection
issues, but it resulted in ugly and complicated code in pg_ctl.c.
In commit c61559ec3, I changed the probe rate to every tenth of a second.
That prompted Jeff Janes to complain that the log-spam problem had become
much worse. In the ensuing discussion, Andres Freund pointed out that
we could dispense with connection attempts altogether if the postmaster
were changed to report its status in postmaster.pid, which "pg_ctl start"
already relies on being able to read. This patch implements that, teaching
postmaster.c to report a status string into the pidfile at the same
state-change points already identified as being of interest for systemd
status reporting (cf commit 7d17e683f). pg_ctl no longer needs to link
with libpq at all; all its functions now depend on reading server files.
In support of this, teach AddToDataDirLockFile() to allow addition of
postmaster.pid lines in not-necessarily-sequential order. This is needed
on Windows where the SHMEM_KEY line will never be written at all. We still
have the restriction that we don't want to truncate the pidfile; document
the reasons for that a bit better.
Also, fix the pg_ctl TAP tests so they'll notice if "start -w" mode
is broken --- before, they'd just wait out the sixty seconds until
the loop gives up, and then report success anyway. (Yes, I found that
out the hard way.)
While at it, arrange for pg_ctl to not need to #include miscadmin.h;
as a rather low-level backend header, requiring that to be compilable
client-side is pretty dubious. This requires moving the #define's
associated with the pidfile into a new header file, and moving
PG_BACKEND_VERSIONSTR someplace else. For lack of a clearly better
"someplace else", I put it into port.h, beside the declaration of
find_other_exec(), since most users of that macro are passing the value to
find_other_exec(). (initdb still depends on miscadmin.h, but at least
pg_ctl and pg_upgrade no longer do.)
In passing, fix main.c so that PG_BACKEND_VERSIONSTR actually defines the
output of "postgres -V", which remarkably it had never done before.
Discussion: https://postgr.es/m/CAMkU=1xJW8e+CTotojOMBd-yzUvD0e_JZu2xHo=MnuZ4__m7Pg@mail.gmail.com
2017-06-28 23:31:24 +02:00
|
|
|
}
|
2015-11-17 12:46:17 +01:00
|
|
|
|
2009-02-23 10:28:50 +01:00
|
|
|
pmState = PM_RECOVERY;
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
}
|
2010-05-15 22:01:32 +02:00
|
|
|
if (CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) &&
|
Don't launch new child processes after we've been told to shut down.
Once we've received a shutdown signal (SIGINT or SIGTERM), we should not
launch any more child processes, even if we get signals requesting such.
The normal code path for spawning backends has always understood that,
but the postmaster's infrastructure for hot standby and autovacuum didn't
get the memo. As reported by Hari Babu in bug #7643, this could lead to
failure to shut down at all in some cases, such as when SIGINT is received
just before the startup process sends PMSIGNAL_RECOVERY_STARTED: we'd
launch a bgwriter and checkpointer, and then those processes would have no
idea that they ought to quit. Similarly, launching a new autovacuum worker
would result in waiting till it finished before shutting down.
Also, switch the order of the code blocks in reaper() that detect startup
process crash versus shutdown termination. Once we've sent it a signal,
we should not consider that exit(1) is surprising. This is just a cosmetic
fix since shutdown occurs correctly anyway, but better not to log a phony
complaint about startup process crash.
Back-patch to 9.0. Some parts of this might be applicable before that,
but given the lack of prior complaints I'm not going to worry too much
about older branches.
2012-11-21 21:18:38 +01:00
|
|
|
pmState == PM_RECOVERY && Shutdown == NoShutdown)
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
{
|
2009-02-23 10:28:50 +01:00
|
|
|
/*
|
|
|
|
* Likewise, start other special children as needed.
|
|
|
|
*/
|
|
|
|
Assert(PgStatPID == 0);
|
|
|
|
PgStatPID = pgstat_start();
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-13 02:17:06 +01:00
|
|
|
|
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 02:32:45 +01:00
|
|
|
ereport(LOG,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
(errmsg("database system is ready to accept read only connections")));
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-13 02:17:06 +01:00
|
|
|
|
Change pg_ctl to detect server-ready by watching status in postmaster.pid.
Traditionally, "pg_ctl start -w" has waited for the server to become
ready to accept connections by attempting a connection once per second.
That has the major problem that connection issues (for instance, a
kernel packet filter blocking traffic) can't be reliably told apart
from server startup issues, and the minor problem that if server startup
isn't quick, we accumulate "the database system is starting up" spam
in the server log. We've hacked around many of the possible connection
issues, but it resulted in ugly and complicated code in pg_ctl.c.
In commit c61559ec3, I changed the probe rate to every tenth of a second.
That prompted Jeff Janes to complain that the log-spam problem had become
much worse. In the ensuing discussion, Andres Freund pointed out that
we could dispense with connection attempts altogether if the postmaster
were changed to report its status in postmaster.pid, which "pg_ctl start"
already relies on being able to read. This patch implements that, teaching
postmaster.c to report a status string into the pidfile at the same
state-change points already identified as being of interest for systemd
status reporting (cf commit 7d17e683f). pg_ctl no longer needs to link
with libpq at all; all its functions now depend on reading server files.
In support of this, teach AddToDataDirLockFile() to allow addition of
postmaster.pid lines in not-necessarily-sequential order. This is needed
on Windows where the SHMEM_KEY line will never be written at all. We still
have the restriction that we don't want to truncate the pidfile; document
the reasons for that a bit better.
Also, fix the pg_ctl TAP tests so they'll notice if "start -w" mode
is broken --- before, they'd just wait out the sixty seconds until
the loop gives up, and then report success anyway. (Yes, I found that
out the hard way.)
While at it, arrange for pg_ctl to not need to #include miscadmin.h;
as a rather low-level backend header, requiring that to be compilable
client-side is pretty dubious. This requires moving the #define's
associated with the pidfile into a new header file, and moving
PG_BACKEND_VERSIONSTR someplace else. For lack of a clearly better
"someplace else", I put it into port.h, beside the declaration of
find_other_exec(), since most users of that macro are passing the value to
find_other_exec(). (initdb still depends on miscadmin.h, but at least
pg_ctl and pg_upgrade no longer do.)
In passing, fix main.c so that PG_BACKEND_VERSIONSTR actually defines the
output of "postgres -V", which remarkably it had never done before.
Discussion: https://postgr.es/m/CAMkU=1xJW8e+CTotojOMBd-yzUvD0e_JZu2xHo=MnuZ4__m7Pg@mail.gmail.com
2017-06-28 23:31:24 +02:00
|
|
|
/* Report status */
|
|
|
|
AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_READY);
|
2015-11-17 12:46:17 +01:00
|
|
|
#ifdef USE_SYSTEMD
|
|
|
|
sd_notify(0, "READY=1");
|
|
|
|
#endif
|
|
|
|
|
2010-05-15 22:01:32 +02:00
|
|
|
pmState = PM_HOT_STANDBY;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/* Some workers may be scheduled to start now */
|
Eliminate one background-worker-related flag variable.
Teach sigusr1_handler() to use the same test for whether a worker
might need to be started as ServerLoop(). Aside from being perhaps
a bit simpler, this prevents a potentially-unbounded delay when
starting a background worker. On some platforms, select() doesn't
return when interrupted by a signal, but is instead restarted,
including a reset of the timeout to the originally-requested value.
If signals arrive often enough, but no connection requests arrive,
sigusr1_handler() will be executed repeatedly, but the body of
ServerLoop() won't be reached. This change ensures that, even in
that case, background workers will eventually get launched.
This is far from a perfect fix; really, we need select() to return
control to ServerLoop() after an interrupt, either via the self-pipe
trick or some other mechanism. But that's going to require more
work and discussion, so let's do this for now to at least mitigate
the damage.
Per investigation of test_shm_mq failures on buildfarm member anole.
2014-10-05 03:25:41 +02:00
|
|
|
StartWorkerNeeded = true;
|
2009-02-23 10:28:50 +01:00
|
|
|
}
|
Start background writer during archive recovery. Background writer now performs
its usual buffer cleaning duties during archive recovery, and it's responsible
for performing restartpoints.
This requires some changes in postmaster. When the startup process has done
all the initialization and is ready to start WAL redo, it signals the
postmaster to launch the background writer. The postmaster is signaled again
when the point in recovery is reached where we know that the database is in
consistent state. Postmaster isn't interested in that at the moment, but
that's the point where we could let other backends in to perform read-only
queries. The postmaster is signaled third time when the recovery has ended,
so that postmaster knows that it's safe to start accepting connections.
The startup process now traps SIGTERM, and performs a "clean" shutdown. If
you do a fast shutdown during recovery, a shutdown restartpoint is performed,
like a shutdown checkpoint, and postmaster kills the processes cleanly. You
still have to continue the recovery at next startup, though.
Currently, the background writer is only launched during archive recovery.
We could launch it during crash recovery as well, but it seems better to keep
that codepath as simple as possible, for the sake of robustness. And it
couldn't do any restartpoints during crash recovery anyway, so it wouldn't be
that useful.
log_restartpoints is gone. Use log_checkpoints instead. This is yet to be
documented.
This whole operation is a pre-requisite for Hot Standby, but has some value of
its own whether the hot standby patch makes 8.4 or not.
Simon Riggs, with lots of modifications by me.
2009-02-18 16:58:41 +01:00
|
|
|
|
Eliminate one background-worker-related flag variable.
Teach sigusr1_handler() to use the same test for whether a worker
might need to be started as ServerLoop(). Aside from being perhaps
a bit simpler, this prevents a potentially-unbounded delay when
starting a background worker. On some platforms, select() doesn't
return when interrupted by a signal, but is instead restarted,
including a reset of the timeout to the originally-requested value.
If signals arrive often enough, but no connection requests arrive,
sigusr1_handler() will be executed repeatedly, but the body of
ServerLoop() won't be reached. This change ensures that, even in
that case, background workers will eventually get launched.
This is far from a perfect fix; really, we need select() to return
control to ServerLoop() after an interrupt, either via the self-pipe
trick or some other mechanism. But that's going to require more
work and discussion, so let's do this for now to at least mitigate
the damage.
Per investigation of test_shm_mq failures on buildfarm member anole.
2014-10-05 03:25:41 +02:00
|
|
|
if (StartWorkerNeeded || HaveCrashedWorker)
|
Allow multiple bgworkers to be launched per postmaster iteration.
Previously, maybe_start_bgworker() would launch at most one bgworker
process per call, on the grounds that the postmaster might otherwise
neglect its other duties for too long. However, that seems overly
conservative, especially since bad effects only become obvious when
many hundreds of bgworkers need to be launched at once. On the other
side of the coin is that the existing logic could result in substantial
delay of bgworker launches, because ServerLoop isn't guaranteed to
iterate immediately after a signal arrives. (My attempt to fix that
by using pselect(2) encountered too many portability question marks,
and in any case could not help on platforms without pselect().)
One could also question the wisdom of using an O(N^2) processing
method if the system is intended to support so many bgworkers.
As a compromise, allow that function to launch up to 100 bgworkers
per call (and in consequence, rename it to maybe_start_bgworkers).
This will allow any normal parallel-query request for workers
to be satisfied immediately during sigusr1_handler, avoiding the
question of whether ServerLoop will be able to launch more promptly.
There is talk of rewriting the postmaster to use a WaitEventSet to
avoid the signal-response-delay problem, but I'd argue that this change
should be kept even after that happens (if it ever does).
Backpatch to 9.6 where parallel query was added. The issue exists
before that, but previous uses of bgworkers typically aren't as
sensitive to how quickly they get launched.
Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
2017-04-26 22:17:29 +02:00
|
|
|
maybe_start_bgworkers();
|
Allow background workers to be started dynamically.
There is a new API, RegisterDynamicBackgroundWorker, which allows
an ordinary user backend to register a new background writer during
normal running. This means that it's no longer necessary for all
background workers to be registered during processing of
shared_preload_libraries, although the option of registering workers
at that time remains available.
When a background worker exits and will not be restarted, the
slot previously used by that background worker is automatically
released and becomes available for reuse. Slots used by background
workers that are configured for automatic restart can't (yet) be
released without shutting down the system.
This commit adds a new source file, bgworker.c, and moves some
of the existing control logic for background workers there.
Previously, there was little enough logic that it made sense to
keep everything in postmaster.c, but not any more.
This commit also makes the worker_spi contrib module into an
extension and adds a new function, worker_spi_launch, which can
be used to demonstrate the new facility.
2013-07-16 19:02:15 +02:00
|
|
|
|
2005-08-12 20:23:56 +02:00
|
|
|
if (CheckPostmasterSignal(PMSIGNAL_WAKEN_ARCHIVER) &&
|
2008-01-11 01:54:09 +01:00
|
|
|
PgArchPID != 0)
|
2004-07-19 04:47:16 +02:00
|
|
|
{
|
2005-08-12 20:23:56 +02:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* Send SIGUSR1 to archiver process, to wake it up and begin archiving
|
2017-05-12 17:49:56 +02:00
|
|
|
* next WAL file.
|
2005-08-12 20:23:56 +02:00
|
|
|
*/
|
2006-11-21 21:59:53 +01:00
|
|
|
signal_child(PgArchPID, SIGUSR1);
|
2004-08-29 07:07:03 +02:00
|
|
|
}
|
2001-11-04 20:55:31 +01:00
|
|
|
|
2005-08-12 20:23:56 +02:00
|
|
|
if (CheckPostmasterSignal(PMSIGNAL_ROTATE_LOGFILE) &&
|
|
|
|
SysLoggerPID != 0)
|
|
|
|
{
|
|
|
|
/* Tell syslogger to rotate logfile */
|
2006-11-21 21:59:53 +01:00
|
|
|
signal_child(SysLoggerPID, SIGUSR1);
|
2005-08-12 20:23:56 +02:00
|
|
|
}
|
2005-08-12 05:25:13 +02:00
|
|
|
|
Don't launch new child processes after we've been told to shut down.
Once we've received a shutdown signal (SIGINT or SIGTERM), we should not
launch any more child processes, even if we get signals requesting such.
The normal code path for spawning backends has always understood that,
but the postmaster's infrastructure for hot standby and autovacuum didn't
get the memo. As reported by Hari Babu in bug #7643, this could lead to
failure to shut down at all in some cases, such as when SIGINT is received
just before the startup process sends PMSIGNAL_RECOVERY_STARTED: we'd
launch a bgwriter and checkpointer, and then those processes would have no
idea that they ought to quit. Similarly, launching a new autovacuum worker
would result in waiting till it finished before shutting down.
Also, switch the order of the code blocks in reaper() that detect startup
process crash versus shutdown termination. Once we've sent it a signal,
we should not consider that exit(1) is surprising. This is just a cosmetic
fix since shutdown occurs correctly anyway, but better not to log a phony
complaint about startup process crash.
Back-patch to 9.0. Some parts of this might be applicable before that,
but given the lack of prior complaints I'm not going to worry too much
about older branches.
2012-11-21 21:18:38 +01:00
|
|
|
if (CheckPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER) &&
|
|
|
|
Shutdown == NoShutdown)
|
2006-07-10 18:20:52 +02:00
|
|
|
{
|
Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 23:42:10 +01:00
|
|
|
/*
|
|
|
|
* Start one iteration of the autovacuum daemon, even if autovacuuming
|
|
|
|
* is nominally not enabled. This is so we can have an active defense
|
|
|
|
* against transaction ID wraparound. We set a flag for the main loop
|
|
|
|
* to do it rather than trying to do it here --- this is because the
|
|
|
|
* autovac process itself may send the signal, and we want to handle
|
|
|
|
* that by launching another iteration as soon as the current one
|
|
|
|
* completes.
|
|
|
|
*/
|
2007-02-16 00:23:23 +01:00
|
|
|
start_autovac_launcher = true;
|
2006-07-10 18:20:52 +02:00
|
|
|
}
|
|
|
|
|
Don't launch new child processes after we've been told to shut down.
Once we've received a shutdown signal (SIGINT or SIGTERM), we should not
launch any more child processes, even if we get signals requesting such.
The normal code path for spawning backends has always understood that,
but the postmaster's infrastructure for hot standby and autovacuum didn't
get the memo. As reported by Hari Babu in bug #7643, this could lead to
failure to shut down at all in some cases, such as when SIGINT is received
just before the startup process sends PMSIGNAL_RECOVERY_STARTED: we'd
launch a bgwriter and checkpointer, and then those processes would have no
idea that they ought to quit. Similarly, launching a new autovacuum worker
would result in waiting till it finished before shutting down.
Also, switch the order of the code blocks in reaper() that detect startup
process crash versus shutdown termination. Once we've sent it a signal,
we should not consider that exit(1) is surprising. This is just a cosmetic
fix since shutdown occurs correctly anyway, but better not to log a phony
complaint about startup process crash.
Back-patch to 9.0. Some parts of this might be applicable before that,
but given the lack of prior complaints I'm not going to worry too much
about older branches.
2012-11-21 21:18:38 +01:00
|
|
|
if (CheckPostmasterSignal(PMSIGNAL_START_AUTOVAC_WORKER) &&
|
|
|
|
Shutdown == NoShutdown)
|
2007-07-24 06:54:09 +02:00
|
|
|
{
|
|
|
|
/* The autovacuum launcher wants us to start a worker process. */
|
2007-02-16 00:23:23 +01:00
|
|
|
StartAutovacuumWorker();
|
2007-07-24 06:54:09 +02:00
|
|
|
}
|
2007-02-16 00:23:23 +01:00
|
|
|
|
Don't lose walreceiver start requests due to race condition in postmaster.
When a walreceiver dies, the startup process will notice that and send
a PMSIGNAL_START_WALRECEIVER signal to the postmaster, asking for a new
walreceiver to be launched. There's a race condition, which at least
in HEAD is very easy to hit, whereby the postmaster might see that
signal before it processes the SIGCHLD from the walreceiver process.
In that situation, sigusr1_handler() just dropped the start request
on the floor, reasoning that it must be redundant. Eventually, after
10 seconds (WALRCV_STARTUP_TIMEOUT), the startup process would make a
fresh request --- but that's a long time if the connection could have
been re-established almost immediately.
Fix it by setting a state flag inside the postmaster that we won't
clear until we do launch a walreceiver. In cases where that results
in an extra walreceiver launch, it's up to the walreceiver to realize
it's unwanted and go away --- but we have, and need, that logic anyway
for the opposite race case.
I came across this through investigating unexpected delays in the
src/test/recovery TAP tests: it manifests there in test cases where
a master server is stopped and restarted while leaving streaming
slaves active.
This logic has been broken all along, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/21344.1498494720@sss.pgh.pa.us
2017-06-26 23:31:56 +02:00
|
|
|
if (CheckPostmasterSignal(PMSIGNAL_START_WALRECEIVER))
|
2010-01-15 10:19:10 +01:00
|
|
|
{
|
|
|
|
/* Startup Process wants us to start the walreceiver process. */
|
Don't lose walreceiver start requests due to race condition in postmaster.
When a walreceiver dies, the startup process will notice that and send
a PMSIGNAL_START_WALRECEIVER signal to the postmaster, asking for a new
walreceiver to be launched. There's a race condition, which at least
in HEAD is very easy to hit, whereby the postmaster might see that
signal before it processes the SIGCHLD from the walreceiver process.
In that situation, sigusr1_handler() just dropped the start request
on the floor, reasoning that it must be redundant. Eventually, after
10 seconds (WALRCV_STARTUP_TIMEOUT), the startup process would make a
fresh request --- but that's a long time if the connection could have
been re-established almost immediately.
Fix it by setting a state flag inside the postmaster that we won't
clear until we do launch a walreceiver. In cases where that results
in an extra walreceiver launch, it's up to the walreceiver to realize
it's unwanted and go away --- but we have, and need, that logic anyway
for the opposite race case.
I came across this through investigating unexpected delays in the
src/test/recovery TAP tests: it manifests there in test cases where
a master server is stopped and restarted while leaving streaming
slaves active.
This logic has been broken all along, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/21344.1498494720@sss.pgh.pa.us
2017-06-26 23:31:56 +02:00
|
|
|
/* Start immediately if possible, else remember request for later. */
|
|
|
|
WalReceiverRequested = true;
|
|
|
|
MaybeStartWalReceiver();
|
2010-01-15 10:19:10 +01:00
|
|
|
}
|
|
|
|
|
2011-04-04 01:42:00 +02:00
|
|
|
if (CheckPostmasterSignal(PMSIGNAL_ADVANCE_STATE_MACHINE) &&
|
|
|
|
(pmState == PM_WAIT_BACKUP || pmState == PM_WAIT_BACKENDS))
|
|
|
|
{
|
|
|
|
/* Advance postmaster's state machine */
|
|
|
|
PostmasterStateMachine();
|
|
|
|
}
|
|
|
|
|
2011-02-16 03:28:48 +01:00
|
|
|
if (CheckPromoteSignal() && StartupPID != 0 &&
|
|
|
|
(pmState == PM_STARTUP || pmState == PM_RECOVERY ||
|
|
|
|
pmState == PM_HOT_STANDBY || pmState == PM_WAIT_READONLY))
|
|
|
|
{
|
|
|
|
/* Tell startup process to finish recovery */
|
|
|
|
signal_child(StartupPID, SIGUSR2);
|
|
|
|
}
|
|
|
|
|
2001-11-04 20:55:31 +01:00
|
|
|
PG_SETMASK(&UnBlockSig);
|
|
|
|
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-13 02:17:06 +01:00
|
|
|
errno = save_errno;
|
|
|
|
}
|
|
|
|
|
2009-08-29 21:26:52 +02:00
|
|
|
/*
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
* SIGTERM or SIGQUIT while processing startup packet.
|
|
|
|
* Clean up and exit(1).
|
2009-08-29 21:26:52 +02:00
|
|
|
*
|
|
|
|
* XXX: possible future improvement: try to send a message indicating
|
|
|
|
* why we are disconnecting. Problem is to be sure we don't block while
|
|
|
|
* doing so, nor mess up SSL initialization. In practice, if the client
|
|
|
|
* has wedged here, it probably couldn't do anything with the message anyway.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
startup_die(SIGNAL_ARGS)
|
|
|
|
{
|
|
|
|
proc_exit(1);
|
|
|
|
}
|
1997-12-04 01:28:15 +01:00
|
|
|
|
2001-11-04 20:55:31 +01:00
|
|
|
/*
|
|
|
|
* Dummy signal handler
|
|
|
|
*
|
|
|
|
* We use this for signals that we don't actually use in the postmaster,
|
2004-05-30 00:48:23 +02:00
|
|
|
* but we do use in backends. If we were to SIG_IGN such signals in the
|
|
|
|
* postmaster, then a newly started backend might drop a signal that arrives
|
|
|
|
* before it's able to reconfigure its signal processing. (See notes in
|
|
|
|
* tcop/postgres.c.)
|
2001-11-04 20:55:31 +01:00
|
|
|
*/
|
|
|
|
static void
|
|
|
|
dummy_handler(SIGNAL_ARGS)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
Introduce timeout handling framework
Management of timeouts was getting a little cumbersome; what we
originally had was more than enough back when we were only concerned
about deadlocks and query cancel; however, when we added timeouts for
standby processes, the code got considerably messier. Since there are
plans to add more complex timeouts, this seems a good time to introduce
a central timeout handling module.
External modules register their timeout handlers during process
initialization, and later enable and disable them as they see fit using
a simple API; timeout.c is in charge of keeping track of which timeouts
are in effect at any time, installing a common SIGALRM signal handler,
and calling setitimer() as appropriate to ensure timely firing of
external handlers.
timeout.c additionally supports pluggable modules to add their own
timeouts, though this capability isn't exercised anywhere yet.
Additionally, as of this commit, walsender processes are aware of
timeouts; we had a preexisting bug there that made those ignore SIGALRM,
thus being subject to unhandled deadlocks, particularly during the
authentication phase. This has already been fixed in back branches in
commit 0bf8eb2a, which see for more details.
Main author: Zoltán Böszörményi
Some review and cleanup by Álvaro Herrera
Extensive reworking by Tom Lane
2012-07-17 00:43:21 +02:00
|
|
|
/*
|
|
|
|
* Timeout while processing startup packet.
|
|
|
|
* As for startup_die(), we clean up and exit(1).
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
StartupPacketTimeoutHandler(void)
|
|
|
|
{
|
|
|
|
proc_exit(1);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2016-10-18 15:28:23 +02:00
|
|
|
/*
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
* Generate a random cancel key.
|
2016-10-18 15:28:23 +02:00
|
|
|
*/
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
static bool
|
|
|
|
RandomCancelKey(int32 *cancel_key)
|
2016-10-18 15:28:23 +02:00
|
|
|
{
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
#ifdef HAVE_STRONG_RANDOM
|
|
|
|
return pg_strong_random((char *) cancel_key, sizeof(int32));
|
|
|
|
#else
|
2017-05-17 22:31:56 +02:00
|
|
|
|
2016-10-18 15:28:23 +02:00
|
|
|
/*
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
* If built with --disable-strong-random, use plain old erand48.
|
|
|
|
*
|
2017-05-17 22:31:56 +02:00
|
|
|
* We cannot use pg_backend_random() in postmaster, because it stores its
|
|
|
|
* state in shared memory.
|
2016-10-18 15:28:23 +02:00
|
|
|
*/
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
static unsigned short seed[3];
|
2016-10-18 15:28:23 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Select a random seed at the time of first receiving a request.
|
|
|
|
*/
|
|
|
|
if (random_seed == 0)
|
|
|
|
{
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
struct timeval random_stop_time;
|
2016-10-18 15:28:23 +02:00
|
|
|
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
gettimeofday(&random_stop_time, NULL);
|
2016-10-18 15:28:23 +02:00
|
|
|
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
seed[0] = (unsigned short) random_start_time.tv_usec;
|
|
|
|
seed[1] = (unsigned short) (random_stop_time.tv_usec) ^ (random_start_time.tv_usec >> 16);
|
|
|
|
seed[2] = (unsigned short) (random_stop_time.tv_usec >> 16);
|
2016-10-18 15:28:23 +02:00
|
|
|
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
random_seed = 1;
|
2016-10-18 15:28:23 +02:00
|
|
|
}
|
|
|
|
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
*cancel_key = pg_jrand48(seed);
|
|
|
|
|
|
|
|
return true;
|
|
|
|
#endif
|
2016-10-18 15:28:23 +02:00
|
|
|
}
|
|
|
|
|
1999-01-30 21:04:37 +01:00
|
|
|
/*
|
2017-02-06 10:33:58 +01:00
|
|
|
* Count up number of child processes of specified types (dead_end children
|
2010-01-15 10:19:10 +01:00
|
|
|
* are always excluded).
|
1999-01-30 21:04:37 +01:00
|
|
|
*/
|
|
|
|
static int
|
2010-01-15 10:19:10 +01:00
|
|
|
CountChildren(int target)
|
1999-01-30 21:04:37 +01:00
|
|
|
{
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
dlist_iter iter;
|
1999-01-30 21:04:37 +01:00
|
|
|
int cnt = 0;
|
|
|
|
|
2012-10-16 22:36:30 +02:00
|
|
|
dlist_foreach(iter, &BackendList)
|
2007-08-09 03:18:43 +02:00
|
|
|
{
|
2012-10-16 22:36:30 +02:00
|
|
|
Backend *bp = dlist_container(Backend, elem, iter.cur);
|
2007-08-09 03:18:43 +02:00
|
|
|
|
2010-01-15 10:19:10 +01:00
|
|
|
if (bp->dead_end)
|
|
|
|
continue;
|
2011-01-22 04:20:06 +01:00
|
|
|
|
|
|
|
/*
|
2011-04-10 17:42:00 +02:00
|
|
|
* Since target == BACKEND_TYPE_ALL is the most common case, we test
|
|
|
|
* it first and avoid touching shared memory for every child.
|
2011-01-22 04:20:06 +01:00
|
|
|
*/
|
|
|
|
if (target != BACKEND_TYPE_ALL)
|
|
|
|
{
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/*
|
|
|
|
* Assign bkend_type for any recently announced WAL Sender
|
|
|
|
* processes.
|
|
|
|
*/
|
|
|
|
if (bp->bkend_type == BACKEND_TYPE_NORMAL &&
|
|
|
|
IsPostmasterChildWalSender(bp->child_slot))
|
|
|
|
bp->bkend_type = BACKEND_TYPE_WALSND;
|
2011-01-22 04:20:06 +01:00
|
|
|
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
if (!(target & bp->bkend_type))
|
2011-01-22 04:20:06 +01:00
|
|
|
continue;
|
|
|
|
}
|
2010-01-15 10:19:10 +01:00
|
|
|
|
|
|
|
cnt++;
|
2007-08-09 03:18:43 +02:00
|
|
|
}
|
1999-01-30 21:04:37 +01:00
|
|
|
return cnt;
|
|
|
|
}
|
1999-09-27 05:13:16 +02:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-13 02:17:06 +01:00
|
|
|
/*
|
2007-03-07 14:35:03 +01:00
|
|
|
* StartChildProcess -- start an auxiliary process for the postmaster
|
2001-03-14 18:58:46 +01:00
|
|
|
*
|
2016-08-03 20:48:05 +02:00
|
|
|
* "type" determines what kind of child will be started. All child types
|
2007-03-07 14:35:03 +01:00
|
|
|
* initially go to AuxiliaryProcessMain, which will handle common setup.
|
2004-01-07 00:15:22 +01:00
|
|
|
*
|
2004-05-30 00:48:23 +02:00
|
|
|
* Return value of StartChildProcess is subprocess' PID, or 0 if failed
|
|
|
|
* to start subprocess.
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-13 02:17:06 +01:00
|
|
|
*/
|
2004-05-28 07:13:32 +02:00
|
|
|
static pid_t
|
2007-03-07 14:35:03 +01:00
|
|
|
StartChildProcess(AuxProcType type)
|
2003-12-25 04:52:51 +01:00
|
|
|
{
|
2004-05-28 07:13:32 +02:00
|
|
|
pid_t pid;
|
|
|
|
char *av[10];
|
|
|
|
int ac = 0;
|
2007-03-07 14:35:03 +01:00
|
|
|
char typebuf[32];
|
2004-08-29 07:07:03 +02:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
/*
|
|
|
|
* Set up command-line arguments for subprocess
|
|
|
|
*/
|
|
|
|
av[ac++] = "postgres";
|
2003-12-25 04:52:51 +01:00
|
|
|
|
2004-01-28 22:02:40 +01:00
|
|
|
#ifdef EXEC_BACKEND
|
2006-06-18 17:38:37 +02:00
|
|
|
av[ac++] = "--forkboot";
|
2004-05-28 07:13:32 +02:00
|
|
|
av[ac++] = NULL; /* filled in by postmaster_forkexec */
|
2004-01-28 22:02:40 +01:00
|
|
|
#endif
|
|
|
|
|
2007-03-07 14:35:03 +01:00
|
|
|
snprintf(typebuf, sizeof(typebuf), "-x%d", type);
|
|
|
|
av[ac++] = typebuf;
|
2004-01-28 22:02:40 +01:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
av[ac] = NULL;
|
|
|
|
Assert(ac < lengthof(av));
|
2003-12-25 04:52:51 +01:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
#ifdef EXEC_BACKEND
|
|
|
|
pid = postmaster_forkexec(ac, av);
|
2004-08-29 07:07:03 +02:00
|
|
|
#else /* !EXEC_BACKEND */
|
2005-03-10 08:14:03 +01:00
|
|
|
pid = fork_process();
|
2003-08-04 02:43:34 +02:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
if (pid == 0) /* child */
|
|
|
|
{
|
2015-01-13 13:12:37 +01:00
|
|
|
InitPostmasterChild();
|
2004-05-28 07:13:32 +02:00
|
|
|
|
2001-02-08 01:35:10 +01:00
|
|
|
/* Close the postmaster's sockets */
|
2004-08-06 01:32:13 +02:00
|
|
|
ClosePostmasterPorts(false);
|
2001-02-08 01:35:10 +01:00
|
|
|
|
2004-07-31 02:45:57 +02:00
|
|
|
/* Release postmaster's working memory context */
|
|
|
|
MemoryContextSwitchTo(TopMemoryContext);
|
|
|
|
MemoryContextDelete(PostmasterContext);
|
|
|
|
PostmasterContext = NULL;
|
|
|
|
|
2007-03-07 14:35:03 +01:00
|
|
|
AuxiliaryProcessMain(ac, av);
|
2000-11-29 21:59:54 +01:00
|
|
|
ExitPostmaster(0);
|
1999-10-06 23:58:18 +02:00
|
|
|
}
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
#endif /* EXEC_BACKEND */
|
2004-05-28 07:13:32 +02:00
|
|
|
|
1999-10-06 23:58:18 +02:00
|
|
|
if (pid < 0)
|
|
|
|
{
|
2004-05-28 07:13:32 +02:00
|
|
|
/* in parent, fork failed */
|
|
|
|
int save_errno = errno;
|
2005-10-15 04:49:52 +02:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
errno = save_errno;
|
2007-03-07 14:35:03 +01:00
|
|
|
switch (type)
|
2001-08-30 21:02:42 +02:00
|
|
|
{
|
2007-03-07 14:35:03 +01:00
|
|
|
case StartupProcess:
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(LOG,
|
|
|
|
(errmsg("could not fork startup process: %m")));
|
2001-08-30 21:02:42 +02:00
|
|
|
break;
|
2007-03-07 14:35:03 +01:00
|
|
|
case BgWriterProcess:
|
2003-11-19 16:55:08 +01:00
|
|
|
ereport(LOG,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
(errmsg("could not fork background writer process: %m")));
|
2002-03-04 02:46:04 +01:00
|
|
|
break;
|
2011-11-01 18:14:47 +01:00
|
|
|
case CheckpointerProcess:
|
|
|
|
ereport(LOG,
|
2012-06-10 21:20:04 +02:00
|
|
|
(errmsg("could not fork checkpointer process: %m")));
|
2011-11-01 18:14:47 +01:00
|
|
|
break;
|
2007-07-24 06:54:09 +02:00
|
|
|
case WalWriterProcess:
|
|
|
|
ereport(LOG,
|
2007-11-15 22:14:46 +01:00
|
|
|
(errmsg("could not fork WAL writer process: %m")));
|
2007-07-24 06:54:09 +02:00
|
|
|
break;
|
2010-01-15 10:19:10 +01:00
|
|
|
case WalReceiverProcess:
|
|
|
|
ereport(LOG,
|
|
|
|
(errmsg("could not fork WAL receiver process: %m")));
|
|
|
|
break;
|
2001-08-30 21:02:42 +02:00
|
|
|
default:
|
2003-07-22 21:00:12 +02:00
|
|
|
ereport(LOG,
|
|
|
|
(errmsg("could not fork process: %m")));
|
2001-08-30 21:02:42 +02:00
|
|
|
break;
|
|
|
|
}
|
2001-03-22 05:01:46 +01:00
|
|
|
|
2001-03-14 18:58:46 +01:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* fork failure is fatal during startup, but there's no need to choke
|
|
|
|
* immediately if starting other child types fails.
|
2001-03-14 18:58:46 +01:00
|
|
|
*/
|
2007-03-07 14:35:03 +01:00
|
|
|
if (type == StartupProcess)
|
2004-05-30 00:48:23 +02:00
|
|
|
ExitPostmaster(1);
|
|
|
|
return 0;
|
1999-10-06 23:58:18 +02:00
|
|
|
}
|
|
|
|
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-13 02:17:06 +01:00
|
|
|
/*
|
2004-05-28 07:13:32 +02:00
|
|
|
* in parent, successful fork
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-13 02:17:06 +01:00
|
|
|
*/
|
2001-03-14 18:58:46 +01:00
|
|
|
return pid;
|
1999-10-06 23:58:18 +02:00
|
|
|
}
|
1999-12-03 07:26:34 +01:00
|
|
|
|
2007-02-16 00:23:23 +01:00
|
|
|
/*
|
|
|
|
* StartAutovacuumWorker
|
|
|
|
* Start an autovac worker process.
|
|
|
|
*
|
|
|
|
* This function is here because it enters the resulting PID into the
|
|
|
|
* postmaster's private backends list.
|
|
|
|
*
|
|
|
|
* NB -- this code very roughly matches BackendStartup.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
StartAutovacuumWorker(void)
|
|
|
|
{
|
2007-11-15 22:14:46 +01:00
|
|
|
Backend *bn;
|
2007-02-16 00:23:23 +01:00
|
|
|
|
|
|
|
/*
|
2007-08-09 03:18:43 +02:00
|
|
|
* If not in condition to run a process, don't try, but handle it like a
|
|
|
|
* fork failure. This does not normally happen, since the signal is only
|
|
|
|
* supposed to be sent by autovacuum launcher when it's OK to do it, but
|
|
|
|
* we have to check to avoid race-condition problems during DB state
|
|
|
|
* changes.
|
2007-02-16 00:23:23 +01:00
|
|
|
*/
|
2007-08-09 03:18:43 +02:00
|
|
|
if (canAcceptConnections() == CAC_OK)
|
2007-02-16 00:23:23 +01:00
|
|
|
{
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
/*
|
2017-05-17 22:31:56 +02:00
|
|
|
* Compute the cancel key that will be assigned to this session. We
|
|
|
|
* probably don't need cancel keys for autovac workers, but we'd
|
|
|
|
* better have something random in the field to prevent unfriendly
|
|
|
|
* people from sending cancels to them.
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
*/
|
|
|
|
if (!RandomCancelKey(&MyCancelKey))
|
|
|
|
{
|
|
|
|
ereport(LOG,
|
|
|
|
(errcode(ERRCODE_INTERNAL_ERROR),
|
2016-12-12 10:55:32 +01:00
|
|
|
errmsg("could not generate random cancel key")));
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
bn = (Backend *) malloc(sizeof(Backend));
|
|
|
|
if (bn)
|
2007-06-25 18:09:03 +02:00
|
|
|
{
|
2016-10-18 15:28:23 +02:00
|
|
|
bn->cancel_key = MyCancelKey;
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
|
2016-10-18 15:28:23 +02:00
|
|
|
/* Autovac workers are not dead_end and need a child slot */
|
|
|
|
bn->dead_end = false;
|
|
|
|
bn->child_slot = MyPMChildSlot = AssignPostmasterChildSlot();
|
|
|
|
bn->bgworker_notify = false;
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
|
2016-10-18 15:28:23 +02:00
|
|
|
bn->pid = StartAutoVacWorker();
|
|
|
|
if (bn->pid > 0)
|
|
|
|
{
|
|
|
|
bn->bkend_type = BACKEND_TYPE_AUTOVAC;
|
|
|
|
dlist_push_head(&BackendList, &bn->elem);
|
2007-02-16 00:23:23 +01:00
|
|
|
#ifdef EXEC_BACKEND
|
2016-10-18 15:28:23 +02:00
|
|
|
ShmemBackendArrayAdd(bn);
|
2007-02-16 00:23:23 +01:00
|
|
|
#endif
|
2016-10-18 15:28:23 +02:00
|
|
|
/* all OK */
|
|
|
|
return;
|
2007-08-09 03:18:43 +02:00
|
|
|
}
|
2007-06-25 18:09:03 +02:00
|
|
|
|
2016-10-18 15:28:23 +02:00
|
|
|
/*
|
|
|
|
* fork failed, fall through to report -- actual error message was
|
|
|
|
* logged by StartAutoVacWorker
|
|
|
|
*/
|
|
|
|
(void) ReleasePostmasterChildSlot(bn->child_slot);
|
2007-08-09 03:18:43 +02:00
|
|
|
free(bn);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
ereport(LOG,
|
|
|
|
(errcode(ERRCODE_OUT_OF_MEMORY),
|
|
|
|
errmsg("out of memory")));
|
2007-02-16 00:23:23 +01:00
|
|
|
}
|
2007-06-25 18:09:03 +02:00
|
|
|
|
2007-08-09 03:18:43 +02:00
|
|
|
/*
|
2007-11-15 22:14:46 +01:00
|
|
|
* Report the failure to the launcher, if it's running. (If it's not, we
|
|
|
|
* might not even be connected to shared memory, so don't try to call
|
2009-08-24 19:23:02 +02:00
|
|
|
* AutoVacWorkerFailed.) Note that we also need to signal it so that it
|
|
|
|
* responds to the condition, but we don't do that here, instead waiting
|
|
|
|
* for ServerLoop to do it. This way we avoid a ping-pong signalling in
|
|
|
|
* quick succession between the autovac launcher and postmaster in case
|
|
|
|
* things get ugly.
|
2007-08-09 03:18:43 +02:00
|
|
|
*/
|
2007-06-25 18:09:03 +02:00
|
|
|
if (AutoVacPID != 0)
|
2007-08-09 03:18:43 +02:00
|
|
|
{
|
|
|
|
AutoVacWorkerFailed();
|
2009-08-24 19:23:02 +02:00
|
|
|
avlauncher_needs_signal = true;
|
2007-08-09 03:18:43 +02:00
|
|
|
}
|
2007-02-16 00:23:23 +01:00
|
|
|
}
|
2000-07-09 15:14:19 +02:00
|
|
|
|
Don't lose walreceiver start requests due to race condition in postmaster.
When a walreceiver dies, the startup process will notice that and send
a PMSIGNAL_START_WALRECEIVER signal to the postmaster, asking for a new
walreceiver to be launched. There's a race condition, which at least
in HEAD is very easy to hit, whereby the postmaster might see that
signal before it processes the SIGCHLD from the walreceiver process.
In that situation, sigusr1_handler() just dropped the start request
on the floor, reasoning that it must be redundant. Eventually, after
10 seconds (WALRCV_STARTUP_TIMEOUT), the startup process would make a
fresh request --- but that's a long time if the connection could have
been re-established almost immediately.
Fix it by setting a state flag inside the postmaster that we won't
clear until we do launch a walreceiver. In cases where that results
in an extra walreceiver launch, it's up to the walreceiver to realize
it's unwanted and go away --- but we have, and need, that logic anyway
for the opposite race case.
I came across this through investigating unexpected delays in the
src/test/recovery TAP tests: it manifests there in test cases where
a master server is stopped and restarted while leaving streaming
slaves active.
This logic has been broken all along, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/21344.1498494720@sss.pgh.pa.us
2017-06-26 23:31:56 +02:00
|
|
|
/*
|
|
|
|
* MaybeStartWalReceiver
|
|
|
|
* Start the WAL receiver process, if not running and our state allows.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
MaybeStartWalReceiver(void)
|
|
|
|
{
|
|
|
|
if (WalReceiverPID == 0 &&
|
|
|
|
(pmState == PM_STARTUP || pmState == PM_RECOVERY ||
|
|
|
|
pmState == PM_HOT_STANDBY || pmState == PM_WAIT_READONLY) &&
|
|
|
|
Shutdown == NoShutdown)
|
|
|
|
{
|
|
|
|
WalReceiverPID = StartWalReceiver();
|
|
|
|
WalReceiverRequested = false;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
|
1999-12-06 08:21:12 +01:00
|
|
|
/*
|
2000-01-09 13:13:24 +01:00
|
|
|
* Create the opts file
|
1999-12-06 08:21:12 +01:00
|
|
|
*/
|
2000-07-09 15:14:19 +02:00
|
|
|
static bool
|
2004-05-14 00:45:04 +02:00
|
|
|
CreateOptsFile(int argc, char *argv[], char *fullprogname)
|
1999-12-03 07:26:34 +01:00
|
|
|
{
|
2001-03-22 05:01:46 +01:00
|
|
|
FILE *fp;
|
2003-07-27 23:49:55 +02:00
|
|
|
int i;
|
1999-12-03 07:26:34 +01:00
|
|
|
|
2005-07-04 06:51:52 +02:00
|
|
|
#define OPTS_FILE "postmaster.opts"
|
1999-12-03 07:26:34 +01:00
|
|
|
|
2005-07-04 06:51:52 +02:00
|
|
|
if ((fp = fopen(OPTS_FILE, "w")) == NULL)
|
2000-04-12 19:17:23 +02:00
|
|
|
{
|
2005-07-04 06:51:52 +02:00
|
|
|
elog(LOG, "could not create file \"%s\": %m", OPTS_FILE);
|
2000-07-09 15:14:19 +02:00
|
|
|
return false;
|
1999-12-03 07:26:34 +01:00
|
|
|
}
|
|
|
|
|
2000-07-09 15:14:19 +02:00
|
|
|
fprintf(fp, "%s", fullprogname);
|
|
|
|
for (i = 1; i < argc; i++)
|
2008-06-26 04:47:19 +02:00
|
|
|
fprintf(fp, " \"%s\"", argv[i]);
|
2000-07-09 15:14:19 +02:00
|
|
|
fputs("\n", fp);
|
1999-12-03 07:26:34 +01:00
|
|
|
|
2004-01-26 23:35:32 +01:00
|
|
|
if (fclose(fp))
|
2000-04-12 19:17:23 +02:00
|
|
|
{
|
2005-07-04 06:51:52 +02:00
|
|
|
elog(LOG, "could not write file \"%s\": %m", OPTS_FILE);
|
2000-07-09 15:14:19 +02:00
|
|
|
return false;
|
1999-12-03 07:26:34 +01:00
|
|
|
}
|
|
|
|
|
2000-07-09 15:14:19 +02:00
|
|
|
return true;
|
1999-12-03 07:26:34 +01:00
|
|
|
}
|
2001-06-03 16:53:56 +02:00
|
|
|
|
2003-12-20 18:31:21 +01:00
|
|
|
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
/*
|
|
|
|
* MaxLivePostmasterChildren
|
|
|
|
*
|
|
|
|
* This reports the number of entries needed in per-child-process arrays
|
|
|
|
* (the PMChildFlags array, and if EXEC_BACKEND the ShmemBackendArray).
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
* These arrays include regular backends, autovac workers, walsenders
|
|
|
|
* and background workers, but not special children nor dead_end children.
|
|
|
|
* This allows the arrays to have a fixed maximum size, to wit the same
|
2014-05-06 18:12:18 +02:00
|
|
|
* too-many-children limit enforced by canAcceptConnections(). The exact value
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
* isn't too critical as long as it's more than MaxBackends.
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
*/
|
|
|
|
int
|
|
|
|
MaxLivePostmasterChildren(void)
|
|
|
|
{
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
return 2 * (MaxConnections + autovacuum_max_workers + 1 +
|
Add new GUC, max_worker_processes, limiting number of bgworkers.
In 9.3, there's no particular limit on the number of bgworkers;
instead, we just count up the number that are actually registered,
and use that to set MaxBackends. However, that approach causes
problems for Hot Standby, which needs both MaxBackends and the
size of the lock table to be the same on the standby as on the
master, yet it may not be desirable to run the same bgworkers in
both places. 9.3 handles that by failing to notice the problem,
which will probably work fine in nearly all cases anyway, but is
not theoretically sound.
A further problem with simply counting the number of registered
workers is that new workers can't be registered without a
postmaster restart. This is inconvenient for administrators,
since bouncing the postmaster causes an interruption of service.
Moreover, there are a number of applications for background
processes where, by necessity, the background process must be
started on the fly (e.g. parallel query). While this patch
doesn't actually make it possible to register new background
workers after startup time, it's a necessary prerequisite.
Patch by me. Review by Michael Paquier.
2013-07-04 17:24:24 +02:00
|
|
|
max_worker_processes);
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
}
|
|
|
|
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/*
|
|
|
|
* Connect background worker to a database.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
BackgroundWorkerInitializeConnection(char *dbname, char *username)
|
|
|
|
{
|
|
|
|
BackgroundWorker *worker = MyBgworkerEntry;
|
|
|
|
|
|
|
|
/* XXX is this the right errcode? */
|
|
|
|
if (!(worker->bgw_flags & BGWORKER_BACKEND_DATABASE_CONNECTION))
|
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
|
|
|
|
errmsg("database connection requirement not indicated during registration")));
|
|
|
|
|
2015-02-02 22:23:59 +01:00
|
|
|
InitPostgres(dbname, InvalidOid, username, InvalidOid, NULL);
|
|
|
|
|
|
|
|
/* it had better not gotten out of "init" mode yet */
|
|
|
|
if (!IsInitProcessingMode())
|
|
|
|
ereport(ERROR,
|
|
|
|
(errmsg("invalid processing mode in background worker")));
|
|
|
|
SetProcessingMode(NormalProcessing);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Connect background worker to a database using OIDs.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
BackgroundWorkerInitializeConnectionByOid(Oid dboid, Oid useroid)
|
|
|
|
{
|
|
|
|
BackgroundWorker *worker = MyBgworkerEntry;
|
|
|
|
|
|
|
|
/* XXX is this the right errcode? */
|
|
|
|
if (!(worker->bgw_flags & BGWORKER_BACKEND_DATABASE_CONNECTION))
|
|
|
|
ereport(FATAL,
|
|
|
|
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
|
|
|
|
errmsg("database connection requirement not indicated during registration")));
|
|
|
|
|
|
|
|
InitPostgres(NULL, dboid, NULL, useroid, NULL);
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
|
|
|
/* it had better not gotten out of "init" mode yet */
|
|
|
|
if (!IsInitProcessingMode())
|
|
|
|
ereport(ERROR,
|
2013-08-08 04:48:40 +02:00
|
|
|
(errmsg("invalid processing mode in background worker")));
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
SetProcessingMode(NormalProcessing);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Block/unblock signals in a background worker
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
BackgroundWorkerBlockSignals(void)
|
|
|
|
{
|
|
|
|
PG_SETMASK(&BlockSig);
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
BackgroundWorkerUnblockSignals(void)
|
|
|
|
{
|
|
|
|
PG_SETMASK(&UnBlockSig);
|
|
|
|
}
|
|
|
|
|
|
|
|
#ifdef EXEC_BACKEND
|
|
|
|
static pid_t
|
Allow background workers to be started dynamically.
There is a new API, RegisterDynamicBackgroundWorker, which allows
an ordinary user backend to register a new background writer during
normal running. This means that it's no longer necessary for all
background workers to be registered during processing of
shared_preload_libraries, although the option of registering workers
at that time remains available.
When a background worker exits and will not be restarted, the
slot previously used by that background worker is automatically
released and becomes available for reuse. Slots used by background
workers that are configured for automatic restart can't (yet) be
released without shutting down the system.
This commit adds a new source file, bgworker.c, and moves some
of the existing control logic for background workers there.
Previously, there was little enough logic that it made sense to
keep everything in postmaster.c, but not any more.
This commit also makes the worker_spi contrib module into an
extension and adds a new function, worker_spi_launch, which can
be used to demonstrate the new facility.
2013-07-16 19:02:15 +02:00
|
|
|
bgworker_forkexec(int shmem_slot)
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
{
|
|
|
|
char *av[10];
|
|
|
|
int ac = 0;
|
|
|
|
char forkav[MAXPGPATH];
|
|
|
|
|
Allow background workers to be started dynamically.
There is a new API, RegisterDynamicBackgroundWorker, which allows
an ordinary user backend to register a new background writer during
normal running. This means that it's no longer necessary for all
background workers to be registered during processing of
shared_preload_libraries, although the option of registering workers
at that time remains available.
When a background worker exits and will not be restarted, the
slot previously used by that background worker is automatically
released and becomes available for reuse. Slots used by background
workers that are configured for automatic restart can't (yet) be
released without shutting down the system.
This commit adds a new source file, bgworker.c, and moves some
of the existing control logic for background workers there.
Previously, there was little enough logic that it made sense to
keep everything in postmaster.c, but not any more.
This commit also makes the worker_spi contrib module into an
extension and adds a new function, worker_spi_launch, which can
be used to demonstrate the new facility.
2013-07-16 19:02:15 +02:00
|
|
|
snprintf(forkav, MAXPGPATH, "--forkbgworker=%d", shmem_slot);
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
|
|
|
av[ac++] = "postgres";
|
|
|
|
av[ac++] = forkav;
|
|
|
|
av[ac++] = NULL; /* filled in by postmaster_forkexec */
|
|
|
|
av[ac] = NULL;
|
|
|
|
|
|
|
|
Assert(ac < lengthof(av));
|
|
|
|
|
|
|
|
return postmaster_forkexec(ac, av);
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Start a new bgworker.
|
|
|
|
* Starting time conditions must have been checked already.
|
|
|
|
*
|
2017-04-24 18:16:58 +02:00
|
|
|
* Returns true on success, false on failure.
|
|
|
|
* In either case, update the RegisteredBgWorker's state appropriately.
|
|
|
|
*
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
* This code is heavily based on autovacuum.c, q.v.
|
|
|
|
*/
|
2017-04-24 18:16:58 +02:00
|
|
|
static bool
|
2013-08-16 21:14:54 +02:00
|
|
|
do_start_bgworker(RegisteredBgWorker *rw)
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
{
|
|
|
|
pid_t worker_pid;
|
|
|
|
|
2017-04-24 18:16:58 +02:00
|
|
|
Assert(rw->rw_pid == 0);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Allocate and assign the Backend element. Note we must do this before
|
|
|
|
* forking, so that we can handle out of memory properly.
|
|
|
|
*
|
|
|
|
* Treat failure as though the worker had crashed. That way, the
|
|
|
|
* postmaster will wait a bit before attempting to start it again; if it
|
|
|
|
* tried again right away, most likely it'd find itself repeating the
|
|
|
|
* out-of-memory or fork failure condition.
|
|
|
|
*/
|
|
|
|
if (!assign_backendlist_entry(rw))
|
|
|
|
{
|
|
|
|
rw->rw_crashed_at = GetCurrentTimestamp();
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2015-06-26 17:23:32 +02:00
|
|
|
ereport(DEBUG1,
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
(errmsg("starting background worker process \"%s\"",
|
|
|
|
rw->rw_worker.bgw_name)));
|
|
|
|
|
|
|
|
#ifdef EXEC_BACKEND
|
Allow background workers to be started dynamically.
There is a new API, RegisterDynamicBackgroundWorker, which allows
an ordinary user backend to register a new background writer during
normal running. This means that it's no longer necessary for all
background workers to be registered during processing of
shared_preload_libraries, although the option of registering workers
at that time remains available.
When a background worker exits and will not be restarted, the
slot previously used by that background worker is automatically
released and becomes available for reuse. Slots used by background
workers that are configured for automatic restart can't (yet) be
released without shutting down the system.
This commit adds a new source file, bgworker.c, and moves some
of the existing control logic for background workers there.
Previously, there was little enough logic that it made sense to
keep everything in postmaster.c, but not any more.
This commit also makes the worker_spi contrib module into an
extension and adds a new function, worker_spi_launch, which can
be used to demonstrate the new facility.
2013-07-16 19:02:15 +02:00
|
|
|
switch ((worker_pid = bgworker_forkexec(rw->rw_shmem_slot)))
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
#else
|
|
|
|
switch ((worker_pid = fork_process()))
|
|
|
|
#endif
|
|
|
|
{
|
|
|
|
case -1:
|
2017-04-24 18:16:58 +02:00
|
|
|
/* in postmaster, fork failed ... */
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
ereport(LOG,
|
|
|
|
(errmsg("could not fork worker process: %m")));
|
2017-04-24 18:16:58 +02:00
|
|
|
/* undo what assign_backendlist_entry did */
|
|
|
|
ReleasePostmasterChildSlot(rw->rw_child_slot);
|
|
|
|
rw->rw_child_slot = 0;
|
|
|
|
free(rw->rw_backend);
|
|
|
|
rw->rw_backend = NULL;
|
|
|
|
/* mark entry as crashed, so we'll try again later */
|
|
|
|
rw->rw_crashed_at = GetCurrentTimestamp();
|
|
|
|
break;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
|
|
|
#ifndef EXEC_BACKEND
|
|
|
|
case 0:
|
|
|
|
/* in postmaster child ... */
|
2015-01-13 13:12:37 +01:00
|
|
|
InitPostmasterChild();
|
|
|
|
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/* Close the postmaster's sockets */
|
|
|
|
ClosePostmasterPorts(false);
|
|
|
|
|
2016-08-03 20:48:05 +02:00
|
|
|
/*
|
|
|
|
* Before blowing away PostmasterContext, save this bgworker's
|
|
|
|
* data where it can find it.
|
|
|
|
*/
|
|
|
|
MyBgworkerEntry = (BackgroundWorker *)
|
|
|
|
MemoryContextAlloc(TopMemoryContext, sizeof(BackgroundWorker));
|
|
|
|
memcpy(MyBgworkerEntry, &rw->rw_worker, sizeof(BackgroundWorker));
|
|
|
|
|
|
|
|
/* Release postmaster's working memory context */
|
|
|
|
MemoryContextSwitchTo(TopMemoryContext);
|
|
|
|
MemoryContextDelete(PostmasterContext);
|
|
|
|
PostmasterContext = NULL;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
2013-08-16 21:14:54 +02:00
|
|
|
StartBackgroundWorker();
|
2017-04-24 18:16:58 +02:00
|
|
|
|
|
|
|
exit(1); /* should not get here */
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
break;
|
|
|
|
#endif
|
|
|
|
default:
|
2017-04-24 18:16:58 +02:00
|
|
|
/* in postmaster, fork successful ... */
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
rw->rw_pid = worker_pid;
|
2015-09-01 21:30:19 +02:00
|
|
|
rw->rw_backend->pid = rw->rw_pid;
|
2013-08-28 20:08:13 +02:00
|
|
|
ReportBackgroundWorkerPID(rw);
|
2017-04-24 18:16:58 +02:00
|
|
|
/* add new worker to lists of backends */
|
|
|
|
dlist_push_head(&BackendList, &rw->rw_backend->elem);
|
|
|
|
#ifdef EXEC_BACKEND
|
|
|
|
ShmemBackendArrayAdd(rw->rw_backend);
|
|
|
|
#endif
|
|
|
|
return true;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
}
|
2017-04-24 18:16:58 +02:00
|
|
|
|
|
|
|
return false;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Does the current postmaster state require starting a worker with the
|
|
|
|
* specified start_time?
|
|
|
|
*/
|
|
|
|
static bool
|
|
|
|
bgworker_should_start_now(BgWorkerStartTime start_time)
|
|
|
|
{
|
|
|
|
switch (pmState)
|
|
|
|
{
|
|
|
|
case PM_NO_CHILDREN:
|
|
|
|
case PM_WAIT_DEAD_END:
|
|
|
|
case PM_SHUTDOWN_2:
|
|
|
|
case PM_SHUTDOWN:
|
|
|
|
case PM_WAIT_BACKENDS:
|
|
|
|
case PM_WAIT_READONLY:
|
|
|
|
case PM_WAIT_BACKUP:
|
|
|
|
break;
|
|
|
|
|
|
|
|
case PM_RUN:
|
|
|
|
if (start_time == BgWorkerStart_RecoveryFinished)
|
|
|
|
return true;
|
|
|
|
/* fall through */
|
|
|
|
|
|
|
|
case PM_HOT_STANDBY:
|
|
|
|
if (start_time == BgWorkerStart_ConsistentState)
|
|
|
|
return true;
|
|
|
|
/* fall through */
|
|
|
|
|
|
|
|
case PM_RECOVERY:
|
|
|
|
case PM_STARTUP:
|
|
|
|
case PM_INIT:
|
|
|
|
if (start_time == BgWorkerStart_PostmasterStart)
|
|
|
|
return true;
|
|
|
|
/* fall through */
|
|
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Allocate the Backend struct for a connected background worker, but don't
|
|
|
|
* add it to the list of backends just yet.
|
|
|
|
*
|
2017-04-24 18:16:58 +02:00
|
|
|
* On failure, return false without changing any worker state.
|
|
|
|
*
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
* Some info from the Backend is copied into the passed rw.
|
|
|
|
*/
|
|
|
|
static bool
|
|
|
|
assign_backendlist_entry(RegisteredBgWorker *rw)
|
|
|
|
{
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
Backend *bn;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
/*
|
|
|
|
* Compute the cancel key that will be assigned to this session. We
|
|
|
|
* probably don't need cancel keys for background workers, but we'd better
|
|
|
|
* have something random in the field to prevent unfriendly people from
|
|
|
|
* sending cancels to them.
|
|
|
|
*/
|
|
|
|
if (!RandomCancelKey(&MyCancelKey))
|
|
|
|
{
|
|
|
|
ereport(LOG,
|
|
|
|
(errcode(ERRCODE_INTERNAL_ERROR),
|
2016-12-12 10:55:32 +01:00
|
|
|
errmsg("could not generate random cancel key")));
|
Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.
pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.
If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.
This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.
Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
bn = malloc(sizeof(Backend));
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
if (bn == NULL)
|
|
|
|
{
|
|
|
|
ereport(LOG,
|
|
|
|
(errcode(ERRCODE_OUT_OF_MEMORY),
|
|
|
|
errmsg("out of memory")));
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
bn->cancel_key = MyCancelKey;
|
|
|
|
bn->child_slot = MyPMChildSlot = AssignPostmasterChildSlot();
|
|
|
|
bn->bkend_type = BACKEND_TYPE_BGWORKER;
|
|
|
|
bn->dead_end = false;
|
2013-08-28 20:08:13 +02:00
|
|
|
bn->bgworker_notify = false;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
|
|
|
rw->rw_backend = bn;
|
|
|
|
rw->rw_child_slot = bn->child_slot;
|
|
|
|
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
Allow multiple bgworkers to be launched per postmaster iteration.
Previously, maybe_start_bgworker() would launch at most one bgworker
process per call, on the grounds that the postmaster might otherwise
neglect its other duties for too long. However, that seems overly
conservative, especially since bad effects only become obvious when
many hundreds of bgworkers need to be launched at once. On the other
side of the coin is that the existing logic could result in substantial
delay of bgworker launches, because ServerLoop isn't guaranteed to
iterate immediately after a signal arrives. (My attempt to fix that
by using pselect(2) encountered too many portability question marks,
and in any case could not help on platforms without pselect().)
One could also question the wisdom of using an O(N^2) processing
method if the system is intended to support so many bgworkers.
As a compromise, allow that function to launch up to 100 bgworkers
per call (and in consequence, rename it to maybe_start_bgworkers).
This will allow any normal parallel-query request for workers
to be satisfied immediately during sigusr1_handler, avoiding the
question of whether ServerLoop will be able to launch more promptly.
There is talk of rewriting the postmaster to use a WaitEventSet to
avoid the signal-response-delay problem, but I'd argue that this change
should be kept even after that happens (if it ever does).
Backpatch to 9.6 where parallel query was added. The issue exists
before that, but previous uses of bgworkers typically aren't as
sensitive to how quickly they get launched.
Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
2017-04-26 22:17:29 +02:00
|
|
|
* If the time is right, start background worker(s).
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
*
|
Allow multiple bgworkers to be launched per postmaster iteration.
Previously, maybe_start_bgworker() would launch at most one bgworker
process per call, on the grounds that the postmaster might otherwise
neglect its other duties for too long. However, that seems overly
conservative, especially since bad effects only become obvious when
many hundreds of bgworkers need to be launched at once. On the other
side of the coin is that the existing logic could result in substantial
delay of bgworker launches, because ServerLoop isn't guaranteed to
iterate immediately after a signal arrives. (My attempt to fix that
by using pselect(2) encountered too many portability question marks,
and in any case could not help on platforms without pselect().)
One could also question the wisdom of using an O(N^2) processing
method if the system is intended to support so many bgworkers.
As a compromise, allow that function to launch up to 100 bgworkers
per call (and in consequence, rename it to maybe_start_bgworkers).
This will allow any normal parallel-query request for workers
to be satisfied immediately during sigusr1_handler, avoiding the
question of whether ServerLoop will be able to launch more promptly.
There is talk of rewriting the postmaster to use a WaitEventSet to
avoid the signal-response-delay problem, but I'd argue that this change
should be kept even after that happens (if it ever does).
Backpatch to 9.6 where parallel query was added. The issue exists
before that, but previous uses of bgworkers typically aren't as
sensitive to how quickly they get launched.
Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
2017-04-26 22:17:29 +02:00
|
|
|
* As a side effect, the bgworker control variables are set or reset
|
|
|
|
* depending on whether more workers may need to be started.
|
2017-04-24 18:16:58 +02:00
|
|
|
*
|
Allow multiple bgworkers to be launched per postmaster iteration.
Previously, maybe_start_bgworker() would launch at most one bgworker
process per call, on the grounds that the postmaster might otherwise
neglect its other duties for too long. However, that seems overly
conservative, especially since bad effects only become obvious when
many hundreds of bgworkers need to be launched at once. On the other
side of the coin is that the existing logic could result in substantial
delay of bgworker launches, because ServerLoop isn't guaranteed to
iterate immediately after a signal arrives. (My attempt to fix that
by using pselect(2) encountered too many portability question marks,
and in any case could not help on platforms without pselect().)
One could also question the wisdom of using an O(N^2) processing
method if the system is intended to support so many bgworkers.
As a compromise, allow that function to launch up to 100 bgworkers
per call (and in consequence, rename it to maybe_start_bgworkers).
This will allow any normal parallel-query request for workers
to be satisfied immediately during sigusr1_handler, avoiding the
question of whether ServerLoop will be able to launch more promptly.
There is talk of rewriting the postmaster to use a WaitEventSet to
avoid the signal-response-delay problem, but I'd argue that this change
should be kept even after that happens (if it ever does).
Backpatch to 9.6 where parallel query was added. The issue exists
before that, but previous uses of bgworkers typically aren't as
sensitive to how quickly they get launched.
Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
2017-04-26 22:17:29 +02:00
|
|
|
* We limit the number of workers started per call, to avoid consuming the
|
2017-04-24 18:16:58 +02:00
|
|
|
* postmaster's attention for too long when many such requests are pending.
|
|
|
|
* As long as StartWorkerNeeded is true, ServerLoop will not block and will
|
|
|
|
* call this function again after dealing with any other issues.
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
*/
|
|
|
|
static void
|
Allow multiple bgworkers to be launched per postmaster iteration.
Previously, maybe_start_bgworker() would launch at most one bgworker
process per call, on the grounds that the postmaster might otherwise
neglect its other duties for too long. However, that seems overly
conservative, especially since bad effects only become obvious when
many hundreds of bgworkers need to be launched at once. On the other
side of the coin is that the existing logic could result in substantial
delay of bgworker launches, because ServerLoop isn't guaranteed to
iterate immediately after a signal arrives. (My attempt to fix that
by using pselect(2) encountered too many portability question marks,
and in any case could not help on platforms without pselect().)
One could also question the wisdom of using an O(N^2) processing
method if the system is intended to support so many bgworkers.
As a compromise, allow that function to launch up to 100 bgworkers
per call (and in consequence, rename it to maybe_start_bgworkers).
This will allow any normal parallel-query request for workers
to be satisfied immediately during sigusr1_handler, avoiding the
question of whether ServerLoop will be able to launch more promptly.
There is talk of rewriting the postmaster to use a WaitEventSet to
avoid the signal-response-delay problem, but I'd argue that this change
should be kept even after that happens (if it ever does).
Backpatch to 9.6 where parallel query was added. The issue exists
before that, but previous uses of bgworkers typically aren't as
sensitive to how quickly they get launched.
Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
2017-04-26 22:17:29 +02:00
|
|
|
maybe_start_bgworkers(void)
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
{
|
Allow multiple bgworkers to be launched per postmaster iteration.
Previously, maybe_start_bgworker() would launch at most one bgworker
process per call, on the grounds that the postmaster might otherwise
neglect its other duties for too long. However, that seems overly
conservative, especially since bad effects only become obvious when
many hundreds of bgworkers need to be launched at once. On the other
side of the coin is that the existing logic could result in substantial
delay of bgworker launches, because ServerLoop isn't guaranteed to
iterate immediately after a signal arrives. (My attempt to fix that
by using pselect(2) encountered too many portability question marks,
and in any case could not help on platforms without pselect().)
One could also question the wisdom of using an O(N^2) processing
method if the system is intended to support so many bgworkers.
As a compromise, allow that function to launch up to 100 bgworkers
per call (and in consequence, rename it to maybe_start_bgworkers).
This will allow any normal parallel-query request for workers
to be satisfied immediately during sigusr1_handler, avoiding the
question of whether ServerLoop will be able to launch more promptly.
There is talk of rewriting the postmaster to use a WaitEventSet to
avoid the signal-response-delay problem, but I'd argue that this change
should be kept even after that happens (if it ever does).
Backpatch to 9.6 where parallel query was added. The issue exists
before that, but previous uses of bgworkers typically aren't as
sensitive to how quickly they get launched.
Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
2017-04-26 22:17:29 +02:00
|
|
|
#define MAX_BGWORKERS_TO_LAUNCH 100
|
|
|
|
int num_launched = 0;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
TimestampTz now = 0;
|
Allow multiple bgworkers to be launched per postmaster iteration.
Previously, maybe_start_bgworker() would launch at most one bgworker
process per call, on the grounds that the postmaster might otherwise
neglect its other duties for too long. However, that seems overly
conservative, especially since bad effects only become obvious when
many hundreds of bgworkers need to be launched at once. On the other
side of the coin is that the existing logic could result in substantial
delay of bgworker launches, because ServerLoop isn't guaranteed to
iterate immediately after a signal arrives. (My attempt to fix that
by using pselect(2) encountered too many portability question marks,
and in any case could not help on platforms without pselect().)
One could also question the wisdom of using an O(N^2) processing
method if the system is intended to support so many bgworkers.
As a compromise, allow that function to launch up to 100 bgworkers
per call (and in consequence, rename it to maybe_start_bgworkers).
This will allow any normal parallel-query request for workers
to be satisfied immediately during sigusr1_handler, avoiding the
question of whether ServerLoop will be able to launch more promptly.
There is talk of rewriting the postmaster to use a WaitEventSet to
avoid the signal-response-delay problem, but I'd argue that this change
should be kept even after that happens (if it ever does).
Backpatch to 9.6 where parallel query was added. The issue exists
before that, but previous uses of bgworkers typically aren't as
sensitive to how quickly they get launched.
Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
2017-04-26 22:17:29 +02:00
|
|
|
slist_mutable_iter iter;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
2017-04-24 18:16:58 +02:00
|
|
|
/*
|
|
|
|
* During crash recovery, we have no need to be called until the state
|
|
|
|
* transition out of recovery.
|
|
|
|
*/
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
if (FatalError)
|
|
|
|
{
|
|
|
|
StartWorkerNeeded = false;
|
|
|
|
HaveCrashedWorker = false;
|
2017-04-24 18:16:58 +02:00
|
|
|
return;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
}
|
|
|
|
|
2017-04-24 18:16:58 +02:00
|
|
|
/* Don't need to be called again unless we find a reason for it below */
|
|
|
|
StartWorkerNeeded = false;
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
HaveCrashedWorker = false;
|
|
|
|
|
Allow background workers to be started dynamically.
There is a new API, RegisterDynamicBackgroundWorker, which allows
an ordinary user backend to register a new background writer during
normal running. This means that it's no longer necessary for all
background workers to be registered during processing of
shared_preload_libraries, although the option of registering workers
at that time remains available.
When a background worker exits and will not be restarted, the
slot previously used by that background worker is automatically
released and becomes available for reuse. Slots used by background
workers that are configured for automatic restart can't (yet) be
released without shutting down the system.
This commit adds a new source file, bgworker.c, and moves some
of the existing control logic for background workers there.
Previously, there was little enough logic that it made sense to
keep everything in postmaster.c, but not any more.
This commit also makes the worker_spi contrib module into an
extension and adds a new function, worker_spi_launch, which can
be used to demonstrate the new facility.
2013-07-16 19:02:15 +02:00
|
|
|
slist_foreach_modify(iter, &BackgroundWorkerList)
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
{
|
|
|
|
RegisteredBgWorker *rw;
|
|
|
|
|
|
|
|
rw = slist_container(RegisteredBgWorker, rw_lnode, iter.cur);
|
|
|
|
|
2017-04-24 18:16:58 +02:00
|
|
|
/* ignore if already running */
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
if (rw->rw_pid != 0)
|
|
|
|
continue;
|
|
|
|
|
2017-04-24 18:16:58 +02:00
|
|
|
/* if marked for death, clean up and remove from list */
|
2013-10-18 16:21:25 +02:00
|
|
|
if (rw->rw_terminate)
|
|
|
|
{
|
|
|
|
ForgetBackgroundWorker(&iter);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
/*
|
|
|
|
* If this worker has crashed previously, maybe it needs to be
|
|
|
|
* restarted (unless on registration it specified it doesn't want to
|
|
|
|
* be restarted at all). Check how long ago did a crash last happen.
|
|
|
|
* If the last crash is too recent, don't start it right away; let it
|
|
|
|
* be restarted once enough time has passed.
|
|
|
|
*/
|
|
|
|
if (rw->rw_crashed_at != 0)
|
|
|
|
{
|
|
|
|
if (rw->rw_worker.bgw_restart_time == BGW_NEVER_RESTART)
|
Allow background workers to be started dynamically.
There is a new API, RegisterDynamicBackgroundWorker, which allows
an ordinary user backend to register a new background writer during
normal running. This means that it's no longer necessary for all
background workers to be registered during processing of
shared_preload_libraries, although the option of registering workers
at that time remains available.
When a background worker exits and will not be restarted, the
slot previously used by that background worker is automatically
released and becomes available for reuse. Slots used by background
workers that are configured for automatic restart can't (yet) be
released without shutting down the system.
This commit adds a new source file, bgworker.c, and moves some
of the existing control logic for background workers there.
Previously, there was little enough logic that it made sense to
keep everything in postmaster.c, but not any more.
This commit also makes the worker_spi contrib module into an
extension and adds a new function, worker_spi_launch, which can
be used to demonstrate the new facility.
2013-07-16 19:02:15 +02:00
|
|
|
{
|
2013-07-24 23:41:55 +02:00
|
|
|
ForgetBackgroundWorker(&iter);
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
continue;
|
Allow background workers to be started dynamically.
There is a new API, RegisterDynamicBackgroundWorker, which allows
an ordinary user backend to register a new background writer during
normal running. This means that it's no longer necessary for all
background workers to be registered during processing of
shared_preload_libraries, although the option of registering workers
at that time remains available.
When a background worker exits and will not be restarted, the
slot previously used by that background worker is automatically
released and becomes available for reuse. Slots used by background
workers that are configured for automatic restart can't (yet) be
released without shutting down the system.
This commit adds a new source file, bgworker.c, and moves some
of the existing control logic for background workers there.
Previously, there was little enough logic that it made sense to
keep everything in postmaster.c, but not any more.
This commit also makes the worker_spi contrib module into an
extension and adds a new function, worker_spi_launch, which can
be used to demonstrate the new facility.
2013-07-16 19:02:15 +02:00
|
|
|
}
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
2017-04-24 18:16:58 +02:00
|
|
|
/* read system time only when needed */
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
if (now == 0)
|
|
|
|
now = GetCurrentTimestamp();
|
|
|
|
|
|
|
|
if (!TimestampDifferenceExceeds(rw->rw_crashed_at, now,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
rw->rw_worker.bgw_restart_time * 1000))
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
{
|
2017-04-24 18:16:58 +02:00
|
|
|
/* Set flag to remember that we have workers to start later */
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
HaveCrashedWorker = true;
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (bgworker_should_start_now(rw->rw_worker.bgw_start_time))
|
|
|
|
{
|
2017-04-24 18:16:58 +02:00
|
|
|
/* reset crash time before trying to start worker */
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
rw->rw_crashed_at = 0;
|
|
|
|
|
|
|
|
/*
|
2017-04-24 18:16:58 +02:00
|
|
|
* Try to start the worker.
|
|
|
|
*
|
|
|
|
* On failure, give up processing workers for now, but set
|
|
|
|
* StartWorkerNeeded so we'll come back here on the next iteration
|
|
|
|
* of ServerLoop to try again. (We don't want to wait, because
|
|
|
|
* there might be additional ready-to-run workers.) We could set
|
|
|
|
* HaveCrashedWorker as well, since this worker is now marked
|
|
|
|
* crashed, but there's no need because the next run of this
|
|
|
|
* function will do that.
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
*/
|
2017-04-24 18:16:58 +02:00
|
|
|
if (!do_start_bgworker(rw))
|
|
|
|
{
|
|
|
|
StartWorkerNeeded = true;
|
2015-09-01 21:30:19 +02:00
|
|
|
return;
|
2017-04-24 18:16:58 +02:00
|
|
|
}
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
|
|
|
|
/*
|
Allow multiple bgworkers to be launched per postmaster iteration.
Previously, maybe_start_bgworker() would launch at most one bgworker
process per call, on the grounds that the postmaster might otherwise
neglect its other duties for too long. However, that seems overly
conservative, especially since bad effects only become obvious when
many hundreds of bgworkers need to be launched at once. On the other
side of the coin is that the existing logic could result in substantial
delay of bgworker launches, because ServerLoop isn't guaranteed to
iterate immediately after a signal arrives. (My attempt to fix that
by using pselect(2) encountered too many portability question marks,
and in any case could not help on platforms without pselect().)
One could also question the wisdom of using an O(N^2) processing
method if the system is intended to support so many bgworkers.
As a compromise, allow that function to launch up to 100 bgworkers
per call (and in consequence, rename it to maybe_start_bgworkers).
This will allow any normal parallel-query request for workers
to be satisfied immediately during sigusr1_handler, avoiding the
question of whether ServerLoop will be able to launch more promptly.
There is talk of rewriting the postmaster to use a WaitEventSet to
avoid the signal-response-delay problem, but I'd argue that this change
should be kept even after that happens (if it ever does).
Backpatch to 9.6 where parallel query was added. The issue exists
before that, but previous uses of bgworkers typically aren't as
sensitive to how quickly they get launched.
Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
2017-04-26 22:17:29 +02:00
|
|
|
* If we've launched as many workers as allowed, quit, but have
|
|
|
|
* ServerLoop call us again to look for additional ready-to-run
|
|
|
|
* workers. There might not be any, but we'll find out the next
|
|
|
|
* time we run.
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
*/
|
Allow multiple bgworkers to be launched per postmaster iteration.
Previously, maybe_start_bgworker() would launch at most one bgworker
process per call, on the grounds that the postmaster might otherwise
neglect its other duties for too long. However, that seems overly
conservative, especially since bad effects only become obvious when
many hundreds of bgworkers need to be launched at once. On the other
side of the coin is that the existing logic could result in substantial
delay of bgworker launches, because ServerLoop isn't guaranteed to
iterate immediately after a signal arrives. (My attempt to fix that
by using pselect(2) encountered too many portability question marks,
and in any case could not help on platforms without pselect().)
One could also question the wisdom of using an O(N^2) processing
method if the system is intended to support so many bgworkers.
As a compromise, allow that function to launch up to 100 bgworkers
per call (and in consequence, rename it to maybe_start_bgworkers).
This will allow any normal parallel-query request for workers
to be satisfied immediately during sigusr1_handler, avoiding the
question of whether ServerLoop will be able to launch more promptly.
There is talk of rewriting the postmaster to use a WaitEventSet to
avoid the signal-response-delay problem, but I'd argue that this change
should be kept even after that happens (if it ever does).
Backpatch to 9.6 where parallel query was added. The issue exists
before that, but previous uses of bgworkers typically aren't as
sensitive to how quickly they get launched.
Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
2017-04-26 22:17:29 +02:00
|
|
|
if (++num_launched >= MAX_BGWORKERS_TO_LAUNCH)
|
|
|
|
{
|
|
|
|
StartWorkerNeeded = true;
|
|
|
|
return;
|
|
|
|
}
|
Background worker processes
Background workers are postmaster subprocesses that run arbitrary
user-specified code. They can request shared memory access as well as
backend database connections; or they can just use plain libpq frontend
database connections.
Modules listed in shared_preload_libraries can register background
workers in their _PG_init() function; this is early enough that it's not
necessary to provide an extra GUC option, because the necessary extra
resources can be allocated early on. Modules can install more than one
bgworker, if necessary.
Care is taken that these extra processes do not interfere with other
postmaster tasks: only one such process is started on each ServerLoop
iteration. This means a large number of them could be waiting to be
started up and postmaster is still able to quickly service external
connection requests. Also, shutdown sequence should not be impacted by
a worker process that's reasonably well behaved (i.e. promptly responds
to termination signals.)
The current implementation lets worker processes specify their start
time, i.e. at what point in the server startup process they are to be
started: right after postmaster start (in which case they mustn't ask
for shared memory access), when consistent state has been reached
(useful during recovery in a HOT standby server), or when recovery has
terminated (i.e. when normal backends are allowed).
In case of a bgworker crash, actions to take depend on registration
data: if shared memory was requested, then all other connections are
taken down (as well as other bgworkers), just like it were a regular
backend crashing. The bgworker itself is restarted, too, within a
configurable timeframe (which can be configured to be never).
More features to add to this framework can be imagined without much
effort, and have been discussed, but this seems good enough as a useful
unit already.
An elementary sample module is supplied.
Author: Álvaro Herrera
This patch is loosely based on prior patches submitted by KaiGai Kohei,
and unsubmitted code by Simon Riggs.
Reviewed by: KaiGai Kohei, Markus Wanner, Andres Freund,
Heikki Linnakangas, Simon Riggs, Amit Kapila
2012-12-06 18:57:52 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
|
2013-08-28 20:08:13 +02:00
|
|
|
/*
|
|
|
|
* When a backend asks to be notified about worker state changes, we
|
|
|
|
* set a flag in its backend entry. The background worker machinery needs
|
|
|
|
* to know when such backends exit.
|
|
|
|
*/
|
|
|
|
bool
|
|
|
|
PostmasterMarkPIDForWorkerNotify(int pid)
|
|
|
|
{
|
|
|
|
dlist_iter iter;
|
|
|
|
Backend *bp;
|
|
|
|
|
|
|
|
dlist_foreach(iter, &BackendList)
|
|
|
|
{
|
|
|
|
bp = dlist_container(Backend, elem, iter.cur);
|
|
|
|
if (bp->pid == pid)
|
|
|
|
{
|
|
|
|
bp->bgworker_notify = true;
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2003-12-20 18:31:21 +01:00
|
|
|
#ifdef EXEC_BACKEND
|
|
|
|
|
|
|
|
/*
|
2004-11-17 01:14:14 +01:00
|
|
|
* The following need to be available to the save/restore_backend_variables
|
2012-07-31 20:36:54 +02:00
|
|
|
* functions. They are marked NON_EXEC_STATIC in their home modules.
|
2003-12-20 18:31:21 +01:00
|
|
|
*/
|
|
|
|
extern slock_t *ShmemLock;
|
2004-05-27 17:07:41 +02:00
|
|
|
extern slock_t *ProcStructLock;
|
2007-03-07 14:35:03 +01:00
|
|
|
extern PGPROC *AuxiliaryProcs;
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
extern PMSignalData *PMSignalState;
|
2010-01-10 15:16:08 +01:00
|
|
|
extern pgsocket pgStatSock;
|
2012-07-31 20:36:54 +02:00
|
|
|
extern pg_time_t first_syslogger_file_time;
|
2004-11-17 01:14:14 +01:00
|
|
|
|
|
|
|
#ifndef WIN32
|
2009-08-28 19:42:54 +02:00
|
|
|
#define write_inheritable_socket(dest, src, childpid) ((*(dest) = (src)), true)
|
2004-11-17 01:14:14 +01:00
|
|
|
#define read_inheritable_socket(dest, src) (*(dest) = *(src))
|
|
|
|
#else
|
2009-08-06 11:50:22 +02:00
|
|
|
static bool write_duplicated_handle(HANDLE *dest, HANDLE src, HANDLE child);
|
2011-04-10 17:42:00 +02:00
|
|
|
static bool write_inheritable_socket(InheritableSocket *dest, SOCKET src,
|
2005-10-15 04:49:52 +02:00
|
|
|
pid_t childPid);
|
2011-04-10 17:42:00 +02:00
|
|
|
static void read_inheritable_socket(SOCKET *dest, InheritableSocket *src);
|
2004-11-17 01:14:14 +01:00
|
|
|
#endif
|
2003-12-20 18:31:21 +01:00
|
|
|
|
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
/* Save critical backend variables into the BackendParameters struct */
|
|
|
|
#ifndef WIN32
|
2004-01-07 00:15:22 +01:00
|
|
|
static bool
|
2011-04-10 17:42:00 +02:00
|
|
|
save_backend_variables(BackendParameters *param, Port *port)
|
2004-11-17 01:14:14 +01:00
|
|
|
#else
|
|
|
|
static bool
|
2011-04-10 17:42:00 +02:00
|
|
|
save_backend_variables(BackendParameters *param, Port *port,
|
2004-11-17 01:14:14 +01:00
|
|
|
HANDLE childProcess, pid_t childPid)
|
|
|
|
#endif
|
2003-12-20 18:31:21 +01:00
|
|
|
{
|
2004-11-17 01:14:14 +01:00
|
|
|
memcpy(¶m->port, port, sizeof(Port));
|
2009-08-06 11:50:22 +02:00
|
|
|
if (!write_inheritable_socket(¶m->portsocket, port->sock, childPid))
|
|
|
|
return false;
|
2004-05-27 17:07:41 +02:00
|
|
|
|
2007-02-10 15:58:55 +01:00
|
|
|
strlcpy(param->DataDir, DataDir, MAXPGPATH);
|
2003-12-20 18:31:21 +01:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
memcpy(¶m->ListenSocket, &ListenSocket, sizeof(ListenSocket));
|
2004-05-27 17:07:41 +02:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
param->MyCancelKey = MyCancelKey;
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
param->MyPMChildSlot = MyPMChildSlot;
|
2003-12-20 18:31:21 +01:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
param->UsedShmemSegID = UsedShmemSegID;
|
|
|
|
param->UsedShmemSegAddr = UsedShmemSegAddr;
|
2003-12-20 18:31:21 +01:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
param->ShmemLock = ShmemLock;
|
|
|
|
param->ShmemVariableCache = ShmemVariableCache;
|
|
|
|
param->ShmemBackendArray = ShmemBackendArray;
|
2004-05-28 07:13:32 +02:00
|
|
|
|
Reduce the number of semaphores used under --disable-spinlocks.
Instead of allocating a semaphore from the operating system for every
spinlock, allocate a fixed number of semaphores (by default, 1024)
from the operating system and multiplex all the spinlocks that get
created onto them. This could self-deadlock if a process attempted
to acquire more than one spinlock at a time, but since processes
aren't supposed to execute anything other than short stretches of
straight-line code while holding a spinlock, that shouldn't happen.
One motivation for this change is that, with the introduction of
dynamic shared memory, it may be desirable to create spinlocks that
last for less than the lifetime of the server. Without this change,
attempting to use such facilities under --disable-spinlocks would
quickly exhaust any supply of available semaphores. Quite apart
from that, it's desirable to contain the quantity of semaphores
needed to run the server simply on convenience grounds, since using
too many may make it harder to get PostgreSQL running on a new
platform, which is mostly the point of --disable-spinlocks in the
first place.
Patch by me; review by Tom Lane.
2014-01-09 00:49:14 +01:00
|
|
|
#ifndef HAVE_SPINLOCKS
|
|
|
|
param->SpinlockSemaArray = SpinlockSemaArray;
|
|
|
|
#endif
|
2016-02-04 22:43:04 +01:00
|
|
|
param->NamedLWLockTrancheRequests = NamedLWLockTrancheRequests;
|
|
|
|
param->NamedLWLockTrancheArray = NamedLWLockTrancheArray;
|
2014-01-27 17:07:44 +01:00
|
|
|
param->MainLWLockArray = MainLWLockArray;
|
2004-11-17 01:14:14 +01:00
|
|
|
param->ProcStructLock = ProcStructLock;
|
2006-01-04 22:06:32 +01:00
|
|
|
param->ProcGlobal = ProcGlobal;
|
2007-03-07 14:35:03 +01:00
|
|
|
param->AuxiliaryProcs = AuxiliaryProcs;
|
2011-11-25 14:02:10 +01:00
|
|
|
param->PreparedXactProcs = PreparedXactProcs;
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
param->PMSignalState = PMSignalState;
|
2009-08-06 11:50:22 +02:00
|
|
|
if (!write_inheritable_socket(¶m->pgStatSock, pgStatSock, childPid))
|
|
|
|
return false;
|
2003-12-20 18:31:21 +01:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
param->PostmasterPid = PostmasterPid;
|
2005-06-30 00:51:57 +02:00
|
|
|
param->PgStartTime = PgStartTime;
|
2008-05-04 23:13:36 +02:00
|
|
|
param->PgReloadTime = PgReloadTime;
|
2012-07-31 20:36:54 +02:00
|
|
|
param->first_syslogger_file_time = first_syslogger_file_time;
|
2004-05-30 00:48:23 +02:00
|
|
|
|
2007-07-19 21:13:43 +02:00
|
|
|
param->redirection_done = redirection_done;
|
2011-09-29 23:20:53 +02:00
|
|
|
param->IsBinaryUpgrade = IsBinaryUpgrade;
|
2012-03-29 07:19:11 +02:00
|
|
|
param->max_safe_fds = max_safe_fds;
|
2007-07-19 21:13:43 +02:00
|
|
|
|
2013-01-02 16:01:14 +01:00
|
|
|
param->MaxBackends = MaxBackends;
|
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
#ifdef WIN32
|
|
|
|
param->PostmasterHandle = PostmasterHandle;
|
2009-08-06 11:50:22 +02:00
|
|
|
if (!write_duplicated_handle(¶m->initial_signal_pipe,
|
2010-02-26 03:01:40 +01:00
|
|
|
pgwin32_create_signal_listener(childPid),
|
|
|
|
childProcess))
|
2009-08-06 11:50:22 +02:00
|
|
|
return false;
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
#else
|
|
|
|
memcpy(¶m->postmaster_alive_fds, &postmaster_alive_fds,
|
|
|
|
sizeof(postmaster_alive_fds));
|
2004-11-17 01:14:14 +01:00
|
|
|
#endif
|
2004-05-28 07:13:32 +02:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
memcpy(¶m->syslogPipe, &syslogPipe, sizeof(syslogPipe));
|
2003-12-20 18:31:21 +01:00
|
|
|
|
2007-02-10 15:58:55 +01:00
|
|
|
strlcpy(param->my_exec_path, my_exec_path, MAXPGPATH);
|
2003-12-20 18:31:21 +01:00
|
|
|
|
2007-02-10 15:58:55 +01:00
|
|
|
strlcpy(param->pkglib_path, pkglib_path, MAXPGPATH);
|
2005-08-11 23:11:50 +02:00
|
|
|
|
2007-02-10 15:58:55 +01:00
|
|
|
strlcpy(param->ExtraOptions, ExtraOptions, MAXPGPATH);
|
2003-12-20 18:31:21 +01:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
return true;
|
|
|
|
}
|
2004-01-07 00:15:22 +01:00
|
|
|
|
2004-08-06 01:32:13 +02:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
#ifdef WIN32
|
|
|
|
/*
|
|
|
|
* Duplicate a handle for usage in a child process, and write the child
|
|
|
|
* process instance of the handle to the parameter file.
|
|
|
|
*/
|
2009-08-06 11:50:22 +02:00
|
|
|
static bool
|
2009-06-11 16:49:15 +02:00
|
|
|
write_duplicated_handle(HANDLE *dest, HANDLE src, HANDLE childProcess)
|
2004-11-17 01:14:14 +01:00
|
|
|
{
|
2005-10-15 04:49:52 +02:00
|
|
|
HANDLE hChild = INVALID_HANDLE_VALUE;
|
2004-11-17 01:14:14 +01:00
|
|
|
|
|
|
|
if (!DuplicateHandle(GetCurrentProcess(),
|
|
|
|
src,
|
|
|
|
childProcess,
|
|
|
|
&hChild,
|
|
|
|
0,
|
|
|
|
TRUE,
|
|
|
|
DUPLICATE_CLOSE_SOURCE | DUPLICATE_SAME_ACCESS))
|
2009-08-06 11:50:22 +02:00
|
|
|
{
|
|
|
|
ereport(LOG,
|
2011-08-23 21:00:52 +02:00
|
|
|
(errmsg_internal("could not duplicate handle to be written to backend parameter file: error code %lu",
|
|
|
|
GetLastError())));
|
2009-08-06 11:50:22 +02:00
|
|
|
return false;
|
|
|
|
}
|
2004-11-17 01:14:14 +01:00
|
|
|
|
|
|
|
*dest = hChild;
|
2009-08-06 11:50:22 +02:00
|
|
|
return true;
|
2004-11-17 01:14:14 +01:00
|
|
|
}
|
2004-05-21 07:08:06 +02:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
/*
|
|
|
|
* Duplicate a socket for usage in a child process, and write the resulting
|
|
|
|
* structure to the parameter file.
|
|
|
|
* This is required because a number of LSPs (Layered Service Providers) very
|
|
|
|
* common on Windows (antivirus, firewalls, download managers etc) break
|
|
|
|
* straight socket inheritance.
|
|
|
|
*/
|
2009-08-06 11:50:22 +02:00
|
|
|
static bool
|
2011-04-10 17:42:00 +02:00
|
|
|
write_inheritable_socket(InheritableSocket *dest, SOCKET src, pid_t childpid)
|
2004-11-17 01:14:14 +01:00
|
|
|
{
|
|
|
|
dest->origsocket = src;
|
2010-01-10 15:16:08 +01:00
|
|
|
if (src != 0 && src != PGINVALID_SOCKET)
|
2004-11-17 01:14:14 +01:00
|
|
|
{
|
|
|
|
/* Actual socket */
|
|
|
|
if (WSADuplicateSocket(src, childpid, &dest->wsainfo) != 0)
|
2009-08-06 11:50:22 +02:00
|
|
|
{
|
|
|
|
ereport(LOG,
|
2004-11-17 01:14:14 +01:00
|
|
|
(errmsg("could not duplicate socket %d for use in backend: error code %d",
|
2011-04-28 21:05:58 +02:00
|
|
|
(int) src, WSAGetLastError())));
|
2009-08-06 11:50:22 +02:00
|
|
|
return false;
|
|
|
|
}
|
2004-11-17 01:14:14 +01:00
|
|
|
}
|
2009-08-06 11:50:22 +02:00
|
|
|
return true;
|
2004-11-17 01:14:14 +01:00
|
|
|
}
|
2004-05-28 07:13:32 +02:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
/*
|
|
|
|
* Read a duplicate socket structure back, and get the socket descriptor.
|
|
|
|
*/
|
|
|
|
static void
|
2011-04-10 17:42:00 +02:00
|
|
|
read_inheritable_socket(SOCKET *dest, InheritableSocket *src)
|
2004-11-17 01:14:14 +01:00
|
|
|
{
|
2005-10-15 04:49:52 +02:00
|
|
|
SOCKET s;
|
2004-05-28 07:13:32 +02:00
|
|
|
|
2010-01-10 15:16:08 +01:00
|
|
|
if (src->origsocket == PGINVALID_SOCKET || src->origsocket == 0)
|
2004-01-26 23:35:32 +01:00
|
|
|
{
|
2004-11-17 01:14:14 +01:00
|
|
|
/* Not a real socket! */
|
|
|
|
*dest = src->origsocket;
|
2004-01-26 23:35:32 +01:00
|
|
|
}
|
2004-11-17 01:14:14 +01:00
|
|
|
else
|
|
|
|
{
|
|
|
|
/* Actual socket, so create from structure */
|
|
|
|
s = WSASocket(FROM_PROTOCOL_INFO,
|
|
|
|
FROM_PROTOCOL_INFO,
|
|
|
|
FROM_PROTOCOL_INFO,
|
|
|
|
&src->wsainfo,
|
|
|
|
0,
|
|
|
|
0);
|
|
|
|
if (s == INVALID_SOCKET)
|
|
|
|
{
|
|
|
|
write_stderr("could not create inherited socket: error code %d\n",
|
|
|
|
WSAGetLastError());
|
|
|
|
exit(1);
|
|
|
|
}
|
|
|
|
*dest = s;
|
2004-01-26 23:35:32 +01:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* To make sure we don't get two references to the same socket, close
|
|
|
|
* the original one. (This would happen when inheritance actually
|
|
|
|
* works..
|
2004-11-17 01:14:14 +01:00
|
|
|
*/
|
|
|
|
closesocket(src->origsocket);
|
|
|
|
}
|
2003-12-20 18:31:21 +01:00
|
|
|
}
|
2004-11-17 01:14:14 +01:00
|
|
|
#endif
|
2003-12-20 18:31:21 +01:00
|
|
|
|
2004-05-28 07:13:32 +02:00
|
|
|
static void
|
2004-11-17 01:14:14 +01:00
|
|
|
read_backend_variables(char *id, Port *port)
|
2003-12-20 18:31:21 +01:00
|
|
|
{
|
2004-12-29 22:36:09 +01:00
|
|
|
BackendParameters param;
|
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
#ifndef WIN32
|
|
|
|
/* Non-win32 implementation reads from file */
|
2005-10-15 04:49:52 +02:00
|
|
|
FILE *fp;
|
2003-12-20 18:31:21 +01:00
|
|
|
|
|
|
|
/* Open file */
|
2004-11-17 01:14:14 +01:00
|
|
|
fp = AllocateFile(id, PG_BINARY_R);
|
2003-12-20 18:31:21 +01:00
|
|
|
if (!fp)
|
2004-11-17 01:14:14 +01:00
|
|
|
{
|
2015-01-13 21:02:47 +01:00
|
|
|
write_stderr("could not open backend variables file \"%s\": %s\n",
|
2004-11-17 01:14:14 +01:00
|
|
|
id, strerror(errno));
|
|
|
|
exit(1);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (fread(¶m, sizeof(param), 1, fp) != 1)
|
|
|
|
{
|
|
|
|
write_stderr("could not read from backend variables file \"%s\": %s\n",
|
|
|
|
id, strerror(errno));
|
|
|
|
exit(1);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Release file */
|
|
|
|
FreeFile(fp);
|
|
|
|
if (unlink(id) != 0)
|
|
|
|
{
|
|
|
|
write_stderr("could not remove file \"%s\": %s\n",
|
|
|
|
id, strerror(errno));
|
|
|
|
exit(1);
|
|
|
|
}
|
|
|
|
#else
|
|
|
|
/* Win32 version uses mapped file */
|
2005-10-15 04:49:52 +02:00
|
|
|
HANDLE paramHandle;
|
2004-12-29 22:36:09 +01:00
|
|
|
BackendParameters *paramp;
|
2003-12-20 18:31:21 +01:00
|
|
|
|
2010-11-16 12:40:56 +01:00
|
|
|
#ifdef _WIN64
|
|
|
|
paramHandle = (HANDLE) _atoi64(id);
|
|
|
|
#else
|
2005-10-15 04:49:52 +02:00
|
|
|
paramHandle = (HANDLE) atol(id);
|
2010-11-16 12:40:56 +01:00
|
|
|
#endif
|
2004-12-29 22:36:09 +01:00
|
|
|
paramp = MapViewOfFile(paramHandle, FILE_MAP_READ, 0, 0, 0);
|
|
|
|
if (!paramp)
|
2004-11-17 01:14:14 +01:00
|
|
|
{
|
2011-08-23 21:00:52 +02:00
|
|
|
write_stderr("could not map view of backend variables: error code %lu\n",
|
|
|
|
GetLastError());
|
2004-11-17 01:14:14 +01:00
|
|
|
exit(1);
|
|
|
|
}
|
2004-05-28 07:13:32 +02:00
|
|
|
|
2004-12-29 22:36:09 +01:00
|
|
|
memcpy(¶m, paramp, sizeof(BackendParameters));
|
2004-05-30 00:48:23 +02:00
|
|
|
|
2004-12-29 22:36:09 +01:00
|
|
|
if (!UnmapViewOfFile(paramp))
|
2004-11-17 01:14:14 +01:00
|
|
|
{
|
2011-08-23 21:00:52 +02:00
|
|
|
write_stderr("could not unmap view of backend variables: error code %lu\n",
|
|
|
|
GetLastError());
|
2004-11-17 01:14:14 +01:00
|
|
|
exit(1);
|
|
|
|
}
|
2003-12-20 18:31:21 +01:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
if (!CloseHandle(paramHandle))
|
|
|
|
{
|
2011-08-23 21:00:52 +02:00
|
|
|
write_stderr("could not close handle to backend parameter variables: error code %lu\n",
|
|
|
|
GetLastError());
|
2004-11-17 01:14:14 +01:00
|
|
|
exit(1);
|
|
|
|
}
|
|
|
|
#endif
|
2004-12-29 22:36:09 +01:00
|
|
|
|
|
|
|
restore_backend_variables(¶m, port);
|
2004-11-17 01:14:14 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Restore critical backend variables from the BackendParameters struct */
|
|
|
|
static void
|
2011-04-10 17:42:00 +02:00
|
|
|
restore_backend_variables(BackendParameters *param, Port *port)
|
2004-11-17 01:14:14 +01:00
|
|
|
{
|
|
|
|
memcpy(port, ¶m->port, sizeof(Port));
|
|
|
|
read_inheritable_socket(&port->sock, ¶m->portsocket);
|
|
|
|
|
|
|
|
SetDataDir(param->DataDir);
|
2003-12-20 18:31:21 +01:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
memcpy(&ListenSocket, ¶m->ListenSocket, sizeof(ListenSocket));
|
2003-12-20 18:31:21 +01:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
MyCancelKey = param->MyCancelKey;
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
MyPMChildSlot = param->MyPMChildSlot;
|
2004-11-17 01:14:14 +01:00
|
|
|
|
|
|
|
UsedShmemSegID = param->UsedShmemSegID;
|
|
|
|
UsedShmemSegAddr = param->UsedShmemSegAddr;
|
|
|
|
|
|
|
|
ShmemLock = param->ShmemLock;
|
|
|
|
ShmemVariableCache = param->ShmemVariableCache;
|
|
|
|
ShmemBackendArray = param->ShmemBackendArray;
|
|
|
|
|
Reduce the number of semaphores used under --disable-spinlocks.
Instead of allocating a semaphore from the operating system for every
spinlock, allocate a fixed number of semaphores (by default, 1024)
from the operating system and multiplex all the spinlocks that get
created onto them. This could self-deadlock if a process attempted
to acquire more than one spinlock at a time, but since processes
aren't supposed to execute anything other than short stretches of
straight-line code while holding a spinlock, that shouldn't happen.
One motivation for this change is that, with the introduction of
dynamic shared memory, it may be desirable to create spinlocks that
last for less than the lifetime of the server. Without this change,
attempting to use such facilities under --disable-spinlocks would
quickly exhaust any supply of available semaphores. Quite apart
from that, it's desirable to contain the quantity of semaphores
needed to run the server simply on convenience grounds, since using
too many may make it harder to get PostgreSQL running on a new
platform, which is mostly the point of --disable-spinlocks in the
first place.
Patch by me; review by Tom Lane.
2014-01-09 00:49:14 +01:00
|
|
|
#ifndef HAVE_SPINLOCKS
|
|
|
|
SpinlockSemaArray = param->SpinlockSemaArray;
|
|
|
|
#endif
|
2016-02-04 22:43:04 +01:00
|
|
|
NamedLWLockTrancheRequests = param->NamedLWLockTrancheRequests;
|
|
|
|
NamedLWLockTrancheArray = param->NamedLWLockTrancheArray;
|
2014-01-27 17:07:44 +01:00
|
|
|
MainLWLockArray = param->MainLWLockArray;
|
2004-11-17 01:14:14 +01:00
|
|
|
ProcStructLock = param->ProcStructLock;
|
2006-01-04 22:06:32 +01:00
|
|
|
ProcGlobal = param->ProcGlobal;
|
2007-03-07 14:35:03 +01:00
|
|
|
AuxiliaryProcs = param->AuxiliaryProcs;
|
2011-11-25 14:02:10 +01:00
|
|
|
PreparedXactProcs = param->PreparedXactProcs;
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
PMSignalState = param->PMSignalState;
|
2004-11-17 01:14:14 +01:00
|
|
|
read_inheritable_socket(&pgStatSock, ¶m->pgStatSock);
|
|
|
|
|
|
|
|
PostmasterPid = param->PostmasterPid;
|
2005-06-30 00:51:57 +02:00
|
|
|
PgStartTime = param->PgStartTime;
|
2008-05-04 23:13:36 +02:00
|
|
|
PgReloadTime = param->PgReloadTime;
|
2012-07-31 20:36:54 +02:00
|
|
|
first_syslogger_file_time = param->first_syslogger_file_time;
|
2003-12-20 18:31:21 +01:00
|
|
|
|
2007-07-19 21:13:43 +02:00
|
|
|
redirection_done = param->redirection_done;
|
2011-09-29 23:20:53 +02:00
|
|
|
IsBinaryUpgrade = param->IsBinaryUpgrade;
|
2012-03-29 07:19:11 +02:00
|
|
|
max_safe_fds = param->max_safe_fds;
|
2007-07-19 21:13:43 +02:00
|
|
|
|
2013-01-02 16:01:14 +01:00
|
|
|
MaxBackends = param->MaxBackends;
|
|
|
|
|
2004-05-30 05:50:15 +02:00
|
|
|
#ifdef WIN32
|
2004-11-17 01:14:14 +01:00
|
|
|
PostmasterHandle = param->PostmasterHandle;
|
|
|
|
pgwin32_initial_signal_pipe = param->initial_signal_pipe;
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
#else
|
|
|
|
memcpy(&postmaster_alive_fds, ¶m->postmaster_alive_fds,
|
|
|
|
sizeof(postmaster_alive_fds));
|
2004-05-30 05:50:15 +02:00
|
|
|
#endif
|
2004-01-07 00:15:22 +01:00
|
|
|
|
2004-11-17 01:14:14 +01:00
|
|
|
memcpy(&syslogPipe, ¶m->syslogPipe, sizeof(syslogPipe));
|
2004-05-21 07:08:06 +02:00
|
|
|
|
2007-02-10 15:58:55 +01:00
|
|
|
strlcpy(my_exec_path, param->my_exec_path, MAXPGPATH);
|
2004-05-28 07:13:32 +02:00
|
|
|
|
2007-02-10 15:58:55 +01:00
|
|
|
strlcpy(pkglib_path, param->pkglib_path, MAXPGPATH);
|
2005-08-11 23:11:50 +02:00
|
|
|
|
2007-02-10 15:58:55 +01:00
|
|
|
strlcpy(ExtraOptions, param->ExtraOptions, MAXPGPATH);
|
2003-12-20 18:31:21 +01:00
|
|
|
}
|
|
|
|
|
2004-01-26 23:59:54 +01:00
|
|
|
|
2005-08-21 01:26:37 +02:00
|
|
|
Size
|
2004-05-27 17:07:41 +02:00
|
|
|
ShmemBackendArraySize(void)
|
2004-01-26 23:59:54 +01:00
|
|
|
{
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
return mul_size(MaxLivePostmasterChildren(), sizeof(Backend));
|
2004-01-26 23:59:54 +01:00
|
|
|
}
|
|
|
|
|
2004-05-27 17:07:41 +02:00
|
|
|
void
|
|
|
|
ShmemBackendArrayAllocation(void)
|
2004-01-26 23:59:54 +01:00
|
|
|
{
|
2005-08-21 01:26:37 +02:00
|
|
|
Size size = ShmemBackendArraySize();
|
2004-05-27 17:07:41 +02:00
|
|
|
|
|
|
|
ShmemBackendArray = (Backend *) ShmemAlloc(size);
|
2004-05-30 00:48:23 +02:00
|
|
|
/* Mark all slots as empty */
|
2004-01-26 23:59:54 +01:00
|
|
|
memset(ShmemBackendArray, 0, size);
|
|
|
|
}
|
|
|
|
|
2004-05-27 17:07:41 +02:00
|
|
|
static void
|
|
|
|
ShmemBackendArrayAdd(Backend *bn)
|
2004-01-26 23:59:54 +01:00
|
|
|
{
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
/* The array slot corresponding to my PMChildSlot should be free */
|
|
|
|
int i = bn->child_slot - 1;
|
2004-01-26 23:59:54 +01:00
|
|
|
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
Assert(ShmemBackendArray[i].pid == 0);
|
|
|
|
ShmemBackendArray[i] = *bn;
|
2004-01-26 23:59:54 +01:00
|
|
|
}
|
|
|
|
|
2004-05-27 17:07:41 +02:00
|
|
|
static void
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
ShmemBackendArrayRemove(Backend *bn)
|
2004-01-26 23:59:54 +01:00
|
|
|
{
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
int i = bn->child_slot - 1;
|
2004-01-26 23:59:54 +01:00
|
|
|
|
Install a "dead man switch" to allow the postmaster to detect cases where
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
2009-05-05 21:59:00 +02:00
|
|
|
Assert(ShmemBackendArray[i].pid == bn->pid);
|
|
|
|
/* Mark the slot as empty */
|
|
|
|
ShmemBackendArray[i].pid = 0;
|
2004-01-26 23:59:54 +01:00
|
|
|
}
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
#endif /* EXEC_BACKEND */
|
2004-05-28 07:13:32 +02:00
|
|
|
|
2004-01-11 04:49:31 +01:00
|
|
|
|
|
|
|
#ifdef WIN32
|
|
|
|
|
2012-07-05 20:00:40 +02:00
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* Subset implementation of waitpid() for Windows. We assume pid is -1
|
2012-07-05 20:00:40 +02:00
|
|
|
* (that is, check all child processes) and options is WNOHANG (don't wait).
|
|
|
|
*/
|
2004-05-27 17:07:41 +02:00
|
|
|
static pid_t
|
2012-07-05 20:00:40 +02:00
|
|
|
waitpid(pid_t pid, int *exitstatus, int options)
|
2004-01-26 23:59:54 +01:00
|
|
|
{
|
2007-11-15 22:14:46 +01:00
|
|
|
DWORD dwd;
|
|
|
|
ULONG_PTR key;
|
|
|
|
OVERLAPPED *ovl;
|
2007-10-26 23:50:10 +02:00
|
|
|
|
2004-08-29 07:07:03 +02:00
|
|
|
/*
|
2007-11-15 22:14:46 +01:00
|
|
|
* Check if there are any dead children. If there are, return the pid of
|
|
|
|
* the first one that died.
|
2004-08-29 07:07:03 +02:00
|
|
|
*/
|
2007-10-26 23:50:10 +02:00
|
|
|
if (GetQueuedCompletionStatus(win32ChildQueue, &dwd, &key, &ovl, 0))
|
2004-08-29 05:16:30 +02:00
|
|
|
{
|
2007-11-15 22:14:46 +01:00
|
|
|
*exitstatus = (int) key;
|
2007-10-26 23:50:10 +02:00
|
|
|
return dwd;
|
2004-01-26 23:59:54 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
2004-05-30 00:48:23 +02:00
|
|
|
/*
|
2007-10-26 23:50:10 +02:00
|
|
|
* Note! Code below executes on a thread pool! All operations must
|
|
|
|
* be thread safe! Note that elog() and friends must *not* be used.
|
2004-05-30 00:48:23 +02:00
|
|
|
*/
|
2007-10-26 23:50:10 +02:00
|
|
|
static void WINAPI
|
|
|
|
pgwin32_deadchild_callback(PVOID lpParameter, BOOLEAN TimerOrWaitFired)
|
2004-05-27 17:07:41 +02:00
|
|
|
{
|
2007-11-15 22:14:46 +01:00
|
|
|
win32_deadchild_waitinfo *childinfo = (win32_deadchild_waitinfo *) lpParameter;
|
2007-10-26 23:50:10 +02:00
|
|
|
DWORD exitcode;
|
2004-05-27 17:07:41 +02:00
|
|
|
|
2007-10-26 23:50:10 +02:00
|
|
|
if (TimerOrWaitFired)
|
2007-11-15 22:14:46 +01:00
|
|
|
return; /* timeout. Should never happen, since we use
|
|
|
|
* INFINITE as timeout value. */
|
2004-02-08 23:28:57 +01:00
|
|
|
|
2007-11-15 22:14:46 +01:00
|
|
|
/*
|
|
|
|
* Remove handle from wait - required even though it's set to wait only
|
|
|
|
* once
|
|
|
|
*/
|
2007-10-26 23:50:10 +02:00
|
|
|
UnregisterWaitEx(childinfo->waitHandle, NULL);
|
|
|
|
|
|
|
|
if (!GetExitCodeProcess(childinfo->procHandle, &exitcode))
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Should never happen. Inform user and set a fixed exitcode.
|
|
|
|
*/
|
2007-11-15 21:04:38 +01:00
|
|
|
write_stderr("could not read exit code for process\n");
|
2007-10-26 23:50:10 +02:00
|
|
|
exitcode = 255;
|
|
|
|
}
|
|
|
|
|
2007-11-15 22:14:46 +01:00
|
|
|
if (!PostQueuedCompletionStatus(win32ChildQueue, childinfo->procId, (ULONG_PTR) exitcode, NULL))
|
2007-10-26 23:50:10 +02:00
|
|
|
write_stderr("could not post child completion status\n");
|
|
|
|
|
2007-11-15 22:14:46 +01:00
|
|
|
/*
|
|
|
|
* Handle is per-process, so we close it here instead of in the
|
|
|
|
* originating thread
|
|
|
|
*/
|
2007-10-26 23:50:10 +02:00
|
|
|
CloseHandle(childinfo->procHandle);
|
|
|
|
|
2007-11-15 22:14:46 +01:00
|
|
|
/*
|
|
|
|
* Free struct that was allocated before the call to
|
|
|
|
* RegisterWaitForSingleObject()
|
|
|
|
*/
|
2007-10-26 23:50:10 +02:00
|
|
|
free(childinfo);
|
|
|
|
|
|
|
|
/* Queue SIGCHLD signal */
|
|
|
|
pg_queue_signal(SIGCHLD);
|
2004-02-08 23:28:57 +01:00
|
|
|
}
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
#endif /* WIN32 */
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Initialize one and only handle for monitoring postmaster death.
|
|
|
|
*
|
|
|
|
* Called once in the postmaster, so that child processes can subsequently
|
|
|
|
* monitor if their parent is dead.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
InitPostmasterDeathWatchHandle(void)
|
|
|
|
{
|
|
|
|
#ifndef WIN32
|
2012-06-10 21:20:04 +02:00
|
|
|
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
/*
|
|
|
|
* Create a pipe. Postmaster holds the write end of the pipe open
|
2012-06-10 21:20:04 +02:00
|
|
|
* (POSTMASTER_FD_OWN), and children hold the read end. Children can pass
|
|
|
|
* the read file descriptor to select() to wake up in case postmaster
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
* dies, or check for postmaster death with a (read() == 0). Children must
|
|
|
|
* close the write end as soon as possible after forking, because EOF
|
|
|
|
* won't be signaled in the read end until all processes have closed the
|
|
|
|
* write fd. That is taken care of in ClosePostmasterPorts().
|
|
|
|
*/
|
|
|
|
Assert(MyProcPid == PostmasterPid);
|
Avoid depending on non-POSIX behavior of fcntl(2).
The POSIX standard does not say that the success return value for
fcntl(F_SETFD) and fcntl(F_SETFL) is zero; it says only that it's not -1.
We had several calls that were making the stronger assumption. Adjust
them to test specifically for -1 for strict spec compliance.
The standard further leaves open the possibility that the O_NONBLOCK
flag bit is not the only active one in F_SETFL's argument. Formally,
therefore, one ought to get the current flags with F_GETFL and store
them back with only the O_NONBLOCK bit changed when trying to change
the nonblock state. In port/noblock.c, we were doing the full pushup
in pg_set_block but not in pg_set_noblock, which is just weird. Make
both of them do it properly, since they have little business making
any assumptions about the socket they're handed. The other places
where we're issuing F_SETFL are working with FDs we just got from
pipe(2), so it's reasonable to assume the FDs' properties are all
default, so I didn't bother adding F_GETFL steps there.
Also, while pg_set_block deserves some points for trying to do things
right, somebody had decided that it'd be even better to cast fcntl's
third argument to "long". Which is completely loony, because POSIX
clearly says the third argument for an F_SETFL call is "int".
Given the lack of field complaints, these missteps apparently are not
of significance on any common platforms. But they're still wrong,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/30882.1492800880@sss.pgh.pa.us
2017-04-21 21:55:56 +02:00
|
|
|
if (pipe(postmaster_alive_fds) < 0)
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
ereport(FATAL,
|
|
|
|
(errcode_for_file_access(),
|
|
|
|
errmsg_internal("could not create pipe to monitor postmaster death: %m")));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Set O_NONBLOCK to allow testing for the fd's presence with a read()
|
|
|
|
* call.
|
|
|
|
*/
|
Avoid depending on non-POSIX behavior of fcntl(2).
The POSIX standard does not say that the success return value for
fcntl(F_SETFD) and fcntl(F_SETFL) is zero; it says only that it's not -1.
We had several calls that were making the stronger assumption. Adjust
them to test specifically for -1 for strict spec compliance.
The standard further leaves open the possibility that the O_NONBLOCK
flag bit is not the only active one in F_SETFL's argument. Formally,
therefore, one ought to get the current flags with F_GETFL and store
them back with only the O_NONBLOCK bit changed when trying to change
the nonblock state. In port/noblock.c, we were doing the full pushup
in pg_set_block but not in pg_set_noblock, which is just weird. Make
both of them do it properly, since they have little business making
any assumptions about the socket they're handed. The other places
where we're issuing F_SETFL are working with FDs we just got from
pipe(2), so it's reasonable to assume the FDs' properties are all
default, so I didn't bother adding F_GETFL steps there.
Also, while pg_set_block deserves some points for trying to do things
right, somebody had decided that it'd be even better to cast fcntl's
third argument to "long". Which is completely loony, because POSIX
clearly says the third argument for an F_SETFL call is "int".
Given the lack of field complaints, these missteps apparently are not
of significance on any common platforms. But they're still wrong,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/30882.1492800880@sss.pgh.pa.us
2017-04-21 21:55:56 +02:00
|
|
|
if (fcntl(postmaster_alive_fds[POSTMASTER_FD_WATCH], F_SETFL, O_NONBLOCK) == -1)
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
ereport(FATAL,
|
|
|
|
(errcode_for_socket_access(),
|
2013-04-19 05:35:19 +02:00
|
|
|
errmsg_internal("could not set postmaster death monitoring pipe to nonblocking mode: %m")));
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
#else
|
2012-06-10 21:20:04 +02:00
|
|
|
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
/*
|
|
|
|
* On Windows, we use a process handle for the same purpose.
|
|
|
|
*/
|
|
|
|
if (DuplicateHandle(GetCurrentProcess(),
|
|
|
|
GetCurrentProcess(),
|
|
|
|
GetCurrentProcess(),
|
|
|
|
&PostmasterHandle,
|
|
|
|
0,
|
|
|
|
TRUE,
|
|
|
|
DUPLICATE_SAME_ACCESS) == 0)
|
|
|
|
ereport(FATAL,
|
2011-08-23 21:00:52 +02:00
|
|
|
(errmsg_internal("could not duplicate postmaster handle: error code %lu",
|
|
|
|
GetLastError())));
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
#endif /* WIN32 */
|
Introduce a pipe between postmaster and each backend, which can be used to
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
2011-07-08 17:27:49 +02:00
|
|
|
}
|