2007-07-24 06:54:09 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* walwriter.c
|
|
|
|
*
|
|
|
|
* The WAL writer background process is new as of Postgres 8.3. It attempts
|
|
|
|
* to keep regular backends from having to write out (and fsync) WAL pages.
|
|
|
|
* Also, it guarantees that transaction commit records that weren't synced
|
|
|
|
* to disk immediately upon commit (ie, were "asynchronously committed")
|
|
|
|
* will reach disk within a knowable time --- which, as it happens, is at
|
|
|
|
* most three times the wal_writer_delay cycle time.
|
|
|
|
*
|
|
|
|
* Note that as with the bgwriter for shared buffers, regular backends are
|
|
|
|
* still empowered to issue WAL writes and fsyncs when the walwriter doesn't
|
2011-11-13 10:00:57 +01:00
|
|
|
* keep up. This means that the WALWriter is not an essential process and
|
|
|
|
* can shutdown quickly when requested.
|
2007-07-24 06:54:09 +02:00
|
|
|
*
|
|
|
|
* Because the walwriter's cycle is directly linked to the maximum delay
|
|
|
|
* before async-commit transactions are guaranteed committed, it's probably
|
|
|
|
* unwise to load additional functionality onto it. For instance, if you've
|
|
|
|
* got a yen to create xlog segments further in advance, that'd be better done
|
|
|
|
* in bgwriter than in walwriter.
|
|
|
|
*
|
|
|
|
* The walwriter is started by the postmaster as soon as the startup subprocess
|
|
|
|
* finishes. It remains alive until the postmaster commands it to terminate.
|
|
|
|
* Normal termination is by SIGTERM, which instructs the walwriter to exit(0).
|
|
|
|
* Emergency termination is by SIGQUIT; like any backend, the walwriter will
|
|
|
|
* simply abort and exit on SIGQUIT.
|
|
|
|
*
|
|
|
|
* If the walwriter exits unexpectedly, the postmaster treats that the same
|
|
|
|
* as a backend crash: shared memory may be corrupted, so remaining backends
|
|
|
|
* should be killed by SIGQUIT and then a recovery cycle started.
|
|
|
|
*
|
|
|
|
*
|
2024-01-04 02:49:05 +01:00
|
|
|
* Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
|
2007-07-24 06:54:09 +02:00
|
|
|
*
|
|
|
|
*
|
|
|
|
* IDENTIFICATION
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/backend/postmaster/walwriter.c
|
2007-07-24 06:54:09 +02:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#include "postgres.h"
|
|
|
|
|
|
|
|
#include <signal.h>
|
|
|
|
#include <unistd.h>
|
|
|
|
|
|
|
|
#include "access/xlog.h"
|
|
|
|
#include "libpq/pqsignal.h"
|
|
|
|
#include "miscadmin.h"
|
2016-03-10 18:44:09 +01:00
|
|
|
#include "pgstat.h"
|
2024-03-18 10:35:08 +01:00
|
|
|
#include "postmaster/auxprocess.h"
|
2019-12-17 19:14:28 +01:00
|
|
|
#include "postmaster/interrupt.h"
|
2007-07-24 06:54:09 +02:00
|
|
|
#include "postmaster/walwriter.h"
|
|
|
|
#include "storage/bufmgr.h"
|
2016-11-22 20:26:40 +01:00
|
|
|
#include "storage/condition_variable.h"
|
2012-08-29 00:02:07 +02:00
|
|
|
#include "storage/fd.h"
|
2007-07-24 06:54:09 +02:00
|
|
|
#include "storage/ipc.h"
|
2011-09-04 07:13:16 +02:00
|
|
|
#include "storage/lwlock.h"
|
Reduce idle power consumption of walwriter and checkpointer processes.
This patch modifies the walwriter process so that, when it has not found
anything useful to do for many consecutive wakeup cycles, it extends its
sleep time to reduce the server's idle power consumption. It reverts to
normal as soon as it's done any successful flushes. It's still true that
during any async commit, backends check for completed, unflushed pages of
WAL and signal the walwriter if there are any; so that in practice the
walwriter can get awakened and returned to normal operation sooner than the
sleep time might suggest.
Also, improve the checkpointer so that it uses a latch and a computed delay
time to not wake up at all except when it has something to do, replacing a
previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for
reducing the server's power consumption when idle.
In passing, get rid of the dedicated latch for signaling the walwriter in
favor of using its procLatch, since that comports better with possible
generic signal handlers using that latch. Also, fix a pre-existing bug
with failure to save/restore errno in walwriter's signal handlers.
Peter Geoghegan, somewhat simplified by Tom
2012-05-09 02:03:26 +02:00
|
|
|
#include "storage/proc.h"
|
2019-11-25 22:08:53 +01:00
|
|
|
#include "storage/procsignal.h"
|
2007-07-24 06:54:09 +02:00
|
|
|
#include "storage/smgr.h"
|
2011-09-04 07:13:16 +02:00
|
|
|
#include "utils/guc.h"
|
|
|
|
#include "utils/hsearch.h"
|
2007-07-24 06:54:09 +02:00
|
|
|
#include "utils/memutils.h"
|
|
|
|
#include "utils/resowner.h"
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
* GUC parameters
|
|
|
|
*/
|
|
|
|
int WalWriterDelay = 200;
|
2023-05-15 00:45:19 +02:00
|
|
|
int WalWriterFlushAfter = DEFAULT_WAL_WRITER_FLUSH_AFTER;
|
2007-07-24 06:54:09 +02:00
|
|
|
|
Reduce idle power consumption of walwriter and checkpointer processes.
This patch modifies the walwriter process so that, when it has not found
anything useful to do for many consecutive wakeup cycles, it extends its
sleep time to reduce the server's idle power consumption. It reverts to
normal as soon as it's done any successful flushes. It's still true that
during any async commit, backends check for completed, unflushed pages of
WAL and signal the walwriter if there are any; so that in practice the
walwriter can get awakened and returned to normal operation sooner than the
sleep time might suggest.
Also, improve the checkpointer so that it uses a latch and a computed delay
time to not wake up at all except when it has something to do, replacing a
previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for
reducing the server's power consumption when idle.
In passing, get rid of the dedicated latch for signaling the walwriter in
favor of using its procLatch, since that comports better with possible
generic signal handlers using that latch. Also, fix a pre-existing bug
with failure to save/restore errno in walwriter's signal handlers.
Peter Geoghegan, somewhat simplified by Tom
2012-05-09 02:03:26 +02:00
|
|
|
/*
|
|
|
|
* Number of do-nothing loops before lengthening the delay time, and the
|
|
|
|
* multiplier to apply to WalWriterDelay when we do decide to hibernate.
|
|
|
|
* (Perhaps these need to be configurable?)
|
|
|
|
*/
|
|
|
|
#define LOOPS_UNTIL_HIBERNATE 50
|
|
|
|
#define HIBERNATE_FACTOR 25
|
|
|
|
|
2007-07-24 06:54:09 +02:00
|
|
|
/*
|
|
|
|
* Main entry point for walwriter process
|
|
|
|
*
|
Fix management of pendingOpsTable in auxiliary processes.
mdinit() was misusing IsBootstrapProcessingMode() to decide whether to
create an fsync pending-operations table in the current process. This led
to creating a table not only in the startup and checkpointer processes as
intended, but also in the bgwriter process, not to mention other auxiliary
processes such as walwriter and walreceiver. Creation of the table in the
bgwriter is fatal, because it absorbs fsync requests that should have gone
to the checkpointer; instead they just sit in bgwriter local memory and are
never acted on. So writes performed by the bgwriter were not being fsync'd
which could result in data loss after an OS crash. I think there is no
live bug with respect to walwriter and walreceiver because those never
perform any writes of shared buffers; but the potential is there for
future breakage in those processes too.
To fix, make AuxiliaryProcessMain() export the current process's
AuxProcType as a global variable, and then make mdinit() test directly for
the types of aux process that should have a pendingOpsTable. Having done
that, we might as well also get rid of the random bool flags such as
am_walreceiver that some of the aux processes had grown. (Note that we
could not have fixed the bug by examining those variables in mdinit(),
because it's called from BaseInit() which is run by AuxiliaryProcessMain()
before entering any of the process-type-specific code.)
Back-patch to 9.2, where the problem was introduced by the split-up of
bgwriter and checkpointer processes. The bogus pendingOpsTable exists
in walwriter and walreceiver processes in earlier branches, but absent
any evidence that it causes actual problems there, I'll leave the older
branches alone.
2012-07-18 21:28:10 +02:00
|
|
|
* This is invoked from AuxiliaryProcessMain, which has already created the
|
|
|
|
* basic execution environment, but not enabled signals yet.
|
2007-07-24 06:54:09 +02:00
|
|
|
*/
|
|
|
|
void
|
2024-03-18 10:35:08 +01:00
|
|
|
WalWriterMain(char *startup_data, size_t startup_data_len)
|
2007-07-24 06:54:09 +02:00
|
|
|
{
|
|
|
|
sigjmp_buf local_sigjmp_buf;
|
|
|
|
MemoryContext walwriter_context;
|
Reduce idle power consumption of walwriter and checkpointer processes.
This patch modifies the walwriter process so that, when it has not found
anything useful to do for many consecutive wakeup cycles, it extends its
sleep time to reduce the server's idle power consumption. It reverts to
normal as soon as it's done any successful flushes. It's still true that
during any async commit, backends check for completed, unflushed pages of
WAL and signal the walwriter if there are any; so that in practice the
walwriter can get awakened and returned to normal operation sooner than the
sleep time might suggest.
Also, improve the checkpointer so that it uses a latch and a computed delay
time to not wake up at all except when it has something to do, replacing a
previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for
reducing the server's power consumption when idle.
In passing, get rid of the dedicated latch for signaling the walwriter in
favor of using its procLatch, since that comports better with possible
generic signal handlers using that latch. Also, fix a pre-existing bug
with failure to save/restore errno in walwriter's signal handlers.
Peter Geoghegan, somewhat simplified by Tom
2012-05-09 02:03:26 +02:00
|
|
|
int left_till_hibernate;
|
2012-05-09 05:05:58 +02:00
|
|
|
bool hibernating;
|
2011-11-13 10:00:57 +01:00
|
|
|
|
2024-03-18 10:35:08 +01:00
|
|
|
Assert(startup_data_len == 0);
|
|
|
|
|
|
|
|
MyBackendType = B_WAL_WRITER;
|
|
|
|
AuxiliaryProcessMainCommon();
|
|
|
|
|
2007-07-24 06:54:09 +02:00
|
|
|
/*
|
|
|
|
* Properly accept or ignore signals the postmaster might send us
|
|
|
|
*
|
|
|
|
* We have no particular use for SIGINT at the moment, but seems
|
|
|
|
* reasonable to treat like SIGTERM.
|
|
|
|
*/
|
2019-12-17 19:14:28 +01:00
|
|
|
pqsignal(SIGHUP, SignalHandlerForConfigReload);
|
|
|
|
pqsignal(SIGINT, SignalHandlerForShutdownRequest);
|
|
|
|
pqsignal(SIGTERM, SignalHandlerForShutdownRequest);
|
Centralize setup of SIGQUIT handling for postmaster child processes.
We decided that the policy established in commit 7634bd4f6 for
the bgwriter, checkpointer, walwriter, and walreceiver processes,
namely that they should accept SIGQUIT at all times, really ought
to apply uniformly to all postmaster children. Therefore, get
rid of the duplicative and inconsistent per-process code for
establishing that signal handler and removing SIGQUIT from BlockSig.
Instead, make InitPostmasterChild do it.
The handler set up by InitPostmasterChild is SignalHandlerForCrashExit,
which just summarily does _exit(2). In interactive backends, we
almost immediately replace that with quickdie, since we would prefer
to try to tell the client that we're dying. However, this patch is
changing the behavior of autovacuum (both launcher and workers), as
well as walsenders. Those processes formerly also used quickdie,
but AFAICS that was just mindless copy-and-paste: they don't have
any interactive client that's likely to benefit from being told this.
The stats collector continues to be an outlier, in that it thinks
SIGQUIT means normal exit. That should probably be changed for
consistency, but there's another patch set where that's being
dealt with, so I didn't do so here.
Discussion: https://postgr.es/m/644875.1599933441@sss.pgh.pa.us
2020-09-16 22:04:36 +02:00
|
|
|
/* SIGQUIT handler was already set up by InitPostmasterChild */
|
2007-07-24 06:54:09 +02:00
|
|
|
pqsignal(SIGALRM, SIG_IGN);
|
|
|
|
pqsignal(SIGPIPE, SIG_IGN);
|
2019-11-25 22:08:53 +01:00
|
|
|
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
|
2007-07-24 06:54:09 +02:00
|
|
|
pqsignal(SIGUSR2, SIG_IGN); /* not used */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Reset some signals that are accepted by postmaster but not here
|
|
|
|
*/
|
|
|
|
pqsignal(SIGCHLD, SIG_DFL);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Create a memory context that we will do all our work in. We do this so
|
|
|
|
* that we can reset the context during error recovery and thereby avoid
|
|
|
|
* possible memory leaks. Formerly this code just ran in
|
|
|
|
* TopMemoryContext, but resetting that would be a really bad idea.
|
|
|
|
*/
|
|
|
|
walwriter_context = AllocSetContextCreate(TopMemoryContext,
|
|
|
|
"Wal Writer",
|
Add macros to make AllocSetContextCreate() calls simpler and safer.
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls
had typos in the context-sizing parameters. While none of these led to
especially significant problems, they did create minor inefficiencies,
and it's now clear that expecting people to copy-and-paste those calls
accurately is not a great idea. Let's reduce the risk of future errors
by introducing single macros that encapsulate the common use-cases.
Three such macros are enough to cover all but two special-purpose contexts;
those two calls can be left as-is, I think.
While this patch doesn't in itself improve matters for third-party
extensions, it doesn't break anything for them either, and they can
gradually adopt the simplified notation over time.
In passing, change TopMemoryContext to use the default allocation
parameters. Formerly it could only be extended 8K at a time. That was
probably reasonable when this code was written; but nowadays we create
many more contexts than we did then, so that it's not unusual to have a
couple hundred K in TopMemoryContext, even without considering various
dubious code that sticks other things there. There seems no good reason
not to let it use growing blocks like most other contexts.
Back-patch to 9.6, mostly because that's still close enough to HEAD that
it's easy to do so, and keeping the branches in sync can be expected to
avoid some future back-patching pain. The bugs fixed by these changes
don't seem to be significant enough to justify fixing them further back.
Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 23:50:38 +02:00
|
|
|
ALLOCSET_DEFAULT_SIZES);
|
2007-07-24 06:54:09 +02:00
|
|
|
MemoryContextSwitchTo(walwriter_context);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If an exception is encountered, processing resumes here.
|
|
|
|
*
|
Accept SIGQUIT during error recovery in auxiliary processes.
The bgwriter, checkpointer, walwriter, and walreceiver processes
claimed to allow SIGQUIT "at all times". In reality SIGQUIT
would get re-blocked during error recovery, because we didn't
update the actual signal mask immediately, so sigsetjmp() would
save and reinstate a mask that includes SIGQUIT.
This appears to be simply a coding oversight. There's never a
good reason to hold off SIGQUIT in these processes, because it's
going to just call _exit(2) which should be safe enough, especially
since the postmaster is going to tear down shared memory afterwards.
Hence, stick in PG_SETMASK() calls to install the modified BlockSig
mask immediately.
Also try to improve the comments around sigsetjmp blocks. Most of
them were just referencing postgres.c, which is misleading because
actually postgres.c manages the signals differently.
No back-patch, since there's no evidence that this is causing any
problems in the field.
Discussion: https://postgr.es/m/CALDaNm1d1hHPZUg3xU4XjtWBOLCrA+-2cJcLpw-cePZ=GgDVfA@mail.gmail.com
2020-09-11 22:01:28 +02:00
|
|
|
* You might wonder why this isn't coded as an infinite loop around a
|
|
|
|
* PG_TRY construct. The reason is that this is the bottom of the
|
|
|
|
* exception stack, and so with PG_TRY there would be no exception handler
|
|
|
|
* in force at all during the CATCH part. By leaving the outermost setjmp
|
|
|
|
* always active, we have at least some chance of recovering from an error
|
|
|
|
* during error recovery. (If we get into an infinite loop thereby, it
|
|
|
|
* will soon be stopped by overflow of elog.c's internal state stack.)
|
|
|
|
*
|
|
|
|
* Note that we use sigsetjmp(..., 1), so that the prevailing signal mask
|
|
|
|
* (to wit, BlockSig) will be restored when longjmp'ing to here. Thus,
|
|
|
|
* signals other than SIGQUIT will be blocked until we complete error
|
|
|
|
* recovery. It might seem that this policy makes the HOLD_INTERRUPTS()
|
|
|
|
* call redundant, but it is not since InterruptPending might be set
|
|
|
|
* already.
|
2007-07-24 06:54:09 +02:00
|
|
|
*/
|
|
|
|
if (sigsetjmp(local_sigjmp_buf, 1) != 0)
|
|
|
|
{
|
|
|
|
/* Since not using PG_TRY, must reset error stack by hand */
|
|
|
|
error_context_stack = NULL;
|
|
|
|
|
|
|
|
/* Prevent interrupts while cleaning up */
|
|
|
|
HOLD_INTERRUPTS();
|
|
|
|
|
|
|
|
/* Report the error to the server log */
|
|
|
|
EmitErrorReport();
|
|
|
|
|
|
|
|
/*
|
|
|
|
* These operations are really just a minimal subset of
|
|
|
|
* AbortTransaction(). We don't have very many resources to worry
|
|
|
|
* about in walwriter, but we do have LWLocks, and perhaps buffers?
|
|
|
|
*/
|
|
|
|
LWLockReleaseAll();
|
2016-11-22 20:26:40 +01:00
|
|
|
ConditionVariableCancelSleep();
|
2016-03-10 18:44:09 +01:00
|
|
|
pgstat_report_wait_end();
|
2007-07-24 06:54:09 +02:00
|
|
|
UnlockBuffers();
|
Use a ResourceOwner to track buffer pins in all cases.
Historically, we've allowed auxiliary processes to take buffer pins without
tracking them in a ResourceOwner. However, that creates problems for error
recovery. In particular, we've seen multiple reports of assertion crashes
in the startup process when it gets an error while holding a buffer pin,
as for example if it gets ENOSPC during a write. In a non-assert build,
the process would simply exit without releasing the pin at all. We've
gotten away with that so far just because a failure exit of the startup
process translates to a database crash anyhow; but any similar behavior
in other aux processes could result in stuck pins and subsequent problems
in vacuum.
To improve this, institute a policy that we must *always* have a resowner
backing any attempt to pin a buffer, which we can enforce just by removing
the previous special-case code in resowner.c. Add infrastructure to make
it easy to create a process-lifespan AuxProcessResourceOwner and clear
out its contents at appropriate times. Replace existing ad-hoc resowner
management in bgwriter.c and other aux processes with that. (Thus, while
the startup process gains a resowner where it had none at all before, some
other aux process types are replacing an ad-hoc resowner with this code.)
Also use the AuxProcessResourceOwner to manage buffer pins taken during
StartupXLOG and ShutdownXLOG, even when those are being run in a bootstrap
process or a standalone backend rather than a true auxiliary process.
In passing, remove some other ad-hoc resource owner creations that had
gotten cargo-culted into various other places. As far as I can tell
that was all unnecessary, and if it had been necessary it was incomplete,
due to lacking any provision for clearing those resowners later.
(Also worth noting in this connection is that a process that hasn't called
InitBufferPoolBackend has no business accessing buffers; so there's more
to do than just add the resowner if we want to touch buffers in processes
not covered by this patch.)
Although this fixes a very old bug, no back-patch, because there's no
evidence of any significant problem in non-assert builds.
Patch by me, pursuant to a report from Justin Pryzby. Thanks to
Robert Haas and Kyotaro Horiguchi for reviews.
Discussion: https://postgr.es/m/20180627233939.GA10276@telsasoft.com
2018-07-18 18:15:16 +02:00
|
|
|
ReleaseAuxProcessResources(false);
|
2007-07-24 06:54:09 +02:00
|
|
|
AtEOXact_Buffers(false);
|
2012-10-17 18:38:21 +02:00
|
|
|
AtEOXact_SMgr();
|
2018-04-28 23:45:02 +02:00
|
|
|
AtEOXact_Files(false);
|
2007-09-11 19:15:33 +02:00
|
|
|
AtEOXact_HashTables(false);
|
2007-07-24 06:54:09 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Now return to normal top-level context and clear ErrorContext for
|
|
|
|
* next time.
|
|
|
|
*/
|
|
|
|
MemoryContextSwitchTo(walwriter_context);
|
|
|
|
FlushErrorState();
|
|
|
|
|
|
|
|
/* Flush any leaked data in the top-level context */
|
2023-11-15 20:42:30 +01:00
|
|
|
MemoryContextReset(walwriter_context);
|
2007-07-24 06:54:09 +02:00
|
|
|
|
|
|
|
/* Now we can allow interrupts again */
|
|
|
|
RESUME_INTERRUPTS();
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Sleep at least 1 second after any error. A write error is likely
|
|
|
|
* to be repeated, and we don't want to be filling the error logs as
|
|
|
|
* fast as we can.
|
|
|
|
*/
|
|
|
|
pg_usleep(1000000L);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* We can now handle ereport(ERROR) */
|
|
|
|
PG_exception_stack = &local_sigjmp_buf;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Unblock signals (they were blocked when the postmaster forked us)
|
|
|
|
*/
|
2023-02-02 22:34:56 +01:00
|
|
|
sigprocmask(SIG_SETMASK, &UnBlockSig, NULL);
|
2007-07-24 06:54:09 +02:00
|
|
|
|
Reduce idle power consumption of walwriter and checkpointer processes.
This patch modifies the walwriter process so that, when it has not found
anything useful to do for many consecutive wakeup cycles, it extends its
sleep time to reduce the server's idle power consumption. It reverts to
normal as soon as it's done any successful flushes. It's still true that
during any async commit, backends check for completed, unflushed pages of
WAL and signal the walwriter if there are any; so that in practice the
walwriter can get awakened and returned to normal operation sooner than the
sleep time might suggest.
Also, improve the checkpointer so that it uses a latch and a computed delay
time to not wake up at all except when it has something to do, replacing a
previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for
reducing the server's power consumption when idle.
In passing, get rid of the dedicated latch for signaling the walwriter in
favor of using its procLatch, since that comports better with possible
generic signal handlers using that latch. Also, fix a pre-existing bug
with failure to save/restore errno in walwriter's signal handlers.
Peter Geoghegan, somewhat simplified by Tom
2012-05-09 02:03:26 +02:00
|
|
|
/*
|
|
|
|
* Reset hibernation state after any error.
|
|
|
|
*/
|
|
|
|
left_till_hibernate = LOOPS_UNTIL_HIBERNATE;
|
2012-05-09 05:05:58 +02:00
|
|
|
hibernating = false;
|
|
|
|
SetWalWriterSleeping(false);
|
Reduce idle power consumption of walwriter and checkpointer processes.
This patch modifies the walwriter process so that, when it has not found
anything useful to do for many consecutive wakeup cycles, it extends its
sleep time to reduce the server's idle power consumption. It reverts to
normal as soon as it's done any successful flushes. It's still true that
during any async commit, backends check for completed, unflushed pages of
WAL and signal the walwriter if there are any; so that in practice the
walwriter can get awakened and returned to normal operation sooner than the
sleep time might suggest.
Also, improve the checkpointer so that it uses a latch and a computed delay
time to not wake up at all except when it has something to do, replacing a
previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for
reducing the server's power consumption when idle.
In passing, get rid of the dedicated latch for signaling the walwriter in
favor of using its procLatch, since that comports better with possible
generic signal handlers using that latch. Also, fix a pre-existing bug
with failure to save/restore errno in walwriter's signal handlers.
Peter Geoghegan, somewhat simplified by Tom
2012-05-09 02:03:26 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Advertise our latch that backends can use to wake us up while we're
|
|
|
|
* sleeping.
|
|
|
|
*/
|
|
|
|
ProcGlobal->walwriterLatch = &MyProc->procLatch;
|
|
|
|
|
2007-07-24 06:54:09 +02:00
|
|
|
/*
|
|
|
|
* Loop forever
|
|
|
|
*/
|
|
|
|
for (;;)
|
|
|
|
{
|
Reduce idle power consumption of walwriter and checkpointer processes.
This patch modifies the walwriter process so that, when it has not found
anything useful to do for many consecutive wakeup cycles, it extends its
sleep time to reduce the server's idle power consumption. It reverts to
normal as soon as it's done any successful flushes. It's still true that
during any async commit, backends check for completed, unflushed pages of
WAL and signal the walwriter if there are any; so that in practice the
walwriter can get awakened and returned to normal operation sooner than the
sleep time might suggest.
Also, improve the checkpointer so that it uses a latch and a computed delay
time to not wake up at all except when it has something to do, replacing a
previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for
reducing the server's power consumption when idle.
In passing, get rid of the dedicated latch for signaling the walwriter in
favor of using its procLatch, since that comports better with possible
generic signal handlers using that latch. Also, fix a pre-existing bug
with failure to save/restore errno in walwriter's signal handlers.
Peter Geoghegan, somewhat simplified by Tom
2012-05-09 02:03:26 +02:00
|
|
|
long cur_timeout;
|
|
|
|
|
2012-05-09 05:05:58 +02:00
|
|
|
/*
|
|
|
|
* Advertise whether we might hibernate in this cycle. We do this
|
|
|
|
* before resetting the latch to ensure that any async commits will
|
|
|
|
* see the flag set if they might possibly need to wake us up, and
|
|
|
|
* that we won't miss any signal they send us. (If we discover work
|
|
|
|
* to do in the last cycle before we would hibernate, the global flag
|
|
|
|
* will be set unnecessarily, but little harm is done.) But avoid
|
|
|
|
* touching the global flag if it doesn't need to change.
|
|
|
|
*/
|
|
|
|
if (hibernating != (left_till_hibernate <= 1))
|
|
|
|
{
|
|
|
|
hibernating = (left_till_hibernate <= 1);
|
|
|
|
SetWalWriterSleeping(hibernating);
|
|
|
|
}
|
|
|
|
|
Reduce idle power consumption of walwriter and checkpointer processes.
This patch modifies the walwriter process so that, when it has not found
anything useful to do for many consecutive wakeup cycles, it extends its
sleep time to reduce the server's idle power consumption. It reverts to
normal as soon as it's done any successful flushes. It's still true that
during any async commit, backends check for completed, unflushed pages of
WAL and signal the walwriter if there are any; so that in practice the
walwriter can get awakened and returned to normal operation sooner than the
sleep time might suggest.
Also, improve the checkpointer so that it uses a latch and a computed delay
time to not wake up at all except when it has something to do, replacing a
previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for
reducing the server's power consumption when idle.
In passing, get rid of the dedicated latch for signaling the walwriter in
favor of using its procLatch, since that comports better with possible
generic signal handlers using that latch. Also, fix a pre-existing bug
with failure to save/restore errno in walwriter's signal handlers.
Peter Geoghegan, somewhat simplified by Tom
2012-05-09 02:03:26 +02:00
|
|
|
/* Clear any already-pending wakeups */
|
2015-01-14 18:45:22 +01:00
|
|
|
ResetLatch(MyLatch);
|
2007-07-24 06:54:09 +02:00
|
|
|
|
2021-03-12 05:29:59 +01:00
|
|
|
/* Process any signals received recently */
|
2024-01-25 04:50:08 +01:00
|
|
|
HandleMainLoopInterrupts();
|
2007-07-24 06:54:09 +02:00
|
|
|
|
|
|
|
/*
|
Reduce idle power consumption of walwriter and checkpointer processes.
This patch modifies the walwriter process so that, when it has not found
anything useful to do for many consecutive wakeup cycles, it extends its
sleep time to reduce the server's idle power consumption. It reverts to
normal as soon as it's done any successful flushes. It's still true that
during any async commit, backends check for completed, unflushed pages of
WAL and signal the walwriter if there are any; so that in practice the
walwriter can get awakened and returned to normal operation sooner than the
sleep time might suggest.
Also, improve the checkpointer so that it uses a latch and a computed delay
time to not wake up at all except when it has something to do, replacing a
previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for
reducing the server's power consumption when idle.
In passing, get rid of the dedicated latch for signaling the walwriter in
favor of using its procLatch, since that comports better with possible
generic signal handlers using that latch. Also, fix a pre-existing bug
with failure to save/restore errno in walwriter's signal handlers.
Peter Geoghegan, somewhat simplified by Tom
2012-05-09 02:03:26 +02:00
|
|
|
* Do what we're here for; then, if XLogBackgroundFlush() found useful
|
|
|
|
* work to do, reset hibernation counter.
|
2007-07-24 06:54:09 +02:00
|
|
|
*/
|
Reduce idle power consumption of walwriter and checkpointer processes.
This patch modifies the walwriter process so that, when it has not found
anything useful to do for many consecutive wakeup cycles, it extends its
sleep time to reduce the server's idle power consumption. It reverts to
normal as soon as it's done any successful flushes. It's still true that
during any async commit, backends check for completed, unflushed pages of
WAL and signal the walwriter if there are any; so that in practice the
walwriter can get awakened and returned to normal operation sooner than the
sleep time might suggest.
Also, improve the checkpointer so that it uses a latch and a computed delay
time to not wake up at all except when it has something to do, replacing a
previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for
reducing the server's power consumption when idle.
In passing, get rid of the dedicated latch for signaling the walwriter in
favor of using its procLatch, since that comports better with possible
generic signal handlers using that latch. Also, fix a pre-existing bug
with failure to save/restore errno in walwriter's signal handlers.
Peter Geoghegan, somewhat simplified by Tom
2012-05-09 02:03:26 +02:00
|
|
|
if (XLogBackgroundFlush())
|
|
|
|
left_till_hibernate = LOOPS_UNTIL_HIBERNATE;
|
|
|
|
else if (left_till_hibernate > 0)
|
|
|
|
left_till_hibernate--;
|
2007-07-24 06:54:09 +02:00
|
|
|
|
2022-04-06 22:56:06 +02:00
|
|
|
/* report pending statistics to the cumulative stats system */
|
2022-04-06 23:08:57 +02:00
|
|
|
pgstat_report_wal(false);
|
Track total amounts of times spent writing and syncing WAL data to disk.
This commit adds new GUC track_wal_io_timing. When this is enabled,
the total amounts of time XLogWrite writes and issue_xlog_fsync syncs
WAL data to disk are counted in pg_stat_wal. This information would be
useful to check how much WAL write and sync affect the performance.
Enabling track_wal_io_timing will make the server query the operating
system for the current time every time WAL is written or synced,
which may cause significant overhead on some platforms. To avoid such
additional overhead in the server with track_io_timing enabled,
this commit introduces track_wal_io_timing as a separate parameter from
track_io_timing.
Note that WAL write and sync activity by walreceiver has not been tracked yet.
This commit makes the server also track the numbers of times XLogWrite
writes and issue_xlog_fsync syncs WAL data to disk, in pg_stat_wal,
regardless of the setting of track_wal_io_timing. This counters can be
used to calculate the WAL write and sync time per request, for example.
Bump PGSTAT_FILE_FORMAT_ID.
Bump catalog version.
Author: Masahiro Ikeda
Reviewed-By: Japin Li, Hayato Kuroda, Masahiko Sawada, David Johnston, Fujii Masao
Discussion: https://postgr.es/m/0509ad67b585a5b86a83d445dfa75392@oss.nttdata.com
2021-03-09 08:52:06 +01:00
|
|
|
|
Reduce idle power consumption of walwriter and checkpointer processes.
This patch modifies the walwriter process so that, when it has not found
anything useful to do for many consecutive wakeup cycles, it extends its
sleep time to reduce the server's idle power consumption. It reverts to
normal as soon as it's done any successful flushes. It's still true that
during any async commit, backends check for completed, unflushed pages of
WAL and signal the walwriter if there are any; so that in practice the
walwriter can get awakened and returned to normal operation sooner than the
sleep time might suggest.
Also, improve the checkpointer so that it uses a latch and a computed delay
time to not wake up at all except when it has something to do, replacing a
previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for
reducing the server's power consumption when idle.
In passing, get rid of the dedicated latch for signaling the walwriter in
favor of using its procLatch, since that comports better with possible
generic signal handlers using that latch. Also, fix a pre-existing bug
with failure to save/restore errno in walwriter's signal handlers.
Peter Geoghegan, somewhat simplified by Tom
2012-05-09 02:03:26 +02:00
|
|
|
/*
|
|
|
|
* Sleep until we are signaled or WalWriterDelay has elapsed. If we
|
|
|
|
* haven't done anything useful for quite some time, lengthen the
|
|
|
|
* sleep time so as to reduce the server's idle power consumption.
|
|
|
|
*/
|
|
|
|
if (left_till_hibernate > 0)
|
|
|
|
cur_timeout = WalWriterDelay; /* in ms */
|
|
|
|
else
|
|
|
|
cur_timeout = WalWriterDelay * HIBERNATE_FACTOR;
|
|
|
|
|
Add WL_EXIT_ON_PM_DEATH pseudo-event.
Users of the WaitEventSet and WaitLatch() APIs can now choose between
asking for WL_POSTMASTER_DEATH and then handling it explicitly, or asking
for WL_EXIT_ON_PM_DEATH to trigger immediate exit on postmaster death.
This reduces code duplication, since almost all callers want the latter.
Repair all code that was previously ignoring postmaster death completely,
or requesting the event but ignoring it, or requesting the event but then
doing an unconditional PostmasterIsAlive() call every time through its
event loop (which is an expensive syscall on platforms for which we don't
have USE_POSTMASTER_DEATH_SIGNAL support).
Assert that callers of WaitLatchXXX() under the postmaster remember to
ask for either WL_POSTMASTER_DEATH or WL_EXIT_ON_PM_DEATH, to prevent
future bugs.
The only process that doesn't handle postmaster death is syslogger. It
waits until all backends holding the write end of the syslog pipe
(including the postmaster) have closed it by exiting, to be sure to
capture any parting messages. By using the WaitEventSet API directly
it avoids the new assertion, and as a by-product it may be slightly
more efficient on platforms that have epoll().
Author: Thomas Munro
Reviewed-by: Kyotaro Horiguchi, Heikki Linnakangas, Tom Lane
Discussion: https://postgr.es/m/CAEepm%3D1TCviRykkUb69ppWLr_V697rzd1j3eZsRMmbXvETfqbQ%40mail.gmail.com,
https://postgr.es/m/CAEepm=2LqHzizbe7muD7-2yHUbTOoF7Q+qkSD5Q41kuhttRTwA@mail.gmail.com
2018-11-23 08:16:41 +01:00
|
|
|
(void) WaitLatch(MyLatch,
|
|
|
|
WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
|
|
|
|
cur_timeout,
|
|
|
|
WAIT_EVENT_WAL_WRITER_MAIN);
|
2007-07-24 06:54:09 +02:00
|
|
|
}
|
|
|
|
}
|