Fix the fallback memory barrier implementation to be reentrant.

This was essentially "broken" since 0c8eda62; but until more
recently (14e8803f) barriers usage in signal handlers was infrequent.

The failure to be reentrant was noticed because the test_shm_mq, which
uses memory barriers at a high frequency, occasionally got stuck on some
solaris buildfarm animals. Turns out, those machines use sun studio
12.1, which doesn't yet have efficient memory barrier support. A machine
with a newer sun studio did not fail.  Forcing the barrier fallback to
be used on x86 allows to reproduce the problem.

The new fallback is to use kill(PostmasterPid, 0) based on the theory
that that'll always imply a barrier due to checking the liveliness of
PostmasterPid on systems old enough to need fallback support. It's hard
to come up with a good and performant fallback.

I'm not backpatching this for now - the problem isn't active in the back
branches, and we haven't backpatched barrier changes for
now. Additionally master looks entirely different than the back branches
due to the new atomics abstraction. It seems better to let this rest in
master, where the non-reentrancy actively causes a problem, and then
consider backpatching.

Found-By: Robert Haas
Discussion: 55626265.3060800@dunslane.net
This commit is contained in:
Andres Freund 2015-06-26 17:00:01 +02:00
parent 5ca611841b
commit 1b468a131b
1 changed files with 19 additions and 2 deletions

View File

@ -20,15 +20,32 @@
*/
#define ATOMICS_INCLUDE_DEFINITIONS
#include "miscadmin.h"
#include "port/atomics.h"
#include "storage/spin.h"
#ifdef PG_HAVE_MEMORY_BARRIER_EMULATION
#ifdef WIN32
#error "barriers are required (and provided) on WIN32 platforms"
#endif
#include <sys/types.h>
#include <signal.h>
#endif
#ifdef PG_HAVE_MEMORY_BARRIER_EMULATION
void
pg_spinlock_barrier(void)
{
S_LOCK(&dummy_spinlock);
S_UNLOCK(&dummy_spinlock);
/*
* NB: we have to be reentrant here, some barriers are placed in signal
* handlers.
*
* We use kill(0) for the fallback barrier as we assume that kernels on
* systems old enough to require fallback barrier support will include an
* appropriate barrier while checking the existence of the postmaster
* pid.
*/
(void) kill(PostmasterPid, 0);
}
#endif