Add a basic atomic ops API abstracting away platform/architecture details.
Several upcoming performance/scalability improvements require atomic
operations. This new API avoids the need to splatter compiler and
architecture dependent code over all the locations employing atomic
ops.
For several of the potential usages it'd be problematic to maintain
both, a atomics using implementation and one using spinlocks or
similar. In all likelihood one of the implementations would not get
tested regularly under concurrency. To avoid that scenario the new API
provides a automatic fallback of atomic operations to spinlocks. All
properties of atomic operations are maintained. This fallback -
obviously - isn't as fast as just using atomic ops, but it's not bad
either. For one of the future users the atomics ontop spinlocks
implementation was actually slightly faster than the old purely
spinlock using implementation. That's important because it reduces the
fear of regressing older platforms when improving the scalability for
new ones.
The API, loosely modeled after the C11 atomics support, currently
provides 'atomic flags' and 32 bit unsigned integers. If the platform
efficiently supports atomic 64 bit unsigned integers those are also
provided.
To implement atomics support for a platform/architecture/compiler for
a type of atomics 32bit compare and exchange needs to be
implemented. If available and more efficient native support for flags,
32 bit atomic addition, and corresponding 64 bit operations may also
be provided. Additional useful atomic operations are implemented
generically ontop of these.
The implementation for various versions of gcc, msvc and sun studio have
been tested. Additional existing stub implementations for
* Intel icc
* HUPX acc
* IBM xlc
are included but have never been tested. These will likely require
fixes based on buildfarm and user feedback.
As atomic operations also require barriers for some operations the
existing barrier support has been moved into the atomics code.
Author: Andres Freund with contributions from Oskari Saarenmaa
Reviewed-By: Amit Kapila, Robert Haas, Heikki Linnakangas and Álvaro Herrera
Discussion: CA+TgmoYBW+ux5-8Ja=Mcyuy8=VXAnVRHp3Kess6Pn3DMXAPAEA@mail.gmail.com,
20131015123303.GH5300@awork2.anarazel.de,
20131028205522.GI20248@awork2.anarazel.de
2014-09-25 23:49:05 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* fallback.h
|
|
|
|
* Fallback for platforms without spinlock and/or atomics support. Slower
|
|
|
|
* than native atomics support, but not unusably slow.
|
|
|
|
*
|
2022-01-08 01:04:57 +01:00
|
|
|
* Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
|
Add a basic atomic ops API abstracting away platform/architecture details.
Several upcoming performance/scalability improvements require atomic
operations. This new API avoids the need to splatter compiler and
architecture dependent code over all the locations employing atomic
ops.
For several of the potential usages it'd be problematic to maintain
both, a atomics using implementation and one using spinlocks or
similar. In all likelihood one of the implementations would not get
tested regularly under concurrency. To avoid that scenario the new API
provides a automatic fallback of atomic operations to spinlocks. All
properties of atomic operations are maintained. This fallback -
obviously - isn't as fast as just using atomic ops, but it's not bad
either. For one of the future users the atomics ontop spinlocks
implementation was actually slightly faster than the old purely
spinlock using implementation. That's important because it reduces the
fear of regressing older platforms when improving the scalability for
new ones.
The API, loosely modeled after the C11 atomics support, currently
provides 'atomic flags' and 32 bit unsigned integers. If the platform
efficiently supports atomic 64 bit unsigned integers those are also
provided.
To implement atomics support for a platform/architecture/compiler for
a type of atomics 32bit compare and exchange needs to be
implemented. If available and more efficient native support for flags,
32 bit atomic addition, and corresponding 64 bit operations may also
be provided. Additional useful atomic operations are implemented
generically ontop of these.
The implementation for various versions of gcc, msvc and sun studio have
been tested. Additional existing stub implementations for
* Intel icc
* HUPX acc
* IBM xlc
are included but have never been tested. These will likely require
fixes based on buildfarm and user feedback.
As atomic operations also require barriers for some operations the
existing barrier support has been moved into the atomics code.
Author: Andres Freund with contributions from Oskari Saarenmaa
Reviewed-By: Amit Kapila, Robert Haas, Heikki Linnakangas and Álvaro Herrera
Discussion: CA+TgmoYBW+ux5-8Ja=Mcyuy8=VXAnVRHp3Kess6Pn3DMXAPAEA@mail.gmail.com,
20131015123303.GH5300@awork2.anarazel.de,
20131028205522.GI20248@awork2.anarazel.de
2014-09-25 23:49:05 +02:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
*
|
|
|
|
* src/include/port/atomics/fallback.h
|
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
|
|
|
|
/* intentionally no include guards, should only be included by atomics.h */
|
|
|
|
#ifndef INSIDE_ATOMICS_H
|
|
|
|
# error "should be included via atomics.h"
|
|
|
|
#endif
|
|
|
|
|
|
|
|
#ifndef pg_memory_barrier_impl
|
|
|
|
/*
|
|
|
|
* If we have no memory barrier implementation for this architecture, we
|
|
|
|
* fall back to acquiring and releasing a spinlock. This might, in turn,
|
|
|
|
* fall back to the semaphore-based spinlock implementation, which will be
|
|
|
|
* amazingly slow.
|
|
|
|
*
|
|
|
|
* It's not self-evident that every possible legal implementation of a
|
|
|
|
* spinlock acquire-and-release would be equivalent to a full memory barrier.
|
|
|
|
* For example, I'm not sure that Itanium's acq and rel add up to a full
|
|
|
|
* fence. But all of our actual implementations seem OK in this regard.
|
|
|
|
*/
|
|
|
|
#define PG_HAVE_MEMORY_BARRIER_EMULATION
|
|
|
|
|
|
|
|
extern void pg_spinlock_barrier(void);
|
|
|
|
#define pg_memory_barrier_impl pg_spinlock_barrier
|
|
|
|
#endif
|
|
|
|
|
2015-01-11 01:15:29 +01:00
|
|
|
#ifndef pg_compiler_barrier_impl
|
|
|
|
/*
|
|
|
|
* If the compiler/arch combination does not provide compiler barriers,
|
2015-01-13 09:29:16 +01:00
|
|
|
* provide a fallback. The fallback simply consists of a function call into
|
|
|
|
* an externally defined function. That should guarantee compiler barrier
|
2015-01-11 01:15:29 +01:00
|
|
|
* semantics except for compilers that do inter translation unit/global
|
|
|
|
* optimization - those better provide an actual compiler barrier.
|
|
|
|
*
|
2015-01-13 09:29:16 +01:00
|
|
|
* A native compiler barrier for sure is a lot faster than this...
|
2015-01-11 01:15:29 +01:00
|
|
|
*/
|
|
|
|
#define PG_HAVE_COMPILER_BARRIER_EMULATION
|
|
|
|
extern void pg_extern_compiler_barrier(void);
|
|
|
|
#define pg_compiler_barrier_impl pg_extern_compiler_barrier
|
|
|
|
#endif
|
|
|
|
|
|
|
|
|
Add a basic atomic ops API abstracting away platform/architecture details.
Several upcoming performance/scalability improvements require atomic
operations. This new API avoids the need to splatter compiler and
architecture dependent code over all the locations employing atomic
ops.
For several of the potential usages it'd be problematic to maintain
both, a atomics using implementation and one using spinlocks or
similar. In all likelihood one of the implementations would not get
tested regularly under concurrency. To avoid that scenario the new API
provides a automatic fallback of atomic operations to spinlocks. All
properties of atomic operations are maintained. This fallback -
obviously - isn't as fast as just using atomic ops, but it's not bad
either. For one of the future users the atomics ontop spinlocks
implementation was actually slightly faster than the old purely
spinlock using implementation. That's important because it reduces the
fear of regressing older platforms when improving the scalability for
new ones.
The API, loosely modeled after the C11 atomics support, currently
provides 'atomic flags' and 32 bit unsigned integers. If the platform
efficiently supports atomic 64 bit unsigned integers those are also
provided.
To implement atomics support for a platform/architecture/compiler for
a type of atomics 32bit compare and exchange needs to be
implemented. If available and more efficient native support for flags,
32 bit atomic addition, and corresponding 64 bit operations may also
be provided. Additional useful atomic operations are implemented
generically ontop of these.
The implementation for various versions of gcc, msvc and sun studio have
been tested. Additional existing stub implementations for
* Intel icc
* HUPX acc
* IBM xlc
are included but have never been tested. These will likely require
fixes based on buildfarm and user feedback.
As atomic operations also require barriers for some operations the
existing barrier support has been moved into the atomics code.
Author: Andres Freund with contributions from Oskari Saarenmaa
Reviewed-By: Amit Kapila, Robert Haas, Heikki Linnakangas and Álvaro Herrera
Discussion: CA+TgmoYBW+ux5-8Ja=Mcyuy8=VXAnVRHp3Kess6Pn3DMXAPAEA@mail.gmail.com,
20131015123303.GH5300@awork2.anarazel.de,
20131028205522.GI20248@awork2.anarazel.de
2014-09-25 23:49:05 +02:00
|
|
|
/*
|
2015-01-13 09:29:16 +01:00
|
|
|
* If we have atomics implementation for this platform, fall back to providing
|
Add a basic atomic ops API abstracting away platform/architecture details.
Several upcoming performance/scalability improvements require atomic
operations. This new API avoids the need to splatter compiler and
architecture dependent code over all the locations employing atomic
ops.
For several of the potential usages it'd be problematic to maintain
both, a atomics using implementation and one using spinlocks or
similar. In all likelihood one of the implementations would not get
tested regularly under concurrency. To avoid that scenario the new API
provides a automatic fallback of atomic operations to spinlocks. All
properties of atomic operations are maintained. This fallback -
obviously - isn't as fast as just using atomic ops, but it's not bad
either. For one of the future users the atomics ontop spinlocks
implementation was actually slightly faster than the old purely
spinlock using implementation. That's important because it reduces the
fear of regressing older platforms when improving the scalability for
new ones.
The API, loosely modeled after the C11 atomics support, currently
provides 'atomic flags' and 32 bit unsigned integers. If the platform
efficiently supports atomic 64 bit unsigned integers those are also
provided.
To implement atomics support for a platform/architecture/compiler for
a type of atomics 32bit compare and exchange needs to be
implemented. If available and more efficient native support for flags,
32 bit atomic addition, and corresponding 64 bit operations may also
be provided. Additional useful atomic operations are implemented
generically ontop of these.
The implementation for various versions of gcc, msvc and sun studio have
been tested. Additional existing stub implementations for
* Intel icc
* HUPX acc
* IBM xlc
are included but have never been tested. These will likely require
fixes based on buildfarm and user feedback.
As atomic operations also require barriers for some operations the
existing barrier support has been moved into the atomics code.
Author: Andres Freund with contributions from Oskari Saarenmaa
Reviewed-By: Amit Kapila, Robert Haas, Heikki Linnakangas and Álvaro Herrera
Discussion: CA+TgmoYBW+ux5-8Ja=Mcyuy8=VXAnVRHp3Kess6Pn3DMXAPAEA@mail.gmail.com,
20131015123303.GH5300@awork2.anarazel.de,
20131028205522.GI20248@awork2.anarazel.de
2014-09-25 23:49:05 +02:00
|
|
|
* the atomics API using a spinlock to protect the internal state. Possibly
|
|
|
|
* the spinlock implementation uses semaphores internally...
|
|
|
|
*
|
|
|
|
* We have to be a bit careful here, as it's not guaranteed that atomic
|
|
|
|
* variables are mapped to the same address in every process (e.g. dynamic
|
|
|
|
* shared memory segments). We can't just hash the address and use that to map
|
|
|
|
* to a spinlock. Instead assign a spinlock on initialization of the atomic
|
|
|
|
* variable.
|
|
|
|
*/
|
|
|
|
#if !defined(PG_HAVE_ATOMIC_FLAG_SUPPORT) && !defined(PG_HAVE_ATOMIC_U32_SUPPORT)
|
|
|
|
|
|
|
|
#define PG_HAVE_ATOMIC_FLAG_SIMULATION
|
|
|
|
#define PG_HAVE_ATOMIC_FLAG_SUPPORT
|
|
|
|
|
|
|
|
typedef struct pg_atomic_flag
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* To avoid circular includes we can't use s_lock as a type here. Instead
|
|
|
|
* just reserve enough space for all spinlock types. Some platforms would
|
|
|
|
* be content with just one byte instead of 4, but that's not too much
|
|
|
|
* waste.
|
|
|
|
*/
|
|
|
|
#if defined(__hppa) || defined(__hppa__) /* HP PA-RISC, GCC and HP compilers */
|
|
|
|
int sema[4];
|
|
|
|
#else
|
|
|
|
int sema;
|
|
|
|
#endif
|
2018-04-07 04:55:32 +02:00
|
|
|
volatile bool value;
|
Add a basic atomic ops API abstracting away platform/architecture details.
Several upcoming performance/scalability improvements require atomic
operations. This new API avoids the need to splatter compiler and
architecture dependent code over all the locations employing atomic
ops.
For several of the potential usages it'd be problematic to maintain
both, a atomics using implementation and one using spinlocks or
similar. In all likelihood one of the implementations would not get
tested regularly under concurrency. To avoid that scenario the new API
provides a automatic fallback of atomic operations to spinlocks. All
properties of atomic operations are maintained. This fallback -
obviously - isn't as fast as just using atomic ops, but it's not bad
either. For one of the future users the atomics ontop spinlocks
implementation was actually slightly faster than the old purely
spinlock using implementation. That's important because it reduces the
fear of regressing older platforms when improving the scalability for
new ones.
The API, loosely modeled after the C11 atomics support, currently
provides 'atomic flags' and 32 bit unsigned integers. If the platform
efficiently supports atomic 64 bit unsigned integers those are also
provided.
To implement atomics support for a platform/architecture/compiler for
a type of atomics 32bit compare and exchange needs to be
implemented. If available and more efficient native support for flags,
32 bit atomic addition, and corresponding 64 bit operations may also
be provided. Additional useful atomic operations are implemented
generically ontop of these.
The implementation for various versions of gcc, msvc and sun studio have
been tested. Additional existing stub implementations for
* Intel icc
* HUPX acc
* IBM xlc
are included but have never been tested. These will likely require
fixes based on buildfarm and user feedback.
As atomic operations also require barriers for some operations the
existing barrier support has been moved into the atomics code.
Author: Andres Freund with contributions from Oskari Saarenmaa
Reviewed-By: Amit Kapila, Robert Haas, Heikki Linnakangas and Álvaro Herrera
Discussion: CA+TgmoYBW+ux5-8Ja=Mcyuy8=VXAnVRHp3Kess6Pn3DMXAPAEA@mail.gmail.com,
20131015123303.GH5300@awork2.anarazel.de,
20131028205522.GI20248@awork2.anarazel.de
2014-09-25 23:49:05 +02:00
|
|
|
} pg_atomic_flag;
|
|
|
|
|
|
|
|
#endif /* PG_HAVE_ATOMIC_FLAG_SUPPORT */
|
|
|
|
|
|
|
|
#if !defined(PG_HAVE_ATOMIC_U32_SUPPORT)
|
|
|
|
|
|
|
|
#define PG_HAVE_ATOMIC_U32_SIMULATION
|
|
|
|
|
|
|
|
#define PG_HAVE_ATOMIC_U32_SUPPORT
|
|
|
|
typedef struct pg_atomic_uint32
|
|
|
|
{
|
|
|
|
/* Check pg_atomic_flag's definition above for an explanation */
|
2022-07-08 01:17:47 +02:00
|
|
|
#if defined(__hppa) || defined(__hppa__) /* HP PA-RISC */
|
Add a basic atomic ops API abstracting away platform/architecture details.
Several upcoming performance/scalability improvements require atomic
operations. This new API avoids the need to splatter compiler and
architecture dependent code over all the locations employing atomic
ops.
For several of the potential usages it'd be problematic to maintain
both, a atomics using implementation and one using spinlocks or
similar. In all likelihood one of the implementations would not get
tested regularly under concurrency. To avoid that scenario the new API
provides a automatic fallback of atomic operations to spinlocks. All
properties of atomic operations are maintained. This fallback -
obviously - isn't as fast as just using atomic ops, but it's not bad
either. For one of the future users the atomics ontop spinlocks
implementation was actually slightly faster than the old purely
spinlock using implementation. That's important because it reduces the
fear of regressing older platforms when improving the scalability for
new ones.
The API, loosely modeled after the C11 atomics support, currently
provides 'atomic flags' and 32 bit unsigned integers. If the platform
efficiently supports atomic 64 bit unsigned integers those are also
provided.
To implement atomics support for a platform/architecture/compiler for
a type of atomics 32bit compare and exchange needs to be
implemented. If available and more efficient native support for flags,
32 bit atomic addition, and corresponding 64 bit operations may also
be provided. Additional useful atomic operations are implemented
generically ontop of these.
The implementation for various versions of gcc, msvc and sun studio have
been tested. Additional existing stub implementations for
* Intel icc
* HUPX acc
* IBM xlc
are included but have never been tested. These will likely require
fixes based on buildfarm and user feedback.
As atomic operations also require barriers for some operations the
existing barrier support has been moved into the atomics code.
Author: Andres Freund with contributions from Oskari Saarenmaa
Reviewed-By: Amit Kapila, Robert Haas, Heikki Linnakangas and Álvaro Herrera
Discussion: CA+TgmoYBW+ux5-8Ja=Mcyuy8=VXAnVRHp3Kess6Pn3DMXAPAEA@mail.gmail.com,
20131015123303.GH5300@awork2.anarazel.de,
20131028205522.GI20248@awork2.anarazel.de
2014-09-25 23:49:05 +02:00
|
|
|
int sema[4];
|
|
|
|
#else
|
|
|
|
int sema;
|
|
|
|
#endif
|
|
|
|
volatile uint32 value;
|
|
|
|
} pg_atomic_uint32;
|
|
|
|
|
|
|
|
#endif /* PG_HAVE_ATOMIC_U32_SUPPORT */
|
|
|
|
|
2017-04-07 23:44:47 +02:00
|
|
|
#if !defined(PG_HAVE_ATOMIC_U64_SUPPORT)
|
|
|
|
|
|
|
|
#define PG_HAVE_ATOMIC_U64_SIMULATION
|
|
|
|
|
|
|
|
#define PG_HAVE_ATOMIC_U64_SUPPORT
|
|
|
|
typedef struct pg_atomic_uint64
|
|
|
|
{
|
|
|
|
/* Check pg_atomic_flag's definition above for an explanation */
|
2022-07-08 01:17:47 +02:00
|
|
|
#if defined(__hppa) || defined(__hppa__) /* HP PA-RISC */
|
2017-04-07 23:44:47 +02:00
|
|
|
int sema[4];
|
|
|
|
#else
|
|
|
|
int sema;
|
|
|
|
#endif
|
|
|
|
volatile uint64 value;
|
|
|
|
} pg_atomic_uint64;
|
|
|
|
|
|
|
|
#endif /* PG_HAVE_ATOMIC_U64_SUPPORT */
|
|
|
|
|
Add a basic atomic ops API abstracting away platform/architecture details.
Several upcoming performance/scalability improvements require atomic
operations. This new API avoids the need to splatter compiler and
architecture dependent code over all the locations employing atomic
ops.
For several of the potential usages it'd be problematic to maintain
both, a atomics using implementation and one using spinlocks or
similar. In all likelihood one of the implementations would not get
tested regularly under concurrency. To avoid that scenario the new API
provides a automatic fallback of atomic operations to spinlocks. All
properties of atomic operations are maintained. This fallback -
obviously - isn't as fast as just using atomic ops, but it's not bad
either. For one of the future users the atomics ontop spinlocks
implementation was actually slightly faster than the old purely
spinlock using implementation. That's important because it reduces the
fear of regressing older platforms when improving the scalability for
new ones.
The API, loosely modeled after the C11 atomics support, currently
provides 'atomic flags' and 32 bit unsigned integers. If the platform
efficiently supports atomic 64 bit unsigned integers those are also
provided.
To implement atomics support for a platform/architecture/compiler for
a type of atomics 32bit compare and exchange needs to be
implemented. If available and more efficient native support for flags,
32 bit atomic addition, and corresponding 64 bit operations may also
be provided. Additional useful atomic operations are implemented
generically ontop of these.
The implementation for various versions of gcc, msvc and sun studio have
been tested. Additional existing stub implementations for
* Intel icc
* HUPX acc
* IBM xlc
are included but have never been tested. These will likely require
fixes based on buildfarm and user feedback.
As atomic operations also require barriers for some operations the
existing barrier support has been moved into the atomics code.
Author: Andres Freund with contributions from Oskari Saarenmaa
Reviewed-By: Amit Kapila, Robert Haas, Heikki Linnakangas and Álvaro Herrera
Discussion: CA+TgmoYBW+ux5-8Ja=Mcyuy8=VXAnVRHp3Kess6Pn3DMXAPAEA@mail.gmail.com,
20131015123303.GH5300@awork2.anarazel.de,
20131028205522.GI20248@awork2.anarazel.de
2014-09-25 23:49:05 +02:00
|
|
|
#ifdef PG_HAVE_ATOMIC_FLAG_SIMULATION
|
|
|
|
|
|
|
|
#define PG_HAVE_ATOMIC_INIT_FLAG
|
|
|
|
extern void pg_atomic_init_flag_impl(volatile pg_atomic_flag *ptr);
|
|
|
|
|
|
|
|
#define PG_HAVE_ATOMIC_TEST_SET_FLAG
|
|
|
|
extern bool pg_atomic_test_set_flag_impl(volatile pg_atomic_flag *ptr);
|
|
|
|
|
|
|
|
#define PG_HAVE_ATOMIC_CLEAR_FLAG
|
|
|
|
extern void pg_atomic_clear_flag_impl(volatile pg_atomic_flag *ptr);
|
|
|
|
|
|
|
|
#define PG_HAVE_ATOMIC_UNLOCKED_TEST_FLAG
|
2018-04-07 04:55:32 +02:00
|
|
|
extern bool pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr);
|
Add a basic atomic ops API abstracting away platform/architecture details.
Several upcoming performance/scalability improvements require atomic
operations. This new API avoids the need to splatter compiler and
architecture dependent code over all the locations employing atomic
ops.
For several of the potential usages it'd be problematic to maintain
both, a atomics using implementation and one using spinlocks or
similar. In all likelihood one of the implementations would not get
tested regularly under concurrency. To avoid that scenario the new API
provides a automatic fallback of atomic operations to spinlocks. All
properties of atomic operations are maintained. This fallback -
obviously - isn't as fast as just using atomic ops, but it's not bad
either. For one of the future users the atomics ontop spinlocks
implementation was actually slightly faster than the old purely
spinlock using implementation. That's important because it reduces the
fear of regressing older platforms when improving the scalability for
new ones.
The API, loosely modeled after the C11 atomics support, currently
provides 'atomic flags' and 32 bit unsigned integers. If the platform
efficiently supports atomic 64 bit unsigned integers those are also
provided.
To implement atomics support for a platform/architecture/compiler for
a type of atomics 32bit compare and exchange needs to be
implemented. If available and more efficient native support for flags,
32 bit atomic addition, and corresponding 64 bit operations may also
be provided. Additional useful atomic operations are implemented
generically ontop of these.
The implementation for various versions of gcc, msvc and sun studio have
been tested. Additional existing stub implementations for
* Intel icc
* HUPX acc
* IBM xlc
are included but have never been tested. These will likely require
fixes based on buildfarm and user feedback.
As atomic operations also require barriers for some operations the
existing barrier support has been moved into the atomics code.
Author: Andres Freund with contributions from Oskari Saarenmaa
Reviewed-By: Amit Kapila, Robert Haas, Heikki Linnakangas and Álvaro Herrera
Discussion: CA+TgmoYBW+ux5-8Ja=Mcyuy8=VXAnVRHp3Kess6Pn3DMXAPAEA@mail.gmail.com,
20131015123303.GH5300@awork2.anarazel.de,
20131028205522.GI20248@awork2.anarazel.de
2014-09-25 23:49:05 +02:00
|
|
|
|
|
|
|
#endif /* PG_HAVE_ATOMIC_FLAG_SIMULATION */
|
|
|
|
|
|
|
|
#ifdef PG_HAVE_ATOMIC_U32_SIMULATION
|
|
|
|
|
|
|
|
#define PG_HAVE_ATOMIC_INIT_U32
|
|
|
|
extern void pg_atomic_init_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 val_);
|
|
|
|
|
Fix fallback implementation of pg_atomic_write_u32().
I somehow had assumed that in the spinlock (in turn possibly using
semaphores) based fallback atomics implementation 32 bit writes could be
done without a lock. As far as the write goes that's correct, since
postgres supports only platforms with single-copy atomicity for aligned
32bit writes. But writing without holding the spinlock breaks
read-modify-write operations like pg_atomic_compare_exchange_u32(),
since they'll potentially "miss" a concurrent write, which can't happen
in actual hardware implementations.
In 9.6+ when using the fallback atomics implementation this could lead
to buffer header locks not being properly marked as released, and
potentially some related state corruption. I don't see a related danger
in 9.5 (earliest release with the API), because pg_atomic_write_u32()
wasn't used in a concurrent manner there.
The state variable of local buffers, before this change, were
manipulated using pg_atomic_write_u32(), to avoid unnecessary
synchronization overhead. As that'd not be the case anymore, introduce
and use pg_atomic_unlocked_write_u32(), which does not correctly
interact with RMW operations.
This bug only caused issues when postgres is compiled on platforms
without atomics support (i.e. no common new platform), or when compiled
with --disable-atomics, which explains why this wasn't noticed in
testing.
Reported-By: Tom Lane
Discussion: <14947.1475690465@sss.pgh.pa.us>
Backpatch: 9.5-, where the atomic operations API was introduced.
2016-10-08 01:55:15 +02:00
|
|
|
#define PG_HAVE_ATOMIC_WRITE_U32
|
|
|
|
extern void pg_atomic_write_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 val);
|
|
|
|
|
Add a basic atomic ops API abstracting away platform/architecture details.
Several upcoming performance/scalability improvements require atomic
operations. This new API avoids the need to splatter compiler and
architecture dependent code over all the locations employing atomic
ops.
For several of the potential usages it'd be problematic to maintain
both, a atomics using implementation and one using spinlocks or
similar. In all likelihood one of the implementations would not get
tested regularly under concurrency. To avoid that scenario the new API
provides a automatic fallback of atomic operations to spinlocks. All
properties of atomic operations are maintained. This fallback -
obviously - isn't as fast as just using atomic ops, but it's not bad
either. For one of the future users the atomics ontop spinlocks
implementation was actually slightly faster than the old purely
spinlock using implementation. That's important because it reduces the
fear of regressing older platforms when improving the scalability for
new ones.
The API, loosely modeled after the C11 atomics support, currently
provides 'atomic flags' and 32 bit unsigned integers. If the platform
efficiently supports atomic 64 bit unsigned integers those are also
provided.
To implement atomics support for a platform/architecture/compiler for
a type of atomics 32bit compare and exchange needs to be
implemented. If available and more efficient native support for flags,
32 bit atomic addition, and corresponding 64 bit operations may also
be provided. Additional useful atomic operations are implemented
generically ontop of these.
The implementation for various versions of gcc, msvc and sun studio have
been tested. Additional existing stub implementations for
* Intel icc
* HUPX acc
* IBM xlc
are included but have never been tested. These will likely require
fixes based on buildfarm and user feedback.
As atomic operations also require barriers for some operations the
existing barrier support has been moved into the atomics code.
Author: Andres Freund with contributions from Oskari Saarenmaa
Reviewed-By: Amit Kapila, Robert Haas, Heikki Linnakangas and Álvaro Herrera
Discussion: CA+TgmoYBW+ux5-8Ja=Mcyuy8=VXAnVRHp3Kess6Pn3DMXAPAEA@mail.gmail.com,
20131015123303.GH5300@awork2.anarazel.de,
20131028205522.GI20248@awork2.anarazel.de
2014-09-25 23:49:05 +02:00
|
|
|
#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32
|
|
|
|
extern bool pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
|
|
|
|
uint32 *expected, uint32 newval);
|
|
|
|
|
|
|
|
#define PG_HAVE_ATOMIC_FETCH_ADD_U32
|
|
|
|
extern uint32 pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_);
|
|
|
|
|
|
|
|
#endif /* PG_HAVE_ATOMIC_U32_SIMULATION */
|
2017-04-07 23:44:47 +02:00
|
|
|
|
|
|
|
|
|
|
|
#ifdef PG_HAVE_ATOMIC_U64_SIMULATION
|
|
|
|
|
|
|
|
#define PG_HAVE_ATOMIC_INIT_U64
|
|
|
|
extern void pg_atomic_init_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 val_);
|
|
|
|
|
|
|
|
#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U64
|
|
|
|
extern bool pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
|
|
|
|
uint64 *expected, uint64 newval);
|
|
|
|
|
|
|
|
#define PG_HAVE_ATOMIC_FETCH_ADD_U64
|
|
|
|
extern uint64 pg_atomic_fetch_add_u64_impl(volatile pg_atomic_uint64 *ptr, int64 add_);
|
|
|
|
|
|
|
|
#endif /* PG_HAVE_ATOMIC_U64_SIMULATION */
|