Commit Graph

37 Commits

Author SHA1 Message Date
Heikki Linnakangas 936546dcbc Optimize pg_comp_crc32c_sse42 routine slightly, and also use it on x86.
Eliminate the separate 'len' variable from the loops, and also use the 4
byte instruction. This shaves off a few more cycles. Even though this
routine that uses the special SSE 4.2 instructions is much faster than a
generic routine, it's still a hot spot, so let's make it as fast as
possible.

Change the configure test to not test _mm_crc32_u64. That variant is only
available in the 64-bit x86-64 architecture, not in 32-bit x86. Modify
pg_comp_crc32c_sse42 so that it only uses _mm_crc32_u64 on x86-64. With
these changes, the SSE accelerated CRC-32C implementation can also be used
on 32-bit x86 systems.

This also fixes the 32-bit MSVC build.
2015-04-14 23:58:16 +03:00
Heikki Linnakangas 3dc2d62d04 Use Intel SSE 4.2 CRC instructions where available.
Modern x86 and x86-64 processors with SSE 4.2 support have special
instructions, crc32b and crc32q, for calculating CRC-32C. They greatly
speed up CRC calculation.

Whether the instructions can be used or not depends on the compiler and the
target architecture. If generation of SSE 4.2 instructions is allowed for
the target (-msse4.2 flag on gcc and clang), use them. If they are not
allowed by default, but the compiler supports the -msse4.2 flag to enable
them, compile just the CRC-32C function with -msse4.2 flag, and check at
runtime whether the processor we're running on supports it. If it doesn't,
fall back to the slicing-by-8 algorithm. (With the common defaults on
current operating systems, the runtime-check variant is what you get in
practice.)

Abhijit Menon-Sen, heavily modified by me, reviewed by Andres Freund.
2015-04-14 17:05:03 +03:00
Andres Freund 8122e1437e Add, optional, support for 128bit integers.
We will, for the foreseeable future, not expose 128 bit datatypes to
SQL. But being able to use 128bit math will allow us, in a later patch,
to use 128bit accumulators for some aggregates; leading to noticeable
speedups over using numeric.

So far we only detect a gcc/clang extension that supports 128bit math,
but no 128bit literals, and no *printf support. We might want to expand
this in the future to further compilers; if there are any that that
provide similar support.

Discussion: 544BB5F1.50709@proxel.se
Author: Andreas Karlsson, with significant editorializing by me
Reviewed-By: Peter Geoghegan, Oskari Saarenmaa
2015-03-20 10:26:17 +01:00
Heikki Linnakangas 025c02420d Speed up CRC calculation using slicing-by-8 algorithm.
This speeds up WAL generation and replay. The new algorithm is
significantly faster with large inputs, like full-page images or when
inserting wide rows. It is slower with tiny inputs, i.e. less than 10 bytes
or so, but the speedup with longer inputs more than make up for that. Even
small WAL records at least have 24 byte header in the front.

The output is identical to the current byte-at-a-time computation, so this
does not affect compatibility. The new algorithm is only used for the
CRC-32C variant, not the legacy version used in tsquery or the
"traditional" CRC-32 used in hstore and ltree. Those are not as performance
critical, and are usually only applied over small inputs, so it seems
better to not carry around the extra lookup tables to speed up those rare
cases.

Abhijit Menon-Sen
2015-02-10 10:54:40 +02:00
Noah Misch b779168ffe Detect PG_PRINTF_ATTRIBUTE automatically.
This eliminates gobs of "unrecognized format function type" warnings
under MinGW compilers predating GCC 4.4.
2014-11-23 09:34:03 -05:00
Andres Freund b64d92f1a5 Add a basic atomic ops API abstracting away platform/architecture details.
Several upcoming performance/scalability improvements require atomic
operations. This new API avoids the need to splatter compiler and
architecture dependent code over all the locations employing atomic
ops.

For several of the potential usages it'd be problematic to maintain
both, a atomics using implementation and one using spinlocks or
similar. In all likelihood one of the implementations would not get
tested regularly under concurrency. To avoid that scenario the new API
provides a automatic fallback of atomic operations to spinlocks. All
properties of atomic operations are maintained. This fallback -
obviously - isn't as fast as just using atomic ops, but it's not bad
either. For one of the future users the atomics ontop spinlocks
implementation was actually slightly faster than the old purely
spinlock using implementation. That's important because it reduces the
fear of regressing older platforms when improving the scalability for
new ones.

The API, loosely modeled after the C11 atomics support, currently
provides 'atomic flags' and 32 bit unsigned integers. If the platform
efficiently supports atomic 64 bit unsigned integers those are also
provided.

To implement atomics support for a platform/architecture/compiler for
a type of atomics 32bit compare and exchange needs to be
implemented. If available and more efficient native support for flags,
32 bit atomic addition, and corresponding 64 bit operations may also
be provided. Additional useful atomic operations are implemented
generically ontop of these.

The implementation for various versions of gcc, msvc and sun studio have
been tested. Additional existing stub implementations for
* Intel icc
* HUPX acc
* IBM xlc
are included but have never been tested. These will likely require
fixes based on buildfarm and user feedback.

As atomic operations also require barriers for some operations the
existing barrier support has been moved into the atomics code.

Author: Andres Freund with contributions from Oskari Saarenmaa
Reviewed-By: Amit Kapila, Robert Haas, Heikki Linnakangas and Álvaro Herrera
Discussion: CA+TgmoYBW+ux5-8Ja=Mcyuy8=VXAnVRHp3Kess6Pn3DMXAPAEA@mail.gmail.com,
    20131015123303.GH5300@awork2.anarazel.de,
    20131028205522.GI20248@awork2.anarazel.de
2014-09-25 23:49:05 +02:00
Tom Lane 4c8aa8b5ae Fix "quiet inline" configure test for newer clang compilers.
This test used to just define an unused static inline function and check
whether that causes a warning.  But newer clang versions warn about
unused static inline functions when defined inside a .c file, but not
when defined in an included header, which is the case we care about.
Change the test to cope.

Andres Freund
2014-05-01 16:16:36 -04:00
Simon Riggs fdea2530bd Compiler optimizations for page checksum code.
Ants Aasma and Jeff Davis
2013-04-30 06:59:26 +01:00
Tom Lane b853eb9718 Improve handling of ereport(ERROR) and elog(ERROR).
In commit 71450d7fd6, we added code to inform
suitably-intelligent compilers that ereport() doesn't return if the elevel
is ERROR or higher.  This patch extends that to elog(), and also fixes a
double-evaluation hazard that the previous commit created in ereport(),
as well as reducing the emitted code size.

The elog() improvement requires the compiler to support __VA_ARGS__, which
should be available in just about anything nowadays since it's required by
C99.  But our minimum language baseline is still C89, so add a configure
test for that.

The previous commit assumed that ereport's elevel could be evaluated twice,
which isn't terribly safe --- there are already counterexamples in xlog.c.
On compilers that have __builtin_constant_p, we can use that to protect the
second test, since there's no possible optimization gain if the compiler
doesn't know the value of elevel.  Otherwise, use a local variable inside
the macros to prevent double evaluation.  The local-variable solution is
inferior because (a) it leads to useless code being emitted when elevel
isn't constant, and (b) it increases the optimization level needed for the
compiler to recognize that subsequent code is unreachable.  But it seems
better than not teaching non-gcc compilers about unreachability at all.

Lastly, if the compiler has __builtin_unreachable(), we can use that
instead of abort(), resulting in a noticeable code savings since no
function call is actually emitted.  However, it seems wise to do this only
in non-assert builds.  In an assert build, continue to use abort(), so that
the behavior will be predictable and debuggable if the "impossible"
happens.

These changes involve making the ereport and elog macros emit do-while
statement blocks not just expressions, which forces small changes in
a few call sites.

Andres Freund, Tom Lane, Heikki Linnakangas
2013-01-13 18:40:09 -05:00
Alvaro Herrera f46baf601d Rename USE_INLINE to PG_USE_INLINE
The former name was too likely to conflict with symbols from external
headers; and, as seen in recent buildfarm failures in member spoonbill,
it has now happened at least in plpython.
2012-10-09 11:17:33 -03:00
Tom Lane ea473fb2de Add infrastructure for compile-time assertions about variable types.
Currently, the macros only work with fairly recent gcc versions, but there
is room to expand them to other compilers that have comparable features.

Heavily revised and autoconfiscated version of a patch by Andres Freund.
2012-09-30 14:38:31 -04:00
Tom Lane 44404f3945 Adjust configure to use "+Olibmerrno" with HP-UX C compiler, if possible.
This is reported to be necessary on some versions of that OS.  In service
of this, cause PGAC_PROG_CC_CFLAGS_OPT to reject switches that result in
compiler warnings, since on yet other versions of that OS, the switch does
nothing except provoke a warning.

Report and patch by Ibrar Ahmed, further tweaking by me.
2011-05-26 17:29:33 -04:00
Peter Eisentraut 804a786c95 Add/fix caching on some configure checks 2010-09-29 22:38:04 +03:00
Magnus Hagander 9f2e211386 Remove cvs keywords from all files. 2010-09-20 22:08:53 +02:00
Peter Eisentraut 3f11971916 Remove extra newlines at end and beginning of files, add missing newlines
at end of files.
2010-08-19 05:57:36 +00:00
Michael Meskes 29259531c7 Replace self written 'long long int' configure test by standard 'AC_TYPE_LONG_LONG_INT' macro call. 2010-05-25 17:28:20 +00:00
Michael Meskes 555a02f910 Added a configure test for "long long" datatypes. So far this is only used in ecpg and replaces the old test that was kind of hackish. 2010-05-25 14:32:55 +00:00
Tom Lane e08ab7c312 Support inlining various small performance-critical functions on non-GCC
compilers, by applying a configure check to see if the compiler will accept
an unreferenced "static inline foo ..." function without warnings.  It is
believed that such warnings are the only reason not to declare inlined
functions in headers, if the compiler understands "inline" at all.

Kurt Harriman
2010-02-13 02:34:16 +00:00
Tom Lane 623f8a0969 Modify the recently-added probe for -Wl,--as-needed some more, because RHEL-4
vintage Linux is even more broken than we realized: a link to libreadline
will succeed, and fail only at runtime.  It seems that an AC_TRY_RUN test
is the only reliable way to check whether this is really safe.  Per report
from Tatsuo.
2008-06-27 00:36:16 +00:00
Tom Lane 1ac1bea076 Adjust -Wl,--asneeded test to avoid using the switch if it breaks
libreadline.  What we will do for compatibility :-(
2008-05-20 03:30:22 +00:00
Tom Lane 2dad10f467 Make another try at using -Wl,--as-needed to suppress linking of unnecessary
shared libraries.  We've tried this before and had problems with libreadline
not linking properly on some platforms, but that seems to be a libreadline
bug that may have been fixed by now.  In any case, it's early enough in the
8.4 devel cycle that we can afford to have some transient breakage while
we work out any portability problems.

On Darwin, we try -Wl,-dead_strip_dylibs, which seems to be the equivalent
incantation there.
2008-05-18 20:13:12 +00:00
Alvaro Herrera 7861d72ea2 Modify the float4 datatype to be pass-by-val. Along the way, remove the last
uses of the long-deprecated float32 in contrib/seg; the definitions themselves
are still there, but no longer used.  fmgr/README updated to match.

I added a CREATE FUNCTION to account for existing seg_center() code in seg.c
too, and some tests for it and the neighbor functions.  At the same time,
remove checks for NULL which are not needed (because the functions are declared
STRICT).

I had to do some adjustments to contrib's btree_gist too.  The choices for
representation there are not ideal for changing the underlying types :-(

Original patch by Zoltan Boszormenyi, with some adjustments by me.
2008-04-18 18:43:09 +00:00
Peter Eisentraut b120382353 Upgrade to Autoconf 2.61:
- Change configure.in to use Autoconf 2.61 and update generated files.
- Update build system and documentation to support now directory variables
  offered by Autoconf 2.61.
- Replace usages of PGAC_CHECK_ALIGNOF by AC_CHECK_ALIGNOF, now available
  in Autoconf 2.61.
- Drop our patched version of AC_C_INLINE, as Autoconf now has the change.
2008-02-17 16:36:43 +00:00
Bruce Momjian b5498167d7 Allow AIX to use --enable-thread-safety by passing PTHREAD_LIBS to
binary compiles, and adjust configure tests for AIX.
2004-12-16 17:48:29 +00:00
Neil Conway 857e210ea9 When using GCC, change the default CFLAGS to:
-O2 -Wall -Wmissing-prototypes -Wpointer-arith

Check whether the version of GCC we are using supports any of:

  -Wdeclaration-after-statement
  -Wendif-labels
  -Wold-style-definition

And add the supported flags to CFLAGS.
2004-10-20 02:12:07 +00:00
Tom Lane 6f295328e5 Do not let external specification of CFLAGS stop us from adding
-fno-strict-aliasing.
2004-02-02 04:07:18 +00:00
PostgreSQL Daemon 969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Peter Eisentraut 801427abc2 Unset CFLAGS before reading template. This should be more robust.
When --enable-debug is used, then the default CFLAGS for non-GCC is just
-g without -O.

Backpatch enhancement of Autoconf inline test that detects problems with
the HP C compiler.
2003-11-01 20:48:51 +00:00
Peter Eisentraut 378f59904a Fix CFLAGS selection to actually work. Add test to detect whether gcc's
option -fno-strict-aliasing is available.
2003-10-25 15:32:11 +00:00
Tom Lane f690920a75 Infrastructure for upgraded error reporting mechanism. elog.c is
rewritten and the protocol is changed, but most elog calls are still
elog calls.  Also, we need to contemplate mechanisms for controlling
all this functionality --- eg, how much stuff should appear in the
postmaster log?  And what API should libpq expose for it?
2003-04-24 21:16:45 +00:00
Peter Eisentraut cb1d036acb Generate pg_config.h.in by autoheader. Separate out manually editable
parts.  Standardize spelling of comments in pg_config.h.
2003-04-06 22:45:23 +00:00
Peter Eisentraut 955a1f81a7 Factor out the code that detects the long long int snprintf format into a
separate macro.  Also add support for %I64d which is the way on Windows.

The code that checks for the 64-bit int type now gives more reasonable
results when cross-compiling: In that case we just take the compiler's
information and trust that the arithmetic works.  Disabling int64 is too
pessimistic.
2003-01-28 21:57:12 +00:00
Peter Eisentraut 7c1ff35410 Upgrade to Autoconf version 2.53. Replaced many custom macro
calls with new or now-built-in versions.  Make sure that all
calls to AC_DEFINE have a third argument, for possible use of
autoheader in the future.
2002-03-29 17:32:55 +00:00
Peter Eisentraut 15abc7788e More correct way to check for existence of types, which allows to specify
which include files to consider.  Should fix BeOS problems with int8 types.
2001-12-02 11:38:40 +00:00
Peter Eisentraut ef6164de1d Revert removal of signed, volatile, and signal handler arg type tests. 2000-08-29 09:36:51 +00:00
Peter Eisentraut 79abd73eee Remove configure tests for `signed', `volatile', and signal handler args;
the harm potential outweighs the possible benefits.
2000-08-27 19:00:41 +00:00
Peter Eisentraut 06cd0f1a32 Substituted new configure test for types of accept()
Interfaced a lot of the custom tests to the config.cache, in the process
made them separate macros and grouped them out into files. Made naming
adjustments.

Removed a couple of useless/unused configure tests.

Disabled C++ by default. C++ is no more special than Perl, Python, and Tcl.
And it breaks equally often. :(
2000-06-11 11:40:09 +00:00