Commit Graph

2 Commits

Author SHA1 Message Date
Nathan Bossart 598e0114a3 Fix code for probing availability of AVX-512.
This commit fixes a few things:
* Instead of checking for CPU support of the "xsave" extension, we
  need to check for OS support of XGETBV instructions via the
  "osxsave" flag.
* We must check that additional XCR0 bits are set to be sure the
  ZMM registers are fully enabled.
* We should use the recommended ordering of steps.  Specifically,
  we need to check that the ZMM registers are enabled prior to
  checking for AVX-512 via CPUID.

In passing, split this code into separate functions to improve
readability.

Reported-by: Andrew Kane
Reviewed-by: Akash Shankaran, Raghuveer Devulapalli
Discussion: https://postgr.es/m/20240418024459.GA3385227%40nathanxps13
2024-04-23 10:54:04 -05:00
Nathan Bossart 792752af4e Optimize pg_popcount() with AVX-512 instructions.
Presently, pg_popcount() processes data in 32-bit or 64-bit chunks
when possible.  Newer hardware that supports AVX-512 instructions
can use 512-bit chunks, which provides a nice speedup, especially
for larger buffers.  This commit introduces the infrastructure
required to detect compiler and CPU support for the required
AVX-512 intrinsic functions, and it adds a new pg_popcount()
implementation that uses these functions.  If CPU support for this
optimized implementation is detected at runtime, a function pointer
is updated so that it is used by subsequent calls to pg_popcount().

Most of the existing in-tree calls to pg_popcount() should benefit
from these instructions, and calls with smaller buffers should at
least not regress compared to v16.  The new infrastructure
introduced by this commit can also be used to optimize
visibilitymap_count(), but that is left for a follow-up commit.

Co-authored-by: Paul Amonson, Ants Aasma
Reviewed-by: Matthias van de Meent, Tom Lane, Noah Misch, Akash Shankaran, Alvaro Herrera, Andres Freund, David Rowley
Discussion: https://postgr.es/m/BL1PR11MB5304097DF7EA81D04C33F3D1DCA6A%40BL1PR11MB5304.namprd11.prod.outlook.com
2024-04-06 21:56:23 -05:00