Use ARMv8 CRC instructions where available.
ARMv8 introduced special CPU instructions for calculating CRC-32C. Use
them, when available, for speed.
Like with the similar Intel CRC instructions, several factors affect
whether the instructions can be used. The compiler intrinsics for them must
be supported by the compiler, and the instructions must be supported by the
target architecture. If the compilation target architecture does not
support the instructions, but adding "-march=armv8-a+crc" makes them
available, then we compile the code with a runtime check to determine if
the host we're running on supports them or not.
For the runtime check, use glibc getauxval() function. Unfortunately,
that's not very portable, but I couldn't find any more portable way to do
it. If getauxval() is not available, the CRC instructions will still be
used if the target architecture supports them without any additional
compiler flags, but the runtime check will not be available.
Original patch by Yuqi Gu, heavily modified by me. Reviewed by Andres
Freund, Thomas Munro.
Discussion: https://www.postgresql.org/message-id/HE1PR0801MB1323D171938EABC04FFE7FA9E3110%40HE1PR0801MB1323.eurprd08.prod.outlook.com
2018-04-04 11:22:45 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* pg_crc32c_armv8.c
|
|
|
|
* Compute CRC-32C checksum using ARMv8 CRC Extension instructions
|
|
|
|
*
|
|
|
|
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
|
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
*
|
|
|
|
*
|
|
|
|
* IDENTIFICATION
|
|
|
|
* src/port/pg_crc32c_armv8.c
|
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#include "c.h"
|
|
|
|
|
|
|
|
#include "port/pg_crc32c.h"
|
|
|
|
|
|
|
|
#include <arm_acle.h>
|
|
|
|
|
|
|
|
pg_crc32c
|
|
|
|
pg_comp_crc32c_armv8(pg_crc32c crc, const void *data, size_t len)
|
|
|
|
{
|
|
|
|
const unsigned char *p = data;
|
|
|
|
const unsigned char *pend = p + len;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* ARMv8 doesn't require alignment, but aligned memory access is
|
|
|
|
* significantly faster. Process leading bytes so that the loop below
|
|
|
|
* starts with a pointer aligned to eight bytes.
|
|
|
|
*/
|
2018-04-26 20:45:04 +02:00
|
|
|
if (!PointerIsAligned(p, uint16) &&
|
|
|
|
p + 1 <= pend)
|
Use ARMv8 CRC instructions where available.
ARMv8 introduced special CPU instructions for calculating CRC-32C. Use
them, when available, for speed.
Like with the similar Intel CRC instructions, several factors affect
whether the instructions can be used. The compiler intrinsics for them must
be supported by the compiler, and the instructions must be supported by the
target architecture. If the compilation target architecture does not
support the instructions, but adding "-march=armv8-a+crc" makes them
available, then we compile the code with a runtime check to determine if
the host we're running on supports them or not.
For the runtime check, use glibc getauxval() function. Unfortunately,
that's not very portable, but I couldn't find any more portable way to do
it. If getauxval() is not available, the CRC instructions will still be
used if the target architecture supports them without any additional
compiler flags, but the runtime check will not be available.
Original patch by Yuqi Gu, heavily modified by me. Reviewed by Andres
Freund, Thomas Munro.
Discussion: https://www.postgresql.org/message-id/HE1PR0801MB1323D171938EABC04FFE7FA9E3110%40HE1PR0801MB1323.eurprd08.prod.outlook.com
2018-04-04 11:22:45 +02:00
|
|
|
{
|
|
|
|
crc = __crc32cb(crc, *p);
|
|
|
|
p += 1;
|
|
|
|
}
|
2018-04-26 20:45:04 +02:00
|
|
|
if (!PointerIsAligned(p, uint32) &&
|
|
|
|
p + 2 <= pend)
|
Use ARMv8 CRC instructions where available.
ARMv8 introduced special CPU instructions for calculating CRC-32C. Use
them, when available, for speed.
Like with the similar Intel CRC instructions, several factors affect
whether the instructions can be used. The compiler intrinsics for them must
be supported by the compiler, and the instructions must be supported by the
target architecture. If the compilation target architecture does not
support the instructions, but adding "-march=armv8-a+crc" makes them
available, then we compile the code with a runtime check to determine if
the host we're running on supports them or not.
For the runtime check, use glibc getauxval() function. Unfortunately,
that's not very portable, but I couldn't find any more portable way to do
it. If getauxval() is not available, the CRC instructions will still be
used if the target architecture supports them without any additional
compiler flags, but the runtime check will not be available.
Original patch by Yuqi Gu, heavily modified by me. Reviewed by Andres
Freund, Thomas Munro.
Discussion: https://www.postgresql.org/message-id/HE1PR0801MB1323D171938EABC04FFE7FA9E3110%40HE1PR0801MB1323.eurprd08.prod.outlook.com
2018-04-04 11:22:45 +02:00
|
|
|
{
|
|
|
|
crc = __crc32ch(crc, *(uint16 *) p);
|
|
|
|
p += 2;
|
|
|
|
}
|
2018-04-26 20:45:04 +02:00
|
|
|
if (!PointerIsAligned(p, uint64) &&
|
|
|
|
p + 4 <= pend)
|
Use ARMv8 CRC instructions where available.
ARMv8 introduced special CPU instructions for calculating CRC-32C. Use
them, when available, for speed.
Like with the similar Intel CRC instructions, several factors affect
whether the instructions can be used. The compiler intrinsics for them must
be supported by the compiler, and the instructions must be supported by the
target architecture. If the compilation target architecture does not
support the instructions, but adding "-march=armv8-a+crc" makes them
available, then we compile the code with a runtime check to determine if
the host we're running on supports them or not.
For the runtime check, use glibc getauxval() function. Unfortunately,
that's not very portable, but I couldn't find any more portable way to do
it. If getauxval() is not available, the CRC instructions will still be
used if the target architecture supports them without any additional
compiler flags, but the runtime check will not be available.
Original patch by Yuqi Gu, heavily modified by me. Reviewed by Andres
Freund, Thomas Munro.
Discussion: https://www.postgresql.org/message-id/HE1PR0801MB1323D171938EABC04FFE7FA9E3110%40HE1PR0801MB1323.eurprd08.prod.outlook.com
2018-04-04 11:22:45 +02:00
|
|
|
{
|
|
|
|
crc = __crc32cw(crc, *(uint32 *) p);
|
|
|
|
p += 4;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Process eight bytes at a time, as far as we can. */
|
|
|
|
while (p + 8 <= pend)
|
|
|
|
{
|
|
|
|
crc = __crc32cd(crc, *(uint64 *) p);
|
|
|
|
p += 8;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Process remaining 0-7 bytes. */
|
|
|
|
if (p + 4 <= pend)
|
|
|
|
{
|
|
|
|
crc = __crc32cw(crc, *(uint32 *) p);
|
|
|
|
p += 4;
|
|
|
|
}
|
|
|
|
if (p + 2 <= pend)
|
|
|
|
{
|
|
|
|
crc = __crc32ch(crc, *(uint16 *) p);
|
|
|
|
p += 2;
|
|
|
|
}
|
|
|
|
if (p < pend)
|
|
|
|
{
|
|
|
|
crc = __crc32cb(crc, *p);
|
|
|
|
}
|
|
|
|
|
|
|
|
return crc;
|
|
|
|
}
|