postgresql/src/port/erand48.c

137 lines
3.9 KiB
C
Raw Normal View History

/*-------------------------------------------------------------------------
*
* erand48.c
*
* This file supplies pg_erand48() and related functions, which except
* for the names are just like the POSIX-standard erand48() family.
* (We don't supply the full set though, only the ones we have found use
* for in Postgres. In particular, we do *not* implement lcong48(), so
* that there is no need for the multiplier and addend to be variable.)
*
* We used to test for an operating system version rather than
* unconditionally using our own, but (1) some versions of Cygwin have a
* buggy erand48() that always returns zero and (2) as of 2011, glibc's
* erand48() is strangely coded to be almost-but-not-quite thread-safe,
* which doesn't matter for the backend but is important for pgbench.
*
* Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
*
* Portions Copyright (c) 1993 Martin Birgmeier
* All rights reserved.
*
* You may redistribute unmodified or modified versions of this source
* code provided that the above copyright notice and this and the
* following conditions are retained.
*
* This software is provided ``as is'', and comes with no warranties
* of any kind. I shall in no event be liable for anything that happens
* to anyone/anything when using this software.
*
* IDENTIFICATION
2010-09-20 22:08:53 +02:00
* src/port/erand48.c
*
*-------------------------------------------------------------------------
*/
#include "c.h"
#include <math.h>
/* These values are specified by POSIX */
#define RAND48_MULT UINT64CONST(0x0005deece66d)
#define RAND48_ADD UINT64CONST(0x000b)
/* POSIX specifies 0x330e's use in srand48, but the other bits are arbitrary */
Redesign tablesample method API, and do extensive code review. The original implementation of TABLESAMPLE modeled the tablesample method API on index access methods, which wasn't a good choice because, without specialized DDL commands, there's no way to build an extension that can implement a TSM. (Raw inserts into system catalogs are not an acceptable thing to do, because we can't undo them during DROP EXTENSION, nor will pg_upgrade behave sanely.) Instead adopt an API more like procedural language handlers or foreign data wrappers, wherein the only SQL-level support object needed is a single handler function identified by having a special return type. This lets us get rid of the supporting catalog altogether, so that no custom DDL support is needed for the feature. Adjust the API so that it can support non-constant tablesample arguments (the original coding assumed we could evaluate the argument expressions at ExecInitSampleScan time, which is undesirable even if it weren't outright unsafe), and discourage sampling methods from looking at invisible tuples. Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable within and across queries, as required by the SQL standard, and deal more honestly with methods that can't support that requirement. Make a full code-review pass over the tablesample additions, and fix assorted bugs, omissions, infelicities, and cosmetic issues (such as failure to put the added code stanzas in a consistent ordering). Improve EXPLAIN's output of tablesample plans, too. Back-patch to 9.5 so that we don't have to support the original API in production.
2015-07-25 20:39:00 +02:00
#define RAND48_SEED_0 (0x330e)
#define RAND48_SEED_1 (0xabcd)
#define RAND48_SEED_2 (0x1234)
static unsigned short _rand48_seed[3] = {
RAND48_SEED_0,
RAND48_SEED_1,
RAND48_SEED_2
};
/*
* Advance the 48-bit value stored in xseed[] to the next "random" number.
*
* Also returns the value of that number --- without masking it to 48 bits.
* If caller uses the result, it must mask off the bits it wants.
*/
static uint64
_dorand48(unsigned short xseed[3])
{
/*
* We do the arithmetic in uint64; any type wider than 48 bits would work.
*/
uint64 in;
uint64 out;
in = (uint64) xseed[2] << 32 | (uint64) xseed[1] << 16 | (uint64) xseed[0];
out = in * RAND48_MULT + RAND48_ADD;
xseed[0] = out & 0xFFFF;
xseed[1] = (out >> 16) & 0xFFFF;
xseed[2] = (out >> 32) & 0xFFFF;
return out;
}
/*
* Generate a random floating-point value using caller-supplied state.
* Values are uniformly distributed over the interval [0.0, 1.0).
*/
double
pg_erand48(unsigned short xseed[3])
{
uint64 x = _dorand48(xseed);
return ldexp((double) (x & UINT64CONST(0xFFFFFFFFFFFF)), -48);
}
/*
* Generate a random non-negative integral value using internal state.
* Values are uniformly distributed over the interval [0, 2^31).
*/
long
pg_lrand48(void)
{
uint64 x = _dorand48(_rand48_seed);
return (x >> 17) & UINT64CONST(0x7FFFFFFF);
}
/*
* Generate a random signed integral value using caller-supplied state.
* Values are uniformly distributed over the interval [-2^31, 2^31).
*/
Replace PostmasterRandom() with a stronger source, second attempt. This adds a new routine, pg_strong_random() for generating random bytes, for use in both frontend and backend. At the moment, it's only used in the backend, but the upcoming SCRAM authentication patches need strong random numbers in libpq as well. pg_strong_random() is based on, and replaces, the existing implementation in pgcrypto. It can acquire strong random numbers from a number of sources, depending on what's available: - OpenSSL RAND_bytes(), if built with OpenSSL - On Windows, the native cryptographic functions are used - /dev/urandom Unlike the current pgcrypto function, the source is chosen by configure. That makes it easier to test different implementations, and ensures that we don't accidentally fall back to a less secure implementation, if the primary source fails. All of those methods are quite reliable, it would be pretty surprising for them to fail, so we'd rather find out by failing hard. If no strong random source is available, we fall back to using erand48(), seeded from current timestamp, like PostmasterRandom() was. That isn't cryptographically secure, but allows us to still work on platforms that don't have any of the above stronger sources. Because it's not very secure, the built-in implementation is only used if explicitly requested with --disable-strong-random. This replaces the more complicated Fortuna algorithm we used to have in pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom, so it doesn't seem worth the maintenance effort to keep that. pgcrypto functions that require strong random numbers will be disabled with --disable-strong-random. Original patch by Magnus Hagander, tons of further work by Michael Paquier and me. Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
long
pg_jrand48(unsigned short xseed[3])
{
uint64 x = _dorand48(xseed);
return (int32) ((x >> 16) & UINT64CONST(0xFFFFFFFF));
Replace PostmasterRandom() with a stronger source, second attempt. This adds a new routine, pg_strong_random() for generating random bytes, for use in both frontend and backend. At the moment, it's only used in the backend, but the upcoming SCRAM authentication patches need strong random numbers in libpq as well. pg_strong_random() is based on, and replaces, the existing implementation in pgcrypto. It can acquire strong random numbers from a number of sources, depending on what's available: - OpenSSL RAND_bytes(), if built with OpenSSL - On Windows, the native cryptographic functions are used - /dev/urandom Unlike the current pgcrypto function, the source is chosen by configure. That makes it easier to test different implementations, and ensures that we don't accidentally fall back to a less secure implementation, if the primary source fails. All of those methods are quite reliable, it would be pretty surprising for them to fail, so we'd rather find out by failing hard. If no strong random source is available, we fall back to using erand48(), seeded from current timestamp, like PostmasterRandom() was. That isn't cryptographically secure, but allows us to still work on platforms that don't have any of the above stronger sources. Because it's not very secure, the built-in implementation is only used if explicitly requested with --disable-strong-random. This replaces the more complicated Fortuna algorithm we used to have in pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom, so it doesn't seem worth the maintenance effort to keep that. pgcrypto functions that require strong random numbers will be disabled with --disable-strong-random. Original patch by Magnus Hagander, tons of further work by Michael Paquier and me. Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 12:42:59 +01:00
}
/*
* Initialize the internal state using the given seed.
*
* Per POSIX, this uses only 32 bits from "seed" even if "long" is wider.
* Hence, the set of possible seed values is smaller than it could be.
* Better practice is to use caller-supplied state and initialize it with
* random bits obtained from a high-quality source of random bits.
*
* Note: POSIX specifies a function seed48() that allows all 48 bits
* of the internal state to be set, but we don't currently support that.
*/
void
pg_srand48(long seed)
{
_rand48_seed[0] = RAND48_SEED_0;
_rand48_seed[1] = (unsigned short) seed;
_rand48_seed[2] = (unsigned short) (seed >> 16);
}