postgresql/src/backend/utils/misc
Peter Eisentraut 721856ff24 Remove distprep
A PostgreSQL release tarball contains a number of prebuilt files, in
particular files produced by bison, flex, perl, and well as html and
man documentation.  We have done this consistent with established
practice at the time to not require these tools for building from a
tarball.  Some of these tools were hard to get, or get the right
version of, from time to time, and shipping the prebuilt output was a
convenience to users.

Now this has at least two problems:

One, we have to make the build system(s) work in two modes: Building
from a git checkout and building from a tarball.  This is pretty
complicated, but it works so far for autoconf/make.  It does not
currently work for meson; you can currently only build with meson from
a git checkout.  Making meson builds work from a tarball seems very
difficult or impossible.  One particular problem is that since meson
requires a separate build directory, we cannot make the build update
files like gram.h in the source tree.  So if you were to build from a
tarball and update gram.y, you will have a gram.h in the source tree
and one in the build tree, but the way things work is that the
compiler will always use the one in the source tree.  So you cannot,
for example, make any gram.y changes when building from a tarball.
This seems impossible to fix in a non-horrible way.

Second, there is increased interest nowadays in precisely tracking the
origin of software.  We can reasonably track contributions into the
git tree, and users can reasonably track the path from a tarball to
packages and downloads and installs.  But what happens between the git
tree and the tarball is obscure and in some cases non-reproducible.

The solution for both of these issues is to get rid of the step that
adds prebuilt files to the tarball.  The tarball now only contains
what is in the git tree (*).  Getting the additional build
dependencies is no longer a problem nowadays, and the complications to
keep these dual build modes working are significant.  And of course we
want to get the meson build system working universally.

This commit removes the make distprep target altogether.  The make
dist target continues to do its job, it just doesn't call distprep
anymore.

(*) - The tarball also contains the INSTALL file that is built at make
dist time, but not by distprep.  This is unchanged for now.

The make maintainer-clean target, whose job it is to remove the
prebuilt files in addition to what make distclean does, is now just an
alias to make distprep.  (In practice, it is probably obsolete given
that git clean is available.)

The following programs are now hard build requirements in configure
(they were already required by meson.build):

- bison
- flex
- perl

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/e07408d9-e5f2-d9fd-5672-f53354e9305e@eisentraut.org
2023-11-06 15:18:04 +01:00
..
.gitignore Convert cvsignore to gitignore, and add .gitignore for build targets. 2010-09-22 12:57:04 +02:00
Makefile Remove distprep 2023-11-06 15:18:04 +01:00
README Store GUC data in a memory context, instead of using malloc(). 2022-10-14 12:10:48 -04:00
conffiles.c Update copyright for 2023 2023-01-02 15:00:37 -05:00
guc-file.l Update copyright for 2023 2023-01-02 15:00:37 -05:00
guc.c Make GetConfigOption/GetConfigOptionResetString return "" for NULL. 2023-11-02 11:53:36 -04:00
guc_funcs.c Revert "Add USER SET parameter values for pg_db_role_setting" 2023-05-17 20:28:57 +03:00
guc_internal.h Fix typos in comments, code and documentation 2023-01-03 16:26:14 +09:00
guc_tables.c Remove useless self-joins 2023-10-25 12:59:16 +03:00
help_config.c Update copyright for 2023 2023-01-02 15:00:37 -05:00
meson.build Move queryjumble.c code to src/backend/nodes/ 2023-01-21 11:48:37 +09:00
pg_config.c Update copyright for 2023 2023-01-02 15:00:37 -05:00
pg_controldata.c Acquire ControlFileLock in relevant SQL functions. 2023-10-16 10:43:47 +13:00
pg_rusage.c Update copyright for 2023 2023-01-02 15:00:37 -05:00
postgresql.conf.sample improve alignment of postgresql.conf comments 2023-10-31 08:51:36 -04:00
ps_status.c Fix various typos 2023-04-18 13:23:23 +12:00
queryenvironment.c Update copyright for 2023 2023-01-02 15:00:37 -05:00
rls.c Update copyright for 2023 2023-01-02 15:00:37 -05:00
sampling.c Update copyright for 2023 2023-01-02 15:00:37 -05:00
superuser.c Update copyright for 2023 2023-01-02 15:00:37 -05:00
timeout.c Update copyright for 2023 2023-01-02 15:00:37 -05:00
tzparser.c Update copyright for 2023 2023-01-02 15:00:37 -05:00

README

src/backend/utils/misc/README

GUC Implementation Notes
========================

The GUC (Grand Unified Configuration) module implements configuration
variables of multiple types (currently boolean, enum, int, real, and string).
Variable settings can come from various places, with a priority ordering
determining which setting is used.


Per-Variable Hooks
------------------

Each variable known to GUC can optionally have a check_hook, an
assign_hook, and/or a show_hook to provide customized behavior.
Check hooks are used to perform validity checking on variable values
(above and beyond what GUC can do), to compute derived settings when
nontrivial work is needed to do that, and optionally to "canonicalize"
user-supplied values.  Assign hooks are used to update any derived state
that needs to change when a GUC variable is set.  Show hooks are used to
modify the default SHOW display for a variable.


If a check_hook is provided, it points to a function of the signature
	bool check_hook(datatype *newvalue, void **extra, GucSource source)
The "newvalue" argument is of type bool *, int *, double *, or char **
for bool, int/enum, real, or string variables respectively.  The check
function should validate the proposed new value, and return true if it is
OK or false if not.  The function can optionally do a few other things:

* When rejecting a bad proposed value, it may be useful to append some
additional information to the generic "invalid value for parameter FOO"
complaint that guc.c will emit.  To do that, call
	void GUC_check_errdetail(const char *format, ...)
where the format string and additional arguments follow the rules for
errdetail() arguments.  The resulting string will be emitted as the
DETAIL line of guc.c's error report, so it should follow the message style
guidelines for DETAIL messages.  There is also
	void GUC_check_errhint(const char *format, ...)
which can be used in the same way to append a HINT message.
Occasionally it may even be appropriate to override guc.c's generic primary
message or error code, which can be done with
	void GUC_check_errcode(int sqlerrcode)
	void GUC_check_errmsg(const char *format, ...)
In general, check_hooks should avoid throwing errors directly if possible,
though this may be impractical to avoid for some corner cases such as
out-of-memory.

* Since the newvalue is pass-by-reference, the function can modify it.
This might be used for example to canonicalize the spelling of a string
value, round off a buffer size to the nearest supported value, or replace
a special value such as "-1" with a computed default value.  If the
function wishes to replace a string value, it must guc_malloc (not palloc)
the replacement value, and be sure to guc_free() the previous value.

* Derived information, such as the role OID represented by a user name,
can be stored for use by the assign hook.  To do this, guc_malloc (not palloc)
storage space for the information, and return its address at *extra.
guc.c will automatically guc_free() this space when the associated GUC setting
is no longer of interest.  *extra is initialized to NULL before call, so
it can be ignored if not needed.

The "source" argument indicates the source of the proposed new value,
If it is >= PGC_S_INTERACTIVE, then we are performing an interactive
assignment (e.g., a SET command).  But when source < PGC_S_INTERACTIVE,
we are reading a non-interactive option source, such as postgresql.conf.
This is sometimes needed to determine whether a setting should be
allowed.  The check_hook might also look at the current actual value of
the variable to determine what is allowed.

Note that check hooks are sometimes called just to validate a value,
without any intention of actually changing the setting.  Therefore the
check hook must *not* take any action based on the assumption that an
assignment will occur.


If an assign_hook is provided, it points to a function of the signature
	void assign_hook(datatype newvalue, void *extra)
where the type of "newvalue" matches the kind of variable, and "extra"
is the derived-information pointer returned by the check_hook (always
NULL if there is no check_hook).  This function is called immediately
before actually setting the variable's value (so it can look at the actual
variable to determine the old value, for example to avoid doing work when
the value isn't really changing).

Note that there is no provision for a failure result code.  assign_hooks
should never fail except under the most dire circumstances, since a failure
may for example result in GUC settings not being rolled back properly during
transaction abort.  In general, try to do anything that could conceivably
fail in a check_hook instead, and pass along the results in an "extra"
struct, so that the assign hook has little to do beyond copying the data to
someplace.  This applies particularly to catalog lookups: any required
lookups must be done in the check_hook, since the assign_hook may be
executed during transaction rollback when lookups will be unsafe.

Note that check_hooks are sometimes called outside any transaction, too.
This happens when processing the wired-in "bootstrap" value, values coming
from the postmaster command line or environment, or values coming from
postgresql.conf.  Therefore, any catalog lookups done in a check_hook
should be guarded with an IsTransactionState() test, and there must be a
fallback path to allow derived values to be computed during the first
subsequent use of the GUC setting within a transaction.  A typical
arrangement is for the catalog values computed by the check_hook and
installed by the assign_hook to be used only for the remainder of the
transaction in which the new setting is made.  Each subsequent transaction
looks up the values afresh on first use.  This arrangement is useful to
prevent use of stale catalog values, independently of the problem of
needing to check GUC values outside a transaction.


If a show_hook is provided, it points to a function of the signature
	const char *show_hook(void)
This hook allows variable-specific computation of the value displayed
by SHOW (and other SQL features for showing GUC variable values).
The return value can point to a static buffer, since show functions are
not used reentrantly.


Saving/Restoring GUC Variable Values
------------------------------------

Prior values of configuration variables must be remembered in order to deal
with several special cases: RESET (a/k/a SET TO DEFAULT), rollback of SET
on transaction abort, rollback of SET LOCAL at transaction end (either
commit or abort), and save/restore around a function that has a SET option.
RESET is defined as selecting the value that would be effective had there
never been any SET commands in the current session.

To handle these cases we must keep track of many distinct values for each
variable.  The primary values are:

* actual variable contents	always the current effective value

* reset_val			the value to use for RESET

(Each GUC entry also has a boot_val which is the wired-in default value.
This is assigned to the reset_val and the actual variable during
InitializeGUCOptions().  The boot_val is also consulted to restore the
correct reset_val if SIGHUP processing discovers that a variable formerly
specified in postgresql.conf is no longer set there.)

In addition to the primary values, there is a stack of former effective
values that might need to be restored in future.  Stacking and unstacking
is controlled by the GUC "nest level", which is zero when outside any
transaction, one at top transaction level, and incremented for each
open subtransaction or function call with a SET option.  A stack entry
is made whenever a GUC variable is first modified at a given nesting level.
(Note: the reset_val need not be stacked because it is only changed by
non-transactional operations.)

A stack entry has a state, a prior value of the GUC variable, a remembered
source of that prior value, and depending on the state may also have a
"masked" value.  The masked value is needed when SET followed by SET LOCAL
occur at the same nest level: the SET's value is masked but must be
remembered to restore after transaction commit.

During initialization we set the actual value and reset_val based on
whichever non-interactive source has the highest priority.  They will
have the same value.

The possible transactional operations on a GUC value are:

Entry to a function with a SET option:

	Push a stack entry with the prior variable value and state SAVE,
	then set the variable.

Plain SET command:

	If no stack entry of current level:
		Push new stack entry w/prior value and state SET
	else if stack entry's state is SAVE, SET, or LOCAL:
		change stack state to SET, don't change saved value
		(here we are forgetting effects of prior set action)
	else (entry must have state SET+LOCAL):
		discard its masked value, change state to SET
		(here we are forgetting effects of prior SET and SET LOCAL)
	Now set new value.

SET LOCAL command:

	If no stack entry of current level:
		Push new stack entry w/prior value and state LOCAL
	else if stack entry's state is SAVE or LOCAL or SET+LOCAL:
		no change to stack entry
		(in SAVE case, SET LOCAL will be forgotten at func exit)
	else (entry must have state SET):
		put current active into its masked slot, set state SET+LOCAL
	Now set new value.

Transaction or subtransaction abort:

	Pop stack entries, restoring prior value, until top < subxact depth

Transaction or subtransaction commit (incl. successful function exit):

	While stack entry level >= subxact depth

		if entry's state is SAVE:
			pop, restoring prior value
		else if level is 1 and entry's state is SET+LOCAL:
			pop, restoring *masked* value
		else if level is 1 and entry's state is SET:
			pop, discarding old value
		else if level is 1 and entry's state is LOCAL:
			pop, restoring prior value
		else if there is no entry of exactly level N-1:
			decrement entry's level, no other state change
		else
			merge entries of level N-1 and N as specified below

The merged entry will have level N-1 and prior = older prior, so easiest
to keep older entry and free newer.  There are 12 possibilities since
we already handled level N state = SAVE:

N-1		N

SAVE		SET		discard top prior, set state SET
SAVE		LOCAL		discard top prior, no change to stack entry
SAVE		SET+LOCAL	discard top prior, copy masked, state S+L

SET		SET		discard top prior, no change to stack entry
SET		LOCAL		copy top prior to masked, state S+L
SET		SET+LOCAL	discard top prior, copy masked, state S+L

LOCAL		SET		discard top prior, set state SET
LOCAL		LOCAL		discard top prior, no change to stack entry
LOCAL		SET+LOCAL	discard top prior, copy masked, state S+L

SET+LOCAL	SET		discard top prior and second masked, state SET
SET+LOCAL	LOCAL		discard top prior, no change to stack entry
SET+LOCAL	SET+LOCAL	discard top prior, copy masked, state S+L


RESET is executed like a SET, but using the reset_val as the desired new
value.  (We do not provide a RESET LOCAL command, but SET LOCAL TO DEFAULT
has the same behavior that RESET LOCAL would.)  The source associated with
the reset_val also becomes associated with the actual value.

If SIGHUP is received, the GUC code rereads the postgresql.conf
configuration file (this does not happen in the signal handler, but at
next return to main loop; note that it can be executed while within a
transaction).  New values from postgresql.conf are assigned to actual
variable, reset_val, and stacked actual values, but only if each of
these has a current source priority <= PGC_S_FILE.  (It is thus possible
for reset_val to track the config-file setting even if there is
currently a different interactive value of the actual variable.)

The check_hook, assign_hook and show_hook routines work only with the
actual variable, and are not directly aware of the additional values
maintained by GUC.


GUC Memory Handling
-------------------

String variable values are allocated with guc_malloc or guc_strdup,
which ensure that the values are kept in a long-lived context, and provide
more control over handling out-of-memory failures than bare palloc.

We allow a string variable's actual value, reset_val, boot_val, and stacked
values to point at the same storage.  This makes it slightly harder to free
space (we must test whether a value to be freed isn't equal to any of the
other pointers in the GUC entry or associated stack items).  The main
advantage is that we never need to malloc during transaction commit/abort,
so cannot cause an out-of-memory failure there.

"Extra" structs returned by check_hook routines are managed in the same
way as string values.  Note that we support "extra" structs for all types
of GUC variables, although they are mainly useful with strings.


GUC and Null String Variables
-----------------------------

A GUC string variable can have a boot_val of NULL.  guc.c handles this
unsurprisingly, assigning the NULL to the underlying C variable.  Any code
using such a variable, as well as any hook functions for it, must then be
prepared to deal with a NULL value.

However, it is not possible to assign a NULL value to a GUC string
variable in any other way: values coming from SET, postgresql.conf, etc,
might be empty strings, but they'll never be NULL.  And SHOW displays
a NULL the same as an empty string.  It is therefore not appropriate to
treat a NULL value as a distinct user-visible setting.  A typical use
for a NULL boot_val is to denote that a value hasn't yet been set for
a variable that will receive a real value later in startup.

If it's undesirable for code using the underlying C variable to have to
worry about NULL values ever, the variable can be given a non-null static
initializer as well as a non-null boot_val.  guc.c will overwrite the
static initializer pointer with a copy of the boot_val during
InitializeGUCOptions, but the variable will never contain a NULL.