2003-08-17 21:58:06 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* typcache.h
|
|
|
|
* Type cache definitions.
|
|
|
|
*
|
|
|
|
* The type cache exists to speed lookup of certain information about data
|
|
|
|
* types that is not directly available from a type's pg_type row.
|
|
|
|
*
|
2020-01-01 18:21:45 +01:00
|
|
|
* Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
|
2003-08-17 21:58:06 +02:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
*
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/include/utils/typcache.h
|
2003-08-17 21:58:06 +02:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#ifndef TYPCACHE_H
|
|
|
|
#define TYPCACHE_H
|
|
|
|
|
2004-04-01 23:28:47 +02:00
|
|
|
#include "access/tupdesc.h"
|
2003-08-17 21:58:06 +02:00
|
|
|
#include "fmgr.h"
|
2017-09-15 04:59:21 +02:00
|
|
|
#include "storage/dsm.h"
|
|
|
|
#include "utils/dsa.h"
|
2003-08-17 21:58:06 +02:00
|
|
|
|
|
|
|
|
Use the typcache to cache constraints for domain types.
Previously, we cached domain constraints for the life of a query, or
really for the life of the FmgrInfo struct that was used to invoke
domain_in() or domain_check(). But plpgsql (and probably other places)
are set up to cache such FmgrInfos for the whole lifespan of a session,
which meant they could be enforcing really stale sets of constraints.
On the other hand, searching pg_constraint once per query gets kind of
expensive too: testing says that as much as half the runtime of a
trivial query such as "SELECT 0::domaintype" went into that.
To fix this, delegate the responsibility for tracking a domain's
constraints to the typcache, which has the infrastructure needed to
detect syscache invalidation events that signal possible changes.
This not only removes unnecessary repeat reads of pg_constraint,
but ensures that we never apply stale constraint data: whatever we
use is the current data according to syscache rules.
Unfortunately, the current configuration of the system catalogs means
we have to flush cached domain-constraint data whenever either pg_type
or pg_constraint changes, which happens rather a lot (eg, creation or
deletion of a temp table will do it). It might be worth rearranging
things to split pg_constraint into two catalogs, of which the domain
constraint one would probably be very low-traffic. That's a job for
another patch though, and in any case this patch should improve matters
materially even with that handicap.
This patch makes use of the recently-added memory context reset callback
feature to manage the lifespan of domain constraint caches, so that we
don't risk deleting a cache that might be in the midst of evaluation.
Although this is a bug fix as well as a performance improvement, no
back-patch. There haven't been many if any field complaints about
stale domain constraint checks, so it doesn't seem worth taking the
risk of modifying data structures as basic as MemoryContexts in back
branches.
2015-03-01 20:06:50 +01:00
|
|
|
/* DomainConstraintCache is an opaque struct known only within typcache.c */
|
|
|
|
typedef struct DomainConstraintCache DomainConstraintCache;
|
|
|
|
|
2010-10-25 05:04:37 +02:00
|
|
|
/* TypeCacheEnumData is an opaque struct known only within typcache.c */
|
|
|
|
struct TypeCacheEnumData;
|
|
|
|
|
2003-08-17 21:58:06 +02:00
|
|
|
typedef struct TypeCacheEntry
|
|
|
|
{
|
|
|
|
/* typeId is the hash lookup key and MUST BE FIRST */
|
|
|
|
Oid type_id; /* OID of the data type */
|
|
|
|
|
2020-03-06 18:19:29 +01:00
|
|
|
uint32 type_id_hash; /* hashed value of the OID */
|
|
|
|
|
2003-08-17 21:58:06 +02:00
|
|
|
/* some subsidiary information copied from the pg_type row */
|
|
|
|
int16 typlen;
|
|
|
|
bool typbyval;
|
|
|
|
char typalign;
|
2011-11-15 19:05:45 +01:00
|
|
|
char typstorage;
|
2004-04-01 23:28:47 +02:00
|
|
|
char typtype;
|
|
|
|
Oid typrelid;
|
Make PL/Python handle domain-type conversions correctly.
Fix PL/Python so that it can handle domains over composite, and so that
it enforces domain constraints correctly in other cases that were not
always done properly before. Notably, it didn't do arrays of domains
right (oversight in commit c12d570fa), and it failed to enforce domain
constraints when returning a composite type containing a domain field,
and if a transform function is being used for a domain's base type then
it failed to enforce domain constraints on the result. Also, in many
places it missed checking domain constraints on null values, because
the plpy_typeio code simply wasn't called for Py_None.
Rather than try to band-aid these problems, I made a significant
refactoring of the plpy_typeio logic. The existing design of recursing
for array and composite members is extended to also treat domains as
containers requiring recursion, and the APIs for the module are cleaned
up and simplified.
The patch also modifies plpy_typeio to rely on the typcache more than
it did before (which was pretty much not at all). This reduces the
need for repetitive lookups, and lets us get rid of an ad-hoc scheme
for detecting changes in composite types. I added a couple of small
features to typcache to help with that.
Although some of this is fixing bugs that long predate v11, I don't
think we should risk a back-patch: it's a significant amount of code
churn, and there've been no complaints from the field about the bugs.
Tom Lane, reviewed by Anthony Bykov
Discussion: https://postgr.es/m/24449.1509393613@sss.pgh.pa.us
2017-11-16 22:22:57 +01:00
|
|
|
Oid typelem;
|
Make pg_statistic and related code account more honestly for collations.
When we first put in collations support, we basically punted on teaching
pg_statistic, ANALYZE, and the planner selectivity functions about that.
They've just used DEFAULT_COLLATION_OID independently of the actual
collation of the data. It's time to improve that, so:
* Add columns to pg_statistic that record the specific collation associated
with each statistics slot.
* Teach ANALYZE to use the column's actual collation when comparing values
for statistical purposes, and record this in the appropriate slot. (Note
that type-specific typanalyze functions are now expected to fill
stats->stacoll with the appropriate collation, too.)
* Teach assorted selectivity functions to use the actual collation of
the stats they are looking at, instead of just assuming it's
DEFAULT_COLLATION_OID.
This should give noticeably better results in selectivity estimates for
columns with nondefault collations, at least for query clauses that use
that same collation (which would be the default behavior in most cases).
It's still true that comparisons with explicit COLLATE clauses different
from the stored data's collation won't be well-estimated, but that's no
worse than before. Also, this patch does make the first step towards
doing better with that, which is that it's now theoretically possible to
collect stats for a collation other than the column's own collation.
Patch by me; thanks to Peter Eisentraut for review.
Discussion: https://postgr.es/m/14706.1544630227@sss.pgh.pa.us
2018-12-14 18:52:49 +01:00
|
|
|
Oid typcollation;
|
2003-08-17 21:58:06 +02:00
|
|
|
|
|
|
|
/*
|
2006-12-23 01:43:13 +01:00
|
|
|
* Information obtained from opfamily entries
|
2003-08-17 21:58:06 +02:00
|
|
|
*
|
2005-11-22 19:17:34 +01:00
|
|
|
* These will be InvalidOid if no match could be found, or if the
|
2011-06-03 21:38:12 +02:00
|
|
|
* information hasn't yet been requested. Also note that for array and
|
|
|
|
* composite types, typcache.c checks that the contained types are
|
|
|
|
* comparable or hashable before allowing eq_opr etc to become set.
|
2003-08-17 21:58:06 +02:00
|
|
|
*/
|
2006-12-23 01:43:13 +01:00
|
|
|
Oid btree_opf; /* the default btree opclass' family */
|
2007-11-15 22:14:46 +01:00
|
|
|
Oid btree_opintype; /* the default btree opclass' opcintype */
|
2006-12-23 01:43:13 +01:00
|
|
|
Oid hash_opf; /* the default hash opclass' family */
|
|
|
|
Oid hash_opintype; /* the default hash opclass' opcintype */
|
|
|
|
Oid eq_opr; /* the equality operator */
|
|
|
|
Oid lt_opr; /* the less-than operator */
|
|
|
|
Oid gt_opr; /* the greater-than operator */
|
|
|
|
Oid cmp_proc; /* the btree comparison function */
|
2010-10-31 02:55:20 +01:00
|
|
|
Oid hash_proc; /* the hash calculation function */
|
2017-09-01 04:21:21 +02:00
|
|
|
Oid hash_extended_proc; /* the extended hash calculation function */
|
2003-08-17 21:58:06 +02:00
|
|
|
|
|
|
|
/*
|
2010-10-31 02:55:20 +01:00
|
|
|
* Pre-set-up fmgr call info for the equality operator, the btree
|
2014-05-06 18:12:18 +02:00
|
|
|
* comparison function, and the hash calculation function. These are kept
|
2010-10-31 02:55:20 +01:00
|
|
|
* in the type cache to avoid problems with memory leaks in repeated calls
|
2011-06-03 21:38:12 +02:00
|
|
|
* to functions such as array_eq, array_cmp, hash_array. There is not
|
|
|
|
* currently a need to maintain call info for the lt_opr or gt_opr.
|
2003-08-17 21:58:06 +02:00
|
|
|
*/
|
|
|
|
FmgrInfo eq_opr_finfo;
|
|
|
|
FmgrInfo cmp_proc_finfo;
|
2010-10-31 02:55:20 +01:00
|
|
|
FmgrInfo hash_proc_finfo;
|
2017-09-01 04:21:21 +02:00
|
|
|
FmgrInfo hash_extended_proc_finfo;
|
2004-04-01 23:28:47 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Tuple descriptor if it's a composite type (row type). NULL if not
|
2006-10-04 02:30:14 +02:00
|
|
|
* composite or information hasn't yet been requested. (NOTE: this is a
|
Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
* reference-counted tupledesc.)
|
|
|
|
*
|
|
|
|
* To simplify caching dependent info, tupDesc_identifier is an identifier
|
|
|
|
* for this tupledesc that is unique for the life of the process, and
|
|
|
|
* changes anytime the tupledesc does. Zero if not yet determined.
|
2004-04-01 23:28:47 +02:00
|
|
|
*/
|
|
|
|
TupleDesc tupDesc;
|
Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
uint64 tupDesc_identifier;
|
2010-10-25 05:04:37 +02:00
|
|
|
|
2011-11-15 19:05:45 +01:00
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* Fields computed when TYPECACHE_RANGE_INFO is requested. Zeroes if not
|
2012-06-10 21:20:04 +02:00
|
|
|
* a range type or information hasn't yet been requested. Note that
|
2011-11-15 19:05:45 +01:00
|
|
|
* rng_cmp_proc_finfo could be different from the element type's default
|
|
|
|
* btree comparison function.
|
|
|
|
*/
|
2012-06-10 21:20:04 +02:00
|
|
|
struct TypeCacheEntry *rngelemtype; /* range's element type */
|
|
|
|
Oid rng_collation; /* collation for comparisons, if any */
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
FmgrInfo rng_cmp_proc_finfo; /* comparison function */
|
2011-11-15 19:05:45 +01:00
|
|
|
FmgrInfo rng_canonical_finfo; /* canonicalization function, if any */
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
FmgrInfo rng_subdiff_finfo; /* difference function, if any */
|
2011-11-15 19:05:45 +01:00
|
|
|
|
2017-10-26 19:47:45 +02:00
|
|
|
/*
|
|
|
|
* Domain's base type and typmod if it's a domain type. Zeroes if not
|
|
|
|
* domain, or if information hasn't been requested.
|
|
|
|
*/
|
|
|
|
Oid domainBaseType;
|
|
|
|
int32 domainBaseTypmod;
|
|
|
|
|
Use the typcache to cache constraints for domain types.
Previously, we cached domain constraints for the life of a query, or
really for the life of the FmgrInfo struct that was used to invoke
domain_in() or domain_check(). But plpgsql (and probably other places)
are set up to cache such FmgrInfos for the whole lifespan of a session,
which meant they could be enforcing really stale sets of constraints.
On the other hand, searching pg_constraint once per query gets kind of
expensive too: testing says that as much as half the runtime of a
trivial query such as "SELECT 0::domaintype" went into that.
To fix this, delegate the responsibility for tracking a domain's
constraints to the typcache, which has the infrastructure needed to
detect syscache invalidation events that signal possible changes.
This not only removes unnecessary repeat reads of pg_constraint,
but ensures that we never apply stale constraint data: whatever we
use is the current data according to syscache rules.
Unfortunately, the current configuration of the system catalogs means
we have to flush cached domain-constraint data whenever either pg_type
or pg_constraint changes, which happens rather a lot (eg, creation or
deletion of a temp table will do it). It might be worth rearranging
things to split pg_constraint into two catalogs, of which the domain
constraint one would probably be very low-traffic. That's a job for
another patch though, and in any case this patch should improve matters
materially even with that handicap.
This patch makes use of the recently-added memory context reset callback
feature to manage the lifespan of domain constraint caches, so that we
don't risk deleting a cache that might be in the midst of evaluation.
Although this is a bug fix as well as a performance improvement, no
back-patch. There haven't been many if any field complaints about
stale domain constraint checks, so it doesn't seem worth taking the
risk of modifying data structures as basic as MemoryContexts in back
branches.
2015-03-01 20:06:50 +01:00
|
|
|
/*
|
|
|
|
* Domain constraint data if it's a domain type. NULL if not domain, or
|
|
|
|
* if domain has no constraints, or if information hasn't been requested.
|
|
|
|
*/
|
|
|
|
DomainConstraintCache *domainData;
|
|
|
|
|
2011-06-03 21:38:12 +02:00
|
|
|
/* Private data, for internal use of typcache.c only */
|
|
|
|
int flags; /* flags about what we've computed */
|
|
|
|
|
2010-10-25 05:04:37 +02:00
|
|
|
/*
|
2014-05-06 18:12:18 +02:00
|
|
|
* Private information about an enum type. NULL if not enum or
|
2010-10-25 05:04:37 +02:00
|
|
|
* information hasn't been requested.
|
|
|
|
*/
|
|
|
|
struct TypeCacheEnumData *enumData;
|
Use the typcache to cache constraints for domain types.
Previously, we cached domain constraints for the life of a query, or
really for the life of the FmgrInfo struct that was used to invoke
domain_in() or domain_check(). But plpgsql (and probably other places)
are set up to cache such FmgrInfos for the whole lifespan of a session,
which meant they could be enforcing really stale sets of constraints.
On the other hand, searching pg_constraint once per query gets kind of
expensive too: testing says that as much as half the runtime of a
trivial query such as "SELECT 0::domaintype" went into that.
To fix this, delegate the responsibility for tracking a domain's
constraints to the typcache, which has the infrastructure needed to
detect syscache invalidation events that signal possible changes.
This not only removes unnecessary repeat reads of pg_constraint,
but ensures that we never apply stale constraint data: whatever we
use is the current data according to syscache rules.
Unfortunately, the current configuration of the system catalogs means
we have to flush cached domain-constraint data whenever either pg_type
or pg_constraint changes, which happens rather a lot (eg, creation or
deletion of a temp table will do it). It might be worth rearranging
things to split pg_constraint into two catalogs, of which the domain
constraint one would probably be very low-traffic. That's a job for
another patch though, and in any case this patch should improve matters
materially even with that handicap.
This patch makes use of the recently-added memory context reset callback
feature to manage the lifespan of domain constraint caches, so that we
don't risk deleting a cache that might be in the midst of evaluation.
Although this is a bug fix as well as a performance improvement, no
back-patch. There haven't been many if any field complaints about
stale domain constraint checks, so it doesn't seem worth taking the
risk of modifying data structures as basic as MemoryContexts in back
branches.
2015-03-01 20:06:50 +01:00
|
|
|
|
|
|
|
/* We also maintain a list of all known domain-type cache entries */
|
|
|
|
struct TypeCacheEntry *nextDomain;
|
2003-08-17 21:58:06 +02:00
|
|
|
} TypeCacheEntry;
|
|
|
|
|
|
|
|
/* Bit flags to indicate which fields a given caller needs to have set */
|
|
|
|
#define TYPECACHE_EQ_OPR 0x0001
|
|
|
|
#define TYPECACHE_LT_OPR 0x0002
|
|
|
|
#define TYPECACHE_GT_OPR 0x0004
|
|
|
|
#define TYPECACHE_CMP_PROC 0x0008
|
2010-10-31 02:55:20 +01:00
|
|
|
#define TYPECACHE_HASH_PROC 0x0010
|
|
|
|
#define TYPECACHE_EQ_OPR_FINFO 0x0020
|
|
|
|
#define TYPECACHE_CMP_PROC_FINFO 0x0040
|
|
|
|
#define TYPECACHE_HASH_PROC_FINFO 0x0080
|
|
|
|
#define TYPECACHE_TUPDESC 0x0100
|
|
|
|
#define TYPECACHE_BTREE_OPFAMILY 0x0200
|
|
|
|
#define TYPECACHE_HASH_OPFAMILY 0x0400
|
2011-11-15 19:05:45 +01:00
|
|
|
#define TYPECACHE_RANGE_INFO 0x0800
|
2017-10-26 19:47:45 +02:00
|
|
|
#define TYPECACHE_DOMAIN_BASE_INFO 0x1000
|
|
|
|
#define TYPECACHE_DOMAIN_CONSTR_INFO 0x2000
|
|
|
|
#define TYPECACHE_HASH_EXTENDED_PROC 0x4000
|
|
|
|
#define TYPECACHE_HASH_EXTENDED_PROC_FINFO 0x8000
|
Use the typcache to cache constraints for domain types.
Previously, we cached domain constraints for the life of a query, or
really for the life of the FmgrInfo struct that was used to invoke
domain_in() or domain_check(). But plpgsql (and probably other places)
are set up to cache such FmgrInfos for the whole lifespan of a session,
which meant they could be enforcing really stale sets of constraints.
On the other hand, searching pg_constraint once per query gets kind of
expensive too: testing says that as much as half the runtime of a
trivial query such as "SELECT 0::domaintype" went into that.
To fix this, delegate the responsibility for tracking a domain's
constraints to the typcache, which has the infrastructure needed to
detect syscache invalidation events that signal possible changes.
This not only removes unnecessary repeat reads of pg_constraint,
but ensures that we never apply stale constraint data: whatever we
use is the current data according to syscache rules.
Unfortunately, the current configuration of the system catalogs means
we have to flush cached domain-constraint data whenever either pg_type
or pg_constraint changes, which happens rather a lot (eg, creation or
deletion of a temp table will do it). It might be worth rearranging
things to split pg_constraint into two catalogs, of which the domain
constraint one would probably be very low-traffic. That's a job for
another patch though, and in any case this patch should improve matters
materially even with that handicap.
This patch makes use of the recently-added memory context reset callback
feature to manage the lifespan of domain constraint caches, so that we
don't risk deleting a cache that might be in the midst of evaluation.
Although this is a bug fix as well as a performance improvement, no
back-patch. There haven't been many if any field complaints about
stale domain constraint checks, so it doesn't seem worth taking the
risk of modifying data structures as basic as MemoryContexts in back
branches.
2015-03-01 20:06:50 +01:00
|
|
|
|
Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
/* This value will not equal any valid tupledesc identifier, nor 0 */
|
|
|
|
#define INVALID_TUPLEDESC_IDENTIFIER ((uint64) 1)
|
|
|
|
|
Use the typcache to cache constraints for domain types.
Previously, we cached domain constraints for the life of a query, or
really for the life of the FmgrInfo struct that was used to invoke
domain_in() or domain_check(). But plpgsql (and probably other places)
are set up to cache such FmgrInfos for the whole lifespan of a session,
which meant they could be enforcing really stale sets of constraints.
On the other hand, searching pg_constraint once per query gets kind of
expensive too: testing says that as much as half the runtime of a
trivial query such as "SELECT 0::domaintype" went into that.
To fix this, delegate the responsibility for tracking a domain's
constraints to the typcache, which has the infrastructure needed to
detect syscache invalidation events that signal possible changes.
This not only removes unnecessary repeat reads of pg_constraint,
but ensures that we never apply stale constraint data: whatever we
use is the current data according to syscache rules.
Unfortunately, the current configuration of the system catalogs means
we have to flush cached domain-constraint data whenever either pg_type
or pg_constraint changes, which happens rather a lot (eg, creation or
deletion of a temp table will do it). It might be worth rearranging
things to split pg_constraint into two catalogs, of which the domain
constraint one would probably be very low-traffic. That's a job for
another patch though, and in any case this patch should improve matters
materially even with that handicap.
This patch makes use of the recently-added memory context reset callback
feature to manage the lifespan of domain constraint caches, so that we
don't risk deleting a cache that might be in the midst of evaluation.
Although this is a bug fix as well as a performance improvement, no
back-patch. There haven't been many if any field complaints about
stale domain constraint checks, so it doesn't seem worth taking the
risk of modifying data structures as basic as MemoryContexts in back
branches.
2015-03-01 20:06:50 +01:00
|
|
|
/*
|
|
|
|
* Callers wishing to maintain a long-lived reference to a domain's constraint
|
|
|
|
* set must store it in one of these. Use InitDomainConstraintRef() and
|
|
|
|
* UpdateDomainConstraintRef() to manage it. Note: DomainConstraintState is
|
|
|
|
* considered an executable expression type, so it's defined in execnodes.h.
|
|
|
|
*/
|
|
|
|
typedef struct DomainConstraintRef
|
|
|
|
{
|
|
|
|
List *constraints; /* list of DomainConstraintState nodes */
|
2015-11-30 00:18:42 +01:00
|
|
|
MemoryContext refctx; /* context holding DomainConstraintRef */
|
2016-12-22 21:01:27 +01:00
|
|
|
TypeCacheEntry *tcache; /* typcache entry for domain type */
|
Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.
This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.
The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
out operation metadata sequentially; including the avoidance of
nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
constant re-checks at evaluation time
Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.
The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
overhead of expression evaluation, by caching state in prepared
statements. That'd be helpful in OLTPish scenarios where
initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
been made here too.
The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
initialization, whereas previously they were done during
execution. In edge cases this can lead to errors being raised that
previously wouldn't have been, e.g. a NULL array being coerced to a
different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
during expression initialization, previously it was re-built
every time a domain check was evaluated. For normal queries this
doesn't change much, but e.g. for plpgsql functions, which caches
ExprStates, the old set could stick around longer. The behavior
around might still change.
Author: Andres Freund, with significant changes by Tom Lane,
changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-14 23:45:36 +01:00
|
|
|
bool need_exprstate; /* does caller need check_exprstate? */
|
Use the typcache to cache constraints for domain types.
Previously, we cached domain constraints for the life of a query, or
really for the life of the FmgrInfo struct that was used to invoke
domain_in() or domain_check(). But plpgsql (and probably other places)
are set up to cache such FmgrInfos for the whole lifespan of a session,
which meant they could be enforcing really stale sets of constraints.
On the other hand, searching pg_constraint once per query gets kind of
expensive too: testing says that as much as half the runtime of a
trivial query such as "SELECT 0::domaintype" went into that.
To fix this, delegate the responsibility for tracking a domain's
constraints to the typcache, which has the infrastructure needed to
detect syscache invalidation events that signal possible changes.
This not only removes unnecessary repeat reads of pg_constraint,
but ensures that we never apply stale constraint data: whatever we
use is the current data according to syscache rules.
Unfortunately, the current configuration of the system catalogs means
we have to flush cached domain-constraint data whenever either pg_type
or pg_constraint changes, which happens rather a lot (eg, creation or
deletion of a temp table will do it). It might be worth rearranging
things to split pg_constraint into two catalogs, of which the domain
constraint one would probably be very low-traffic. That's a job for
another patch though, and in any case this patch should improve matters
materially even with that handicap.
This patch makes use of the recently-added memory context reset callback
feature to manage the lifespan of domain constraint caches, so that we
don't risk deleting a cache that might be in the midst of evaluation.
Although this is a bug fix as well as a performance improvement, no
back-patch. There haven't been many if any field complaints about
stale domain constraint checks, so it doesn't seem worth taking the
risk of modifying data structures as basic as MemoryContexts in back
branches.
2015-03-01 20:06:50 +01:00
|
|
|
|
|
|
|
/* Management data --- treat these fields as private to typcache.c */
|
|
|
|
DomainConstraintCache *dcc; /* current constraints, or NULL if none */
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
MemoryContextCallback callback; /* used to release refcount when done */
|
Use the typcache to cache constraints for domain types.
Previously, we cached domain constraints for the life of a query, or
really for the life of the FmgrInfo struct that was used to invoke
domain_in() or domain_check(). But plpgsql (and probably other places)
are set up to cache such FmgrInfos for the whole lifespan of a session,
which meant they could be enforcing really stale sets of constraints.
On the other hand, searching pg_constraint once per query gets kind of
expensive too: testing says that as much as half the runtime of a
trivial query such as "SELECT 0::domaintype" went into that.
To fix this, delegate the responsibility for tracking a domain's
constraints to the typcache, which has the infrastructure needed to
detect syscache invalidation events that signal possible changes.
This not only removes unnecessary repeat reads of pg_constraint,
but ensures that we never apply stale constraint data: whatever we
use is the current data according to syscache rules.
Unfortunately, the current configuration of the system catalogs means
we have to flush cached domain-constraint data whenever either pg_type
or pg_constraint changes, which happens rather a lot (eg, creation or
deletion of a temp table will do it). It might be worth rearranging
things to split pg_constraint into two catalogs, of which the domain
constraint one would probably be very low-traffic. That's a job for
another patch though, and in any case this patch should improve matters
materially even with that handicap.
This patch makes use of the recently-added memory context reset callback
feature to manage the lifespan of domain constraint caches, so that we
don't risk deleting a cache that might be in the midst of evaluation.
Although this is a bug fix as well as a performance improvement, no
back-patch. There haven't been many if any field complaints about
stale domain constraint checks, so it doesn't seem worth taking the
risk of modifying data structures as basic as MemoryContexts in back
branches.
2015-03-01 20:06:50 +01:00
|
|
|
} DomainConstraintRef;
|
|
|
|
|
2017-09-15 04:59:21 +02:00
|
|
|
typedef struct SharedRecordTypmodRegistry SharedRecordTypmodRegistry;
|
2003-08-17 21:58:06 +02:00
|
|
|
|
|
|
|
extern TypeCacheEntry *lookup_type_cache(Oid type_id, int flags);
|
|
|
|
|
Use the typcache to cache constraints for domain types.
Previously, we cached domain constraints for the life of a query, or
really for the life of the FmgrInfo struct that was used to invoke
domain_in() or domain_check(). But plpgsql (and probably other places)
are set up to cache such FmgrInfos for the whole lifespan of a session,
which meant they could be enforcing really stale sets of constraints.
On the other hand, searching pg_constraint once per query gets kind of
expensive too: testing says that as much as half the runtime of a
trivial query such as "SELECT 0::domaintype" went into that.
To fix this, delegate the responsibility for tracking a domain's
constraints to the typcache, which has the infrastructure needed to
detect syscache invalidation events that signal possible changes.
This not only removes unnecessary repeat reads of pg_constraint,
but ensures that we never apply stale constraint data: whatever we
use is the current data according to syscache rules.
Unfortunately, the current configuration of the system catalogs means
we have to flush cached domain-constraint data whenever either pg_type
or pg_constraint changes, which happens rather a lot (eg, creation or
deletion of a temp table will do it). It might be worth rearranging
things to split pg_constraint into two catalogs, of which the domain
constraint one would probably be very low-traffic. That's a job for
another patch though, and in any case this patch should improve matters
materially even with that handicap.
This patch makes use of the recently-added memory context reset callback
feature to manage the lifespan of domain constraint caches, so that we
don't risk deleting a cache that might be in the midst of evaluation.
Although this is a bug fix as well as a performance improvement, no
back-patch. There haven't been many if any field complaints about
stale domain constraint checks, so it doesn't seem worth taking the
risk of modifying data structures as basic as MemoryContexts in back
branches.
2015-03-01 20:06:50 +01:00
|
|
|
extern void InitDomainConstraintRef(Oid type_id, DomainConstraintRef *ref,
|
2019-05-22 19:04:48 +02:00
|
|
|
MemoryContext refctx, bool need_exprstate);
|
Use the typcache to cache constraints for domain types.
Previously, we cached domain constraints for the life of a query, or
really for the life of the FmgrInfo struct that was used to invoke
domain_in() or domain_check(). But plpgsql (and probably other places)
are set up to cache such FmgrInfos for the whole lifespan of a session,
which meant they could be enforcing really stale sets of constraints.
On the other hand, searching pg_constraint once per query gets kind of
expensive too: testing says that as much as half the runtime of a
trivial query such as "SELECT 0::domaintype" went into that.
To fix this, delegate the responsibility for tracking a domain's
constraints to the typcache, which has the infrastructure needed to
detect syscache invalidation events that signal possible changes.
This not only removes unnecessary repeat reads of pg_constraint,
but ensures that we never apply stale constraint data: whatever we
use is the current data according to syscache rules.
Unfortunately, the current configuration of the system catalogs means
we have to flush cached domain-constraint data whenever either pg_type
or pg_constraint changes, which happens rather a lot (eg, creation or
deletion of a temp table will do it). It might be worth rearranging
things to split pg_constraint into two catalogs, of which the domain
constraint one would probably be very low-traffic. That's a job for
another patch though, and in any case this patch should improve matters
materially even with that handicap.
This patch makes use of the recently-added memory context reset callback
feature to manage the lifespan of domain constraint caches, so that we
don't risk deleting a cache that might be in the midst of evaluation.
Although this is a bug fix as well as a performance improvement, no
back-patch. There haven't been many if any field complaints about
stale domain constraint checks, so it doesn't seem worth taking the
risk of modifying data structures as basic as MemoryContexts in back
branches.
2015-03-01 20:06:50 +01:00
|
|
|
|
|
|
|
extern void UpdateDomainConstraintRef(DomainConstraintRef *ref);
|
|
|
|
|
|
|
|
extern bool DomainHasConstraints(Oid type_id);
|
|
|
|
|
2004-04-01 23:28:47 +02:00
|
|
|
extern TupleDesc lookup_rowtype_tupdesc(Oid type_id, int32 typmod);
|
|
|
|
|
2004-06-05 03:55:05 +02:00
|
|
|
extern TupleDesc lookup_rowtype_tupdesc_noerror(Oid type_id, int32 typmod,
|
2019-05-22 19:04:48 +02:00
|
|
|
bool noError);
|
2004-06-05 03:55:05 +02:00
|
|
|
|
2006-06-16 20:42:24 +02:00
|
|
|
extern TupleDesc lookup_rowtype_tupdesc_copy(Oid type_id, int32 typmod);
|
|
|
|
|
2017-10-26 19:47:45 +02:00
|
|
|
extern TupleDesc lookup_rowtype_tupdesc_domain(Oid type_id, int32 typmod,
|
2019-05-22 19:04:48 +02:00
|
|
|
bool noError);
|
2017-10-26 19:47:45 +02:00
|
|
|
|
2004-04-01 23:28:47 +02:00
|
|
|
extern void assign_record_type_typmod(TupleDesc tupDesc);
|
|
|
|
|
Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
extern uint64 assign_record_type_identifier(Oid type_id, int32 typmod);
|
|
|
|
|
2010-10-25 05:04:37 +02:00
|
|
|
extern int compare_values_of_enum(TypeCacheEntry *tcache, Oid arg1, Oid arg2);
|
|
|
|
|
2017-09-15 04:59:21 +02:00
|
|
|
extern size_t SharedRecordTypmodRegistryEstimate(void);
|
|
|
|
|
|
|
|
extern void SharedRecordTypmodRegistryInit(SharedRecordTypmodRegistry *,
|
2019-05-22 19:04:48 +02:00
|
|
|
dsm_segment *segment, dsa_area *area);
|
2017-09-15 04:59:21 +02:00
|
|
|
|
|
|
|
extern void SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *);
|
|
|
|
|
Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
|
|
|
#endif /* TYPCACHE_H */
|