postgresql/src/include/utils/typcache.h

203 lines
7.2 KiB
C
Raw Normal View History

/*-------------------------------------------------------------------------
*
* typcache.h
* Type cache definitions.
*
* The type cache exists to speed lookup of certain information about data
* types that is not directly available from a type's pg_type row.
*
* Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
2010-09-20 22:08:53 +02:00
* src/include/utils/typcache.h
*
*-------------------------------------------------------------------------
*/
#ifndef TYPCACHE_H
#define TYPCACHE_H
#include "access/tupdesc.h"
#include "fmgr.h"
#include "storage/dsm.h"
#include "utils/dsa.h"
Use the typcache to cache constraints for domain types. Previously, we cached domain constraints for the life of a query, or really for the life of the FmgrInfo struct that was used to invoke domain_in() or domain_check(). But plpgsql (and probably other places) are set up to cache such FmgrInfos for the whole lifespan of a session, which meant they could be enforcing really stale sets of constraints. On the other hand, searching pg_constraint once per query gets kind of expensive too: testing says that as much as half the runtime of a trivial query such as "SELECT 0::domaintype" went into that. To fix this, delegate the responsibility for tracking a domain's constraints to the typcache, which has the infrastructure needed to detect syscache invalidation events that signal possible changes. This not only removes unnecessary repeat reads of pg_constraint, but ensures that we never apply stale constraint data: whatever we use is the current data according to syscache rules. Unfortunately, the current configuration of the system catalogs means we have to flush cached domain-constraint data whenever either pg_type or pg_constraint changes, which happens rather a lot (eg, creation or deletion of a temp table will do it). It might be worth rearranging things to split pg_constraint into two catalogs, of which the domain constraint one would probably be very low-traffic. That's a job for another patch though, and in any case this patch should improve matters materially even with that handicap. This patch makes use of the recently-added memory context reset callback feature to manage the lifespan of domain constraint caches, so that we don't risk deleting a cache that might be in the midst of evaluation. Although this is a bug fix as well as a performance improvement, no back-patch. There haven't been many if any field complaints about stale domain constraint checks, so it doesn't seem worth taking the risk of modifying data structures as basic as MemoryContexts in back branches.
2015-03-01 20:06:50 +01:00
/* DomainConstraintCache is an opaque struct known only within typcache.c */
typedef struct DomainConstraintCache DomainConstraintCache;
/* TypeCacheEnumData is an opaque struct known only within typcache.c */
struct TypeCacheEnumData;
typedef struct TypeCacheEntry
{
/* typeId is the hash lookup key and MUST BE FIRST */
Oid type_id; /* OID of the data type */
uint32 type_id_hash; /* hashed value of the OID */
/* some subsidiary information copied from the pg_type row */
int16 typlen;
bool typbyval;
char typalign;
char typstorage;
char typtype;
Oid typrelid;
Make PL/Python handle domain-type conversions correctly. Fix PL/Python so that it can handle domains over composite, and so that it enforces domain constraints correctly in other cases that were not always done properly before. Notably, it didn't do arrays of domains right (oversight in commit c12d570fa), and it failed to enforce domain constraints when returning a composite type containing a domain field, and if a transform function is being used for a domain's base type then it failed to enforce domain constraints on the result. Also, in many places it missed checking domain constraints on null values, because the plpy_typeio code simply wasn't called for Py_None. Rather than try to band-aid these problems, I made a significant refactoring of the plpy_typeio logic. The existing design of recursing for array and composite members is extended to also treat domains as containers requiring recursion, and the APIs for the module are cleaned up and simplified. The patch also modifies plpy_typeio to rely on the typcache more than it did before (which was pretty much not at all). This reduces the need for repetitive lookups, and lets us get rid of an ad-hoc scheme for detecting changes in composite types. I added a couple of small features to typcache to help with that. Although some of this is fixing bugs that long predate v11, I don't think we should risk a back-patch: it's a significant amount of code churn, and there've been no complaints from the field about the bugs. Tom Lane, reviewed by Anthony Bykov Discussion: https://postgr.es/m/24449.1509393613@sss.pgh.pa.us
2017-11-16 22:22:57 +01:00
Oid typelem;
Make pg_statistic and related code account more honestly for collations. When we first put in collations support, we basically punted on teaching pg_statistic, ANALYZE, and the planner selectivity functions about that. They've just used DEFAULT_COLLATION_OID independently of the actual collation of the data. It's time to improve that, so: * Add columns to pg_statistic that record the specific collation associated with each statistics slot. * Teach ANALYZE to use the column's actual collation when comparing values for statistical purposes, and record this in the appropriate slot. (Note that type-specific typanalyze functions are now expected to fill stats->stacoll with the appropriate collation, too.) * Teach assorted selectivity functions to use the actual collation of the stats they are looking at, instead of just assuming it's DEFAULT_COLLATION_OID. This should give noticeably better results in selectivity estimates for columns with nondefault collations, at least for query clauses that use that same collation (which would be the default behavior in most cases). It's still true that comparisons with explicit COLLATE clauses different from the stored data's collation won't be well-estimated, but that's no worse than before. Also, this patch does make the first step towards doing better with that, which is that it's now theoretically possible to collect stats for a collation other than the column's own collation. Patch by me; thanks to Peter Eisentraut for review. Discussion: https://postgr.es/m/14706.1544630227@sss.pgh.pa.us
2018-12-14 18:52:49 +01:00
Oid typcollation;
/*
* Information obtained from opfamily entries
*
* These will be InvalidOid if no match could be found, or if the
* information hasn't yet been requested. Also note that for array and
* composite types, typcache.c checks that the contained types are
* comparable or hashable before allowing eq_opr etc to become set.
*/
Oid btree_opf; /* the default btree opclass' family */
2007-11-15 22:14:46 +01:00
Oid btree_opintype; /* the default btree opclass' opcintype */
Oid hash_opf; /* the default hash opclass' family */
Oid hash_opintype; /* the default hash opclass' opcintype */
Oid eq_opr; /* the equality operator */
Oid lt_opr; /* the less-than operator */
Oid gt_opr; /* the greater-than operator */
Oid cmp_proc; /* the btree comparison function */
Oid hash_proc; /* the hash calculation function */
Oid hash_extended_proc; /* the extended hash calculation function */
/*
* Pre-set-up fmgr call info for the equality operator, the btree
* comparison function, and the hash calculation function. These are kept
* in the type cache to avoid problems with memory leaks in repeated calls
* to functions such as array_eq, array_cmp, hash_array. There is not
* currently a need to maintain call info for the lt_opr or gt_opr.
*/
FmgrInfo eq_opr_finfo;
FmgrInfo cmp_proc_finfo;
FmgrInfo hash_proc_finfo;
FmgrInfo hash_extended_proc_finfo;
/*
* Tuple descriptor if it's a composite type (row type). NULL if not
2006-10-04 02:30:14 +02:00
* composite or information hasn't yet been requested. (NOTE: this is a
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
* reference-counted tupledesc.)
*
* To simplify caching dependent info, tupDesc_identifier is an identifier
* for this tupledesc that is unique for the life of the process, and
* changes anytime the tupledesc does. Zero if not yet determined.
*/
TupleDesc tupDesc;
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
uint64 tupDesc_identifier;
/*
* Fields computed when TYPECACHE_RANGE_INFO is requested. Zeroes if not
* a range type or information hasn't yet been requested. Note that
* rng_cmp_proc_finfo could be different from the element type's default
* btree comparison function.
*/
struct TypeCacheEntry *rngelemtype; /* range's element type */
Oid rng_collation; /* collation for comparisons, if any */
Phase 2 of pgindent updates. Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
FmgrInfo rng_cmp_proc_finfo; /* comparison function */
FmgrInfo rng_canonical_finfo; /* canonicalization function, if any */
Phase 2 of pgindent updates. Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
FmgrInfo rng_subdiff_finfo; /* difference function, if any */
/*
* Domain's base type and typmod if it's a domain type. Zeroes if not
* domain, or if information hasn't been requested.
*/
Oid domainBaseType;
int32 domainBaseTypmod;
Use the typcache to cache constraints for domain types. Previously, we cached domain constraints for the life of a query, or really for the life of the FmgrInfo struct that was used to invoke domain_in() or domain_check(). But plpgsql (and probably other places) are set up to cache such FmgrInfos for the whole lifespan of a session, which meant they could be enforcing really stale sets of constraints. On the other hand, searching pg_constraint once per query gets kind of expensive too: testing says that as much as half the runtime of a trivial query such as "SELECT 0::domaintype" went into that. To fix this, delegate the responsibility for tracking a domain's constraints to the typcache, which has the infrastructure needed to detect syscache invalidation events that signal possible changes. This not only removes unnecessary repeat reads of pg_constraint, but ensures that we never apply stale constraint data: whatever we use is the current data according to syscache rules. Unfortunately, the current configuration of the system catalogs means we have to flush cached domain-constraint data whenever either pg_type or pg_constraint changes, which happens rather a lot (eg, creation or deletion of a temp table will do it). It might be worth rearranging things to split pg_constraint into two catalogs, of which the domain constraint one would probably be very low-traffic. That's a job for another patch though, and in any case this patch should improve matters materially even with that handicap. This patch makes use of the recently-added memory context reset callback feature to manage the lifespan of domain constraint caches, so that we don't risk deleting a cache that might be in the midst of evaluation. Although this is a bug fix as well as a performance improvement, no back-patch. There haven't been many if any field complaints about stale domain constraint checks, so it doesn't seem worth taking the risk of modifying data structures as basic as MemoryContexts in back branches.
2015-03-01 20:06:50 +01:00
/*
* Domain constraint data if it's a domain type. NULL if not domain, or
* if domain has no constraints, or if information hasn't been requested.
*/
DomainConstraintCache *domainData;
/* Private data, for internal use of typcache.c only */
int flags; /* flags about what we've computed */
/*
* Private information about an enum type. NULL if not enum or
* information hasn't been requested.
*/
struct TypeCacheEnumData *enumData;
Use the typcache to cache constraints for domain types. Previously, we cached domain constraints for the life of a query, or really for the life of the FmgrInfo struct that was used to invoke domain_in() or domain_check(). But plpgsql (and probably other places) are set up to cache such FmgrInfos for the whole lifespan of a session, which meant they could be enforcing really stale sets of constraints. On the other hand, searching pg_constraint once per query gets kind of expensive too: testing says that as much as half the runtime of a trivial query such as "SELECT 0::domaintype" went into that. To fix this, delegate the responsibility for tracking a domain's constraints to the typcache, which has the infrastructure needed to detect syscache invalidation events that signal possible changes. This not only removes unnecessary repeat reads of pg_constraint, but ensures that we never apply stale constraint data: whatever we use is the current data according to syscache rules. Unfortunately, the current configuration of the system catalogs means we have to flush cached domain-constraint data whenever either pg_type or pg_constraint changes, which happens rather a lot (eg, creation or deletion of a temp table will do it). It might be worth rearranging things to split pg_constraint into two catalogs, of which the domain constraint one would probably be very low-traffic. That's a job for another patch though, and in any case this patch should improve matters materially even with that handicap. This patch makes use of the recently-added memory context reset callback feature to manage the lifespan of domain constraint caches, so that we don't risk deleting a cache that might be in the midst of evaluation. Although this is a bug fix as well as a performance improvement, no back-patch. There haven't been many if any field complaints about stale domain constraint checks, so it doesn't seem worth taking the risk of modifying data structures as basic as MemoryContexts in back branches.
2015-03-01 20:06:50 +01:00
/* We also maintain a list of all known domain-type cache entries */
struct TypeCacheEntry *nextDomain;
} TypeCacheEntry;
/* Bit flags to indicate which fields a given caller needs to have set */
#define TYPECACHE_EQ_OPR 0x0001
#define TYPECACHE_LT_OPR 0x0002
#define TYPECACHE_GT_OPR 0x0004
#define TYPECACHE_CMP_PROC 0x0008
#define TYPECACHE_HASH_PROC 0x0010
#define TYPECACHE_EQ_OPR_FINFO 0x0020
#define TYPECACHE_CMP_PROC_FINFO 0x0040
#define TYPECACHE_HASH_PROC_FINFO 0x0080
#define TYPECACHE_TUPDESC 0x0100
#define TYPECACHE_BTREE_OPFAMILY 0x0200
#define TYPECACHE_HASH_OPFAMILY 0x0400
#define TYPECACHE_RANGE_INFO 0x0800
#define TYPECACHE_DOMAIN_BASE_INFO 0x1000
#define TYPECACHE_DOMAIN_CONSTR_INFO 0x2000
#define TYPECACHE_HASH_EXTENDED_PROC 0x4000
#define TYPECACHE_HASH_EXTENDED_PROC_FINFO 0x8000
Use the typcache to cache constraints for domain types. Previously, we cached domain constraints for the life of a query, or really for the life of the FmgrInfo struct that was used to invoke domain_in() or domain_check(). But plpgsql (and probably other places) are set up to cache such FmgrInfos for the whole lifespan of a session, which meant they could be enforcing really stale sets of constraints. On the other hand, searching pg_constraint once per query gets kind of expensive too: testing says that as much as half the runtime of a trivial query such as "SELECT 0::domaintype" went into that. To fix this, delegate the responsibility for tracking a domain's constraints to the typcache, which has the infrastructure needed to detect syscache invalidation events that signal possible changes. This not only removes unnecessary repeat reads of pg_constraint, but ensures that we never apply stale constraint data: whatever we use is the current data according to syscache rules. Unfortunately, the current configuration of the system catalogs means we have to flush cached domain-constraint data whenever either pg_type or pg_constraint changes, which happens rather a lot (eg, creation or deletion of a temp table will do it). It might be worth rearranging things to split pg_constraint into two catalogs, of which the domain constraint one would probably be very low-traffic. That's a job for another patch though, and in any case this patch should improve matters materially even with that handicap. This patch makes use of the recently-added memory context reset callback feature to manage the lifespan of domain constraint caches, so that we don't risk deleting a cache that might be in the midst of evaluation. Although this is a bug fix as well as a performance improvement, no back-patch. There haven't been many if any field complaints about stale domain constraint checks, so it doesn't seem worth taking the risk of modifying data structures as basic as MemoryContexts in back branches.
2015-03-01 20:06:50 +01:00
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
/* This value will not equal any valid tupledesc identifier, nor 0 */
#define INVALID_TUPLEDESC_IDENTIFIER ((uint64) 1)
Use the typcache to cache constraints for domain types. Previously, we cached domain constraints for the life of a query, or really for the life of the FmgrInfo struct that was used to invoke domain_in() or domain_check(). But plpgsql (and probably other places) are set up to cache such FmgrInfos for the whole lifespan of a session, which meant they could be enforcing really stale sets of constraints. On the other hand, searching pg_constraint once per query gets kind of expensive too: testing says that as much as half the runtime of a trivial query such as "SELECT 0::domaintype" went into that. To fix this, delegate the responsibility for tracking a domain's constraints to the typcache, which has the infrastructure needed to detect syscache invalidation events that signal possible changes. This not only removes unnecessary repeat reads of pg_constraint, but ensures that we never apply stale constraint data: whatever we use is the current data according to syscache rules. Unfortunately, the current configuration of the system catalogs means we have to flush cached domain-constraint data whenever either pg_type or pg_constraint changes, which happens rather a lot (eg, creation or deletion of a temp table will do it). It might be worth rearranging things to split pg_constraint into two catalogs, of which the domain constraint one would probably be very low-traffic. That's a job for another patch though, and in any case this patch should improve matters materially even with that handicap. This patch makes use of the recently-added memory context reset callback feature to manage the lifespan of domain constraint caches, so that we don't risk deleting a cache that might be in the midst of evaluation. Although this is a bug fix as well as a performance improvement, no back-patch. There haven't been many if any field complaints about stale domain constraint checks, so it doesn't seem worth taking the risk of modifying data structures as basic as MemoryContexts in back branches.
2015-03-01 20:06:50 +01:00
/*
* Callers wishing to maintain a long-lived reference to a domain's constraint
* set must store it in one of these. Use InitDomainConstraintRef() and
* UpdateDomainConstraintRef() to manage it. Note: DomainConstraintState is
* considered an executable expression type, so it's defined in execnodes.h.
*/
typedef struct DomainConstraintRef
{
List *constraints; /* list of DomainConstraintState nodes */
MemoryContext refctx; /* context holding DomainConstraintRef */
TypeCacheEntry *tcache; /* typcache entry for domain type */
Faster expression evaluation and targetlist projection. This replaces the old, recursive tree-walk based evaluation, with non-recursive, opcode dispatch based, expression evaluation. Projection is now implemented as part of expression evaluation. This both leads to significant performance improvements, and makes future just-in-time compilation of expressions easier. The speed gains primarily come from: - non-recursive implementation reduces stack usage / overhead - simple sub-expressions are implemented with a single jump, without function calls - sharing some state between different sub-expressions - reduced amount of indirect/hard to predict memory accesses by laying out operation metadata sequentially; including the avoidance of nearly all of the previously used linked lists - more code has been moved to expression initialization, avoiding constant re-checks at evaluation time Future just-in-time compilation (JIT) has become easier, as demonstrated by released patches intended to be merged in a later release, for primarily two reasons: Firstly, due to a stricter split between expression initialization and evaluation, less code has to be handled by the JIT. Secondly, due to the non-recursive nature of the generated "instructions", less performance-critical code-paths can easily be shared between interpreted and compiled evaluation. The new framework allows for significant future optimizations. E.g.: - basic infrastructure for to later reduce the per executor-startup overhead of expression evaluation, by caching state in prepared statements. That'd be helpful in OLTPish scenarios where initialization overhead is measurable. - optimizing the generated "code". A number of proposals for potential work has already been made. - optimizing the interpreter. Similarly a number of proposals have been made here too. The move of logic into the expression initialization step leads to some backward-incompatible changes: - Function permission checks are now done during expression initialization, whereas previously they were done during execution. In edge cases this can lead to errors being raised that previously wouldn't have been, e.g. a NULL array being coerced to a different array type previously didn't perform checks. - The set of domain constraints to be checked, is now evaluated once during expression initialization, previously it was re-built every time a domain check was evaluated. For normal queries this doesn't change much, but e.g. for plpgsql functions, which caches ExprStates, the old set could stick around longer. The behavior around might still change. Author: Andres Freund, with significant changes by Tom Lane, changes by Heikki Linnakangas Reviewed-By: Tom Lane, Heikki Linnakangas Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-14 23:45:36 +01:00
bool need_exprstate; /* does caller need check_exprstate? */
Use the typcache to cache constraints for domain types. Previously, we cached domain constraints for the life of a query, or really for the life of the FmgrInfo struct that was used to invoke domain_in() or domain_check(). But plpgsql (and probably other places) are set up to cache such FmgrInfos for the whole lifespan of a session, which meant they could be enforcing really stale sets of constraints. On the other hand, searching pg_constraint once per query gets kind of expensive too: testing says that as much as half the runtime of a trivial query such as "SELECT 0::domaintype" went into that. To fix this, delegate the responsibility for tracking a domain's constraints to the typcache, which has the infrastructure needed to detect syscache invalidation events that signal possible changes. This not only removes unnecessary repeat reads of pg_constraint, but ensures that we never apply stale constraint data: whatever we use is the current data according to syscache rules. Unfortunately, the current configuration of the system catalogs means we have to flush cached domain-constraint data whenever either pg_type or pg_constraint changes, which happens rather a lot (eg, creation or deletion of a temp table will do it). It might be worth rearranging things to split pg_constraint into two catalogs, of which the domain constraint one would probably be very low-traffic. That's a job for another patch though, and in any case this patch should improve matters materially even with that handicap. This patch makes use of the recently-added memory context reset callback feature to manage the lifespan of domain constraint caches, so that we don't risk deleting a cache that might be in the midst of evaluation. Although this is a bug fix as well as a performance improvement, no back-patch. There haven't been many if any field complaints about stale domain constraint checks, so it doesn't seem worth taking the risk of modifying data structures as basic as MemoryContexts in back branches.
2015-03-01 20:06:50 +01:00
/* Management data --- treat these fields as private to typcache.c */
DomainConstraintCache *dcc; /* current constraints, or NULL if none */
Phase 2 of pgindent updates. Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
MemoryContextCallback callback; /* used to release refcount when done */
Use the typcache to cache constraints for domain types. Previously, we cached domain constraints for the life of a query, or really for the life of the FmgrInfo struct that was used to invoke domain_in() or domain_check(). But plpgsql (and probably other places) are set up to cache such FmgrInfos for the whole lifespan of a session, which meant they could be enforcing really stale sets of constraints. On the other hand, searching pg_constraint once per query gets kind of expensive too: testing says that as much as half the runtime of a trivial query such as "SELECT 0::domaintype" went into that. To fix this, delegate the responsibility for tracking a domain's constraints to the typcache, which has the infrastructure needed to detect syscache invalidation events that signal possible changes. This not only removes unnecessary repeat reads of pg_constraint, but ensures that we never apply stale constraint data: whatever we use is the current data according to syscache rules. Unfortunately, the current configuration of the system catalogs means we have to flush cached domain-constraint data whenever either pg_type or pg_constraint changes, which happens rather a lot (eg, creation or deletion of a temp table will do it). It might be worth rearranging things to split pg_constraint into two catalogs, of which the domain constraint one would probably be very low-traffic. That's a job for another patch though, and in any case this patch should improve matters materially even with that handicap. This patch makes use of the recently-added memory context reset callback feature to manage the lifespan of domain constraint caches, so that we don't risk deleting a cache that might be in the midst of evaluation. Although this is a bug fix as well as a performance improvement, no back-patch. There haven't been many if any field complaints about stale domain constraint checks, so it doesn't seem worth taking the risk of modifying data structures as basic as MemoryContexts in back branches.
2015-03-01 20:06:50 +01:00
} DomainConstraintRef;
typedef struct SharedRecordTypmodRegistry SharedRecordTypmodRegistry;
extern TypeCacheEntry *lookup_type_cache(Oid type_id, int flags);
Use the typcache to cache constraints for domain types. Previously, we cached domain constraints for the life of a query, or really for the life of the FmgrInfo struct that was used to invoke domain_in() or domain_check(). But plpgsql (and probably other places) are set up to cache such FmgrInfos for the whole lifespan of a session, which meant they could be enforcing really stale sets of constraints. On the other hand, searching pg_constraint once per query gets kind of expensive too: testing says that as much as half the runtime of a trivial query such as "SELECT 0::domaintype" went into that. To fix this, delegate the responsibility for tracking a domain's constraints to the typcache, which has the infrastructure needed to detect syscache invalidation events that signal possible changes. This not only removes unnecessary repeat reads of pg_constraint, but ensures that we never apply stale constraint data: whatever we use is the current data according to syscache rules. Unfortunately, the current configuration of the system catalogs means we have to flush cached domain-constraint data whenever either pg_type or pg_constraint changes, which happens rather a lot (eg, creation or deletion of a temp table will do it). It might be worth rearranging things to split pg_constraint into two catalogs, of which the domain constraint one would probably be very low-traffic. That's a job for another patch though, and in any case this patch should improve matters materially even with that handicap. This patch makes use of the recently-added memory context reset callback feature to manage the lifespan of domain constraint caches, so that we don't risk deleting a cache that might be in the midst of evaluation. Although this is a bug fix as well as a performance improvement, no back-patch. There haven't been many if any field complaints about stale domain constraint checks, so it doesn't seem worth taking the risk of modifying data structures as basic as MemoryContexts in back branches.
2015-03-01 20:06:50 +01:00
extern void InitDomainConstraintRef(Oid type_id, DomainConstraintRef *ref,
MemoryContext refctx, bool need_exprstate);
Use the typcache to cache constraints for domain types. Previously, we cached domain constraints for the life of a query, or really for the life of the FmgrInfo struct that was used to invoke domain_in() or domain_check(). But plpgsql (and probably other places) are set up to cache such FmgrInfos for the whole lifespan of a session, which meant they could be enforcing really stale sets of constraints. On the other hand, searching pg_constraint once per query gets kind of expensive too: testing says that as much as half the runtime of a trivial query such as "SELECT 0::domaintype" went into that. To fix this, delegate the responsibility for tracking a domain's constraints to the typcache, which has the infrastructure needed to detect syscache invalidation events that signal possible changes. This not only removes unnecessary repeat reads of pg_constraint, but ensures that we never apply stale constraint data: whatever we use is the current data according to syscache rules. Unfortunately, the current configuration of the system catalogs means we have to flush cached domain-constraint data whenever either pg_type or pg_constraint changes, which happens rather a lot (eg, creation or deletion of a temp table will do it). It might be worth rearranging things to split pg_constraint into two catalogs, of which the domain constraint one would probably be very low-traffic. That's a job for another patch though, and in any case this patch should improve matters materially even with that handicap. This patch makes use of the recently-added memory context reset callback feature to manage the lifespan of domain constraint caches, so that we don't risk deleting a cache that might be in the midst of evaluation. Although this is a bug fix as well as a performance improvement, no back-patch. There haven't been many if any field complaints about stale domain constraint checks, so it doesn't seem worth taking the risk of modifying data structures as basic as MemoryContexts in back branches.
2015-03-01 20:06:50 +01:00
extern void UpdateDomainConstraintRef(DomainConstraintRef *ref);
extern bool DomainHasConstraints(Oid type_id);
extern TupleDesc lookup_rowtype_tupdesc(Oid type_id, int32 typmod);
extern TupleDesc lookup_rowtype_tupdesc_noerror(Oid type_id, int32 typmod,
bool noError);
extern TupleDesc lookup_rowtype_tupdesc_copy(Oid type_id, int32 typmod);
extern TupleDesc lookup_rowtype_tupdesc_domain(Oid type_id, int32 typmod,
bool noError);
extern void assign_record_type_typmod(TupleDesc tupDesc);
Make plpgsql use its DTYPE_REC code paths for composite-type variables. Formerly, DTYPE_REC was used only for variables declared as "record"; variables of named composite types used DTYPE_ROW, which is faster for some purposes but much less flexible. In particular, the ROW code paths are entirely incapable of dealing with DDL-caused changes to the number or data types of the columns of a row variable, once a particular plpgsql function has been parsed for the first time in a session. And, since the stored representation of a ROW isn't a tuple, there wasn't any easy way to deal with variables of domain-over-composite types, since the domain constraint checking code would expect the value to be checked to be a tuple. A lesser, but still real, annoyance is that ROW format cannot represent a true NULL composite value, only a row of per-field NULL values, which is not exactly the same thing. Hence, switch to using DTYPE_REC for all composite-typed variables, whether "record", named composite type, or domain over named composite type. DTYPE_ROW remains but is used only for its native purpose, to represent a fixed-at-compile-time list of variables, for instance the targets of an INTO clause. To accomplish this without taking significant performance losses, introduce infrastructure that allows storing composite-type variables as "expanded objects", similar to the "expanded array" infrastructure introduced in commit 1dc5ebc90. A composite variable's value is thereby kept (most of the time) in the form of separate Datums, so that field accesses and updates are not much more expensive than they were in the ROW format. This holds the line, more or less, on performance of variables of named composite types in field-access-intensive microbenchmarks, and makes variables declared "record" perform much better than before in similar tests. In addition, the logic involved with enforcing composite-domain constraints against updates of individual fields is in the expanded record infrastructure not plpgsql proper, so that it might be reusable for other purposes. In further support of this, introduce a typcache feature for assigning a unique-within-process identifier to each distinct tuple descriptor of interest; in particular, DDL alterations on composite types result in a new identifier for that type. This allows very cheap detection of the need to refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea I had in commit 687f096ea: that assigned identifying sequence numbers to successive versions of individual composite types, but the numbers were not unique across different types, nor was there support for assigning numbers to registered record types. In passing, allow plpgsql functions to accept as well as return type "record". There was no good reason for the old restriction, and it was out of step with most of the other PLs. Tom Lane, reviewed by Pavel Stehule Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
extern uint64 assign_record_type_identifier(Oid type_id, int32 typmod);
extern int compare_values_of_enum(TypeCacheEntry *tcache, Oid arg1, Oid arg2);
extern size_t SharedRecordTypmodRegistryEstimate(void);
extern void SharedRecordTypmodRegistryInit(SharedRecordTypmodRegistry *,
dsm_segment *segment, dsa_area *area);
extern void SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *);
Phase 2 of pgindent updates. Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:18:54 +02:00
#endif /* TYPCACHE_H */