Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* expandedrecord.h
|
|
|
|
* Declarations for composite expanded objects.
|
|
|
|
*
|
2019-01-02 18:44:25 +01:00
|
|
|
* Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
|
Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
*
|
|
|
|
* src/include/utils/expandedrecord.h
|
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#ifndef EXPANDEDRECORD_H
|
|
|
|
#define EXPANDEDRECORD_H
|
|
|
|
|
|
|
|
#include "access/htup.h"
|
|
|
|
#include "access/tupdesc.h"
|
|
|
|
#include "fmgr.h"
|
|
|
|
#include "utils/expandeddatum.h"
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
* An expanded record is contained within a private memory context (as
|
|
|
|
* all expanded objects must be) and has a control structure as below.
|
|
|
|
*
|
|
|
|
* The expanded record might contain a regular "flat" tuple if that was the
|
|
|
|
* original input and we've not modified it. Otherwise, the contents are
|
|
|
|
* represented by Datum/isnull arrays plus type information. We could also
|
|
|
|
* have both forms, if we've deconstructed the original tuple for access
|
|
|
|
* purposes but not yet changed it. For pass-by-reference field types, the
|
|
|
|
* Datums would point into the flat tuple in this situation. Once we start
|
|
|
|
* modifying tuple fields, new pass-by-ref fields are separately palloc'd
|
|
|
|
* within the memory context.
|
|
|
|
*
|
|
|
|
* It's possible to build an expanded record that references a "flat" tuple
|
|
|
|
* stored externally, if the caller can guarantee that that tuple will not
|
|
|
|
* change for the lifetime of the expanded record. (This frammish is mainly
|
|
|
|
* meant to avoid unnecessary data copying in trigger functions.)
|
|
|
|
*/
|
|
|
|
#define ER_MAGIC 1384727874 /* ID for debugging crosschecks */
|
|
|
|
|
|
|
|
typedef struct ExpandedRecordHeader
|
|
|
|
{
|
|
|
|
/* Standard header for expanded objects */
|
|
|
|
ExpandedObjectHeader hdr;
|
|
|
|
|
|
|
|
/* Magic value identifying an expanded record (for debugging only) */
|
|
|
|
int er_magic;
|
|
|
|
|
|
|
|
/* Assorted flag bits */
|
|
|
|
int flags;
|
|
|
|
#define ER_FLAG_FVALUE_VALID 0x0001 /* fvalue is up to date? */
|
|
|
|
#define ER_FLAG_FVALUE_ALLOCED 0x0002 /* fvalue is local storage? */
|
|
|
|
#define ER_FLAG_DVALUES_VALID 0x0004 /* dvalues/dnulls are up to date? */
|
|
|
|
#define ER_FLAG_DVALUES_ALLOCED 0x0008 /* any field values local storage? */
|
|
|
|
#define ER_FLAG_HAVE_EXTERNAL 0x0010 /* any field values are external? */
|
|
|
|
#define ER_FLAG_TUPDESC_ALLOCED 0x0020 /* tupdesc is local storage? */
|
|
|
|
#define ER_FLAG_IS_DOMAIN 0x0040 /* er_decltypeid is domain? */
|
|
|
|
#define ER_FLAG_IS_DUMMY 0x0080 /* this header is dummy (see below) */
|
|
|
|
/* flag bits that are not to be cleared when replacing tuple data: */
|
|
|
|
#define ER_FLAGS_NON_DATA \
|
|
|
|
(ER_FLAG_TUPDESC_ALLOCED | ER_FLAG_IS_DOMAIN | ER_FLAG_IS_DUMMY)
|
|
|
|
|
|
|
|
/* Declared type of the record variable (could be a domain type) */
|
|
|
|
Oid er_decltypeid;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Actual composite type/typmod; never a domain (if ER_FLAG_IS_DOMAIN,
|
|
|
|
* these identify the composite base type). These will match
|
|
|
|
* er_tupdesc->tdtypeid/tdtypmod, as well as the header fields of
|
|
|
|
* composite datums made from or stored in this expanded record.
|
|
|
|
*/
|
|
|
|
Oid er_typeid; /* type OID of the composite type */
|
|
|
|
int32 er_typmod; /* typmod of the composite type */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Tuple descriptor, if we have one, else NULL. This may point to a
|
|
|
|
* reference-counted tupdesc originally belonging to the typcache, in
|
|
|
|
* which case we use a memory context reset callback to release the
|
|
|
|
* refcount. It can also be locally allocated in this object's private
|
|
|
|
* context (in which case ER_FLAG_TUPDESC_ALLOCED is set).
|
|
|
|
*/
|
|
|
|
TupleDesc er_tupdesc;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Unique-within-process identifier for the tupdesc (see typcache.h). This
|
|
|
|
* field will never be equal to INVALID_TUPLEDESC_IDENTIFIER.
|
|
|
|
*/
|
|
|
|
uint64 er_tupdesc_id;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If we have a Datum-array representation of the record, it's kept here;
|
|
|
|
* else ER_FLAG_DVALUES_VALID is not set, and dvalues/dnulls may be NULL
|
|
|
|
* if they've not yet been allocated. If allocated, the dvalues and
|
|
|
|
* dnulls arrays are palloc'd within the object private context, and are
|
|
|
|
* of length matching er_tupdesc->natts. For pass-by-ref field types,
|
|
|
|
* dvalues entries might point either into the fstartptr..fendptr area, or
|
|
|
|
* to separately palloc'd chunks.
|
|
|
|
*/
|
|
|
|
Datum *dvalues; /* array of Datums */
|
|
|
|
bool *dnulls; /* array of is-null flags for Datums */
|
|
|
|
int nfields; /* length of above arrays */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* flat_size is the current space requirement for the flat equivalent of
|
|
|
|
* the expanded record, if known; otherwise it's 0. We store this to make
|
|
|
|
* consecutive calls of get_flat_size cheap. If flat_size is not 0, the
|
|
|
|
* component values data_len, hoff, and hasnull must be valid too.
|
|
|
|
*/
|
|
|
|
Size flat_size;
|
|
|
|
|
|
|
|
Size data_len; /* data len within flat_size */
|
|
|
|
int hoff; /* header offset */
|
|
|
|
bool hasnull; /* null bitmap needed? */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* fvalue points to the flat representation if we have one, else it is
|
|
|
|
* NULL. If the flat representation is valid (up to date) then
|
|
|
|
* ER_FLAG_FVALUE_VALID is set. Even if we've outdated the flat
|
|
|
|
* representation due to changes of user fields, it can still be used to
|
|
|
|
* fetch system column values. If we have a flat representation then
|
|
|
|
* fstartptr/fendptr point to the start and end+1 of its data area; this
|
|
|
|
* is so that we can tell which Datum pointers point into the flat
|
|
|
|
* representation rather than being pointers to separately palloc'd data.
|
|
|
|
*/
|
|
|
|
HeapTuple fvalue; /* might or might not be private storage */
|
|
|
|
char *fstartptr; /* start of its data area */
|
|
|
|
char *fendptr; /* end+1 of its data area */
|
|
|
|
|
Detoast plpgsql variables if they might live across a transaction boundary.
Up to now, it's been safe for plpgsql to store TOAST pointers in its
variables because the ActiveSnapshot for whatever query called the plpgsql
function will surely protect such TOAST values from being vacuumed away,
even if the owning table rows are committed dead. With the introduction of
procedures, that assumption is no longer good in "non atomic" executions
of plpgsql code. We adopt the slightly brute-force solution of detoasting
all TOAST pointers at the time they are stored into variables, if we're in
a non-atomic context, just in case the owning row goes away.
Some care is needed to avoid long-term memory leaks, since plpgsql tends
to run with CurrentMemoryContext pointing to its call-lifespan context,
but we shouldn't assume that no memory is leaked by heap_tuple_fetch_attr.
In plpgsql proper, we can do the detoasting work in the "eval_mcontext".
Most of the code thrashing here is due to the need to add this capability
to expandedrecord.c as well as plpgsql proper. In expandedrecord.c,
we can't assume that the caller's context is short-lived, so make use of
the short-term sub-context that was already invented for checking domain
constraints. In view of this repurposing, it seems good to rename that
variable and associated code from "domain_check_cxt" to "short_term_cxt".
Peter Eisentraut and Tom Lane
Discussion: https://postgr.es/m/5AC06865.9050005@anastigmatix.net
2018-05-16 20:56:52 +02:00
|
|
|
/* Some operations on the expanded record need a short-lived context */
|
|
|
|
MemoryContext er_short_term_cxt; /* short-term memory context */
|
|
|
|
|
Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
/* Working state for domain checking, used if ER_FLAG_IS_DOMAIN is set */
|
|
|
|
struct ExpandedRecordHeader *er_dummy_header; /* dummy record header */
|
|
|
|
void *er_domaininfo; /* cache space for domain_check() */
|
|
|
|
|
|
|
|
/* Callback info (it's active if er_mcb.arg is not NULL) */
|
|
|
|
MemoryContextCallback er_mcb;
|
|
|
|
} ExpandedRecordHeader;
|
|
|
|
|
|
|
|
/* fmgr macros for expanded record objects */
|
|
|
|
#define PG_GETARG_EXPANDED_RECORD(n) DatumGetExpandedRecord(PG_GETARG_DATUM(n))
|
|
|
|
#define ExpandedRecordGetDatum(erh) EOHPGetRWDatum(&(erh)->hdr)
|
|
|
|
#define ExpandedRecordGetRODatum(erh) EOHPGetRODatum(&(erh)->hdr)
|
|
|
|
#define PG_RETURN_EXPANDED_RECORD(x) PG_RETURN_DATUM(ExpandedRecordGetDatum(x))
|
|
|
|
|
|
|
|
/* assorted other macros */
|
|
|
|
#define ExpandedRecordIsEmpty(erh) \
|
|
|
|
(((erh)->flags & (ER_FLAG_DVALUES_VALID | ER_FLAG_FVALUE_VALID)) == 0)
|
|
|
|
#define ExpandedRecordIsDomain(erh) \
|
|
|
|
(((erh)->flags & ER_FLAG_IS_DOMAIN) != 0)
|
|
|
|
|
|
|
|
/* this can substitute for TransferExpandedObject() when we already have erh */
|
|
|
|
#define TransferExpandedRecord(erh, cxt) \
|
|
|
|
MemoryContextSetParent((erh)->hdr.eoh_context, cxt)
|
|
|
|
|
|
|
|
/* information returned by expanded_record_lookup_field() */
|
|
|
|
typedef struct ExpandedRecordFieldInfo
|
|
|
|
{
|
|
|
|
int fnumber; /* field's attr number in record */
|
|
|
|
Oid ftypeid; /* field's type/typmod info */
|
|
|
|
int32 ftypmod;
|
|
|
|
Oid fcollation; /* field's collation if any */
|
|
|
|
} ExpandedRecordFieldInfo;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* prototypes for functions defined in expandedrecord.c
|
|
|
|
*/
|
|
|
|
extern ExpandedRecordHeader *make_expanded_record_from_typeid(Oid type_id, int32 typmod,
|
2019-05-22 19:04:48 +02:00
|
|
|
MemoryContext parentcontext);
|
Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
extern ExpandedRecordHeader *make_expanded_record_from_tupdesc(TupleDesc tupdesc,
|
2019-05-22 19:04:48 +02:00
|
|
|
MemoryContext parentcontext);
|
Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
extern ExpandedRecordHeader *make_expanded_record_from_exprecord(ExpandedRecordHeader *olderh,
|
2019-05-22 19:04:48 +02:00
|
|
|
MemoryContext parentcontext);
|
Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
extern void expanded_record_set_tuple(ExpandedRecordHeader *erh,
|
2019-05-22 19:04:48 +02:00
|
|
|
HeapTuple tuple, bool copy, bool expand_external);
|
Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
extern Datum make_expanded_record_from_datum(Datum recorddatum,
|
2019-05-22 19:04:48 +02:00
|
|
|
MemoryContext parentcontext);
|
Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
extern TupleDesc expanded_record_fetch_tupdesc(ExpandedRecordHeader *erh);
|
|
|
|
extern HeapTuple expanded_record_get_tuple(ExpandedRecordHeader *erh);
|
|
|
|
extern ExpandedRecordHeader *DatumGetExpandedRecord(Datum d);
|
|
|
|
extern void deconstruct_expanded_record(ExpandedRecordHeader *erh);
|
|
|
|
extern bool expanded_record_lookup_field(ExpandedRecordHeader *erh,
|
2019-05-22 19:04:48 +02:00
|
|
|
const char *fieldname,
|
|
|
|
ExpandedRecordFieldInfo *finfo);
|
Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
extern Datum expanded_record_fetch_field(ExpandedRecordHeader *erh, int fnumber,
|
2019-05-22 19:04:48 +02:00
|
|
|
bool *isnull);
|
Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
extern void expanded_record_set_field_internal(ExpandedRecordHeader *erh,
|
2019-05-22 19:04:48 +02:00
|
|
|
int fnumber,
|
|
|
|
Datum newValue, bool isnull,
|
|
|
|
bool expand_external,
|
|
|
|
bool check_constraints);
|
Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
extern void expanded_record_set_fields(ExpandedRecordHeader *erh,
|
2019-05-22 19:04:48 +02:00
|
|
|
const Datum *newValues, const bool *isnulls,
|
|
|
|
bool expand_external);
|
Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
|
|
|
|
/* outside code should never call expanded_record_set_field_internal as such */
|
Detoast plpgsql variables if they might live across a transaction boundary.
Up to now, it's been safe for plpgsql to store TOAST pointers in its
variables because the ActiveSnapshot for whatever query called the plpgsql
function will surely protect such TOAST values from being vacuumed away,
even if the owning table rows are committed dead. With the introduction of
procedures, that assumption is no longer good in "non atomic" executions
of plpgsql code. We adopt the slightly brute-force solution of detoasting
all TOAST pointers at the time they are stored into variables, if we're in
a non-atomic context, just in case the owning row goes away.
Some care is needed to avoid long-term memory leaks, since plpgsql tends
to run with CurrentMemoryContext pointing to its call-lifespan context,
but we shouldn't assume that no memory is leaked by heap_tuple_fetch_attr.
In plpgsql proper, we can do the detoasting work in the "eval_mcontext".
Most of the code thrashing here is due to the need to add this capability
to expandedrecord.c as well as plpgsql proper. In expandedrecord.c,
we can't assume that the caller's context is short-lived, so make use of
the short-term sub-context that was already invented for checking domain
constraints. In view of this repurposing, it seems good to rename that
variable and associated code from "domain_check_cxt" to "short_term_cxt".
Peter Eisentraut and Tom Lane
Discussion: https://postgr.es/m/5AC06865.9050005@anastigmatix.net
2018-05-16 20:56:52 +02:00
|
|
|
#define expanded_record_set_field(erh, fnumber, newValue, isnull, expand_external) \
|
|
|
|
expanded_record_set_field_internal(erh, fnumber, newValue, isnull, expand_external, true)
|
Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-14 00:52:21 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Inline-able fast cases. The expanded_record_fetch_xxx functions above
|
|
|
|
* handle the general cases.
|
|
|
|
*/
|
|
|
|
|
|
|
|
/* Get the tupdesc for the expanded record's actual type */
|
|
|
|
static inline TupleDesc
|
|
|
|
expanded_record_get_tupdesc(ExpandedRecordHeader *erh)
|
|
|
|
{
|
|
|
|
if (likely(erh->er_tupdesc != NULL))
|
|
|
|
return erh->er_tupdesc;
|
|
|
|
else
|
|
|
|
return expanded_record_fetch_tupdesc(erh);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Get value of record field */
|
|
|
|
static inline Datum
|
|
|
|
expanded_record_get_field(ExpandedRecordHeader *erh, int fnumber,
|
|
|
|
bool *isnull)
|
|
|
|
{
|
|
|
|
if ((erh->flags & ER_FLAG_DVALUES_VALID) &&
|
|
|
|
likely(fnumber > 0 && fnumber <= erh->nfields))
|
|
|
|
{
|
|
|
|
*isnull = erh->dnulls[fnumber - 1];
|
|
|
|
return erh->dvalues[fnumber - 1];
|
|
|
|
}
|
|
|
|
else
|
|
|
|
return expanded_record_fetch_field(erh, fnumber, isnull);
|
|
|
|
}
|
|
|
|
|
|
|
|
#endif /* EXPANDEDRECORD_H */
|