postgresql/src/backend/nodes/readfuncs.c

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

802 lines
22 KiB
C
Raw Normal View History

/*-------------------------------------------------------------------------
*
* readfuncs.c
* Reader functions for Postgres tree nodes.
*
* Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
*
* IDENTIFICATION
2010-09-20 22:08:53 +02:00
* src/backend/nodes/readfuncs.c
*
* NOTES
* Parse location fields are written out by outfuncs.c, but only for
2018-09-18 23:11:54 +02:00
* debugging use. When reading a location field, we normally discard
* the stored value and set the location field to -1 (ie, "unknown").
* This is because nodes coming from a stored rule should not be thought
* to have a known location in the current query's text.
*
2018-09-18 23:11:54 +02:00
* However, if restore_location_fields is true, we do restore location
* fields from the string. This is currently intended only for use by the
* WRITE_READ_PARSE_PLAN_TREES test code, which doesn't want to cause
* any change in the node contents.
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include <math.h>
#include "miscadmin.h"
Automatically generate node support functions Add a script to automatically generate the node support functions (copy, equal, out, and read, as well as the node tags enum) from the struct definitions. For each of the four node support files, it creates two include files, e.g., copyfuncs.funcs.c and copyfuncs.switch.c, to include in the main file. All the scaffolding of the main file stays in place. I have tried to mostly make the coverage of the output match what is currently there. For example, one could now do out/read coverage of utility statement nodes, but I have manually excluded those for now. The reason is mainly that it's easier to diff the before and after, and adding a bunch of stuff like this might require a separate analysis and review. Subtyping (TidScan -> Scan) is supported. For the hard cases, you can just write a manual function and exclude generating one. For the not so hard cases, there is a way of annotating struct fields to get special behaviors. For example, pg_node_attr(equal_ignore) has the field ignored in equal functions. (In this patch, I have only ifdef'ed out the code to could be removed, mainly so that it won't constantly have merge conflicts. It will be deleted in a separate patch. All the code comments that are worth keeping from those sections have already been moved to the header files where the structs are defined.) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce%40enterprisedb.com
2022-07-09 08:52:19 +02:00
#include "nodes/bitmapset.h"
#include "nodes/readfuncs.h"
/*
* Macros to simplify reading of different kinds of fields. Use these
* wherever possible to reduce the chance for silly typos. Note that these
* hard-wire conventions about the names of the local variables in a Read
* routine.
*/
/* Macros for declaring appropriate local variables */
/* A few guys need only local_node */
#define READ_LOCALS_NO_FIELDS(nodeTypeName) \
nodeTypeName *local_node = makeNode(nodeTypeName)
/* And a few guys need only the pg_strtok support fields */
#define READ_TEMP_LOCALS() \
2018-09-18 23:11:54 +02:00
const char *token; \
int length
/* ... but most need both */
#define READ_LOCALS(nodeTypeName) \
READ_LOCALS_NO_FIELDS(nodeTypeName); \
READ_TEMP_LOCALS()
/* Read an integer field (anything written as ":fldname %d") */
#define READ_INT_FIELD(fldname) \
token = pg_strtok(&length); /* skip :fldname */ \
token = pg_strtok(&length); /* get field value */ \
local_node->fldname = atoi(token)
/* Read an unsigned integer field (anything written as ":fldname %u") */
#define READ_UINT_FIELD(fldname) \
token = pg_strtok(&length); /* skip :fldname */ \
token = pg_strtok(&length); /* get field value */ \
local_node->fldname = atoui(token)
/* Read an unsigned integer field (anything written using UINT64_FORMAT) */
#define READ_UINT64_FIELD(fldname) \
token = pg_strtok(&length); /* skip :fldname */ \
token = pg_strtok(&length); /* get field value */ \
local_node->fldname = strtou64(token, NULL, 10)
/* Read a long integer field (anything written as ":fldname %ld") */
#define READ_LONG_FIELD(fldname) \
token = pg_strtok(&length); /* skip :fldname */ \
token = pg_strtok(&length); /* get field value */ \
local_node->fldname = atol(token)
/* Read an OID field (don't hard-wire assumption that OID is same as uint) */
#define READ_OID_FIELD(fldname) \
token = pg_strtok(&length); /* skip :fldname */ \
token = pg_strtok(&length); /* get field value */ \
local_node->fldname = atooid(token)
/* Read a char field (ie, one ascii character) */
#define READ_CHAR_FIELD(fldname) \
token = pg_strtok(&length); /* skip :fldname */ \
token = pg_strtok(&length); /* get field value */ \
/* avoid overhead of calling debackslash() for one char */ \
local_node->fldname = (length == 0) ? '\0' : (token[0] == '\\' ? token[1] : token[0])
/* Read an enumerated-type field that was written as an integer code */
#define READ_ENUM_FIELD(fldname, enumtype) \
token = pg_strtok(&length); /* skip :fldname */ \
token = pg_strtok(&length); /* get field value */ \
local_node->fldname = (enumtype) atoi(token)
/* Read a float field */
#define READ_FLOAT_FIELD(fldname) \
token = pg_strtok(&length); /* skip :fldname */ \
token = pg_strtok(&length); /* get field value */ \
local_node->fldname = atof(token)
/* Read a boolean field */
#define READ_BOOL_FIELD(fldname) \
token = pg_strtok(&length); /* skip :fldname */ \
token = pg_strtok(&length); /* get field value */ \
local_node->fldname = strtobool(token)
/* Read a character-string field */
#define READ_STRING_FIELD(fldname) \
token = pg_strtok(&length); /* skip :fldname */ \
token = pg_strtok(&length); /* get field value */ \
local_node->fldname = nullable_string(token, length)
2018-09-18 23:11:54 +02:00
/* Read a parse location field (and possibly throw away the value) */
#ifdef WRITE_READ_PARSE_PLAN_TREES
#define READ_LOCATION_FIELD(fldname) \
token = pg_strtok(&length); /* skip :fldname */ \
token = pg_strtok(&length); /* get field value */ \
local_node->fldname = restore_location_fields ? atoi(token) : -1
#else
#define READ_LOCATION_FIELD(fldname) \
token = pg_strtok(&length); /* skip :fldname */ \
token = pg_strtok(&length); /* get field value */ \
(void) token; /* in case not used elsewhere */ \
local_node->fldname = -1 /* set field to "unknown" */
2018-09-18 23:11:54 +02:00
#endif
/* Read a Node field */
#define READ_NODE_FIELD(fldname) \
token = pg_strtok(&length); /* skip :fldname */ \
(void) token; /* in case not used elsewhere */ \
local_node->fldname = nodeRead(NULL, 0)
/* Read a bitmapset field */
#define READ_BITMAPSET_FIELD(fldname) \
token = pg_strtok(&length); /* skip :fldname */ \
(void) token; /* in case not used elsewhere */ \
local_node->fldname = _readBitmapset()
/* Read an attribute number array */
#define READ_ATTRNUMBER_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readAttrNumberCols(len)
/* Read an oid array */
#define READ_OID_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readOidCols(len)
/* Read an int array */
#define READ_INT_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readIntCols(len)
/* Read a bool array */
#define READ_BOOL_ARRAY(fldname, len) \
token = pg_strtok(&length); /* skip :fldname */ \
local_node->fldname = readBoolCols(len)
/* Routine exit */
#define READ_DONE() \
return local_node
/*
* NOTE: use atoi() to read values written with %d, or atoui() to read
* values written with %u in outfuncs.c. An exception is OID values,
* for which use atooid(). (As of 7.1, outfuncs.c writes OIDs as %u,
* but this will probably change in the future.)
*/
#define atoui(x) ((unsigned int) strtoul((x), NULL, 10))
#define strtobool(x) ((*(x) == 't') ? true : false)
static char *
nullable_string(const char *token, int length)
{
/* outToken emits <> for NULL, and pg_strtok makes that an empty string */
if (length == 0)
return NULL;
/* outToken emits "" for empty string */
if (length == 2 && token[0] == '"' && token[1] == '"')
return pstrdup("");
/* otherwise, we must remove protective backslashes added by outToken */
return debackslash(token, length);
}
/*
* _readBitmapset
*
* Note: this code is used in contexts where we know that a Bitmapset
* is expected. There is equivalent code in nodeRead() that can read a
* Bitmapset when we come across one in other contexts.
*/
static Bitmapset *
_readBitmapset(void)
{
Bitmapset *result = NULL;
READ_TEMP_LOCALS();
token = pg_strtok(&length);
if (token == NULL)
elog(ERROR, "incomplete Bitmapset structure");
if (length != 1 || token[0] != '(')
elog(ERROR, "unrecognized token: \"%.*s\"", length, token);
token = pg_strtok(&length);
if (token == NULL)
elog(ERROR, "incomplete Bitmapset structure");
if (length != 1 || token[0] != 'b')
elog(ERROR, "unrecognized token: \"%.*s\"", length, token);
for (;;)
{
int val;
char *endptr;
token = pg_strtok(&length);
if (token == NULL)
elog(ERROR, "unterminated Bitmapset structure");
if (length == 1 && token[0] == ')')
break;
val = (int) strtol(token, &endptr, 10);
if (endptr != token + length)
elog(ERROR, "unrecognized integer: \"%.*s\"", length, token);
result = bms_add_member(result, val);
}
return result;
}
/*
* We export this function for use by extensions that define extensible nodes.
* That's somewhat historical, though, because calling nodeRead() will work.
*/
Bitmapset *
readBitmapset(void)
{
return _readBitmapset();
}
Automatically generate node support functions Add a script to automatically generate the node support functions (copy, equal, out, and read, as well as the node tags enum) from the struct definitions. For each of the four node support files, it creates two include files, e.g., copyfuncs.funcs.c and copyfuncs.switch.c, to include in the main file. All the scaffolding of the main file stays in place. I have tried to mostly make the coverage of the output match what is currently there. For example, one could now do out/read coverage of utility statement nodes, but I have manually excluded those for now. The reason is mainly that it's easier to diff the before and after, and adding a bunch of stuff like this might require a separate analysis and review. Subtyping (TidScan -> Scan) is supported. For the hard cases, you can just write a manual function and exclude generating one. For the not so hard cases, there is a way of annotating struct fields to get special behaviors. For example, pg_node_attr(equal_ignore) has the field ignored in equal functions. (In this patch, I have only ifdef'ed out the code to could be removed, mainly so that it won't constantly have merge conflicts. It will be deleted in a separate patch. All the code comments that are worth keeping from those sections have already been moved to the header files where the structs are defined.) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce%40enterprisedb.com
2022-07-09 08:52:19 +02:00
#include "readfuncs.funcs.c"
/*
* Support functions for nodes with custom_read_write attribute or
* special_read_write attribute
*/
static Const *
_readConst(void)
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
{
READ_LOCALS(Const);
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
READ_OID_FIELD(consttype);
READ_INT_FIELD(consttypmod);
READ_OID_FIELD(constcollid);
READ_INT_FIELD(constlen);
READ_BOOL_FIELD(constbyval);
READ_BOOL_FIELD(constisnull);
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
READ_LOCATION_FIELD(location);
token = pg_strtok(&length); /* skip :constvalue */
if (local_node->constisnull)
token = pg_strtok(&length); /* skip "<>" */
else
local_node->constvalue = readDatum(local_node->constbyval);
READ_DONE();
}
static BoolExpr *
_readBoolExpr(void)
{
READ_LOCALS(BoolExpr);
/* do-it-yourself enum representation */
token = pg_strtok(&length); /* skip :boolop */
token = pg_strtok(&length); /* get field value */
if (length == 3 && strncmp(token, "and", 3) == 0)
local_node->boolop = AND_EXPR;
else if (length == 2 && strncmp(token, "or", 2) == 0)
local_node->boolop = OR_EXPR;
else if (length == 3 && strncmp(token, "not", 3) == 0)
local_node->boolop = NOT_EXPR;
else
elog(ERROR, "unrecognized boolop \"%.*s\"", length, token);
READ_NODE_FIELD(args);
READ_LOCATION_FIELD(location);
READ_DONE();
}
static A_Const *
_readA_Const(void)
{
READ_LOCALS(A_Const);
token = pg_strtok(&length);
if (length == 4 && strncmp(token, "NULL", 4) == 0)
local_node->isnull = true;
else
{
union ValUnion *tmp = nodeRead(NULL, 0);
memcpy(&local_node->val, tmp, sizeof(*tmp));
}
READ_LOCATION_FIELD(location);
READ_DONE();
}
/*
* _readConstraint
*/
static Constraint *
_readConstraint(void)
{
READ_LOCALS(Constraint);
READ_STRING_FIELD(conname);
READ_BOOL_FIELD(deferrable);
READ_BOOL_FIELD(initdeferred);
READ_LOCATION_FIELD(location);
token = pg_strtok(&length); /* skip :contype */
token = pg_strtok(&length); /* get field value */
if (length == 4 && strncmp(token, "NULL", 4) == 0)
local_node->contype = CONSTR_NULL;
else if (length == 8 && strncmp(token, "NOT_NULL", 8) == 0)
local_node->contype = CONSTR_NOTNULL;
else if (length == 7 && strncmp(token, "DEFAULT", 7) == 0)
local_node->contype = CONSTR_DEFAULT;
else if (length == 8 && strncmp(token, "IDENTITY", 8) == 0)
local_node->contype = CONSTR_IDENTITY;
else if (length == 9 && strncmp(token, "GENERATED", 9) == 0)
local_node->contype = CONSTR_GENERATED;
else if (length == 5 && strncmp(token, "CHECK", 5) == 0)
local_node->contype = CONSTR_CHECK;
else if (length == 11 && strncmp(token, "PRIMARY_KEY", 11) == 0)
local_node->contype = CONSTR_PRIMARY;
else if (length == 6 && strncmp(token, "UNIQUE", 6) == 0)
local_node->contype = CONSTR_UNIQUE;
else if (length == 9 && strncmp(token, "EXCLUSION", 9) == 0)
local_node->contype = CONSTR_EXCLUSION;
else if (length == 11 && strncmp(token, "FOREIGN_KEY", 11) == 0)
local_node->contype = CONSTR_FOREIGN;
else if (length == 15 && strncmp(token, "ATTR_DEFERRABLE", 15) == 0)
local_node->contype = CONSTR_ATTR_DEFERRABLE;
else if (length == 19 && strncmp(token, "ATTR_NOT_DEFERRABLE", 19) == 0)
local_node->contype = CONSTR_ATTR_NOT_DEFERRABLE;
else if (length == 13 && strncmp(token, "ATTR_DEFERRED", 13) == 0)
local_node->contype = CONSTR_ATTR_DEFERRED;
else if (length == 14 && strncmp(token, "ATTR_IMMEDIATE", 14) == 0)
local_node->contype = CONSTR_ATTR_IMMEDIATE;
switch (local_node->contype)
{
case CONSTR_NULL:
case CONSTR_NOTNULL:
/* no extra fields */
break;
case CONSTR_DEFAULT:
READ_NODE_FIELD(raw_expr);
READ_STRING_FIELD(cooked_expr);
break;
case CONSTR_IDENTITY:
READ_NODE_FIELD(options);
READ_CHAR_FIELD(generated_when);
break;
case CONSTR_GENERATED:
READ_NODE_FIELD(raw_expr);
READ_STRING_FIELD(cooked_expr);
READ_CHAR_FIELD(generated_when);
break;
case CONSTR_CHECK:
READ_BOOL_FIELD(is_no_inherit);
READ_NODE_FIELD(raw_expr);
READ_STRING_FIELD(cooked_expr);
READ_BOOL_FIELD(skip_validation);
READ_BOOL_FIELD(initially_valid);
break;
case CONSTR_PRIMARY:
READ_NODE_FIELD(keys);
READ_NODE_FIELD(including);
READ_NODE_FIELD(options);
READ_STRING_FIELD(indexname);
READ_STRING_FIELD(indexspace);
READ_BOOL_FIELD(reset_default_tblspc);
/* access_method and where_clause not currently used */
break;
case CONSTR_UNIQUE:
READ_BOOL_FIELD(nulls_not_distinct);
READ_NODE_FIELD(keys);
READ_NODE_FIELD(including);
READ_NODE_FIELD(options);
READ_STRING_FIELD(indexname);
READ_STRING_FIELD(indexspace);
READ_BOOL_FIELD(reset_default_tblspc);
/* access_method and where_clause not currently used */
break;
case CONSTR_EXCLUSION:
READ_NODE_FIELD(exclusions);
READ_NODE_FIELD(including);
READ_NODE_FIELD(options);
READ_STRING_FIELD(indexname);
READ_STRING_FIELD(indexspace);
READ_BOOL_FIELD(reset_default_tblspc);
READ_STRING_FIELD(access_method);
READ_NODE_FIELD(where_clause);
break;
case CONSTR_FOREIGN:
READ_NODE_FIELD(pktable);
READ_NODE_FIELD(fk_attrs);
READ_NODE_FIELD(pk_attrs);
READ_CHAR_FIELD(fk_matchtype);
READ_CHAR_FIELD(fk_upd_action);
READ_CHAR_FIELD(fk_del_action);
READ_NODE_FIELD(fk_del_set_cols);
READ_NODE_FIELD(old_conpfeqop);
READ_OID_FIELD(old_pktable_oid);
READ_BOOL_FIELD(skip_validation);
READ_BOOL_FIELD(initially_valid);
break;
case CONSTR_ATTR_DEFERRABLE:
case CONSTR_ATTR_NOT_DEFERRABLE:
case CONSTR_ATTR_DEFERRED:
case CONSTR_ATTR_IMMEDIATE:
/* no extra fields */
break;
default:
elog(ERROR, "unrecognized ConstrType: %d", (int) local_node->contype);
break;
}
READ_DONE();
}
static RangeTblEntry *
_readRangeTblEntry(void)
{
READ_LOCALS(RangeTblEntry);
/* put alias + eref first to make dump more legible */
READ_NODE_FIELD(alias);
READ_NODE_FIELD(eref);
READ_ENUM_FIELD(rtekind, RTEKind);
switch (local_node->rtekind)
{
case RTE_RELATION:
READ_OID_FIELD(relid);
READ_CHAR_FIELD(relkind);
READ_INT_FIELD(rellockmode);
READ_NODE_FIELD(tablesample);
Rework query relation permission checking Currently, information about the permissions to be checked on relations mentioned in a query is stored in their range table entries. So the executor must scan the entire range table looking for relations that need to have permissions checked. This can make the permission checking part of the executor initialization needlessly expensive when many inheritance children are present in the range range. While the permissions need not be checked on the individual child relations, the executor still must visit every range table entry to filter them out. This commit moves the permission checking information out of the range table entries into a new plan node called RTEPermissionInfo. Every top-level (inheritance "root") RTE_RELATION entry in the range table gets one and a list of those is maintained alongside the range table. This new list is initialized by the parser when initializing the range table. The rewriter can add more entries to it as rules/views are expanded. Finally, the planner combines the lists of the individual subqueries into one flat list that is passed to the executor for checking. To make it quick to find the RTEPermissionInfo entry belonging to a given relation, RangeTblEntry gets a new Index field 'perminfoindex' that stores the corresponding RTEPermissionInfo's index in the query's list of the latter. ExecutorCheckPerms_hook has gained another List * argument; the signature is now: typedef bool (*ExecutorCheckPerms_hook_type) (List *rangeTable, List *rtePermInfos, bool ereport_on_violation); The first argument is no longer used by any in-core uses of the hook, but we leave it in place because there may be other implementations that do. Implementations should likely scan the rtePermInfos list to determine which operations to allow or deny. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CA+HiwqGjJDmUhDSfv-U2qhKJjt9ST7Xh9JXC_irsAQ1TAUsJYg@mail.gmail.com
2022-12-06 16:09:24 +01:00
READ_UINT_FIELD(perminfoindex);
break;
case RTE_SUBQUERY:
READ_NODE_FIELD(subquery);
READ_BOOL_FIELD(security_barrier);
break;
case RTE_JOIN:
READ_ENUM_FIELD(jointype, JoinType);
READ_INT_FIELD(joinmergedcols);
READ_NODE_FIELD(joinaliasvars);
READ_NODE_FIELD(joinleftcols);
READ_NODE_FIELD(joinrightcols);
READ_NODE_FIELD(join_using_alias);
break;
case RTE_FUNCTION:
READ_NODE_FIELD(functions);
READ_BOOL_FIELD(funcordinality);
break;
case RTE_TABLEFUNC:
READ_NODE_FIELD(tablefunc);
/* The RTE must have a copy of the column type info, if any */
if (local_node->tablefunc)
{
TableFunc *tf = local_node->tablefunc;
local_node->coltypes = tf->coltypes;
local_node->coltypmods = tf->coltypmods;
local_node->colcollations = tf->colcollations;
}
break;
case RTE_VALUES:
READ_NODE_FIELD(values_lists);
READ_NODE_FIELD(coltypes);
READ_NODE_FIELD(coltypmods);
READ_NODE_FIELD(colcollations);
break;
case RTE_CTE:
READ_STRING_FIELD(ctename);
READ_UINT_FIELD(ctelevelsup);
READ_BOOL_FIELD(self_reference);
READ_NODE_FIELD(coltypes);
READ_NODE_FIELD(coltypmods);
READ_NODE_FIELD(colcollations);
break;
case RTE_NAMEDTUPLESTORE:
READ_STRING_FIELD(enrname);
READ_FLOAT_FIELD(enrtuples);
READ_OID_FIELD(relid);
READ_NODE_FIELD(coltypes);
READ_NODE_FIELD(coltypmods);
READ_NODE_FIELD(colcollations);
break;
case RTE_RESULT:
/* no extra fields */
break;
default:
elog(ERROR, "unrecognized RTE kind: %d",
(int) local_node->rtekind);
break;
}
READ_BOOL_FIELD(lateral);
READ_BOOL_FIELD(inh);
READ_BOOL_FIELD(inFromCl);
READ_NODE_FIELD(securityQuals);
READ_DONE();
}
static A_Expr *
_readA_Expr(void)
{
READ_LOCALS(A_Expr);
token = pg_strtok(&length);
if (length == 3 && strncmp(token, "ANY", 3) == 0)
{
local_node->kind = AEXPR_OP_ANY;
READ_NODE_FIELD(name);
}
else if (length == 3 && strncmp(token, "ALL", 3) == 0)
{
local_node->kind = AEXPR_OP_ALL;
READ_NODE_FIELD(name);
}
else if (length == 8 && strncmp(token, "DISTINCT", 8) == 0)
{
local_node->kind = AEXPR_DISTINCT;
READ_NODE_FIELD(name);
}
else if (length == 12 && strncmp(token, "NOT_DISTINCT", 12) == 0)
{
local_node->kind = AEXPR_NOT_DISTINCT;
READ_NODE_FIELD(name);
}
else if (length == 6 && strncmp(token, "NULLIF", 6) == 0)
{
local_node->kind = AEXPR_NULLIF;
READ_NODE_FIELD(name);
}
else if (length == 2 && strncmp(token, "IN", 2) == 0)
{
local_node->kind = AEXPR_IN;
READ_NODE_FIELD(name);
}
else if (length == 4 && strncmp(token, "LIKE", 4) == 0)
{
local_node->kind = AEXPR_LIKE;
READ_NODE_FIELD(name);
}
else if (length == 5 && strncmp(token, "ILIKE", 5) == 0)
{
local_node->kind = AEXPR_ILIKE;
READ_NODE_FIELD(name);
}
else if (length == 7 && strncmp(token, "SIMILAR", 7) == 0)
{
local_node->kind = AEXPR_SIMILAR;
READ_NODE_FIELD(name);
}
else if (length == 7 && strncmp(token, "BETWEEN", 7) == 0)
{
local_node->kind = AEXPR_BETWEEN;
READ_NODE_FIELD(name);
}
else if (length == 11 && strncmp(token, "NOT_BETWEEN", 11) == 0)
{
local_node->kind = AEXPR_NOT_BETWEEN;
READ_NODE_FIELD(name);
}
else if (length == 11 && strncmp(token, "BETWEEN_SYM", 11) == 0)
{
local_node->kind = AEXPR_BETWEEN_SYM;
READ_NODE_FIELD(name);
}
else if (length == 15 && strncmp(token, "NOT_BETWEEN_SYM", 15) == 0)
{
local_node->kind = AEXPR_NOT_BETWEEN_SYM;
READ_NODE_FIELD(name);
}
else if (length == 5 && strncmp(token, ":name", 5) == 0)
{
local_node->kind = AEXPR_OP;
local_node->name = nodeRead(NULL, 0);
}
else
elog(ERROR, "unrecognized A_Expr kind: \"%.*s\"", length, token);
READ_NODE_FIELD(lexpr);
READ_NODE_FIELD(rexpr);
READ_LOCATION_FIELD(location);
READ_DONE();
}
static ExtensibleNode *
_readExtensibleNode(void)
{
const ExtensibleNodeMethods *methods;
ExtensibleNode *local_node;
const char *extnodename;
READ_TEMP_LOCALS();
token = pg_strtok(&length); /* skip :extnodename */
token = pg_strtok(&length); /* get extnodename */
extnodename = nullable_string(token, length);
if (!extnodename)
elog(ERROR, "extnodename has to be supplied");
methods = GetExtensibleNodeMethods(extnodename, false);
local_node = (ExtensibleNode *) newNode(methods->node_size,
T_ExtensibleNode);
local_node->extnodename = extnodename;
/* deserialize the private fields */
methods->nodeRead(local_node);
READ_DONE();
}
Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-07 19:17:43 +01:00
/*
* parseNodeString
*
* Given a character string representing a node tree, parseNodeString creates
* the internal node structure.
*
* The string to be read must already have been loaded into pg_strtok().
*/
Node *
parseNodeString(void)
{
void *return_value;
2003-08-04 02:43:34 +02:00
READ_TEMP_LOCALS();
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
token = pg_strtok(&length);
#define MATCH(tokname, namelen) \
(length == namelen && memcmp(token, tokname, namelen) == 0)
Automatically generate node support functions Add a script to automatically generate the node support functions (copy, equal, out, and read, as well as the node tags enum) from the struct definitions. For each of the four node support files, it creates two include files, e.g., copyfuncs.funcs.c and copyfuncs.switch.c, to include in the main file. All the scaffolding of the main file stays in place. I have tried to mostly make the coverage of the output match what is currently there. For example, one could now do out/read coverage of utility statement nodes, but I have manually excluded those for now. The reason is mainly that it's easier to diff the before and after, and adding a bunch of stuff like this might require a separate analysis and review. Subtyping (TidScan -> Scan) is supported. For the hard cases, you can just write a manual function and exclude generating one. For the not so hard cases, there is a way of annotating struct fields to get special behaviors. For example, pg_node_attr(equal_ignore) has the field ignored in equal functions. (In this patch, I have only ifdef'ed out the code to could be removed, mainly so that it won't constantly have merge conflicts. It will be deleted in a separate patch. All the code comments that are worth keeping from those sections have already been moved to the header files where the structs are defined.) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce%40enterprisedb.com
2022-07-09 08:52:19 +02:00
if (false)
;
#include "readfuncs.switch.c"
else
{
elog(ERROR, "badly formatted node string \"%.32s\"...", token);
return_value = NULL; /* keep compiler quiet */
}
return (Node *) return_value;
}
/*
* readDatum
*
* Given a string representation of a constant, recreate the appropriate
* Datum. The string representation embeds length info, but not byValue,
* so we must be told that.
*/
Datum
readDatum(bool typbyval)
{
Size length,
i;
int tokenLength;
2018-09-18 23:11:54 +02:00
const char *token;
Datum res;
char *s;
/*
* read the actual length of the value
*/
token = pg_strtok(&tokenLength);
length = atoui(token);
token = pg_strtok(&tokenLength); /* read the '[' */
if (token == NULL || token[0] != '[')
elog(ERROR, "expected \"[\" to start datum, but got \"%s\"; length = %zu",
2018-09-18 23:11:54 +02:00
token ? token : "[NULL]", length);
if (typbyval)
{
if (length > (Size) sizeof(Datum))
elog(ERROR, "byval datum but length = %zu", length);
res = (Datum) 0;
s = (char *) (&res);
for (i = 0; i < (Size) sizeof(Datum); i++)
{
token = pg_strtok(&tokenLength);
s[i] = (char) atoi(token);
}
}
else if (length <= 0)
res = (Datum) NULL;
else
{
s = (char *) palloc(length);
for (i = 0; i < length; i++)
{
token = pg_strtok(&tokenLength);
s[i] = (char) atoi(token);
}
res = PointerGetDatum(s);
}
token = pg_strtok(&tokenLength); /* read the ']' */
if (token == NULL || token[0] != ']')
elog(ERROR, "expected \"]\" to end datum, but got \"%s\"; length = %zu",
2018-09-18 23:11:54 +02:00
token ? token : "[NULL]", length);
return res;
}
/*
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
* common implementation for scalar-array-reading functions
*
* The data format is either "<>" for a NULL pointer (in which case numCols
* is ignored) or "(item item item)" where the number of items must equal
* numCols. The convfunc must be okay with stopping at whitespace or a
* right parenthesis, since pg_strtok won't null-terminate the token.
*/
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
#define READ_SCALAR_ARRAY(fnname, datatype, convfunc) \
datatype * \
fnname(int numCols) \
{ \
datatype *vals; \
READ_TEMP_LOCALS(); \
token = pg_strtok(&length); \
if (token == NULL) \
elog(ERROR, "incomplete scalar array"); \
if (length == 0) \
return NULL; /* it was "<>", so return NULL pointer */ \
if (length != 1 || token[0] != '(') \
elog(ERROR, "unrecognized token: \"%.*s\"", length, token); \
vals = (datatype *) palloc(numCols * sizeof(datatype)); \
for (int i = 0; i < numCols; i++) \
{ \
token = pg_strtok(&length); \
if (token == NULL || token[0] == ')') \
elog(ERROR, "incomplete scalar array"); \
vals[i] = convfunc(token); \
} \
token = pg_strtok(&length); \
if (token == NULL || length != 1 || token[0] != ')') \
elog(ERROR, "incomplete scalar array"); \
return vals; \
}
/*
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
* Note: these functions are exported in nodes.h for possible use by
* extensions, so don't mess too much with their names or API.
*/
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
READ_SCALAR_ARRAY(readAttrNumberCols, int16, atoi)
READ_SCALAR_ARRAY(readOidCols, Oid, atooid)
/* outfuncs.c has writeIndexCols, but we don't yet need that here */
/* READ_SCALAR_ARRAY(readIndexCols, Index, atoui) */
READ_SCALAR_ARRAY(readIntCols, int, atoi)
READ_SCALAR_ARRAY(readBoolCols, bool, strtobool)