postgresql/src/backend/nodes/outfuncs.c

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

901 lines
23 KiB
C
Raw Normal View History

/*-------------------------------------------------------------------------
*
* outfuncs.c
* Output functions for Postgres tree nodes.
*
* Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
*
* IDENTIFICATION
2010-09-20 22:08:53 +02:00
* src/backend/nodes/outfuncs.c
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include <ctype.h>
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
#include "access/attnum.h"
#include "common/shortest_dec.h"
#include "lib/stringinfo.h"
#include "miscadmin.h"
Automatically generate node support functions Add a script to automatically generate the node support functions (copy, equal, out, and read, as well as the node tags enum) from the struct definitions. For each of the four node support files, it creates two include files, e.g., copyfuncs.funcs.c and copyfuncs.switch.c, to include in the main file. All the scaffolding of the main file stays in place. I have tried to mostly make the coverage of the output match what is currently there. For example, one could now do out/read coverage of utility statement nodes, but I have manually excluded those for now. The reason is mainly that it's easier to diff the before and after, and adding a bunch of stuff like this might require a separate analysis and review. Subtyping (TidScan -> Scan) is supported. For the hard cases, you can just write a manual function and exclude generating one. For the not so hard cases, there is a way of annotating struct fields to get special behaviors. For example, pg_node_attr(equal_ignore) has the field ignored in equal functions. (In this patch, I have only ifdef'ed out the code to could be removed, mainly so that it won't constantly have merge conflicts. It will be deleted in a separate patch. All the code comments that are worth keeping from those sections have already been moved to the header files where the structs are defined.) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce%40enterprisedb.com
2022-07-09 08:52:19 +02:00
#include "nodes/bitmapset.h"
#include "nodes/nodes.h"
#include "nodes/pg_list.h"
1999-07-16 07:00:38 +02:00
#include "utils/datum.h"
static void outChar(StringInfo str, char c);
static void outDouble(StringInfo str, double d);
/*
* Macros to simplify output of different kinds of fields. Use these
* wherever possible to reduce the chance for silly typos. Note that these
* hard-wire conventions about the names of the local variables in an Out
* routine.
*/
/* Write the label for the node type */
#define WRITE_NODE_TYPE(nodelabel) \
appendStringInfoString(str, nodelabel)
/* Write an integer field (anything written as ":fldname %d") */
#define WRITE_INT_FIELD(fldname) \
appendStringInfo(str, " :" CppAsString(fldname) " %d", node->fldname)
/* Write an unsigned integer field (anything written as ":fldname %u") */
#define WRITE_UINT_FIELD(fldname) \
appendStringInfo(str, " :" CppAsString(fldname) " %u", node->fldname)
/* Write an unsigned integer field (anything written with UINT64_FORMAT) */
#define WRITE_UINT64_FIELD(fldname) \
appendStringInfo(str, " :" CppAsString(fldname) " " UINT64_FORMAT, \
node->fldname)
/* Write an OID field (don't hard-wire assumption that OID is same as uint) */
#define WRITE_OID_FIELD(fldname) \
appendStringInfo(str, " :" CppAsString(fldname) " %u", node->fldname)
/* Write a long-integer field */
#define WRITE_LONG_FIELD(fldname) \
appendStringInfo(str, " :" CppAsString(fldname) " %ld", node->fldname)
/* Write a char field (ie, one ascii character) */
#define WRITE_CHAR_FIELD(fldname) \
(appendStringInfo(str, " :" CppAsString(fldname) " "), \
outChar(str, node->fldname))
/* Write an enumerated-type field as an integer code */
#define WRITE_ENUM_FIELD(fldname, enumtype) \
appendStringInfo(str, " :" CppAsString(fldname) " %d", \
(int) node->fldname)
/* Write a float field (actually, they're double) */
#define WRITE_FLOAT_FIELD(fldname) \
(appendStringInfo(str, " :" CppAsString(fldname) " "), \
outDouble(str, node->fldname))
/* Write a boolean field */
#define WRITE_BOOL_FIELD(fldname) \
appendStringInfo(str, " :" CppAsString(fldname) " %s", \
booltostr(node->fldname))
/* Write a character-string (possibly NULL) field */
#define WRITE_STRING_FIELD(fldname) \
(appendStringInfoString(str, " :" CppAsString(fldname) " "), \
outToken(str, node->fldname))
/* Write a parse location field (actually same as INT case) */
#define WRITE_LOCATION_FIELD(fldname) \
appendStringInfo(str, " :" CppAsString(fldname) " %d", node->fldname)
/* Write a Node field */
#define WRITE_NODE_FIELD(fldname) \
(appendStringInfoString(str, " :" CppAsString(fldname) " "), \
outNode(str, node->fldname))
/* Write a bitmapset field */
#define WRITE_BITMAPSET_FIELD(fldname) \
(appendStringInfoString(str, " :" CppAsString(fldname) " "), \
outBitmapset(str, node->fldname))
/* Write a variable-length array (not a List) of Node pointers */
#define WRITE_NODE_ARRAY(fldname, len) \
(appendStringInfoString(str, " :" CppAsString(fldname) " "), \
writeNodeArray(str, (const Node * const *) node->fldname, len))
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
/* Write a variable-length array of AttrNumber */
#define WRITE_ATTRNUMBER_ARRAY(fldname, len) \
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
(appendStringInfoString(str, " :" CppAsString(fldname) " "), \
writeAttrNumberCols(str, node->fldname, len))
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
/* Write a variable-length array of Oid */
#define WRITE_OID_ARRAY(fldname, len) \
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
(appendStringInfoString(str, " :" CppAsString(fldname) " "), \
writeOidCols(str, node->fldname, len))
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
/* Write a variable-length array of Index */
#define WRITE_INDEX_ARRAY(fldname, len) \
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
(appendStringInfoString(str, " :" CppAsString(fldname) " "), \
writeIndexCols(str, node->fldname, len))
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
/* Write a variable-length array of int */
#define WRITE_INT_ARRAY(fldname, len) \
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
(appendStringInfoString(str, " :" CppAsString(fldname) " "), \
writeIntCols(str, node->fldname, len))
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
/* Write a variable-length array of bool */
#define WRITE_BOOL_ARRAY(fldname, len) \
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
(appendStringInfoString(str, " :" CppAsString(fldname) " "), \
writeBoolCols(str, node->fldname, len))
#define booltostr(x) ((x) ? "true" : "false")
/*
* outToken
* Convert an ordinary string (eg, an identifier) into a form that
* will be decoded back to a plain token by read.c's functions.
*
* If a null string pointer is given, it is encoded as '<>'.
* An empty string is encoded as '""'. To avoid ambiguity, input
* strings beginning with '<' or '"' receive a leading backslash.
*/
void
outToken(StringInfo str, const char *s)
{
if (s == NULL)
{
appendStringInfoString(str, "<>");
return;
}
if (*s == '\0')
{
appendStringInfoString(str, "\"\"");
return;
}
/*
* Look for characters or patterns that are treated specially by read.c
* (either in pg_strtok() or in nodeRead()), and therefore need a
* protective backslash.
*/
/* These characters only need to be quoted at the start of the string */
if (*s == '<' ||
*s == '"' ||
isdigit((unsigned char) *s) ||
((*s == '+' || *s == '-') &&
(isdigit((unsigned char) s[1]) || s[1] == '.')))
appendStringInfoChar(str, '\\');
while (*s)
{
/* These chars must be backslashed anywhere in the string */
if (*s == ' ' || *s == '\n' || *s == '\t' ||
*s == '(' || *s == ')' || *s == '{' || *s == '}' ||
*s == '\\')
appendStringInfoChar(str, '\\');
appendStringInfoChar(str, *s++);
}
}
/*
* Convert one char. Goes through outToken() so that special characters are
* escaped.
*/
static void
outChar(StringInfo str, char c)
{
char in[2];
/* Traditionally, we've represented \0 as <>, so keep doing that */
if (c == '\0')
{
appendStringInfoString(str, "<>");
return;
}
in[0] = c;
in[1] = '\0';
outToken(str, in);
}
/*
* Convert a double value, attempting to ensure the value is preserved exactly.
*/
static void
outDouble(StringInfo str, double d)
{
char buf[DOUBLE_SHORTEST_DECIMAL_LEN];
double_to_shortest_decimal_buf(d, buf);
appendStringInfoString(str, buf);
}
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
/*
* common implementation for scalar-array-writing functions
*
* The data format is either "<>" for a NULL pointer or "(item item item)".
* fmtstr must include a leading space, and the rest of it must produce
* something that will be seen as a single simple token by pg_strtok().
* convfunc can be empty, or the name of a conversion macro or function.
*/
#define WRITE_SCALAR_ARRAY(fnname, datatype, fmtstr, convfunc) \
static void \
fnname(StringInfo str, const datatype *arr, int len) \
{ \
if (arr != NULL) \
{ \
appendStringInfoChar(str, '('); \
for (int i = 0; i < len; i++) \
appendStringInfo(str, fmtstr, convfunc(arr[i])); \
appendStringInfoChar(str, ')'); \
} \
else \
appendStringInfoString(str, "<>"); \
}
WRITE_SCALAR_ARRAY(writeAttrNumberCols, AttrNumber, " %d",)
WRITE_SCALAR_ARRAY(writeOidCols, Oid, " %u",)
WRITE_SCALAR_ARRAY(writeIndexCols, Index, " %u",)
WRITE_SCALAR_ARRAY(writeIntCols, int, " %d",)
WRITE_SCALAR_ARRAY(writeBoolCols, bool, " %s", booltostr)
/*
* Print an array (not a List) of Node pointers.
*
* The decoration is identical to that of scalar arrays, but we can't
* quite use appendStringInfo() in the loop.
*/
static void
writeNodeArray(StringInfo str, const Node *const *arr, int len)
{
if (arr != NULL)
{
appendStringInfoChar(str, '(');
for (int i = 0; i < len; i++)
{
appendStringInfoChar(str, ' ');
outNode(str, arr[i]);
}
appendStringInfoChar(str, ')');
}
else
appendStringInfoString(str, "<>");
}
Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us
2022-07-20 19:04:33 +02:00
/*
* Print a List.
*/
static void
_outList(StringInfo str, const List *node)
{
const ListCell *lc;
appendStringInfoChar(str, '(');
if (IsA(node, IntList))
appendStringInfoChar(str, 'i');
else if (IsA(node, OidList))
appendStringInfoChar(str, 'o');
else if (IsA(node, XidList))
appendStringInfoChar(str, 'x');
foreach(lc, node)
{
/*
* For the sake of backward compatibility, we emit a slightly
* different whitespace format for lists of nodes vs. other types of
* lists. XXX: is this necessary?
*/
if (IsA(node, List))
{
outNode(str, lfirst(lc));
Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
if (lnext(node, lc))
appendStringInfoChar(str, ' ');
}
else if (IsA(node, IntList))
appendStringInfo(str, " %d", lfirst_int(lc));
else if (IsA(node, OidList))
appendStringInfo(str, " %u", lfirst_oid(lc));
else if (IsA(node, XidList))
appendStringInfo(str, " %u", lfirst_xid(lc));
else
elog(ERROR, "unrecognized list node type: %d",
(int) node->type);
}
appendStringInfoChar(str, ')');
}
/*
* outBitmapset -
* converts a bitmap set of integers
*
* Note: the output format is "(b int int ...)", similar to an integer List.
*
* We export this function for use by extensions that define extensible nodes.
* That's somewhat historical, though, because calling outNode() will work.
*/
void
outBitmapset(StringInfo str, const Bitmapset *bms)
{
int x;
appendStringInfoChar(str, '(');
appendStringInfoChar(str, 'b');
x = -1;
while ((x = bms_next_member(bms, x)) >= 0)
appendStringInfo(str, " %d", x);
appendStringInfoChar(str, ')');
}
/*
* Print the value of a Datum given its type.
*/
void
outDatum(StringInfo str, Datum value, int typlen, bool typbyval)
{
Size length,
i;
char *s;
length = datumGetSize(value, typbyval, typlen);
1999-01-21 17:08:55 +01:00
if (typbyval)
{
s = (char *) (&value);
appendStringInfo(str, "%u [ ", (unsigned int) length);
for (i = 0; i < (Size) sizeof(Datum); i++)
appendStringInfo(str, "%d ", (int) (s[i]));
appendStringInfoChar(str, ']');
}
else
{
s = (char *) DatumGetPointer(value);
if (!PointerIsValid(s))
appendStringInfoString(str, "0 [ ]");
else
{
appendStringInfo(str, "%u [ ", (unsigned int) length);
for (i = 0; i < length; i++)
appendStringInfo(str, "%d ", (int) (s[i]));
appendStringInfoChar(str, ']');
}
}
}
Automatically generate node support functions Add a script to automatically generate the node support functions (copy, equal, out, and read, as well as the node tags enum) from the struct definitions. For each of the four node support files, it creates two include files, e.g., copyfuncs.funcs.c and copyfuncs.switch.c, to include in the main file. All the scaffolding of the main file stays in place. I have tried to mostly make the coverage of the output match what is currently there. For example, one could now do out/read coverage of utility statement nodes, but I have manually excluded those for now. The reason is mainly that it's easier to diff the before and after, and adding a bunch of stuff like this might require a separate analysis and review. Subtyping (TidScan -> Scan) is supported. For the hard cases, you can just write a manual function and exclude generating one. For the not so hard cases, there is a way of annotating struct fields to get special behaviors. For example, pg_node_attr(equal_ignore) has the field ignored in equal functions. (In this patch, I have only ifdef'ed out the code to could be removed, mainly so that it won't constantly have merge conflicts. It will be deleted in a separate patch. All the code comments that are worth keeping from those sections have already been moved to the header files where the structs are defined.) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce%40enterprisedb.com
2022-07-09 08:52:19 +02:00
#include "outfuncs.funcs.c"
/*
* Support functions for nodes with custom_read_write attribute or
* special_read_write attribute
*/
static void
_outConst(StringInfo str, const Const *node)
{
WRITE_NODE_TYPE("CONST");
WRITE_OID_FIELD(consttype);
WRITE_INT_FIELD(consttypmod);
WRITE_OID_FIELD(constcollid);
WRITE_INT_FIELD(constlen);
WRITE_BOOL_FIELD(constbyval);
WRITE_BOOL_FIELD(constisnull);
WRITE_LOCATION_FIELD(location);
appendStringInfoString(str, " :constvalue ");
if (node->constisnull)
appendStringInfoString(str, "<>");
else
outDatum(str, node->constvalue, node->constlen, node->constbyval);
}
static void
_outBoolExpr(StringInfo str, const BoolExpr *node)
{
char *opstr = NULL;
WRITE_NODE_TYPE("BOOLEXPR");
/* do-it-yourself enum representation */
switch (node->boolop)
{
case AND_EXPR:
opstr = "and";
break;
case OR_EXPR:
opstr = "or";
break;
case NOT_EXPR:
opstr = "not";
break;
}
appendStringInfoString(str, " :boolop ");
outToken(str, opstr);
WRITE_NODE_FIELD(args);
WRITE_LOCATION_FIELD(location);
}
static void
_outForeignKeyOptInfo(StringInfo str, const ForeignKeyOptInfo *node)
{
int i;
WRITE_NODE_TYPE("FOREIGNKEYOPTINFO");
Move targetlist SRF handling from expression evaluation to new executor node. Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT generate_series(1,5)) so far was done in the expression evaluation (i.e. ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code. This meant that most executor nodes performing projection, and most expression evaluation functions, had to deal with the possibility that an evaluated expression could return a set of return values. That's bad because it leads to repeated code in a lot of places. It also, and that's my (Andres's) motivation, made it a lot harder to implement a more efficient way of doing expression evaluation. To fix this, introduce a new executor node (ProjectSet) that can evaluate targetlists containing one or more SRFs. To avoid the complexity of the old way of handling nested expressions returning sets (e.g. having to pass up ExprDoneCond, and dealing with arguments to functions returning sets etc.), those SRFs can only be at the top level of the node's targetlist. The planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is only necessary in ProjectSet nodes and that SRFs are only present at the top level of the node's targetlist. If there are nested SRFs the planner creates multiple stacked ProjectSet nodes. The ProjectSet nodes always get input from an underlying node. We also discussed and prototyped evaluating targetlist SRFs using ROWS FROM(), but that turned out to be more complicated than we'd hoped. While moving SRF evaluation to ProjectSet would allow to retain the old "least common multiple" behavior when multiple SRFs are present in one targetlist (i.e. continue returning rows until all SRFs are at the end of their input at the same time), we decided to instead only return rows till all SRFs are exhausted, returning NULL for already exhausted ones. We deemed the previous behavior to be too confusing, unexpected and actually not particularly useful. As a side effect, the previously prohibited case of multiple set returning arguments to a function, is now allowed. Not because it's particularly desirable, but because it ends up working and there seems to be no argument for adding code to prohibit it. Currently the behavior for COALESCE and CASE containing SRFs has changed, returning multiple rows from the expression, even when the SRF containing "arm" of the expression is not evaluated. That's because the SRFs are evaluated in a separate ProjectSet node. As that's quite confusing, we're likely to instead prohibit SRFs in those places. But that's still being discussed, and the code would reside in places not touched here, so that's a task for later. There's a lot of, now superfluous, code dealing with set return expressions around. But as the changes to get rid of those are verbose largely boring, it seems better for readability to keep the cleanup as a separate commit. Author: Tom Lane and Andres Freund Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
2017-01-18 21:46:50 +01:00
WRITE_UINT_FIELD(con_relid);
WRITE_UINT_FIELD(ref_relid);
WRITE_INT_FIELD(nkeys);
WRITE_ATTRNUMBER_ARRAY(conkey, node->nkeys);
WRITE_ATTRNUMBER_ARRAY(confkey, node->nkeys);
WRITE_OID_ARRAY(conpfeqop, node->nkeys);
WRITE_INT_FIELD(nmatched_ec);
WRITE_INT_FIELD(nconst_ec);
WRITE_INT_FIELD(nmatched_rcols);
WRITE_INT_FIELD(nmatched_ri);
/* for compactness, just print the number of matches per column: */
appendStringInfoString(str, " :eclass");
for (i = 0; i < node->nkeys; i++)
appendStringInfo(str, " %d", (node->eclass[i] != NULL));
appendStringInfoString(str, " :rinfos");
for (i = 0; i < node->nkeys; i++)
appendStringInfo(str, " %d", list_length(node->rinfos[i]));
Move targetlist SRF handling from expression evaluation to new executor node. Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT generate_series(1,5)) so far was done in the expression evaluation (i.e. ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code. This meant that most executor nodes performing projection, and most expression evaluation functions, had to deal with the possibility that an evaluated expression could return a set of return values. That's bad because it leads to repeated code in a lot of places. It also, and that's my (Andres's) motivation, made it a lot harder to implement a more efficient way of doing expression evaluation. To fix this, introduce a new executor node (ProjectSet) that can evaluate targetlists containing one or more SRFs. To avoid the complexity of the old way of handling nested expressions returning sets (e.g. having to pass up ExprDoneCond, and dealing with arguments to functions returning sets etc.), those SRFs can only be at the top level of the node's targetlist. The planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is only necessary in ProjectSet nodes and that SRFs are only present at the top level of the node's targetlist. If there are nested SRFs the planner creates multiple stacked ProjectSet nodes. The ProjectSet nodes always get input from an underlying node. We also discussed and prototyped evaluating targetlist SRFs using ROWS FROM(), but that turned out to be more complicated than we'd hoped. While moving SRF evaluation to ProjectSet would allow to retain the old "least common multiple" behavior when multiple SRFs are present in one targetlist (i.e. continue returning rows until all SRFs are at the end of their input at the same time), we decided to instead only return rows till all SRFs are exhausted, returning NULL for already exhausted ones. We deemed the previous behavior to be too confusing, unexpected and actually not particularly useful. As a side effect, the previously prohibited case of multiple set returning arguments to a function, is now allowed. Not because it's particularly desirable, but because it ends up working and there seems to be no argument for adding code to prohibit it. Currently the behavior for COALESCE and CASE containing SRFs has changed, returning multiple rows from the expression, even when the SRF containing "arm" of the expression is not evaluated. That's because the SRFs are evaluated in a separate ProjectSet node. As that's quite confusing, we're likely to instead prohibit SRFs in those places. But that's still being discussed, and the code would reside in places not touched here, so that's a task for later. There's a lot of, now superfluous, code dealing with set return expressions around. But as the changes to get rid of those are verbose largely boring, it seems better for readability to keep the cleanup as a separate commit. Author: Tom Lane and Andres Freund Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
2017-01-18 21:46:50 +01:00
}
static void
_outEquivalenceClass(StringInfo str, const EquivalenceClass *node)
{
/*
* To simplify reading, we just chase up to the topmost merged EC and
* print that, without bothering to show the merge-ees separately.
*/
while (node->ec_merged)
node = node->ec_merged;
WRITE_NODE_TYPE("EQUIVALENCECLASS");
WRITE_NODE_FIELD(ec_opfamilies);
WRITE_OID_FIELD(ec_collation);
WRITE_NODE_FIELD(ec_members);
WRITE_NODE_FIELD(ec_sources);
WRITE_NODE_FIELD(ec_derives);
WRITE_BITMAPSET_FIELD(ec_relids);
WRITE_BOOL_FIELD(ec_has_const);
WRITE_BOOL_FIELD(ec_has_volatile);
WRITE_BOOL_FIELD(ec_broken);
WRITE_UINT_FIELD(ec_sortref);
WRITE_UINT_FIELD(ec_min_security);
WRITE_UINT_FIELD(ec_max_security);
}
static void
_outExtensibleNode(StringInfo str, const ExtensibleNode *node)
{
const ExtensibleNodeMethods *methods;
methods = GetExtensibleNodeMethods(node->extnodename, false);
WRITE_NODE_TYPE("EXTENSIBLENODE");
WRITE_STRING_FIELD(extnodename);
/* serialize the private fields */
methods->nodeOut(str, node);
}
static void
_outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
{
WRITE_NODE_TYPE("RANGETBLENTRY");
/* put alias + eref first to make dump more legible */
WRITE_NODE_FIELD(alias);
WRITE_NODE_FIELD(eref);
WRITE_ENUM_FIELD(rtekind, RTEKind);
switch (node->rtekind)
{
case RTE_RELATION:
WRITE_OID_FIELD(relid);
WRITE_CHAR_FIELD(relkind);
WRITE_INT_FIELD(rellockmode);
WRITE_NODE_FIELD(tablesample);
Rework query relation permission checking Currently, information about the permissions to be checked on relations mentioned in a query is stored in their range table entries. So the executor must scan the entire range table looking for relations that need to have permissions checked. This can make the permission checking part of the executor initialization needlessly expensive when many inheritance children are present in the range range. While the permissions need not be checked on the individual child relations, the executor still must visit every range table entry to filter them out. This commit moves the permission checking information out of the range table entries into a new plan node called RTEPermissionInfo. Every top-level (inheritance "root") RTE_RELATION entry in the range table gets one and a list of those is maintained alongside the range table. This new list is initialized by the parser when initializing the range table. The rewriter can add more entries to it as rules/views are expanded. Finally, the planner combines the lists of the individual subqueries into one flat list that is passed to the executor for checking. To make it quick to find the RTEPermissionInfo entry belonging to a given relation, RangeTblEntry gets a new Index field 'perminfoindex' that stores the corresponding RTEPermissionInfo's index in the query's list of the latter. ExecutorCheckPerms_hook has gained another List * argument; the signature is now: typedef bool (*ExecutorCheckPerms_hook_type) (List *rangeTable, List *rtePermInfos, bool ereport_on_violation); The first argument is no longer used by any in-core uses of the hook, but we leave it in place because there may be other implementations that do. Implementations should likely scan the rtePermInfos list to determine which operations to allow or deny. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CA+HiwqGjJDmUhDSfv-U2qhKJjt9ST7Xh9JXC_irsAQ1TAUsJYg@mail.gmail.com
2022-12-06 16:09:24 +01:00
WRITE_UINT_FIELD(perminfoindex);
break;
case RTE_SUBQUERY:
WRITE_NODE_FIELD(subquery);
WRITE_BOOL_FIELD(security_barrier);
Get rid of the "new" and "old" entries in a view's rangetable. The rule system needs "old" and/or "new" pseudo-RTEs in rule actions that are ON INSERT/UPDATE/DELETE. Historically it's put such entries into the ON SELECT rules of views as well, but those are really quite vestigial. The only thing we've used them for is to carry the view's relid forward to AcquireExecutorLocks (so that we can re-lock the view to verify it hasn't changed before re-using a plan) and to carry its relid and permissions data forward to execution-time permissions checks. What we can do instead of that is to retain these fields of the RTE_RELATION RTE for the view even after we convert it to an RTE_SUBQUERY RTE. This requires a tiny amount of extra complication in the planner and AcquireExecutorLocks, but on the other hand we can get rid of the logic that moves that data from one place to another. The principal immediate benefit of doing this, aside from a small saving in the pg_rewrite data for views, is that these pseudo-RTEs no longer trigger ruleutils.c's heuristic about qualifying variable names when the rangetable's length is more than 1. That results in quite a number of small simplifications in regression test outputs, which are all to the good IMO. Bump catversion because we need to dump a few more fields of RTE_SUBQUERY RTEs. While those will always be zeroes anyway in stored rules (because we'd never populate them until query rewrite) they are useful for debugging, and it seems like we'd better make sure to transmit such RTEs accurately in plans sent to parallel workers. I don't think the executor actually examines these fields after startup, but someday it might. This is a second attempt at committing 1b4d280ea. The difference from the first time is that now we can add some filtering rules to AdjustUpgrade.pm to allow cross-version upgrade testing to pass despite all the cosmetic changes in CREATE VIEW outputs. Amit Langote (filtering rules by me) Discussion: https://postgr.es/m/CA+HiwqEf7gPN4Hn+LoZ4tP2q_Qt7n3vw7-6fJKOf92tSEnX6Gg@mail.gmail.com Discussion: https://postgr.es/m/891521.1673657296@sss.pgh.pa.us
2023-01-18 19:23:57 +01:00
/* we re-use these RELATION fields, too: */
WRITE_OID_FIELD(relid);
WRITE_INT_FIELD(rellockmode);
WRITE_UINT_FIELD(perminfoindex);
break;
case RTE_JOIN:
WRITE_ENUM_FIELD(jointype, JoinType);
WRITE_INT_FIELD(joinmergedcols);
WRITE_NODE_FIELD(joinaliasvars);
WRITE_NODE_FIELD(joinleftcols);
WRITE_NODE_FIELD(joinrightcols);
WRITE_NODE_FIELD(join_using_alias);
break;
case RTE_FUNCTION:
WRITE_NODE_FIELD(functions);
WRITE_BOOL_FIELD(funcordinality);
break;
case RTE_TABLEFUNC:
WRITE_NODE_FIELD(tablefunc);
break;
case RTE_VALUES:
WRITE_NODE_FIELD(values_lists);
WRITE_NODE_FIELD(coltypes);
WRITE_NODE_FIELD(coltypmods);
WRITE_NODE_FIELD(colcollations);
break;
case RTE_CTE:
WRITE_STRING_FIELD(ctename);
WRITE_UINT_FIELD(ctelevelsup);
WRITE_BOOL_FIELD(self_reference);
WRITE_NODE_FIELD(coltypes);
WRITE_NODE_FIELD(coltypmods);
WRITE_NODE_FIELD(colcollations);
break;
case RTE_NAMEDTUPLESTORE:
WRITE_STRING_FIELD(enrname);
WRITE_FLOAT_FIELD(enrtuples);
WRITE_NODE_FIELD(coltypes);
WRITE_NODE_FIELD(coltypmods);
WRITE_NODE_FIELD(colcollations);
Get rid of the "new" and "old" entries in a view's rangetable. The rule system needs "old" and/or "new" pseudo-RTEs in rule actions that are ON INSERT/UPDATE/DELETE. Historically it's put such entries into the ON SELECT rules of views as well, but those are really quite vestigial. The only thing we've used them for is to carry the view's relid forward to AcquireExecutorLocks (so that we can re-lock the view to verify it hasn't changed before re-using a plan) and to carry its relid and permissions data forward to execution-time permissions checks. What we can do instead of that is to retain these fields of the RTE_RELATION RTE for the view even after we convert it to an RTE_SUBQUERY RTE. This requires a tiny amount of extra complication in the planner and AcquireExecutorLocks, but on the other hand we can get rid of the logic that moves that data from one place to another. The principal immediate benefit of doing this, aside from a small saving in the pg_rewrite data for views, is that these pseudo-RTEs no longer trigger ruleutils.c's heuristic about qualifying variable names when the rangetable's length is more than 1. That results in quite a number of small simplifications in regression test outputs, which are all to the good IMO. Bump catversion because we need to dump a few more fields of RTE_SUBQUERY RTEs. While those will always be zeroes anyway in stored rules (because we'd never populate them until query rewrite) they are useful for debugging, and it seems like we'd better make sure to transmit such RTEs accurately in plans sent to parallel workers. I don't think the executor actually examines these fields after startup, but someday it might. This is a second attempt at committing 1b4d280ea. The difference from the first time is that now we can add some filtering rules to AdjustUpgrade.pm to allow cross-version upgrade testing to pass despite all the cosmetic changes in CREATE VIEW outputs. Amit Langote (filtering rules by me) Discussion: https://postgr.es/m/CA+HiwqEf7gPN4Hn+LoZ4tP2q_Qt7n3vw7-6fJKOf92tSEnX6Gg@mail.gmail.com Discussion: https://postgr.es/m/891521.1673657296@sss.pgh.pa.us
2023-01-18 19:23:57 +01:00
/* we re-use these RELATION fields, too: */
WRITE_OID_FIELD(relid);
break;
case RTE_RESULT:
/* no extra fields */
break;
default:
elog(ERROR, "unrecognized RTE kind: %d", (int) node->rtekind);
break;
}
WRITE_BOOL_FIELD(lateral);
WRITE_BOOL_FIELD(inh);
WRITE_BOOL_FIELD(inFromCl);
WRITE_NODE_FIELD(securityQuals);
}
static void
_outA_Expr(StringInfo str, const A_Expr *node)
{
WRITE_NODE_TYPE("A_EXPR");
switch (node->kind)
{
case AEXPR_OP:
WRITE_NODE_FIELD(name);
break;
case AEXPR_OP_ANY:
appendStringInfoString(str, " ANY");
WRITE_NODE_FIELD(name);
break;
case AEXPR_OP_ALL:
appendStringInfoString(str, " ALL");
WRITE_NODE_FIELD(name);
break;
case AEXPR_DISTINCT:
appendStringInfoString(str, " DISTINCT");
WRITE_NODE_FIELD(name);
break;
case AEXPR_NOT_DISTINCT:
appendStringInfoString(str, " NOT_DISTINCT");
WRITE_NODE_FIELD(name);
break;
case AEXPR_NULLIF:
appendStringInfoString(str, " NULLIF");
WRITE_NODE_FIELD(name);
break;
case AEXPR_IN:
appendStringInfoString(str, " IN");
WRITE_NODE_FIELD(name);
break;
case AEXPR_LIKE:
appendStringInfoString(str, " LIKE");
WRITE_NODE_FIELD(name);
break;
case AEXPR_ILIKE:
appendStringInfoString(str, " ILIKE");
WRITE_NODE_FIELD(name);
break;
case AEXPR_SIMILAR:
appendStringInfoString(str, " SIMILAR");
WRITE_NODE_FIELD(name);
break;
case AEXPR_BETWEEN:
appendStringInfoString(str, " BETWEEN");
WRITE_NODE_FIELD(name);
break;
case AEXPR_NOT_BETWEEN:
appendStringInfoString(str, " NOT_BETWEEN");
WRITE_NODE_FIELD(name);
break;
case AEXPR_BETWEEN_SYM:
appendStringInfoString(str, " BETWEEN_SYM");
WRITE_NODE_FIELD(name);
break;
case AEXPR_NOT_BETWEEN_SYM:
appendStringInfoString(str, " NOT_BETWEEN_SYM");
WRITE_NODE_FIELD(name);
break;
default:
elog(ERROR, "unrecognized A_Expr_Kind: %d", (int) node->kind);
break;
}
WRITE_NODE_FIELD(lexpr);
WRITE_NODE_FIELD(rexpr);
WRITE_LOCATION_FIELD(location);
}
static void
_outInteger(StringInfo str, const Integer *node)
{
appendStringInfo(str, "%d", node->ival);
}
Redesign tablesample method API, and do extensive code review. The original implementation of TABLESAMPLE modeled the tablesample method API on index access methods, which wasn't a good choice because, without specialized DDL commands, there's no way to build an extension that can implement a TSM. (Raw inserts into system catalogs are not an acceptable thing to do, because we can't undo them during DROP EXTENSION, nor will pg_upgrade behave sanely.) Instead adopt an API more like procedural language handlers or foreign data wrappers, wherein the only SQL-level support object needed is a single handler function identified by having a special return type. This lets us get rid of the supporting catalog altogether, so that no custom DDL support is needed for the feature. Adjust the API so that it can support non-constant tablesample arguments (the original coding assumed we could evaluate the argument expressions at ExecInitSampleScan time, which is undesirable even if it weren't outright unsafe), and discourage sampling methods from looking at invisible tuples. Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable within and across queries, as required by the SQL standard, and deal more honestly with methods that can't support that requirement. Make a full code-review pass over the tablesample additions, and fix assorted bugs, omissions, infelicities, and cosmetic issues (such as failure to put the added code stanzas in a consistent ordering). Improve EXPLAIN's output of tablesample plans, too. Back-patch to 9.5 so that we don't have to support the original API in production.
2015-07-25 20:39:00 +02:00
static void
_outFloat(StringInfo str, const Float *node)
Redesign tablesample method API, and do extensive code review. The original implementation of TABLESAMPLE modeled the tablesample method API on index access methods, which wasn't a good choice because, without specialized DDL commands, there's no way to build an extension that can implement a TSM. (Raw inserts into system catalogs are not an acceptable thing to do, because we can't undo them during DROP EXTENSION, nor will pg_upgrade behave sanely.) Instead adopt an API more like procedural language handlers or foreign data wrappers, wherein the only SQL-level support object needed is a single handler function identified by having a special return type. This lets us get rid of the supporting catalog altogether, so that no custom DDL support is needed for the feature. Adjust the API so that it can support non-constant tablesample arguments (the original coding assumed we could evaluate the argument expressions at ExecInitSampleScan time, which is undesirable even if it weren't outright unsafe), and discourage sampling methods from looking at invisible tuples. Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable within and across queries, as required by the SQL standard, and deal more honestly with methods that can't support that requirement. Make a full code-review pass over the tablesample additions, and fix assorted bugs, omissions, infelicities, and cosmetic issues (such as failure to put the added code stanzas in a consistent ordering). Improve EXPLAIN's output of tablesample plans, too. Back-patch to 9.5 so that we don't have to support the original API in production.
2015-07-25 20:39:00 +02:00
{
/*
* We assume the value is a valid numeric literal and so does not need
* quoting.
*/
appendStringInfoString(str, node->fval);
Redesign tablesample method API, and do extensive code review. The original implementation of TABLESAMPLE modeled the tablesample method API on index access methods, which wasn't a good choice because, without specialized DDL commands, there's no way to build an extension that can implement a TSM. (Raw inserts into system catalogs are not an acceptable thing to do, because we can't undo them during DROP EXTENSION, nor will pg_upgrade behave sanely.) Instead adopt an API more like procedural language handlers or foreign data wrappers, wherein the only SQL-level support object needed is a single handler function identified by having a special return type. This lets us get rid of the supporting catalog altogether, so that no custom DDL support is needed for the feature. Adjust the API so that it can support non-constant tablesample arguments (the original coding assumed we could evaluate the argument expressions at ExecInitSampleScan time, which is undesirable even if it weren't outright unsafe), and discourage sampling methods from looking at invisible tuples. Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable within and across queries, as required by the SQL standard, and deal more honestly with methods that can't support that requirement. Make a full code-review pass over the tablesample additions, and fix assorted bugs, omissions, infelicities, and cosmetic issues (such as failure to put the added code stanzas in a consistent ordering). Improve EXPLAIN's output of tablesample plans, too. Back-patch to 9.5 so that we don't have to support the original API in production.
2015-07-25 20:39:00 +02:00
}
static void
_outBoolean(StringInfo str, const Boolean *node)
{
appendStringInfoString(str, node->boolval ? "true" : "false");
}
static void
_outString(StringInfo str, const String *node)
{
/*
* We use outToken to provide escaping of the string's content, but we
* don't want it to convert an empty string to '""', because we're putting
* double quotes around the string already.
*/
appendStringInfoChar(str, '"');
if (node->sval[0] != '\0')
outToken(str, node->sval);
appendStringInfoChar(str, '"');
}
static void
_outBitString(StringInfo str, const BitString *node)
{
/* internal representation already has leading 'b' */
appendStringInfoString(str, node->bsval);
}
static void
_outA_Const(StringInfo str, const A_Const *node)
{
WRITE_NODE_TYPE("A_CONST");
if (node->isnull)
appendStringInfoString(str, " NULL");
else
{
appendStringInfoString(str, " :val ");
outNode(str, &node->val);
}
WRITE_LOCATION_FIELD(location);
}
static void
_outConstraint(StringInfo str, const Constraint *node)
{
WRITE_NODE_TYPE("CONSTRAINT");
WRITE_STRING_FIELD(conname);
WRITE_BOOL_FIELD(deferrable);
WRITE_BOOL_FIELD(initdeferred);
WRITE_LOCATION_FIELD(location);
appendStringInfoString(str, " :contype ");
switch (node->contype)
{
case CONSTR_NULL:
appendStringInfoString(str, "NULL");
break;
case CONSTR_NOTNULL:
appendStringInfoString(str, "NOT_NULL");
break;
case CONSTR_DEFAULT:
appendStringInfoString(str, "DEFAULT");
WRITE_NODE_FIELD(raw_expr);
WRITE_STRING_FIELD(cooked_expr);
break;
case CONSTR_IDENTITY:
appendStringInfoString(str, "IDENTITY");
WRITE_NODE_FIELD(options);
WRITE_CHAR_FIELD(generated_when);
break;
case CONSTR_GENERATED:
appendStringInfoString(str, "GENERATED");
WRITE_NODE_FIELD(raw_expr);
WRITE_STRING_FIELD(cooked_expr);
WRITE_CHAR_FIELD(generated_when);
break;
case CONSTR_CHECK:
appendStringInfoString(str, "CHECK");
WRITE_BOOL_FIELD(is_no_inherit);
WRITE_NODE_FIELD(raw_expr);
WRITE_STRING_FIELD(cooked_expr);
WRITE_BOOL_FIELD(skip_validation);
WRITE_BOOL_FIELD(initially_valid);
break;
case CONSTR_PRIMARY:
appendStringInfoString(str, "PRIMARY_KEY");
WRITE_NODE_FIELD(keys);
WRITE_NODE_FIELD(including);
WRITE_NODE_FIELD(options);
WRITE_STRING_FIELD(indexname);
WRITE_STRING_FIELD(indexspace);
WRITE_BOOL_FIELD(reset_default_tblspc);
/* access_method and where_clause not currently used */
break;
case CONSTR_UNIQUE:
appendStringInfoString(str, "UNIQUE");
WRITE_BOOL_FIELD(nulls_not_distinct);
WRITE_NODE_FIELD(keys);
WRITE_NODE_FIELD(including);
WRITE_NODE_FIELD(options);
WRITE_STRING_FIELD(indexname);
WRITE_STRING_FIELD(indexspace);
WRITE_BOOL_FIELD(reset_default_tblspc);
/* access_method and where_clause not currently used */
break;
case CONSTR_EXCLUSION:
appendStringInfoString(str, "EXCLUSION");
WRITE_NODE_FIELD(exclusions);
WRITE_NODE_FIELD(including);
WRITE_NODE_FIELD(options);
WRITE_STRING_FIELD(indexname);
WRITE_STRING_FIELD(indexspace);
WRITE_BOOL_FIELD(reset_default_tblspc);
WRITE_STRING_FIELD(access_method);
WRITE_NODE_FIELD(where_clause);
break;
case CONSTR_FOREIGN:
appendStringInfoString(str, "FOREIGN_KEY");
WRITE_NODE_FIELD(pktable);
WRITE_NODE_FIELD(fk_attrs);
WRITE_NODE_FIELD(pk_attrs);
WRITE_CHAR_FIELD(fk_matchtype);
WRITE_CHAR_FIELD(fk_upd_action);
WRITE_CHAR_FIELD(fk_del_action);
WRITE_NODE_FIELD(fk_del_set_cols);
WRITE_NODE_FIELD(old_conpfeqop);
WRITE_OID_FIELD(old_pktable_oid);
WRITE_BOOL_FIELD(skip_validation);
WRITE_BOOL_FIELD(initially_valid);
break;
case CONSTR_ATTR_DEFERRABLE:
appendStringInfoString(str, "ATTR_DEFERRABLE");
break;
case CONSTR_ATTR_NOT_DEFERRABLE:
appendStringInfoString(str, "ATTR_NOT_DEFERRABLE");
break;
case CONSTR_ATTR_DEFERRED:
appendStringInfoString(str, "ATTR_DEFERRED");
break;
case CONSTR_ATTR_IMMEDIATE:
appendStringInfoString(str, "ATTR_IMMEDIATE");
break;
default:
elog(ERROR, "unrecognized ConstrType: %d", (int) node->contype);
break;
}
}
/*
* outNode -
* converts a Node into ascii string and append it to 'str'
*/
void
outNode(StringInfo str, const void *obj)
{
/* Guard against stack overflow due to overly complex expressions */
check_stack_depth();
if (obj == NULL)
appendStringInfoString(str, "<>");
else if (IsA(obj, List) || IsA(obj, IntList) || IsA(obj, OidList) ||
IsA(obj, XidList))
_outList(str, obj);
/* nodeRead does not want to see { } around these! */
else if (IsA(obj, Integer))
_outInteger(str, (Integer *) obj);
else if (IsA(obj, Float))
_outFloat(str, (Float *) obj);
else if (IsA(obj, Boolean))
_outBoolean(str, (Boolean *) obj);
else if (IsA(obj, String))
_outString(str, (String *) obj);
else if (IsA(obj, BitString))
_outBitString(str, (BitString *) obj);
else if (IsA(obj, Bitmapset))
outBitmapset(str, (Bitmapset *) obj);
else
{
appendStringInfoChar(str, '{');
switch (nodeTag(obj))
{
Automatically generate node support functions Add a script to automatically generate the node support functions (copy, equal, out, and read, as well as the node tags enum) from the struct definitions. For each of the four node support files, it creates two include files, e.g., copyfuncs.funcs.c and copyfuncs.switch.c, to include in the main file. All the scaffolding of the main file stays in place. I have tried to mostly make the coverage of the output match what is currently there. For example, one could now do out/read coverage of utility statement nodes, but I have manually excluded those for now. The reason is mainly that it's easier to diff the before and after, and adding a bunch of stuff like this might require a separate analysis and review. Subtyping (TidScan -> Scan) is supported. For the hard cases, you can just write a manual function and exclude generating one. For the not so hard cases, there is a way of annotating struct fields to get special behaviors. For example, pg_node_attr(equal_ignore) has the field ignored in equal functions. (In this patch, I have only ifdef'ed out the code to could be removed, mainly so that it won't constantly have merge conflicts. It will be deleted in a separate patch. All the code comments that are worth keeping from those sections have already been moved to the header files where the structs are defined.) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce%40enterprisedb.com
2022-07-09 08:52:19 +02:00
#include "outfuncs.switch.c"
default:
2003-08-04 02:43:34 +02:00
/*
* This should be an ERROR, but it's too useful to be able to
* dump structures that outNode only understands part of.
*/
elog(WARNING, "could not dump unrecognized node type: %d",
(int) nodeTag(obj));
break;
}
appendStringInfoChar(str, '}');
}
}
/*
* nodeToString -
* returns the ascii representation of the Node as a palloc'd string
*/
char *
nodeToString(const void *obj)
{
StringInfoData str;
/* see stringinfo.h for an explanation of this maneuver */
initStringInfo(&str);
outNode(&str, obj);
return str.data;
}
/*
* bmsToString -
* returns the ascii representation of the Bitmapset as a palloc'd string
*/
char *
bmsToString(const Bitmapset *bms)
{
StringInfoData str;
/* see stringinfo.h for an explanation of this maneuver */
initStringInfo(&str);
outBitmapset(&str, bms);
return str.data;
}