postgresql/contrib/postgres_fdw/deparse.c

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

4209 lines
118 KiB
C
Raw Normal View History

/*-------------------------------------------------------------------------
*
* deparse.c
* Query deparser for postgres_fdw
*
* This file includes functions that examine query WHERE clauses to see
* whether they're safe to send to the remote server for execution, as
* well as functions to construct the query text to be sent. The latter
* functionality is annoyingly duplicative of ruleutils.c, but there are
* enough special considerations that it seems best to keep this separate.
* One saving grace is that we only need deparse logic for node types that
* we consider safe to send.
*
* We assume that the remote session's search_path is exactly "pg_catalog",
* and thus we need schema-qualify all and only names outside pg_catalog.
*
* We do not consider that it is ever safe to send COLLATE expressions to
* the remote server: it might not have the same collation names we do.
* (Later we might consider it safe to send COLLATE "C", but even that would
Improve handling of collations in contrib/postgres_fdw. If we have a local Var of say varchar type with default collation, and we apply a RelabelType to convert that to text with default collation, we don't want to consider that as creating an FDW_COLLATE_UNSAFE situation. It should be okay to compare that to a remote Var, so long as the remote Var determines the comparison collation. (When we actually ship such an expression to the remote side, the local Var would become a Param with default collation, meaning the remote Var would in fact control the comparison collation, because non-default implicit collation overrides default implicit collation in parse_collate.c.) To fix, be more precise about what FDW_COLLATE_NONE means: it applies either to a noncollatable data type or to a collatable type with default collation, if that collation can't be traced to a remote Var. (When it can, FDW_COLLATE_SAFE is appropriate.) We were essentially using that interpretation already at the Var/Const/Param level, but we weren't bubbling it up properly. An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT value to describe the second situation, but that would add more code without changing the actual behavior, so it didn't seem worthwhile. Also, since we're clarifying the rule to be that we care about whether operator/function input collations match, there seems no need to fail immediately upon seeing a Const/Param/non-foreign-Var with nondefault collation. We only have to reject if it appears in a collation-sensitive context (for example, "var IS NOT NULL" is perfectly safe from a collation standpoint, whatever collation the var has). So just set the state to UNSAFE rather than failing immediately. Per report from Jeevan Chalke. This essentially corrects some sloppy thinking in commit ed3ddf918b59545583a4b374566bc1148e75f593, so back-patch to 9.3 where that logic appeared.
2015-09-24 18:47:29 +02:00
* fail on old remote servers.) An expression is considered safe to send
* only if all operator/function input collations used in it are traceable to
* Var(s) of the foreign table. That implies that if the remote server gets
* a different answer than we do, the foreign table's columns are not marked
* with collations that match the remote table's columns, which we can
* consider to be user error.
*
* Portions Copyright (c) 2012-2024, PostgreSQL Global Development Group
*
* IDENTIFICATION
* contrib/postgres_fdw/deparse.c
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "access/htup_details.h"
#include "access/sysattr.h"
#include "access/table.h"
#include "catalog/pg_aggregate.h"
#include "catalog/pg_authid.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_operator.h"
#include "catalog/pg_opfamily.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_ts_config.h"
#include "catalog/pg_ts_dict.h"
#include "catalog/pg_type.h"
#include "commands/defrem.h"
#include "commands/tablecmds.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
#include "nodes/plannodes.h"
#include "optimizer/optimizer.h"
#include "optimizer/prep.h"
#include "optimizer/tlist.h"
#include "parser/parsetree.h"
#include "postgres_fdw.h"
#include "utils/builtins.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
#include "utils/syscache.h"
#include "utils/typcache.h"
/*
* Global context for foreign_expr_walker's search of an expression tree.
*/
typedef struct foreign_glob_cxt
{
PlannerInfo *root; /* global planner state */
RelOptInfo *foreignrel; /* the foreign relation we are planning for */
Relids relids; /* relids of base relations in the underlying
* scan */
} foreign_glob_cxt;
/*
* Local (per-tree-level) context for foreign_expr_walker's search.
* This is concerned with identifying collations used in the expression.
*/
typedef enum
{
Improve handling of collations in contrib/postgres_fdw. If we have a local Var of say varchar type with default collation, and we apply a RelabelType to convert that to text with default collation, we don't want to consider that as creating an FDW_COLLATE_UNSAFE situation. It should be okay to compare that to a remote Var, so long as the remote Var determines the comparison collation. (When we actually ship such an expression to the remote side, the local Var would become a Param with default collation, meaning the remote Var would in fact control the comparison collation, because non-default implicit collation overrides default implicit collation in parse_collate.c.) To fix, be more precise about what FDW_COLLATE_NONE means: it applies either to a noncollatable data type or to a collatable type with default collation, if that collation can't be traced to a remote Var. (When it can, FDW_COLLATE_SAFE is appropriate.) We were essentially using that interpretation already at the Var/Const/Param level, but we weren't bubbling it up properly. An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT value to describe the second situation, but that would add more code without changing the actual behavior, so it didn't seem worthwhile. Also, since we're clarifying the rule to be that we care about whether operator/function input collations match, there seems no need to fail immediately upon seeing a Const/Param/non-foreign-Var with nondefault collation. We only have to reject if it appears in a collation-sensitive context (for example, "var IS NOT NULL" is perfectly safe from a collation standpoint, whatever collation the var has). So just set the state to UNSAFE rather than failing immediately. Per report from Jeevan Chalke. This essentially corrects some sloppy thinking in commit ed3ddf918b59545583a4b374566bc1148e75f593, so back-patch to 9.3 where that logic appeared.
2015-09-24 18:47:29 +02:00
FDW_COLLATE_NONE, /* expression is of a noncollatable type, or
* it has default collation that is not
* traceable to a foreign Var */
FDW_COLLATE_SAFE, /* collation derives from a foreign Var */
Improve handling of collations in contrib/postgres_fdw. If we have a local Var of say varchar type with default collation, and we apply a RelabelType to convert that to text with default collation, we don't want to consider that as creating an FDW_COLLATE_UNSAFE situation. It should be okay to compare that to a remote Var, so long as the remote Var determines the comparison collation. (When we actually ship such an expression to the remote side, the local Var would become a Param with default collation, meaning the remote Var would in fact control the comparison collation, because non-default implicit collation overrides default implicit collation in parse_collate.c.) To fix, be more precise about what FDW_COLLATE_NONE means: it applies either to a noncollatable data type or to a collatable type with default collation, if that collation can't be traced to a remote Var. (When it can, FDW_COLLATE_SAFE is appropriate.) We were essentially using that interpretation already at the Var/Const/Param level, but we weren't bubbling it up properly. An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT value to describe the second situation, but that would add more code without changing the actual behavior, so it didn't seem worthwhile. Also, since we're clarifying the rule to be that we care about whether operator/function input collations match, there seems no need to fail immediately upon seeing a Const/Param/non-foreign-Var with nondefault collation. We only have to reject if it appears in a collation-sensitive context (for example, "var IS NOT NULL" is perfectly safe from a collation standpoint, whatever collation the var has). So just set the state to UNSAFE rather than failing immediately. Per report from Jeevan Chalke. This essentially corrects some sloppy thinking in commit ed3ddf918b59545583a4b374566bc1148e75f593, so back-patch to 9.3 where that logic appeared.
2015-09-24 18:47:29 +02:00
FDW_COLLATE_UNSAFE, /* collation is non-default and derives from
* something other than a foreign Var */
} FDWCollateState;
typedef struct foreign_loc_cxt
{
Oid collation; /* OID of current collation, if any */
FDWCollateState state; /* state of current collation choice */
} foreign_loc_cxt;
/*
* Context for deparseExpr
*/
typedef struct deparse_expr_cxt
{
PlannerInfo *root; /* global planner state */
RelOptInfo *foreignrel; /* the foreign relation we are planning for */
RelOptInfo *scanrel; /* the underlying scan relation. Same as
* foreignrel, when that represents a join or
* a base relation. */
StringInfo buf; /* output buffer to append to */
List **params_list; /* exprs that will become remote Params */
} deparse_expr_cxt;
#define REL_ALIAS_PREFIX "r"
/* Handy macro to add relation name qualification */
#define ADD_REL_QUALIFIER(buf, varno) \
appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno))
#define SUBQUERY_REL_ALIAS_PREFIX "s"
#define SUBQUERY_COL_ALIAS_PREFIX "c"
/*
* Functions to determine whether an expression can be evaluated safely on
* remote server.
*/
static bool foreign_expr_walker(Node *node,
foreign_glob_cxt *glob_cxt,
foreign_loc_cxt *outer_cxt,
foreign_loc_cxt *case_arg_cxt);
static char *deparse_type_name(Oid type_oid, int32 typemod);
/*
* Functions to construct string representation of a node tree.
*/
static void deparseTargetList(StringInfo buf,
RangeTblEntry *rte,
Index rtindex,
Relation rel,
bool is_returning,
Bitmapset *attrs_used,
bool qualify_col,
List **retrieved_attrs);
static void deparseExplicitTargetList(List *tlist,
bool is_returning,
List **retrieved_attrs,
deparse_expr_cxt *context);
static void deparseSubqueryTargetList(deparse_expr_cxt *context);
static void deparseReturningList(StringInfo buf, RangeTblEntry *rte,
Index rtindex, Relation rel,
bool trig_after_row,
Fix WITH CHECK OPTION on views referencing postgres_fdw tables. If a view references a foreign table, and the foreign table has a BEFORE INSERT trigger, then it's possible for a tuple inserted or updated through the view to be changed such that it violates the view's WITH CHECK OPTION constraint. Before this commit, postgres_fdw handled this case inconsistently. A RETURNING clause on the INSERT or UPDATE statement targeting the view would cause the finally-inserted tuple to be read back, and the WITH CHECK OPTION violation would throw an error. But without a RETURNING clause, postgres_fdw would not read the final tuple back, and WITH CHECK OPTION would not throw an error for the violation (or may throw an error when there is no real violation). AFTER ROW triggers on the foreign table had a similar effect as a RETURNING clause on the INSERT or UPDATE statement. To fix, this commit retrieves the attributes needed to enforce the WITH CHECK OPTION constraint along with the attributes needed for the RETURNING clause (if any) from the remote side. Thus, the WITH CHECK OPTION constraint is always evaluated against the final tuple after any triggers on the remote side. This fix may be considered inconsistent with CHECK constraints declared on foreign tables, which are not enforced locally at all (because the constraint is on a remote object). The discussion concluded that this difference is reasonable, because the WITH CHECK OPTION is a constraint on the local view (not any remote object); therefore it only makes sense to enforce its WITH CHECK OPTION constraint locally. Author: Etsuro Fujita Reviewed-by: Arthur Zakirov, Stephen Frost Discussion: https://www.postgresql.org/message-id/7eb58fab-fd3b-781b-ac33-f7cfec96021f%40lab.ntt.co.jp
2018-07-08 09:14:51 +02:00
List *withCheckOptionList,
List *returningList,
List **retrieved_attrs);
static void deparseColumnRef(StringInfo buf, int varno, int varattno,
RangeTblEntry *rte, bool qualify_col);
static void deparseRelation(StringInfo buf, Relation rel);
static void deparseExpr(Expr *node, deparse_expr_cxt *context);
static void deparseVar(Var *node, deparse_expr_cxt *context);
static void deparseConst(Const *node, deparse_expr_cxt *context, int showtype);
static void deparseParam(Param *node, deparse_expr_cxt *context);
static void deparseSubscriptingRef(SubscriptingRef *node, deparse_expr_cxt *context);
static void deparseFuncExpr(FuncExpr *node, deparse_expr_cxt *context);
static void deparseOpExpr(OpExpr *node, deparse_expr_cxt *context);
postgres_fdw: suppress casts on constants in limited cases. When deparsing an expression of the form "remote_var OP constant", we'd normally apply a cast to the constant to make sure that the remote parser thinks it's of the same type we do. However, doing so is often not necessary, and it causes problems if the user has intentionally declared the local column as being of a different type than the remote column. A plausible use-case for that is using text to represent a type that's an enum on the remote side. A comparison on such a column will get shipped as "var = 'foo'::text", which blows up on the remote side because there's no enum = text operator. But if we simply leave off the explicit cast, the comparison will do exactly what the user wants. It's possible to do this without major risk of semantic problems, by relying on the longstanding parser heuristic that "if one operand of an operator is of type unknown, while the other one has a known type, assume that the unknown operand is also of that type". Hence, this patch leaves off the cast only if (a) the operator inputs have the same type locally; (b) the constant will print as a string literal or NULL, both of which are initially taken as type unknown; and (c) the non-Const input is a plain foreign Var. Rule (c) guarantees that the remote parser will know the type of the non-Const input; moreover, it means that if this cast-omission does cause any semantic surprises, that can only happen in cases where the local column has a different type than the remote column. That wasn't guaranteed to work anyway, and this patch should represent a net usability gain for such cases. One point that I (tgl) remain slightly uncomfortable with is that we will ignore an implicit RelabelType when deciding if the non-Const input is a plain Var. That makes it a little squishy to argue that the remote should resolve the Const as being of the same type as its Var, because then our Const is not the same type as our Var. However, if we don't do that, then this hack won't work as desired if the user chooses to use varchar rather than text to represent some remote column. That seems useful, so do it like this for now. We might have to give up the RelabelType-ignoring bit if any problems surface. Dian Fay, with review and kibitzing by me Discussion: https://postgr.es/m/C9LU294V7K4F.34LRRDU449O45@lamia
2021-11-12 17:50:40 +01:00
static bool isPlainForeignVar(Expr *node, deparse_expr_cxt *context);
static void deparseOperatorName(StringInfo buf, Form_pg_operator opform);
static void deparseDistinctExpr(DistinctExpr *node, deparse_expr_cxt *context);
static void deparseScalarArrayOpExpr(ScalarArrayOpExpr *node,
deparse_expr_cxt *context);
static void deparseRelabelType(RelabelType *node, deparse_expr_cxt *context);
static void deparseBoolExpr(BoolExpr *node, deparse_expr_cxt *context);
static void deparseNullTest(NullTest *node, deparse_expr_cxt *context);
static void deparseCaseExpr(CaseExpr *node, deparse_expr_cxt *context);
static void deparseArrayExpr(ArrayExpr *node, deparse_expr_cxt *context);
static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod,
deparse_expr_cxt *context);
static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
deparse_expr_cxt *context);
static void deparseSelectSql(List *tlist, bool is_subquery, List **retrieved_attrs,
deparse_expr_cxt *context);
static void deparseLockingClause(deparse_expr_cxt *context);
static void appendOrderByClause(List *pathkeys, bool has_final_sort,
deparse_expr_cxt *context);
static void appendLimitClause(deparse_expr_cxt *context);
static void appendConditions(List *exprs, deparse_expr_cxt *context);
static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root,
RelOptInfo *foreignrel, bool use_alias,
Index ignore_rel, List **ignore_conds,
List **additional_conds,
List **params_list);
static void appendWhereClause(List *exprs, List *additional_conds,
deparse_expr_cxt *context);
static void deparseFromExpr(List *quals, deparse_expr_cxt *context);
static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root,
RelOptInfo *foreignrel, bool make_subquery,
Index ignore_rel, List **ignore_conds,
List **additional_conds, List **params_list);
static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
static void appendOrderBySuffix(Oid sortop, Oid sortcoltype, bool nulls_first,
deparse_expr_cxt *context);
static void appendAggOrderBy(List *orderList, List *targetList,
deparse_expr_cxt *context);
static void appendFunctionName(Oid funcid, deparse_expr_cxt *context);
static Node *deparseSortGroupClause(Index ref, List *tlist, bool force_colno,
deparse_expr_cxt *context);
/*
* Helper functions
*/
static bool is_subquery_var(Var *node, RelOptInfo *foreignrel,
int *relno, int *colno);
static void get_relation_column_alias_ids(Var *node, RelOptInfo *foreignrel,
int *relno, int *colno);
/*
* Examine each qual clause in input_conds, and classify them into two groups,
* which are returned as two lists:
* - remote_conds contains expressions that can be evaluated remotely
* - local_conds contains expressions that can't be evaluated remotely
*/
void
classifyConditions(PlannerInfo *root,
RelOptInfo *baserel,
List *input_conds,
List **remote_conds,
List **local_conds)
{
ListCell *lc;
*remote_conds = NIL;
*local_conds = NIL;
foreach(lc, input_conds)
{
RestrictInfo *ri = lfirst_node(RestrictInfo, lc);
if (is_foreign_expr(root, baserel, ri->clause))
*remote_conds = lappend(*remote_conds, ri);
else
*local_conds = lappend(*local_conds, ri);
}
}
/*
* Returns true if given expr is safe to evaluate on the foreign server.
*/
bool
is_foreign_expr(PlannerInfo *root,
RelOptInfo *baserel,
Expr *expr)
{
foreign_glob_cxt glob_cxt;
foreign_loc_cxt loc_cxt;
PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) (baserel->fdw_private);
/*
* Check that the expression consists of nodes that are safe to execute
* remotely.
*/
glob_cxt.root = root;
glob_cxt.foreignrel = baserel;
/*
* For an upper relation, use relids from its underneath scan relation,
* because the upperrel's own relids currently aren't set to anything
* meaningful by the core code. For other relation, use their own relids.
*/
if (IS_UPPER_REL(baserel))
glob_cxt.relids = fpinfo->outerrel->relids;
else
glob_cxt.relids = baserel->relids;
loc_cxt.collation = InvalidOid;
loc_cxt.state = FDW_COLLATE_NONE;
if (!foreign_expr_walker((Node *) expr, &glob_cxt, &loc_cxt, NULL))
return false;
/*
* If the expression has a valid collation that does not arise from a
* foreign var, the expression can not be sent over.
*/
if (loc_cxt.state == FDW_COLLATE_UNSAFE)
return false;
/*
* An expression which includes any mutable functions can't be sent over
* because its result is not stable. For example, sending now() remote
* side could cause confusion from clock offsets. Future versions might
* be able to make this choice with more granularity. (We check this last
* because it requires a lot of expensive catalog lookups.)
*/
if (contain_mutable_functions((Node *) expr))
return false;
/* OK to evaluate on the remote server */
return true;
}
/*
* Check if expression is safe to execute remotely, and return true if so.
*
* In addition, *outer_cxt is updated with collation information.
*
* case_arg_cxt is NULL if this subexpression is not inside a CASE-with-arg.
* Otherwise, it points to the collation info derived from the arg expression,
* which must be consulted by any CaseTestExpr.
*
* We must check that the expression contains only node types we can deparse,
* that all types/functions/operators are safe to send (they are "shippable"),
* and that all collations used in the expression derive from Vars of the
* foreign table. Because of the latter, the logic is pretty close to
* assign_collations_walker() in parse_collate.c, though we can assume here
* that the given expression is valid. Note function mutability is not
* currently considered here.
*/
static bool
foreign_expr_walker(Node *node,
foreign_glob_cxt *glob_cxt,
foreign_loc_cxt *outer_cxt,
foreign_loc_cxt *case_arg_cxt)
{
bool check_type = true;
PgFdwRelationInfo *fpinfo;
foreign_loc_cxt inner_cxt;
Oid collation;
FDWCollateState state;
/* Need do nothing for empty subexpressions */
if (node == NULL)
return true;
/* May need server info from baserel's fdw_private struct */
fpinfo = (PgFdwRelationInfo *) (glob_cxt->foreignrel->fdw_private);
/* Set up inner_cxt for possible recursion to child nodes */
inner_cxt.collation = InvalidOid;
inner_cxt.state = FDW_COLLATE_NONE;
switch (nodeTag(node))
{
case T_Var:
{
Var *var = (Var *) node;
/*
* If the Var is from the foreign table, we consider its
* collation (if any) safe to use. If it is from another
* table, we treat its collation the same way as we would a
* Param's collation, ie it's not safe for it to have a
* non-default collation.
*/
if (bms_is_member(var->varno, glob_cxt->relids) &&
var->varlevelsup == 0)
{
/* Var belongs to foreign table */
/*
Remove WITH OIDS support, change oid catalog column visibility. Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_*->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by * (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
* System columns other than ctid should not be sent to
* the remote, since we don't make any effort to ensure
* that local and remote values match (tableoid, in
* particular, almost certainly doesn't match).
*/
if (var->varattno < 0 &&
Remove WITH OIDS support, change oid catalog column visibility. Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_*->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by * (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
var->varattno != SelfItemPointerAttributeNumber)
return false;
/* Else check the collation */
collation = var->varcollid;
state = OidIsValid(collation) ? FDW_COLLATE_SAFE : FDW_COLLATE_NONE;
}
else
{
/* Var belongs to some other table */
Improve handling of collations in contrib/postgres_fdw. If we have a local Var of say varchar type with default collation, and we apply a RelabelType to convert that to text with default collation, we don't want to consider that as creating an FDW_COLLATE_UNSAFE situation. It should be okay to compare that to a remote Var, so long as the remote Var determines the comparison collation. (When we actually ship such an expression to the remote side, the local Var would become a Param with default collation, meaning the remote Var would in fact control the comparison collation, because non-default implicit collation overrides default implicit collation in parse_collate.c.) To fix, be more precise about what FDW_COLLATE_NONE means: it applies either to a noncollatable data type or to a collatable type with default collation, if that collation can't be traced to a remote Var. (When it can, FDW_COLLATE_SAFE is appropriate.) We were essentially using that interpretation already at the Var/Const/Param level, but we weren't bubbling it up properly. An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT value to describe the second situation, but that would add more code without changing the actual behavior, so it didn't seem worthwhile. Also, since we're clarifying the rule to be that we care about whether operator/function input collations match, there seems no need to fail immediately upon seeing a Const/Param/non-foreign-Var with nondefault collation. We only have to reject if it appears in a collation-sensitive context (for example, "var IS NOT NULL" is perfectly safe from a collation standpoint, whatever collation the var has). So just set the state to UNSAFE rather than failing immediately. Per report from Jeevan Chalke. This essentially corrects some sloppy thinking in commit ed3ddf918b59545583a4b374566bc1148e75f593, so back-patch to 9.3 where that logic appeared.
2015-09-24 18:47:29 +02:00
collation = var->varcollid;
if (collation == InvalidOid ||
collation == DEFAULT_COLLATION_OID)
{
/*
* It's noncollatable, or it's safe to combine with a
* collatable foreign Var, so set state to NONE.
*/
state = FDW_COLLATE_NONE;
}
else
{
/*
* Do not fail right away, since the Var might appear
* in a collation-insensitive context.
*/
state = FDW_COLLATE_UNSAFE;
}
}
}
break;
case T_Const:
{
Const *c = (Const *) node;
/*
* Constants of regproc and related types can't be shipped
* unless the referenced object is shippable. But NULL's ok.
* (See also the related code in dependency.c.)
*/
if (!c->constisnull)
{
switch (c->consttype)
{
case REGPROCOID:
case REGPROCEDUREOID:
if (!is_shippable(DatumGetObjectId(c->constvalue),
ProcedureRelationId, fpinfo))
return false;
break;
case REGOPEROID:
case REGOPERATOROID:
if (!is_shippable(DatumGetObjectId(c->constvalue),
OperatorRelationId, fpinfo))
return false;
break;
case REGCLASSOID:
if (!is_shippable(DatumGetObjectId(c->constvalue),
RelationRelationId, fpinfo))
return false;
break;
case REGTYPEOID:
if (!is_shippable(DatumGetObjectId(c->constvalue),
TypeRelationId, fpinfo))
return false;
break;
case REGCOLLATIONOID:
if (!is_shippable(DatumGetObjectId(c->constvalue),
CollationRelationId, fpinfo))
return false;
break;
case REGCONFIGOID:
/*
* For text search objects only, we weaken the
* normal shippability criterion to allow all OIDs
* below FirstNormalObjectId. Without this, none
* of the initdb-installed TS configurations would
* be shippable, which would be quite annoying.
*/
if (DatumGetObjectId(c->constvalue) >= FirstNormalObjectId &&
!is_shippable(DatumGetObjectId(c->constvalue),
TSConfigRelationId, fpinfo))
return false;
break;
case REGDICTIONARYOID:
if (DatumGetObjectId(c->constvalue) >= FirstNormalObjectId &&
!is_shippable(DatumGetObjectId(c->constvalue),
TSDictionaryRelationId, fpinfo))
return false;
break;
case REGNAMESPACEOID:
if (!is_shippable(DatumGetObjectId(c->constvalue),
NamespaceRelationId, fpinfo))
return false;
break;
case REGROLEOID:
if (!is_shippable(DatumGetObjectId(c->constvalue),
AuthIdRelationId, fpinfo))
return false;
break;
}
}
/*
* If the constant has nondefault collation, either it's of a
Improve handling of collations in contrib/postgres_fdw. If we have a local Var of say varchar type with default collation, and we apply a RelabelType to convert that to text with default collation, we don't want to consider that as creating an FDW_COLLATE_UNSAFE situation. It should be okay to compare that to a remote Var, so long as the remote Var determines the comparison collation. (When we actually ship such an expression to the remote side, the local Var would become a Param with default collation, meaning the remote Var would in fact control the comparison collation, because non-default implicit collation overrides default implicit collation in parse_collate.c.) To fix, be more precise about what FDW_COLLATE_NONE means: it applies either to a noncollatable data type or to a collatable type with default collation, if that collation can't be traced to a remote Var. (When it can, FDW_COLLATE_SAFE is appropriate.) We were essentially using that interpretation already at the Var/Const/Param level, but we weren't bubbling it up properly. An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT value to describe the second situation, but that would add more code without changing the actual behavior, so it didn't seem worthwhile. Also, since we're clarifying the rule to be that we care about whether operator/function input collations match, there seems no need to fail immediately upon seeing a Const/Param/non-foreign-Var with nondefault collation. We only have to reject if it appears in a collation-sensitive context (for example, "var IS NOT NULL" is perfectly safe from a collation standpoint, whatever collation the var has). So just set the state to UNSAFE rather than failing immediately. Per report from Jeevan Chalke. This essentially corrects some sloppy thinking in commit ed3ddf918b59545583a4b374566bc1148e75f593, so back-patch to 9.3 where that logic appeared.
2015-09-24 18:47:29 +02:00
* non-builtin type, or it reflects folding of a CollateExpr.
* It's unsafe to send to the remote unless it's used in a
* non-collation-sensitive context.
*/
Improve handling of collations in contrib/postgres_fdw. If we have a local Var of say varchar type with default collation, and we apply a RelabelType to convert that to text with default collation, we don't want to consider that as creating an FDW_COLLATE_UNSAFE situation. It should be okay to compare that to a remote Var, so long as the remote Var determines the comparison collation. (When we actually ship such an expression to the remote side, the local Var would become a Param with default collation, meaning the remote Var would in fact control the comparison collation, because non-default implicit collation overrides default implicit collation in parse_collate.c.) To fix, be more precise about what FDW_COLLATE_NONE means: it applies either to a noncollatable data type or to a collatable type with default collation, if that collation can't be traced to a remote Var. (When it can, FDW_COLLATE_SAFE is appropriate.) We were essentially using that interpretation already at the Var/Const/Param level, but we weren't bubbling it up properly. An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT value to describe the second situation, but that would add more code without changing the actual behavior, so it didn't seem worthwhile. Also, since we're clarifying the rule to be that we care about whether operator/function input collations match, there seems no need to fail immediately upon seeing a Const/Param/non-foreign-Var with nondefault collation. We only have to reject if it appears in a collation-sensitive context (for example, "var IS NOT NULL" is perfectly safe from a collation standpoint, whatever collation the var has). So just set the state to UNSAFE rather than failing immediately. Per report from Jeevan Chalke. This essentially corrects some sloppy thinking in commit ed3ddf918b59545583a4b374566bc1148e75f593, so back-patch to 9.3 where that logic appeared.
2015-09-24 18:47:29 +02:00
collation = c->constcollid;
if (collation == InvalidOid ||
collation == DEFAULT_COLLATION_OID)
state = FDW_COLLATE_NONE;
else
state = FDW_COLLATE_UNSAFE;
}
break;
case T_Param:
{
Param *p = (Param *) node;
/*
* If it's a MULTIEXPR Param, punt. We can't tell from here
* whether the referenced sublink/subplan contains any remote
* Vars; if it does, handling that is too complicated to
* consider supporting at present. Fortunately, MULTIEXPR
* Params are not reduced to plain PARAM_EXEC until the end of
* planning, so we can easily detect this case. (Normal
* PARAM_EXEC Params are safe to ship because their values
* come from somewhere else in the plan tree; but a MULTIEXPR
* references a sub-select elsewhere in the same targetlist,
* so we'd be on the hook to evaluate it somehow if we wanted
* to handle such cases as direct foreign updates.)
*/
if (p->paramkind == PARAM_MULTIEXPR)
return false;
/*
Improve handling of collations in contrib/postgres_fdw. If we have a local Var of say varchar type with default collation, and we apply a RelabelType to convert that to text with default collation, we don't want to consider that as creating an FDW_COLLATE_UNSAFE situation. It should be okay to compare that to a remote Var, so long as the remote Var determines the comparison collation. (When we actually ship such an expression to the remote side, the local Var would become a Param with default collation, meaning the remote Var would in fact control the comparison collation, because non-default implicit collation overrides default implicit collation in parse_collate.c.) To fix, be more precise about what FDW_COLLATE_NONE means: it applies either to a noncollatable data type or to a collatable type with default collation, if that collation can't be traced to a remote Var. (When it can, FDW_COLLATE_SAFE is appropriate.) We were essentially using that interpretation already at the Var/Const/Param level, but we weren't bubbling it up properly. An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT value to describe the second situation, but that would add more code without changing the actual behavior, so it didn't seem worthwhile. Also, since we're clarifying the rule to be that we care about whether operator/function input collations match, there seems no need to fail immediately upon seeing a Const/Param/non-foreign-Var with nondefault collation. We only have to reject if it appears in a collation-sensitive context (for example, "var IS NOT NULL" is perfectly safe from a collation standpoint, whatever collation the var has). So just set the state to UNSAFE rather than failing immediately. Per report from Jeevan Chalke. This essentially corrects some sloppy thinking in commit ed3ddf918b59545583a4b374566bc1148e75f593, so back-patch to 9.3 where that logic appeared.
2015-09-24 18:47:29 +02:00
* Collation rule is same as for Consts and non-foreign Vars.
*/
Improve handling of collations in contrib/postgres_fdw. If we have a local Var of say varchar type with default collation, and we apply a RelabelType to convert that to text with default collation, we don't want to consider that as creating an FDW_COLLATE_UNSAFE situation. It should be okay to compare that to a remote Var, so long as the remote Var determines the comparison collation. (When we actually ship such an expression to the remote side, the local Var would become a Param with default collation, meaning the remote Var would in fact control the comparison collation, because non-default implicit collation overrides default implicit collation in parse_collate.c.) To fix, be more precise about what FDW_COLLATE_NONE means: it applies either to a noncollatable data type or to a collatable type with default collation, if that collation can't be traced to a remote Var. (When it can, FDW_COLLATE_SAFE is appropriate.) We were essentially using that interpretation already at the Var/Const/Param level, but we weren't bubbling it up properly. An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT value to describe the second situation, but that would add more code without changing the actual behavior, so it didn't seem worthwhile. Also, since we're clarifying the rule to be that we care about whether operator/function input collations match, there seems no need to fail immediately upon seeing a Const/Param/non-foreign-Var with nondefault collation. We only have to reject if it appears in a collation-sensitive context (for example, "var IS NOT NULL" is perfectly safe from a collation standpoint, whatever collation the var has). So just set the state to UNSAFE rather than failing immediately. Per report from Jeevan Chalke. This essentially corrects some sloppy thinking in commit ed3ddf918b59545583a4b374566bc1148e75f593, so back-patch to 9.3 where that logic appeared.
2015-09-24 18:47:29 +02:00
collation = p->paramcollid;
if (collation == InvalidOid ||
collation == DEFAULT_COLLATION_OID)
state = FDW_COLLATE_NONE;
else
state = FDW_COLLATE_UNSAFE;
}
break;
case T_SubscriptingRef:
{
SubscriptingRef *sr = (SubscriptingRef *) node;
/* Assignment should not be in restrictions. */
if (sr->refassgnexpr != NULL)
return false;
/*
Support subscripting of arbitrary types, not only arrays. This patch generalizes the subscripting infrastructure so that any data type can be subscripted, if it provides a handler function to define what that means. Traditional variable-length (varlena) arrays all use array_subscript_handler(), while the existing fixed-length types that support subscripting use raw_array_subscript_handler(). It's expected that other types that want to use subscripting notation will define their own handlers. (This patch provides no such new features, though; it only lays the foundation for them.) To do this, move the parser's semantic processing of subscripts (including coercion to whatever data type is required) into a method callback supplied by the handler. On the execution side, replace the ExecEvalSubscriptingRef* layer of functions with direct calls to callback-supplied execution routines. (Thus, essentially no new run-time overhead should be caused by this patch. Indeed, there is room to remove some overhead by supplying specialized execution routines. This patch does a little bit in that line, but more could be done.) Additional work is required here and there to remove formerly hard-wired assumptions about the result type, collation, etc of a SubscriptingRef expression node; and to remove assumptions that the subscript values must be integers. One useful side-effect of this is that we now have a less squishy mechanism for identifying whether a data type is a "true" array: instead of wiring in weird rules about typlen, we can look to see if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this to be bulletproof, we have to forbid user-defined types from using that handler directly; but there seems no good reason for them to do so. This patch also removes assumptions that the number of subscripts is limited to MAXDIM (6), or indeed has any hard-wired limit. That limit still applies to types handled by array_subscript_handler or raw_array_subscript_handler, but to discourage other dependencies on this constant, I've moved it from c.h to utils/array.h. Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov, Peter Eisentraut, Pavel Stehule Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
2020-12-09 18:40:37 +01:00
* Recurse into the remaining subexpressions. The container
* subscripts will not affect collation of the SubscriptingRef
* result, so do those first and reset inner_cxt afterwards.
*/
if (!foreign_expr_walker((Node *) sr->refupperindexpr,
glob_cxt, &inner_cxt, case_arg_cxt))
return false;
Support subscripting of arbitrary types, not only arrays. This patch generalizes the subscripting infrastructure so that any data type can be subscripted, if it provides a handler function to define what that means. Traditional variable-length (varlena) arrays all use array_subscript_handler(), while the existing fixed-length types that support subscripting use raw_array_subscript_handler(). It's expected that other types that want to use subscripting notation will define their own handlers. (This patch provides no such new features, though; it only lays the foundation for them.) To do this, move the parser's semantic processing of subscripts (including coercion to whatever data type is required) into a method callback supplied by the handler. On the execution side, replace the ExecEvalSubscriptingRef* layer of functions with direct calls to callback-supplied execution routines. (Thus, essentially no new run-time overhead should be caused by this patch. Indeed, there is room to remove some overhead by supplying specialized execution routines. This patch does a little bit in that line, but more could be done.) Additional work is required here and there to remove formerly hard-wired assumptions about the result type, collation, etc of a SubscriptingRef expression node; and to remove assumptions that the subscript values must be integers. One useful side-effect of this is that we now have a less squishy mechanism for identifying whether a data type is a "true" array: instead of wiring in weird rules about typlen, we can look to see if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this to be bulletproof, we have to forbid user-defined types from using that handler directly; but there seems no good reason for them to do so. This patch also removes assumptions that the number of subscripts is limited to MAXDIM (6), or indeed has any hard-wired limit. That limit still applies to types handled by array_subscript_handler or raw_array_subscript_handler, but to discourage other dependencies on this constant, I've moved it from c.h to utils/array.h. Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov, Peter Eisentraut, Pavel Stehule Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
2020-12-09 18:40:37 +01:00
inner_cxt.collation = InvalidOid;
inner_cxt.state = FDW_COLLATE_NONE;
if (!foreign_expr_walker((Node *) sr->reflowerindexpr,
glob_cxt, &inner_cxt, case_arg_cxt))
return false;
Support subscripting of arbitrary types, not only arrays. This patch generalizes the subscripting infrastructure so that any data type can be subscripted, if it provides a handler function to define what that means. Traditional variable-length (varlena) arrays all use array_subscript_handler(), while the existing fixed-length types that support subscripting use raw_array_subscript_handler(). It's expected that other types that want to use subscripting notation will define their own handlers. (This patch provides no such new features, though; it only lays the foundation for them.) To do this, move the parser's semantic processing of subscripts (including coercion to whatever data type is required) into a method callback supplied by the handler. On the execution side, replace the ExecEvalSubscriptingRef* layer of functions with direct calls to callback-supplied execution routines. (Thus, essentially no new run-time overhead should be caused by this patch. Indeed, there is room to remove some overhead by supplying specialized execution routines. This patch does a little bit in that line, but more could be done.) Additional work is required here and there to remove formerly hard-wired assumptions about the result type, collation, etc of a SubscriptingRef expression node; and to remove assumptions that the subscript values must be integers. One useful side-effect of this is that we now have a less squishy mechanism for identifying whether a data type is a "true" array: instead of wiring in weird rules about typlen, we can look to see if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this to be bulletproof, we have to forbid user-defined types from using that handler directly; but there seems no good reason for them to do so. This patch also removes assumptions that the number of subscripts is limited to MAXDIM (6), or indeed has any hard-wired limit. That limit still applies to types handled by array_subscript_handler or raw_array_subscript_handler, but to discourage other dependencies on this constant, I've moved it from c.h to utils/array.h. Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov, Peter Eisentraut, Pavel Stehule Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
2020-12-09 18:40:37 +01:00
inner_cxt.collation = InvalidOid;
inner_cxt.state = FDW_COLLATE_NONE;
if (!foreign_expr_walker((Node *) sr->refexpr,
glob_cxt, &inner_cxt, case_arg_cxt))
return false;
/*
Support subscripting of arbitrary types, not only arrays. This patch generalizes the subscripting infrastructure so that any data type can be subscripted, if it provides a handler function to define what that means. Traditional variable-length (varlena) arrays all use array_subscript_handler(), while the existing fixed-length types that support subscripting use raw_array_subscript_handler(). It's expected that other types that want to use subscripting notation will define their own handlers. (This patch provides no such new features, though; it only lays the foundation for them.) To do this, move the parser's semantic processing of subscripts (including coercion to whatever data type is required) into a method callback supplied by the handler. On the execution side, replace the ExecEvalSubscriptingRef* layer of functions with direct calls to callback-supplied execution routines. (Thus, essentially no new run-time overhead should be caused by this patch. Indeed, there is room to remove some overhead by supplying specialized execution routines. This patch does a little bit in that line, but more could be done.) Additional work is required here and there to remove formerly hard-wired assumptions about the result type, collation, etc of a SubscriptingRef expression node; and to remove assumptions that the subscript values must be integers. One useful side-effect of this is that we now have a less squishy mechanism for identifying whether a data type is a "true" array: instead of wiring in weird rules about typlen, we can look to see if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this to be bulletproof, we have to forbid user-defined types from using that handler directly; but there seems no good reason for them to do so. This patch also removes assumptions that the number of subscripts is limited to MAXDIM (6), or indeed has any hard-wired limit. That limit still applies to types handled by array_subscript_handler or raw_array_subscript_handler, but to discourage other dependencies on this constant, I've moved it from c.h to utils/array.h. Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov, Peter Eisentraut, Pavel Stehule Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
2020-12-09 18:40:37 +01:00
* Container subscripting typically yields same collation as
* refexpr's, but in case it doesn't, use same logic as for
* function nodes.
*/
collation = sr->refcollid;
if (collation == InvalidOid)
state = FDW_COLLATE_NONE;
else if (inner_cxt.state == FDW_COLLATE_SAFE &&
collation == inner_cxt.collation)
state = FDW_COLLATE_SAFE;
Improve handling of collations in contrib/postgres_fdw. If we have a local Var of say varchar type with default collation, and we apply a RelabelType to convert that to text with default collation, we don't want to consider that as creating an FDW_COLLATE_UNSAFE situation. It should be okay to compare that to a remote Var, so long as the remote Var determines the comparison collation. (When we actually ship such an expression to the remote side, the local Var would become a Param with default collation, meaning the remote Var would in fact control the comparison collation, because non-default implicit collation overrides default implicit collation in parse_collate.c.) To fix, be more precise about what FDW_COLLATE_NONE means: it applies either to a noncollatable data type or to a collatable type with default collation, if that collation can't be traced to a remote Var. (When it can, FDW_COLLATE_SAFE is appropriate.) We were essentially using that interpretation already at the Var/Const/Param level, but we weren't bubbling it up properly. An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT value to describe the second situation, but that would add more code without changing the actual behavior, so it didn't seem worthwhile. Also, since we're clarifying the rule to be that we care about whether operator/function input collations match, there seems no need to fail immediately upon seeing a Const/Param/non-foreign-Var with nondefault collation. We only have to reject if it appears in a collation-sensitive context (for example, "var IS NOT NULL" is perfectly safe from a collation standpoint, whatever collation the var has). So just set the state to UNSAFE rather than failing immediately. Per report from Jeevan Chalke. This essentially corrects some sloppy thinking in commit ed3ddf918b59545583a4b374566bc1148e75f593, so back-patch to 9.3 where that logic appeared.
2015-09-24 18:47:29 +02:00
else if (collation == DEFAULT_COLLATION_OID)
state = FDW_COLLATE_NONE;
else
state = FDW_COLLATE_UNSAFE;
}
break;
case T_FuncExpr:
{
FuncExpr *fe = (FuncExpr *) node;
/*
* If function used by the expression is not shippable, it
* can't be sent to remote because it might have incompatible
* semantics on remote side.
*/
if (!is_shippable(fe->funcid, ProcedureRelationId, fpinfo))
return false;
/*
* Recurse to input subexpressions.
*/
if (!foreign_expr_walker((Node *) fe->args,
glob_cxt, &inner_cxt, case_arg_cxt))
return false;
/*
* If function's input collation is not derived from a foreign
* Var, it can't be sent to remote.
*/
if (fe->inputcollid == InvalidOid)
/* OK, inputs are all noncollatable */ ;
else if (inner_cxt.state != FDW_COLLATE_SAFE ||
fe->inputcollid != inner_cxt.collation)
return false;
/*
* Detect whether node is introducing a collation not derived
* from a foreign Var. (If so, we just mark it unsafe for now
* rather than immediately returning false, since the parent
* node might not care.)
*/
collation = fe->funccollid;
if (collation == InvalidOid)
state = FDW_COLLATE_NONE;
else if (inner_cxt.state == FDW_COLLATE_SAFE &&
collation == inner_cxt.collation)
state = FDW_COLLATE_SAFE;
Improve handling of collations in contrib/postgres_fdw. If we have a local Var of say varchar type with default collation, and we apply a RelabelType to convert that to text with default collation, we don't want to consider that as creating an FDW_COLLATE_UNSAFE situation. It should be okay to compare that to a remote Var, so long as the remote Var determines the comparison collation. (When we actually ship such an expression to the remote side, the local Var would become a Param with default collation, meaning the remote Var would in fact control the comparison collation, because non-default implicit collation overrides default implicit collation in parse_collate.c.) To fix, be more precise about what FDW_COLLATE_NONE means: it applies either to a noncollatable data type or to a collatable type with default collation, if that collation can't be traced to a remote Var. (When it can, FDW_COLLATE_SAFE is appropriate.) We were essentially using that interpretation already at the Var/Const/Param level, but we weren't bubbling it up properly. An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT value to describe the second situation, but that would add more code without changing the actual behavior, so it didn't seem worthwhile. Also, since we're clarifying the rule to be that we care about whether operator/function input collations match, there seems no need to fail immediately upon seeing a Const/Param/non-foreign-Var with nondefault collation. We only have to reject if it appears in a collation-sensitive context (for example, "var IS NOT NULL" is perfectly safe from a collation standpoint, whatever collation the var has). So just set the state to UNSAFE rather than failing immediately. Per report from Jeevan Chalke. This essentially corrects some sloppy thinking in commit ed3ddf918b59545583a4b374566bc1148e75f593, so back-patch to 9.3 where that logic appeared.
2015-09-24 18:47:29 +02:00
else if (collation == DEFAULT_COLLATION_OID)
state = FDW_COLLATE_NONE;
else
state = FDW_COLLATE_UNSAFE;
}
break;
case T_OpExpr:
case T_DistinctExpr: /* struct-equivalent to OpExpr */
{
OpExpr *oe = (OpExpr *) node;
/*
* Similarly, only shippable operators can be sent to remote.
* (If the operator is shippable, we assume its underlying
* function is too.)
*/
if (!is_shippable(oe->opno, OperatorRelationId, fpinfo))
return false;
/*
* Recurse to input subexpressions.
*/
if (!foreign_expr_walker((Node *) oe->args,
glob_cxt, &inner_cxt, case_arg_cxt))
return false;
/*
* If operator's input collation is not derived from a foreign
* Var, it can't be sent to remote.
*/
if (oe->inputcollid == InvalidOid)
/* OK, inputs are all noncollatable */ ;
else if (inner_cxt.state != FDW_COLLATE_SAFE ||
oe->inputcollid != inner_cxt.collation)
return false;
/* Result-collation handling is same as for functions */
collation = oe->opcollid;
if (collation == InvalidOid)
state = FDW_COLLATE_NONE;
else if (inner_cxt.state == FDW_COLLATE_SAFE &&
collation == inner_cxt.collation)
state = FDW_COLLATE_SAFE;
Improve handling of collations in contrib/postgres_fdw. If we have a local Var of say varchar type with default collation, and we apply a RelabelType to convert that to text with default collation, we don't want to consider that as creating an FDW_COLLATE_UNSAFE situation. It should be okay to compare that to a remote Var, so long as the remote Var determines the comparison collation. (When we actually ship such an expression to the remote side, the local Var would become a Param with default collation, meaning the remote Var would in fact control the comparison collation, because non-default implicit collation overrides default implicit collation in parse_collate.c.) To fix, be more precise about what FDW_COLLATE_NONE means: it applies either to a noncollatable data type or to a collatable type with default collation, if that collation can't be traced to a remote Var. (When it can, FDW_COLLATE_SAFE is appropriate.) We were essentially using that interpretation already at the Var/Const/Param level, but we weren't bubbling it up properly. An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT value to describe the second situation, but that would add more code without changing the actual behavior, so it didn't seem worthwhile. Also, since we're clarifying the rule to be that we care about whether operator/function input collations match, there seems no need to fail immediately upon seeing a Const/Param/non-foreign-Var with nondefault collation. We only have to reject if it appears in a collation-sensitive context (for example, "var IS NOT NULL" is perfectly safe from a collation standpoint, whatever collation the var has). So just set the state to UNSAFE rather than failing immediately. Per report from Jeevan Chalke. This essentially corrects some sloppy thinking in commit ed3ddf918b59545583a4b374566bc1148e75f593, so back-patch to 9.3 where that logic appeared.
2015-09-24 18:47:29 +02:00
else if (collation == DEFAULT_COLLATION_OID)
state = FDW_COLLATE_NONE;
else
state = FDW_COLLATE_UNSAFE;
}
break;
case T_ScalarArrayOpExpr:
{
ScalarArrayOpExpr *oe = (ScalarArrayOpExpr *) node;
/*
* Again, only shippable operators can be sent to remote.
*/
if (!is_shippable(oe->opno, OperatorRelationId, fpinfo))
return false;
/*
* Recurse to input subexpressions.
*/
if (!foreign_expr_walker((Node *) oe->args,
glob_cxt, &inner_cxt, case_arg_cxt))
return false;
/*
* If operator's input collation is not derived from a foreign
* Var, it can't be sent to remote.
*/
if (oe->inputcollid == InvalidOid)
/* OK, inputs are all noncollatable */ ;
else if (inner_cxt.state != FDW_COLLATE_SAFE ||
oe->inputcollid != inner_cxt.collation)
return false;
/* Output is always boolean and so noncollatable. */
collation = InvalidOid;
state = FDW_COLLATE_NONE;
}
break;
case T_RelabelType:
{
RelabelType *r = (RelabelType *) node;
/*
* Recurse to input subexpression.
*/
if (!foreign_expr_walker((Node *) r->arg,
glob_cxt, &inner_cxt, case_arg_cxt))
return false;
/*
* RelabelType must not introduce a collation not derived from
Improve handling of collations in contrib/postgres_fdw. If we have a local Var of say varchar type with default collation, and we apply a RelabelType to convert that to text with default collation, we don't want to consider that as creating an FDW_COLLATE_UNSAFE situation. It should be okay to compare that to a remote Var, so long as the remote Var determines the comparison collation. (When we actually ship such an expression to the remote side, the local Var would become a Param with default collation, meaning the remote Var would in fact control the comparison collation, because non-default implicit collation overrides default implicit collation in parse_collate.c.) To fix, be more precise about what FDW_COLLATE_NONE means: it applies either to a noncollatable data type or to a collatable type with default collation, if that collation can't be traced to a remote Var. (When it can, FDW_COLLATE_SAFE is appropriate.) We were essentially using that interpretation already at the Var/Const/Param level, but we weren't bubbling it up properly. An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT value to describe the second situation, but that would add more code without changing the actual behavior, so it didn't seem worthwhile. Also, since we're clarifying the rule to be that we care about whether operator/function input collations match, there seems no need to fail immediately upon seeing a Const/Param/non-foreign-Var with nondefault collation. We only have to reject if it appears in a collation-sensitive context (for example, "var IS NOT NULL" is perfectly safe from a collation standpoint, whatever collation the var has). So just set the state to UNSAFE rather than failing immediately. Per report from Jeevan Chalke. This essentially corrects some sloppy thinking in commit ed3ddf918b59545583a4b374566bc1148e75f593, so back-patch to 9.3 where that logic appeared.
2015-09-24 18:47:29 +02:00
* an input foreign Var (same logic as for a real function).
*/
collation = r->resultcollid;
if (collation == InvalidOid)
state = FDW_COLLATE_NONE;
else if (inner_cxt.state == FDW_COLLATE_SAFE &&
collation == inner_cxt.collation)
state = FDW_COLLATE_SAFE;
Improve handling of collations in contrib/postgres_fdw. If we have a local Var of say varchar type with default collation, and we apply a RelabelType to convert that to text with default collation, we don't want to consider that as creating an FDW_COLLATE_UNSAFE situation. It should be okay to compare that to a remote Var, so long as the remote Var determines the comparison collation. (When we actually ship such an expression to the remote side, the local Var would become a Param with default collation, meaning the remote Var would in fact control the comparison collation, because non-default implicit collation overrides default implicit collation in parse_collate.c.) To fix, be more precise about what FDW_COLLATE_NONE means: it applies either to a noncollatable data type or to a collatable type with default collation, if that collation can't be traced to a remote Var. (When it can, FDW_COLLATE_SAFE is appropriate.) We were essentially using that interpretation already at the Var/Const/Param level, but we weren't bubbling it up properly. An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT value to describe the second situation, but that would add more code without changing the actual behavior, so it didn't seem worthwhile. Also, since we're clarifying the rule to be that we care about whether operator/function input collations match, there seems no need to fail immediately upon seeing a Const/Param/non-foreign-Var with nondefault collation. We only have to reject if it appears in a collation-sensitive context (for example, "var IS NOT NULL" is perfectly safe from a collation standpoint, whatever collation the var has). So just set the state to UNSAFE rather than failing immediately. Per report from Jeevan Chalke. This essentially corrects some sloppy thinking in commit ed3ddf918b59545583a4b374566bc1148e75f593, so back-patch to 9.3 where that logic appeared.
2015-09-24 18:47:29 +02:00
else if (collation == DEFAULT_COLLATION_OID)
state = FDW_COLLATE_NONE;
else
state = FDW_COLLATE_UNSAFE;
}
break;
case T_BoolExpr:
{
BoolExpr *b = (BoolExpr *) node;
/*
* Recurse to input subexpressions.
*/
if (!foreign_expr_walker((Node *) b->args,
glob_cxt, &inner_cxt, case_arg_cxt))
return false;
/* Output is always boolean and so noncollatable. */
collation = InvalidOid;
state = FDW_COLLATE_NONE;
}
break;
case T_NullTest:
{
NullTest *nt = (NullTest *) node;
/*
* Recurse to input subexpressions.
*/
if (!foreign_expr_walker((Node *) nt->arg,
glob_cxt, &inner_cxt, case_arg_cxt))
return false;
/* Output is always boolean and so noncollatable. */
collation = InvalidOid;
state = FDW_COLLATE_NONE;
}
break;
case T_CaseExpr:
{
CaseExpr *ce = (CaseExpr *) node;
foreign_loc_cxt arg_cxt;
foreign_loc_cxt tmp_cxt;
ListCell *lc;
/*
* Recurse to CASE's arg expression, if any. Its collation
* has to be saved aside for use while examining CaseTestExprs
* within the WHEN expressions.
*/
arg_cxt.collation = InvalidOid;
arg_cxt.state = FDW_COLLATE_NONE;
if (ce->arg)
{
if (!foreign_expr_walker((Node *) ce->arg,
glob_cxt, &arg_cxt, case_arg_cxt))
return false;
}
/* Examine the CaseWhen subexpressions. */
foreach(lc, ce->args)
{
CaseWhen *cw = lfirst_node(CaseWhen, lc);
if (ce->arg)
{
/*
* In a CASE-with-arg, the parser should have produced
* WHEN clauses of the form "CaseTestExpr = RHS",
* possibly with an implicit coercion inserted above
* the CaseTestExpr. However in an expression that's
* been through the optimizer, the WHEN clause could
* be almost anything (since the equality operator
* could have been expanded into an inline function).
* In such cases forbid pushdown, because
* deparseCaseExpr can't handle it.
*/
Node *whenExpr = (Node *) cw->expr;
List *opArgs;
if (!IsA(whenExpr, OpExpr))
return false;
opArgs = ((OpExpr *) whenExpr)->args;
if (list_length(opArgs) != 2 ||
!IsA(strip_implicit_coercions(linitial(opArgs)),
CaseTestExpr))
return false;
}
/*
* Recurse to WHEN expression, passing down the arg info.
* Its collation doesn't affect the result (really, it
* should be boolean and thus not have a collation).
*/
tmp_cxt.collation = InvalidOid;
tmp_cxt.state = FDW_COLLATE_NONE;
if (!foreign_expr_walker((Node *) cw->expr,
glob_cxt, &tmp_cxt, &arg_cxt))
return false;
/* Recurse to THEN expression. */
if (!foreign_expr_walker((Node *) cw->result,
glob_cxt, &inner_cxt, case_arg_cxt))
return false;
}
/* Recurse to ELSE expression. */
if (!foreign_expr_walker((Node *) ce->defresult,
glob_cxt, &inner_cxt, case_arg_cxt))
return false;
/*
* Detect whether node is introducing a collation not derived
* from a foreign Var. (If so, we just mark it unsafe for now
* rather than immediately returning false, since the parent
* node might not care.) This is the same as for function
* nodes, except that the input collation is derived from only
* the THEN and ELSE subexpressions.
*/
collation = ce->casecollid;
if (collation == InvalidOid)
state = FDW_COLLATE_NONE;
else if (inner_cxt.state == FDW_COLLATE_SAFE &&
collation == inner_cxt.collation)
state = FDW_COLLATE_SAFE;
else if (collation == DEFAULT_COLLATION_OID)
state = FDW_COLLATE_NONE;
else
state = FDW_COLLATE_UNSAFE;
}
break;
case T_CaseTestExpr:
{
CaseTestExpr *c = (CaseTestExpr *) node;
/* Punt if we seem not to be inside a CASE arg WHEN. */
if (!case_arg_cxt)
return false;
/*
* Otherwise, any nondefault collation attached to the
* CaseTestExpr node must be derived from foreign Var(s) in
* the CASE arg.
*/
collation = c->collation;
if (collation == InvalidOid)
state = FDW_COLLATE_NONE;
else if (case_arg_cxt->state == FDW_COLLATE_SAFE &&
collation == case_arg_cxt->collation)
state = FDW_COLLATE_SAFE;
else if (collation == DEFAULT_COLLATION_OID)
state = FDW_COLLATE_NONE;
else
state = FDW_COLLATE_UNSAFE;
}
break;
case T_ArrayExpr:
{
ArrayExpr *a = (ArrayExpr *) node;
/*
* Recurse to input subexpressions.
*/
if (!foreign_expr_walker((Node *) a->elements,
glob_cxt, &inner_cxt, case_arg_cxt))
return false;
/*
* ArrayExpr must not introduce a collation not derived from
Improve handling of collations in contrib/postgres_fdw. If we have a local Var of say varchar type with default collation, and we apply a RelabelType to convert that to text with default collation, we don't want to consider that as creating an FDW_COLLATE_UNSAFE situation. It should be okay to compare that to a remote Var, so long as the remote Var determines the comparison collation. (When we actually ship such an expression to the remote side, the local Var would become a Param with default collation, meaning the remote Var would in fact control the comparison collation, because non-default implicit collation overrides default implicit collation in parse_collate.c.) To fix, be more precise about what FDW_COLLATE_NONE means: it applies either to a noncollatable data type or to a collatable type with default collation, if that collation can't be traced to a remote Var. (When it can, FDW_COLLATE_SAFE is appropriate.) We were essentially using that interpretation already at the Var/Const/Param level, but we weren't bubbling it up properly. An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT value to describe the second situation, but that would add more code without changing the actual behavior, so it didn't seem worthwhile. Also, since we're clarifying the rule to be that we care about whether operator/function input collations match, there seems no need to fail immediately upon seeing a Const/Param/non-foreign-Var with nondefault collation. We only have to reject if it appears in a collation-sensitive context (for example, "var IS NOT NULL" is perfectly safe from a collation standpoint, whatever collation the var has). So just set the state to UNSAFE rather than failing immediately. Per report from Jeevan Chalke. This essentially corrects some sloppy thinking in commit ed3ddf918b59545583a4b374566bc1148e75f593, so back-patch to 9.3 where that logic appeared.
2015-09-24 18:47:29 +02:00
* an input foreign Var (same logic as for a function).
*/
collation = a->array_collid;
if (collation == InvalidOid)
state = FDW_COLLATE_NONE;
else if (inner_cxt.state == FDW_COLLATE_SAFE &&
collation == inner_cxt.collation)
state = FDW_COLLATE_SAFE;
Improve handling of collations in contrib/postgres_fdw. If we have a local Var of say varchar type with default collation, and we apply a RelabelType to convert that to text with default collation, we don't want to consider that as creating an FDW_COLLATE_UNSAFE situation. It should be okay to compare that to a remote Var, so long as the remote Var determines the comparison collation. (When we actually ship such an expression to the remote side, the local Var would become a Param with default collation, meaning the remote Var would in fact control the comparison collation, because non-default implicit collation overrides default implicit collation in parse_collate.c.) To fix, be more precise about what FDW_COLLATE_NONE means: it applies either to a noncollatable data type or to a collatable type with default collation, if that collation can't be traced to a remote Var. (When it can, FDW_COLLATE_SAFE is appropriate.) We were essentially using that interpretation already at the Var/Const/Param level, but we weren't bubbling it up properly. An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT value to describe the second situation, but that would add more code without changing the actual behavior, so it didn't seem worthwhile. Also, since we're clarifying the rule to be that we care about whether operator/function input collations match, there seems no need to fail immediately upon seeing a Const/Param/non-foreign-Var with nondefault collation. We only have to reject if it appears in a collation-sensitive context (for example, "var IS NOT NULL" is perfectly safe from a collation standpoint, whatever collation the var has). So just set the state to UNSAFE rather than failing immediately. Per report from Jeevan Chalke. This essentially corrects some sloppy thinking in commit ed3ddf918b59545583a4b374566bc1148e75f593, so back-patch to 9.3 where that logic appeared.
2015-09-24 18:47:29 +02:00
else if (collation == DEFAULT_COLLATION_OID)
state = FDW_COLLATE_NONE;
else
state = FDW_COLLATE_UNSAFE;
}
break;
case T_List:
{
List *l = (List *) node;
ListCell *lc;
/*
* Recurse to component subexpressions.
*/
foreach(lc, l)
{
if (!foreign_expr_walker((Node *) lfirst(lc),
glob_cxt, &inner_cxt, case_arg_cxt))
return false;
}
/*
* When processing a list, collation state just bubbles up
* from the list elements.
*/
collation = inner_cxt.collation;
state = inner_cxt.state;
/* Don't apply exprType() to the list. */
check_type = false;
}
break;
case T_Aggref:
{
Aggref *agg = (Aggref *) node;
ListCell *lc;
/* Not safe to pushdown when not in grouping context */
if (!IS_UPPER_REL(glob_cxt->foreignrel))
return false;
/* Only non-split aggregates are pushable. */
if (agg->aggsplit != AGGSPLIT_SIMPLE)
return false;
/* As usual, it must be shippable. */
if (!is_shippable(agg->aggfnoid, ProcedureRelationId, fpinfo))
return false;
/*
* Recurse to input args. aggdirectargs, aggorder and
* aggdistinct are all present in args, so no need to check
* their shippability explicitly.
*/
foreach(lc, agg->args)
{
Node *n = (Node *) lfirst(lc);
/* If TargetEntry, extract the expression from it */
if (IsA(n, TargetEntry))
{
TargetEntry *tle = (TargetEntry *) n;
n = (Node *) tle->expr;
}
if (!foreign_expr_walker(n,
glob_cxt, &inner_cxt, case_arg_cxt))
return false;
}
/*
* For aggorder elements, check whether the sort operator, if
* specified, is shippable or not.
*/
if (agg->aggorder)
{
foreach(lc, agg->aggorder)
{
SortGroupClause *srt = (SortGroupClause *) lfirst(lc);
Oid sortcoltype;
TypeCacheEntry *typentry;
TargetEntry *tle;
tle = get_sortgroupref_tle(srt->tleSortGroupRef,
agg->args);
sortcoltype = exprType((Node *) tle->expr);
typentry = lookup_type_cache(sortcoltype,
TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
/* Check shippability of non-default sort operator. */
if (srt->sortop != typentry->lt_opr &&
srt->sortop != typentry->gt_opr &&
!is_shippable(srt->sortop, OperatorRelationId,
fpinfo))
return false;
}
}
/* Check aggregate filter */
if (!foreign_expr_walker((Node *) agg->aggfilter,
glob_cxt, &inner_cxt, case_arg_cxt))
return false;
/*
* If aggregate's input collation is not derived from a
* foreign Var, it can't be sent to remote.
*/
if (agg->inputcollid == InvalidOid)
/* OK, inputs are all noncollatable */ ;
else if (inner_cxt.state != FDW_COLLATE_SAFE ||
agg->inputcollid != inner_cxt.collation)
return false;
/*
* Detect whether node is introducing a collation not derived
* from a foreign Var. (If so, we just mark it unsafe for now
* rather than immediately returning false, since the parent
* node might not care.)
*/
collation = agg->aggcollid;
if (collation == InvalidOid)
state = FDW_COLLATE_NONE;
else if (inner_cxt.state == FDW_COLLATE_SAFE &&
collation == inner_cxt.collation)
state = FDW_COLLATE_SAFE;
else if (collation == DEFAULT_COLLATION_OID)
state = FDW_COLLATE_NONE;
else
state = FDW_COLLATE_UNSAFE;
}
break;
default:
/*
* If it's anything else, assume it's unsafe. This list can be
* expanded later, but don't forget to add deparse support below.
*/
return false;
}
/*
* If result type of given expression is not shippable, it can't be sent
* to remote because it might have incompatible semantics on remote side.
*/
if (check_type && !is_shippable(exprType(node), TypeRelationId, fpinfo))
return false;
/*
* Now, merge my collation information into my parent's state.
*/
if (state > outer_cxt->state)
{
/* Override previous parent state */
outer_cxt->collation = collation;
outer_cxt->state = state;
}
else if (state == outer_cxt->state)
{
/* Merge, or detect error if there's a collation conflict */
switch (state)
{
case FDW_COLLATE_NONE:
/* Nothing + nothing is still nothing */
break;
case FDW_COLLATE_SAFE:
if (collation != outer_cxt->collation)
{
/*
* Non-default collation always beats default.
*/
if (outer_cxt->collation == DEFAULT_COLLATION_OID)
{
/* Override previous parent state */
outer_cxt->collation = collation;
}
else if (collation != DEFAULT_COLLATION_OID)
{
/*
* Conflict; show state as indeterminate. We don't
* want to "return false" right away, since parent
* node might not care about collation.
*/
outer_cxt->state = FDW_COLLATE_UNSAFE;
}
}
break;
case FDW_COLLATE_UNSAFE:
/* We're still conflicted ... */
break;
}
}
/* It looks OK */
return true;
}
Avoid postgres_fdw crash for a targetlist entry that's just a Param. foreign_grouping_ok() is willing to put fairly arbitrary expressions into the targetlist of a remote SELECT that's doing grouping or aggregation on the remote side, including expressions that have no foreign component to them at all. This is possibly a bit dubious from an efficiency standpoint; but it rises to the level of a crash-causing bug if the expression is just a Param or non-foreign Var. In that case, the expression will necessarily also appear in the fdw_exprs list of values we need to send to the remote server, and then setrefs.c's set_foreignscan_references will mistakenly replace the fdw_exprs entry with a Var referencing the targetlist result. The root cause of this problem is bad design in commit e7cb7ee14: it put logic into set_foreignscan_references that IMV is postgres_fdw-specific, and yet this bug shows that it isn't postgres_fdw-specific enough. The transformation being done on fdw_exprs assumes that fdw_exprs is to be evaluated with the fdw_scan_tlist as input, which is not how postgres_fdw uses it; yet it could be the right thing for some other FDW. (In the bigger picture, setrefs.c has no business assuming this for the other expression fields of a ForeignScan either.) The right fix therefore would be to expand the FDW API so that the FDW could inform setrefs.c how it intends to evaluate these various expressions. We can't change that in the back branches though, and we also can't just summarily change setrefs.c's behavior there, or we're likely to break external FDWs. As a stopgap, therefore, hack up postgres_fdw so that it won't attempt to send targetlist entries that look exactly like the fdw_exprs entries they'd produce. In most cases this actually produces a superior plan, IMO, with less data needing to be transmitted and returned; so we probably ought to think harder about whether we should ship tlist expressions at all when they don't contain any foreign Vars or Aggs. But that's an optimization not a bug fix so I left it for later. One case where this produces an inferior plan is where the expression in question is actually a GROUP BY expression: then the restriction prevents us from using remote grouping. It might be possible to work around that (since that would reduce to group-by-a-constant on the remote side); but it seems like a pretty unlikely corner case, so I'm not sure it's worth expending effort solely to improve that. In any case the right long-term answer is to fix the API as sketched above, and then revert this hack. Per bug #15781 from Sean Johnston. Back-patch to v10 where the problem was introduced. Discussion: https://postgr.es/m/15781-2601b1002bad087c@postgresql.org
2019-04-27 19:15:54 +02:00
/*
* Returns true if given expr is something we'd have to send the value of
* to the foreign server.
*
* This should return true when the expression is a shippable node that
* deparseExpr would add to context->params_list. Note that we don't care
* if the expression *contains* such a node, only whether one appears at top
* level. We need this to detect cases where setrefs.c would recognize a
* false match between an fdw_exprs item (which came from the params_list)
* and an entry in fdw_scan_tlist (which we're considering putting the given
* expression into).
*/
bool
is_foreign_param(PlannerInfo *root,
RelOptInfo *baserel,
Expr *expr)
{
if (expr == NULL)
return false;
switch (nodeTag(expr))
{
case T_Var:
{
/* It would have to be sent unless it's a foreign Var */
Var *var = (Var *) expr;
PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) (baserel->fdw_private);
Relids relids;
if (IS_UPPER_REL(baserel))
relids = fpinfo->outerrel->relids;
else
relids = baserel->relids;
if (bms_is_member(var->varno, relids) && var->varlevelsup == 0)
return false; /* foreign Var, so not a param */
else
return true; /* it'd have to be a param */
break;
}
case T_Param:
/* Params always have to be sent to the foreign server */
return true;
default:
break;
}
return false;
}
/*
* Returns true if it's safe to push down the sort expression described by
* 'pathkey' to the foreign server.
*/
bool
is_foreign_pathkey(PlannerInfo *root,
RelOptInfo *baserel,
PathKey *pathkey)
{
EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
/*
* is_foreign_expr would detect volatile expressions as well, but checking
* ec_has_volatile here saves some cycles.
*/
if (pathkey_ec->ec_has_volatile)
return false;
/* can't push down the sort if the pathkey's opfamily is not shippable */
if (!is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo))
return false;
/* can push if a suitable EC member exists */
return (find_em_for_rel(root, pathkey_ec, baserel) != NULL);
}
/*
* Convert type OID + typmod info into a type name we can ship to the remote
* server. Someplace else had better have verified that this type name is
* expected to be known on the remote end.
*
* This is almost just format_type_with_typemod(), except that if left to its
* own devices, that function will make schema-qualification decisions based
* on the local search_path, which is wrong. We must schema-qualify all
* type names that are not in pg_catalog. We assume here that built-in types
* are all in pg_catalog and need not be qualified; otherwise, qualify.
*/
static char *
deparse_type_name(Oid type_oid, int32 typemod)
{
bits16 flags = FORMAT_TYPE_TYPEMOD_GIVEN;
if (!is_builtin(type_oid))
flags |= FORMAT_TYPE_FORCE_QUALIFY;
return format_type_extended(type_oid, typemod, flags);
}
/*
* Build the targetlist for given relation to be deparsed as SELECT clause.
*
* The output targetlist contains the columns that need to be fetched from the
* foreign server for the given relation. If foreignrel is an upper relation,
* then the output targetlist can also contain expressions to be evaluated on
* foreign server.
*/
List *
build_tlist_to_deparse(RelOptInfo *foreignrel)
{
List *tlist = NIL;
PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
ListCell *lc;
/*
* For an upper relation, we have already built the target list while
* checking shippability, so just return that.
*/
if (IS_UPPER_REL(foreignrel))
return fpinfo->grouped_tlist;
/*
* We require columns specified in foreignrel->reltarget->exprs and those
* required for evaluating the local conditions.
*/
tlist = add_to_flat_tlist(tlist,
pull_var_clause((Node *) foreignrel->reltarget->exprs,
PVC_RECURSE_PLACEHOLDERS));
foreach(lc, fpinfo->local_conds)
{
RestrictInfo *rinfo = lfirst_node(RestrictInfo, lc);
tlist = add_to_flat_tlist(tlist,
pull_var_clause((Node *) rinfo->clause,
PVC_RECURSE_PLACEHOLDERS));
}
return tlist;
}
/*
* Deparse SELECT statement for given relation into buf.
*
* tlist contains the list of desired columns to be fetched from foreign server.
* For a base relation fpinfo->attrs_used is used to construct SELECT clause,
* hence the tlist is ignored for a base relation.
*
* remote_conds is the list of conditions to be deparsed into the WHERE clause
* (or, in the case of upper relations, into the HAVING clause).
*
* If params_list is not NULL, it receives a list of Params and other-relation
* Vars used in the clauses; these values must be transmitted to the remote
* server as parameter values.
*
* If params_list is NULL, we're generating the query for EXPLAIN purposes,
* so Params and other-relation Vars should be replaced by dummy values.
*
* pathkeys is the list of pathkeys to order the result by.
*
* is_subquery is the flag to indicate whether to deparse the specified
* relation as a subquery.
*
* List of columns selected is returned in retrieved_attrs.
*/
void
deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *rel,
List *tlist, List *remote_conds, List *pathkeys,
bool has_final_sort, bool has_limit, bool is_subquery,
List **retrieved_attrs, List **params_list)
{
deparse_expr_cxt context;
PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) rel->fdw_private;
List *quals;
/*
* We handle relations for foreign tables, joins between those and upper
* relations.
*/
Assert(IS_JOIN_REL(rel) || IS_SIMPLE_REL(rel) || IS_UPPER_REL(rel));
/* Fill portions of context common to upper, join and base relation */
context.buf = buf;
context.root = root;
context.foreignrel = rel;
context.scanrel = IS_UPPER_REL(rel) ? fpinfo->outerrel : rel;
context.params_list = params_list;
/* Construct SELECT clause */
deparseSelectSql(tlist, is_subquery, retrieved_attrs, &context);
/*
* For upper relations, the WHERE clause is built from the remote
* conditions of the underlying scan relation; otherwise, we can use the
* supplied list of remote conditions directly.
*/
if (IS_UPPER_REL(rel))
{
PgFdwRelationInfo *ofpinfo;
ofpinfo = (PgFdwRelationInfo *) fpinfo->outerrel->fdw_private;
quals = ofpinfo->remote_conds;
}
else
quals = remote_conds;
/* Construct FROM and WHERE clauses */
deparseFromExpr(quals, &context);
if (IS_UPPER_REL(rel))
{
/* Append GROUP BY clause */
appendGroupByClause(tlist, &context);
/* Append HAVING clause */
if (remote_conds)
{
appendStringInfoString(buf, " HAVING ");
appendConditions(remote_conds, &context);
}
}
/* Add ORDER BY clause if we found any useful pathkeys */
if (pathkeys)
appendOrderByClause(pathkeys, has_final_sort, &context);
/* Add LIMIT clause if necessary */
if (has_limit)
appendLimitClause(&context);
/* Add any necessary FOR UPDATE/SHARE. */
deparseLockingClause(&context);
}
/*
* Construct a simple SELECT statement that retrieves desired columns
* of the specified foreign table, and append it to "buf". The output
* contains just "SELECT ... ".
*
* We also create an integer List of the columns being retrieved, which is
* returned to *retrieved_attrs, unless we deparse the specified relation
* as a subquery.
*
* tlist is the list of desired columns. is_subquery is the flag to
* indicate whether to deparse the specified relation as a subquery.
* Read prologue of deparseSelectStmtForRel() for details.
*/
static void
deparseSelectSql(List *tlist, bool is_subquery, List **retrieved_attrs,
deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
RelOptInfo *foreignrel = context->foreignrel;
PlannerInfo *root = context->root;
PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
/*
* Construct SELECT list
*/
appendStringInfoString(buf, "SELECT ");
if (is_subquery)
{
/*
* For a relation that is deparsed as a subquery, emit expressions
* specified in the relation's reltarget. Note that since this is for
* the subquery, no need to care about *retrieved_attrs.
*/
deparseSubqueryTargetList(context);
}
else if (IS_JOIN_REL(foreignrel) || IS_UPPER_REL(foreignrel))
{
/*
* For a join or upper relation the input tlist gives the list of
* columns required to be fetched from the foreign server.
*/
deparseExplicitTargetList(tlist, false, retrieved_attrs, context);
}
else
{
/*
* For a base relation fpinfo->attrs_used gives the list of columns
* required to be fetched from the foreign server.
*/
RangeTblEntry *rte = planner_rt_fetch(foreignrel->relid, root);
/*
* Core code already has some lock on each rel being planned, so we
* can use NoLock here.
*/
Relation rel = table_open(rte->relid, NoLock);
deparseTargetList(buf, rte, foreignrel->relid, rel, false,
fpinfo->attrs_used, false, retrieved_attrs);
table_close(rel, NoLock);
}
}
/*
* Construct a FROM clause and, if needed, a WHERE clause, and append those to
* "buf".
*
* quals is the list of clauses to be included in the WHERE clause.
* (These may or may not include RestrictInfo decoration.)
*/
static void
deparseFromExpr(List *quals, deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
RelOptInfo *scanrel = context->scanrel;
List *additional_conds = NIL;
/* For upper relations, scanrel must be either a joinrel or a baserel */
Assert(!IS_UPPER_REL(context->foreignrel) ||
IS_JOIN_REL(scanrel) || IS_SIMPLE_REL(scanrel));
/* Construct FROM clause */
appendStringInfoString(buf, " FROM ");
deparseFromExprForRel(buf, context->root, scanrel,
(bms_membership(scanrel->relids) == BMS_MULTIPLE),
(Index) 0, NULL, &additional_conds,
context->params_list);
appendWhereClause(quals, additional_conds, context);
if (additional_conds != NIL)
list_free_deep(additional_conds);
}
/*
* Emit a target list that retrieves the columns specified in attrs_used.
* This is used for both SELECT and RETURNING targetlists; the is_returning
* parameter is true only for a RETURNING targetlist.
*
* The tlist text is appended to buf, and we also create an integer List
* of the columns being retrieved, which is returned to *retrieved_attrs.
*
* If qualify_col is true, add relation alias before the column name.
*/
static void
deparseTargetList(StringInfo buf,
RangeTblEntry *rte,
Index rtindex,
Relation rel,
bool is_returning,
Bitmapset *attrs_used,
bool qualify_col,
List **retrieved_attrs)
{
TupleDesc tupdesc = RelationGetDescr(rel);
bool have_wholerow;
bool first;
int i;
*retrieved_attrs = NIL;
/* If there's a whole-row reference, we'll need all the columns. */
have_wholerow = bms_is_member(0 - FirstLowInvalidHeapAttributeNumber,
attrs_used);
first = true;
for (i = 1; i <= tupdesc->natts; i++)
{
Form_pg_attribute attr = TupleDescAttr(tupdesc, i - 1);
/* Ignore dropped attributes. */
if (attr->attisdropped)
continue;
if (have_wholerow ||
bms_is_member(i - FirstLowInvalidHeapAttributeNumber,
attrs_used))
{
if (!first)
appendStringInfoString(buf, ", ");
else if (is_returning)
appendStringInfoString(buf, " RETURNING ");
first = false;
deparseColumnRef(buf, rtindex, i, rte, qualify_col);
*retrieved_attrs = lappend_int(*retrieved_attrs, i);
}
}
/*
Remove WITH OIDS support, change oid catalog column visibility. Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_*->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by * (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-21 00:36:57 +01:00
* Add ctid if needed. We currently don't support retrieving any other
* system columns.
*/
if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber,
attrs_used))
{
if (!first)
appendStringInfoString(buf, ", ");
else if (is_returning)
appendStringInfoString(buf, " RETURNING ");
first = false;
if (qualify_col)
ADD_REL_QUALIFIER(buf, rtindex);
appendStringInfoString(buf, "ctid");
*retrieved_attrs = lappend_int(*retrieved_attrs,
SelfItemPointerAttributeNumber);
}
/* Don't generate bad syntax if no undropped columns */
if (first && !is_returning)
appendStringInfoString(buf, "NULL");
}
/*
* Deparse the appropriate locking clause (FOR UPDATE or FOR SHARE) for a
* given relation (context->scanrel).
*/
static void
deparseLockingClause(deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
PlannerInfo *root = context->root;
RelOptInfo *rel = context->scanrel;
PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) rel->fdw_private;
int relid = -1;
while ((relid = bms_next_member(rel->relids, relid)) >= 0)
{
/*
* Ignore relation if it appears in a lower subquery. Locking clause
* for such a relation is included in the subquery if necessary.
*/
if (bms_is_member(relid, fpinfo->lower_subquery_rels))
continue;
/*
* Add FOR UPDATE/SHARE if appropriate. We apply locking during the
* initial row fetch, rather than later on as is done for local
* tables. The extra roundtrips involved in trying to duplicate the
* local semantics exactly don't seem worthwhile (see also comments
* for RowMarkType).
*
* Note: because we actually run the query as a cursor, this assumes
* that DECLARE CURSOR ... FOR UPDATE is supported, which it isn't
* before 8.3.
*/
Rework planning and execution of UPDATE and DELETE. This patch makes two closely related sets of changes: 1. For UPDATE, the subplan of the ModifyTable node now only delivers the new values of the changed columns (i.e., the expressions computed in the query's SET clause) plus row identity information such as CTID. ModifyTable must re-fetch the original tuple to merge in the old values of any unchanged columns. The core advantage of this is that the changed columns are uniform across all tables of an inherited or partitioned target relation, whereas the other columns might not be. A secondary advantage, when the UPDATE involves joins, is that less data needs to pass through the plan tree. The disadvantage of course is an extra fetch of each tuple to be updated. However, that seems to be very nearly free in context; even worst-case tests don't show it to add more than a couple percent to the total query cost. At some point it might be interesting to combine the re-fetch with the tuple access that ModifyTable must do anyway to mark the old tuple dead; but that would require a good deal of refactoring and it seems it wouldn't buy all that much, so this patch doesn't attempt it. 2. For inherited UPDATE/DELETE, instead of generating a separate subplan for each target relation, we now generate a single subplan that is just exactly like a SELECT's plan, then stick ModifyTable on top of that. To let ModifyTable know which target relation a given incoming row refers to, a tableoid junk column is added to the row identity information. This gets rid of the horrid hack that was inheritance_planner(), eliminating O(N^2) planning cost and memory consumption in cases where there were many unprunable target relations. Point 2 of course requires point 1, so that there is a uniform definition of the non-junk columns to be returned by the subplan. We can't insist on uniform definition of the row identity junk columns however, if we want to keep the ability to have both plain and foreign tables in a partitioning hierarchy. Since it wouldn't scale very far to have every child table have its own row identity column, this patch includes provisions to merge similar row identity columns into one column of the subplan result. In particular, we can merge the whole-row Vars typically used as row identity by FDWs into one column by pretending they are type RECORD. (It's still okay for the actual composite Datums to be labeled with the table's rowtype OID, though.) There is more that can be done to file down residual inefficiencies in this patch, but it seems to be committable now. FDW authors should note several API changes: * The argument list for AddForeignUpdateTargets() has changed, and so has the method it must use for adding junk columns to the query. Call add_row_identity_var() instead of manipulating the parse tree directly. You might want to reconsider exactly what you're adding, too. * PlanDirectModify() must now work a little harder to find the ForeignScan plan node; if the foreign table is part of a partitioning hierarchy then the ForeignScan might not be the direct child of ModifyTable. See postgres_fdw for sample code. * To check whether a relation is a target relation, it's no longer sufficient to compare its relid to root->parse->resultRelation. Instead, check it against all_result_relids or leaf_result_relids, as appropriate. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
if (bms_is_member(relid, root->all_result_relids) &&
(root->parse->commandType == CMD_UPDATE ||
root->parse->commandType == CMD_DELETE))
{
/* Relation is UPDATE/DELETE target, so use FOR UPDATE */
appendStringInfoString(buf, " FOR UPDATE");
/* Add the relation alias if we are here for a join relation */
if (IS_JOIN_REL(rel))
appendStringInfo(buf, " OF %s%d", REL_ALIAS_PREFIX, relid);
}
else
{
PlanRowMark *rc = get_plan_rowmark(root->rowMarks, relid);
if (rc)
{
/*
* Relation is specified as a FOR UPDATE/SHARE target, so
* handle that. (But we could also see LCS_NONE, meaning this
* isn't a target relation after all.)
*
* For now, just ignore any [NO] KEY specification, since (a)
* it's not clear what that means for a remote table that we
* don't have complete information about, and (b) it wouldn't
* work anyway on older remote servers. Likewise, we don't
* worry about NOWAIT.
*/
switch (rc->strength)
{
case LCS_NONE:
/* No locking needed */
break;
case LCS_FORKEYSHARE:
case LCS_FORSHARE:
appendStringInfoString(buf, " FOR SHARE");
break;
case LCS_FORNOKEYUPDATE:
case LCS_FORUPDATE:
appendStringInfoString(buf, " FOR UPDATE");
break;
}
/* Add the relation alias if we are here for a join relation */
if (bms_membership(rel->relids) == BMS_MULTIPLE &&
rc->strength != LCS_NONE)
appendStringInfo(buf, " OF %s%d", REL_ALIAS_PREFIX, relid);
}
}
}
}
/*
* Deparse conditions from the provided list and append them to buf.
*
* The conditions in the list are assumed to be ANDed. This function is used to
* deparse WHERE clauses, JOIN .. ON clauses and HAVING clauses.
*
* Depending on the caller, the list elements might be either RestrictInfos
* or bare clauses.
*/
static void
appendConditions(List *exprs, deparse_expr_cxt *context)
{
int nestlevel;
ListCell *lc;
bool is_first = true;
StringInfo buf = context->buf;
/* Make sure any constants in the exprs are printed portably */
nestlevel = set_transmission_modes();
foreach(lc, exprs)
{
Expr *expr = (Expr *) lfirst(lc);
/* Extract clause from RestrictInfo, if required */
if (IsA(expr, RestrictInfo))
expr = ((RestrictInfo *) expr)->clause;
/* Connect expressions with "AND" and parenthesize each condition. */
if (!is_first)
appendStringInfoString(buf, " AND ");
appendStringInfoChar(buf, '(');
deparseExpr(expr, context);
appendStringInfoChar(buf, ')');
is_first = false;
}
reset_transmission_modes(nestlevel);
}
/*
* Append WHERE clause, containing conditions from exprs and additional_conds,
* to context->buf.
*/
static void
appendWhereClause(List *exprs, List *additional_conds, deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
bool need_and = false;
ListCell *lc;
if (exprs != NIL || additional_conds != NIL)
appendStringInfoString(buf, " WHERE ");
/*
* If there are some filters, append them.
*/
if (exprs != NIL)
{
appendConditions(exprs, context);
need_and = true;
}
/*
* If there are some EXISTS conditions, coming from SEMI-JOINS, append
* them.
*/
foreach(lc, additional_conds)
{
if (need_and)
appendStringInfoString(buf, " AND ");
appendStringInfoString(buf, (char *) lfirst(lc));
need_and = true;
}
}
/* Output join name for given join type */
const char *
get_jointype_name(JoinType jointype)
{
switch (jointype)
{
case JOIN_INNER:
return "INNER";
case JOIN_LEFT:
return "LEFT";
case JOIN_RIGHT:
return "RIGHT";
case JOIN_FULL:
return "FULL";
case JOIN_SEMI:
return "SEMI";
default:
/* Shouldn't come here, but protect from buggy code. */
elog(ERROR, "unsupported join type %d", jointype);
}
/* Keep compiler happy */
return NULL;
}
/*
* Deparse given targetlist and append it to context->buf.
*
* tlist is list of TargetEntry's which in turn contain Var nodes.
*
* retrieved_attrs is the list of continuously increasing integers starting
* from 1. It has same number of entries as tlist.
*
* This is used for both SELECT and RETURNING targetlists; the is_returning
* parameter is true only for a RETURNING targetlist.
*/
static void
deparseExplicitTargetList(List *tlist,
bool is_returning,
List **retrieved_attrs,
deparse_expr_cxt *context)
{
ListCell *lc;
StringInfo buf = context->buf;
int i = 0;
*retrieved_attrs = NIL;
foreach(lc, tlist)
{
TargetEntry *tle = lfirst_node(TargetEntry, lc);
if (i > 0)
appendStringInfoString(buf, ", ");
else if (is_returning)
appendStringInfoString(buf, " RETURNING ");
deparseExpr((Expr *) tle->expr, context);
*retrieved_attrs = lappend_int(*retrieved_attrs, i + 1);
i++;
}
if (i == 0 && !is_returning)
appendStringInfoString(buf, "NULL");
}
/*
* Emit expressions specified in the given relation's reltarget.
*
* This is used for deparsing the given relation as a subquery.
*/
static void
deparseSubqueryTargetList(deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
RelOptInfo *foreignrel = context->foreignrel;
bool first;
ListCell *lc;
/* Should only be called in these cases. */
Assert(IS_SIMPLE_REL(foreignrel) || IS_JOIN_REL(foreignrel));
first = true;
foreach(lc, foreignrel->reltarget->exprs)
{
Node *node = (Node *) lfirst(lc);
if (!first)
appendStringInfoString(buf, ", ");
first = false;
deparseExpr((Expr *) node, context);
}
/* Don't generate bad syntax if no expressions */
if (first)
appendStringInfoString(buf, "NULL");
}
/*
* Construct FROM clause for given relation
*
* The function constructs ... JOIN ... ON ... for join relation. For a base
* relation it just returns schema-qualified tablename, with the appropriate
* alias if so requested.
*
* 'ignore_rel' is either zero or the RT index of a target relation. In the
* latter case the function constructs FROM clause of UPDATE or USING clause
* of DELETE; it deparses the join relation as if the relation never contained
* the target relation, and creates a List of conditions to be deparsed into
* the top-level WHERE clause, which is returned to *ignore_conds.
*
* 'additional_conds' is a pointer to a list of strings to be appended to
* the WHERE clause, coming from lower-level SEMI-JOINs.
*/
static void
deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel,
bool use_alias, Index ignore_rel, List **ignore_conds,
List **additional_conds, List **params_list)
{
PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
if (IS_JOIN_REL(foreignrel))
{
StringInfoData join_sql_o;
StringInfoData join_sql_i;
RelOptInfo *outerrel = fpinfo->outerrel;
RelOptInfo *innerrel = fpinfo->innerrel;
bool outerrel_is_target = false;
bool innerrel_is_target = false;
List *additional_conds_i = NIL;
List *additional_conds_o = NIL;
if (ignore_rel > 0 && bms_is_member(ignore_rel, foreignrel->relids))
{
/*
* If this is an inner join, add joinclauses to *ignore_conds and
* set it to empty so that those can be deparsed into the WHERE
* clause. Note that since the target relation can never be
* within the nullable side of an outer join, those could safely
* be pulled up into the WHERE clause (see foreign_join_ok()).
* Note also that since the target relation is only inner-joined
* to any other relation in the query, all conditions in the join
* tree mentioning the target relation could be deparsed into the
* WHERE clause by doing this recursively.
*/
if (fpinfo->jointype == JOIN_INNER)
{
*ignore_conds = list_concat(*ignore_conds,
fpinfo->joinclauses);
fpinfo->joinclauses = NIL;
}
/*
* Check if either of the input relations is the target relation.
*/
if (outerrel->relid == ignore_rel)
outerrel_is_target = true;
else if (innerrel->relid == ignore_rel)
innerrel_is_target = true;
}
/* Deparse outer relation if not the target relation. */
if (!outerrel_is_target)
{
initStringInfo(&join_sql_o);
deparseRangeTblRef(&join_sql_o, root, outerrel,
fpinfo->make_outerrel_subquery,
ignore_rel, ignore_conds, &additional_conds_o,
params_list);
/*
* If inner relation is the target relation, skip deparsing it.
* Note that since the join of the target relation with any other
* relation in the query is an inner join and can never be within
* the nullable side of an outer join, the join could be
* interchanged with higher-level joins (cf. identity 1 on outer
* join reordering shown in src/backend/optimizer/README), which
* means it's safe to skip the target-relation deparsing here.
*/
if (innerrel_is_target)
{
Assert(fpinfo->jointype == JOIN_INNER);
Assert(fpinfo->joinclauses == NIL);
appendBinaryStringInfo(buf, join_sql_o.data, join_sql_o.len);
/* Pass EXISTS conditions to upper level */
if (additional_conds_o != NIL)
{
Assert(*additional_conds == NIL);
*additional_conds = additional_conds_o;
}
return;
}
}
/* Deparse inner relation if not the target relation. */
if (!innerrel_is_target)
{
initStringInfo(&join_sql_i);
deparseRangeTblRef(&join_sql_i, root, innerrel,
fpinfo->make_innerrel_subquery,
ignore_rel, ignore_conds, &additional_conds_i,
params_list);
/*
* SEMI-JOIN is deparsed as the EXISTS subquery. It references
* outer and inner relations, so it should be evaluated as the
* condition in the upper-level WHERE clause. We deparse the
* condition and pass it to upper level callers as an
* additional_conds list. Upper level callers are responsible for
* inserting conditions from the list where appropriate.
*/
if (fpinfo->jointype == JOIN_SEMI)
{
deparse_expr_cxt context;
StringInfoData str;
/* Construct deparsed condition from this SEMI-JOIN */
initStringInfo(&str);
appendStringInfo(&str, "EXISTS (SELECT NULL FROM %s",
join_sql_i.data);
context.buf = &str;
context.foreignrel = foreignrel;
context.scanrel = foreignrel;
context.root = root;
context.params_list = params_list;
/*
* Append SEMI-JOIN clauses and EXISTS conditions from lower
* levels to the current EXISTS subquery
*/
appendWhereClause(fpinfo->joinclauses, additional_conds_i, &context);
/*
* EXISTS conditions, coming from lower join levels, have just
* been processed.
*/
if (additional_conds_i != NIL)
{
list_free_deep(additional_conds_i);
additional_conds_i = NIL;
}
/* Close parentheses for EXISTS subquery */
appendStringInfoChar(&str, ')');
*additional_conds = lappend(*additional_conds, str.data);
}
/*
* If outer relation is the target relation, skip deparsing it.
* See the above note about safety.
*/
if (outerrel_is_target)
{
Assert(fpinfo->jointype == JOIN_INNER);
Assert(fpinfo->joinclauses == NIL);
appendBinaryStringInfo(buf, join_sql_i.data, join_sql_i.len);
/* Pass EXISTS conditions to the upper call */
if (additional_conds_i != NIL)
{
Assert(*additional_conds == NIL);
*additional_conds = additional_conds_i;
}
return;
}
}
/* Neither of the relations is the target relation. */
Assert(!outerrel_is_target && !innerrel_is_target);
/*
* For semijoin FROM clause is deparsed as an outer relation. An inner
* relation and join clauses are converted to EXISTS condition and
* passed to the upper level.
*/
if (fpinfo->jointype == JOIN_SEMI)
{
appendBinaryStringInfo(buf, join_sql_o.data, join_sql_o.len);
}
else
{
/*
* For a join relation FROM clause, entry is deparsed as
*
* ((outer relation) <join type> (inner relation) ON
* (joinclauses))
*/
appendStringInfo(buf, "(%s %s JOIN %s ON ", join_sql_o.data,
get_jointype_name(fpinfo->jointype), join_sql_i.data);
/* Append join clause; (TRUE) if no join clause */
if (fpinfo->joinclauses)
{
deparse_expr_cxt context;
context.buf = buf;
context.foreignrel = foreignrel;
context.scanrel = foreignrel;
context.root = root;
context.params_list = params_list;
appendStringInfoChar(buf, '(');
appendConditions(fpinfo->joinclauses, &context);
appendStringInfoChar(buf, ')');
}
else
appendStringInfoString(buf, "(TRUE)");
/* End the FROM clause entry. */
appendStringInfoChar(buf, ')');
}
/*
* Construct additional_conds to be passed to the upper caller from
* current level additional_conds and additional_conds, coming from
* inner and outer rels.
*/
if (additional_conds_o != NIL)
{
*additional_conds = list_concat(*additional_conds,
additional_conds_o);
list_free(additional_conds_o);
}
if (additional_conds_i != NIL)
{
*additional_conds = list_concat(*additional_conds,
additional_conds_i);
list_free(additional_conds_i);
}
}
else
{
RangeTblEntry *rte = planner_rt_fetch(foreignrel->relid, root);
/*
* Core code already has some lock on each rel being planned, so we
* can use NoLock here.
*/
Relation rel = table_open(rte->relid, NoLock);
deparseRelation(buf, rel);
/*
* Add a unique alias to avoid any conflict in relation names due to
* pulled up subqueries in the query being built for a pushed down
* join.
*/
if (use_alias)
appendStringInfo(buf, " %s%d", REL_ALIAS_PREFIX, foreignrel->relid);
table_close(rel, NoLock);
}
}
/*
* Append FROM clause entry for the given relation into buf.
* Conditions from lower-level SEMI-JOINs are appended to additional_conds
* and should be added to upper level WHERE clause.
*/
static void
deparseRangeTblRef(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel,
bool make_subquery, Index ignore_rel, List **ignore_conds,
List **additional_conds, List **params_list)
{
PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
/* Should only be called in these cases. */
Assert(IS_SIMPLE_REL(foreignrel) || IS_JOIN_REL(foreignrel));
Assert(fpinfo->local_conds == NIL);
/* If make_subquery is true, deparse the relation as a subquery. */
if (make_subquery)
{
List *retrieved_attrs;
int ncols;
/*
* The given relation shouldn't contain the target relation, because
* this should only happen for input relations for a full join, and
* such relations can never contain an UPDATE/DELETE target.
*/
Assert(ignore_rel == 0 ||
!bms_is_member(ignore_rel, foreignrel->relids));
/* Deparse the subquery representing the relation. */
appendStringInfoChar(buf, '(');
deparseSelectStmtForRel(buf, root, foreignrel, NIL,
fpinfo->remote_conds, NIL,
false, false, true,
&retrieved_attrs, params_list);
appendStringInfoChar(buf, ')');
/* Append the relation alias. */
appendStringInfo(buf, " %s%d", SUBQUERY_REL_ALIAS_PREFIX,
fpinfo->relation_index);
/*
* Append the column aliases if needed. Note that the subquery emits
* expressions specified in the relation's reltarget (see
* deparseSubqueryTargetList).
*/
ncols = list_length(foreignrel->reltarget->exprs);
if (ncols > 0)
{
int i;
appendStringInfoChar(buf, '(');
for (i = 1; i <= ncols; i++)
{
if (i > 1)
appendStringInfoString(buf, ", ");
appendStringInfo(buf, "%s%d", SUBQUERY_COL_ALIAS_PREFIX, i);
}
appendStringInfoChar(buf, ')');
}
}
else
deparseFromExprForRel(buf, root, foreignrel, true, ignore_rel,
ignore_conds, additional_conds,
params_list);
}
/*
* deparse remote INSERT statement
*
* The statement text is appended to buf, and we also create an integer List
Fix WITH CHECK OPTION on views referencing postgres_fdw tables. If a view references a foreign table, and the foreign table has a BEFORE INSERT trigger, then it's possible for a tuple inserted or updated through the view to be changed such that it violates the view's WITH CHECK OPTION constraint. Before this commit, postgres_fdw handled this case inconsistently. A RETURNING clause on the INSERT or UPDATE statement targeting the view would cause the finally-inserted tuple to be read back, and the WITH CHECK OPTION violation would throw an error. But without a RETURNING clause, postgres_fdw would not read the final tuple back, and WITH CHECK OPTION would not throw an error for the violation (or may throw an error when there is no real violation). AFTER ROW triggers on the foreign table had a similar effect as a RETURNING clause on the INSERT or UPDATE statement. To fix, this commit retrieves the attributes needed to enforce the WITH CHECK OPTION constraint along with the attributes needed for the RETURNING clause (if any) from the remote side. Thus, the WITH CHECK OPTION constraint is always evaluated against the final tuple after any triggers on the remote side. This fix may be considered inconsistent with CHECK constraints declared on foreign tables, which are not enforced locally at all (because the constraint is on a remote object). The discussion concluded that this difference is reasonable, because the WITH CHECK OPTION is a constraint on the local view (not any remote object); therefore it only makes sense to enforce its WITH CHECK OPTION constraint locally. Author: Etsuro Fujita Reviewed-by: Arthur Zakirov, Stephen Frost Discussion: https://www.postgresql.org/message-id/7eb58fab-fd3b-781b-ac33-f7cfec96021f%40lab.ntt.co.jp
2018-07-08 09:14:51 +02:00
* of the columns being retrieved by WITH CHECK OPTION or RETURNING (if any),
* which is returned to *retrieved_attrs.
*
* This also stores end position of the VALUES clause, so that we can rebuild
* an INSERT for a batch of rows later.
*/
void
deparseInsertSql(StringInfo buf, RangeTblEntry *rte,
Index rtindex, Relation rel,
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
List *targetAttrs, bool doNothing,
Fix WITH CHECK OPTION on views referencing postgres_fdw tables. If a view references a foreign table, and the foreign table has a BEFORE INSERT trigger, then it's possible for a tuple inserted or updated through the view to be changed such that it violates the view's WITH CHECK OPTION constraint. Before this commit, postgres_fdw handled this case inconsistently. A RETURNING clause on the INSERT or UPDATE statement targeting the view would cause the finally-inserted tuple to be read back, and the WITH CHECK OPTION violation would throw an error. But without a RETURNING clause, postgres_fdw would not read the final tuple back, and WITH CHECK OPTION would not throw an error for the violation (or may throw an error when there is no real violation). AFTER ROW triggers on the foreign table had a similar effect as a RETURNING clause on the INSERT or UPDATE statement. To fix, this commit retrieves the attributes needed to enforce the WITH CHECK OPTION constraint along with the attributes needed for the RETURNING clause (if any) from the remote side. Thus, the WITH CHECK OPTION constraint is always evaluated against the final tuple after any triggers on the remote side. This fix may be considered inconsistent with CHECK constraints declared on foreign tables, which are not enforced locally at all (because the constraint is on a remote object). The discussion concluded that this difference is reasonable, because the WITH CHECK OPTION is a constraint on the local view (not any remote object); therefore it only makes sense to enforce its WITH CHECK OPTION constraint locally. Author: Etsuro Fujita Reviewed-by: Arthur Zakirov, Stephen Frost Discussion: https://www.postgresql.org/message-id/7eb58fab-fd3b-781b-ac33-f7cfec96021f%40lab.ntt.co.jp
2018-07-08 09:14:51 +02:00
List *withCheckOptionList, List *returningList,
List **retrieved_attrs, int *values_end_len)
{
TupleDesc tupdesc = RelationGetDescr(rel);
AttrNumber pindex;
bool first;
ListCell *lc;
appendStringInfoString(buf, "INSERT INTO ");
deparseRelation(buf, rel);
if (targetAttrs)
{
appendStringInfoChar(buf, '(');
first = true;
foreach(lc, targetAttrs)
{
int attnum = lfirst_int(lc);
if (!first)
appendStringInfoString(buf, ", ");
first = false;
deparseColumnRef(buf, rtindex, attnum, rte, false);
}
appendStringInfoString(buf, ") VALUES (");
pindex = 1;
first = true;
foreach(lc, targetAttrs)
{
int attnum = lfirst_int(lc);
Form_pg_attribute attr = TupleDescAttr(tupdesc, attnum - 1);
if (!first)
appendStringInfoString(buf, ", ");
first = false;
if (attr->attgenerated)
appendStringInfoString(buf, "DEFAULT");
else
{
appendStringInfo(buf, "$%d", pindex);
pindex++;
}
}
appendStringInfoChar(buf, ')');
}
else
appendStringInfoString(buf, " DEFAULT VALUES");
*values_end_len = buf->len;
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:31:36 +02:00
if (doNothing)
appendStringInfoString(buf, " ON CONFLICT DO NOTHING");
deparseReturningList(buf, rte, rtindex, rel,
rel->trigdesc && rel->trigdesc->trig_insert_after_row,
Fix WITH CHECK OPTION on views referencing postgres_fdw tables. If a view references a foreign table, and the foreign table has a BEFORE INSERT trigger, then it's possible for a tuple inserted or updated through the view to be changed such that it violates the view's WITH CHECK OPTION constraint. Before this commit, postgres_fdw handled this case inconsistently. A RETURNING clause on the INSERT or UPDATE statement targeting the view would cause the finally-inserted tuple to be read back, and the WITH CHECK OPTION violation would throw an error. But without a RETURNING clause, postgres_fdw would not read the final tuple back, and WITH CHECK OPTION would not throw an error for the violation (or may throw an error when there is no real violation). AFTER ROW triggers on the foreign table had a similar effect as a RETURNING clause on the INSERT or UPDATE statement. To fix, this commit retrieves the attributes needed to enforce the WITH CHECK OPTION constraint along with the attributes needed for the RETURNING clause (if any) from the remote side. Thus, the WITH CHECK OPTION constraint is always evaluated against the final tuple after any triggers on the remote side. This fix may be considered inconsistent with CHECK constraints declared on foreign tables, which are not enforced locally at all (because the constraint is on a remote object). The discussion concluded that this difference is reasonable, because the WITH CHECK OPTION is a constraint on the local view (not any remote object); therefore it only makes sense to enforce its WITH CHECK OPTION constraint locally. Author: Etsuro Fujita Reviewed-by: Arthur Zakirov, Stephen Frost Discussion: https://www.postgresql.org/message-id/7eb58fab-fd3b-781b-ac33-f7cfec96021f%40lab.ntt.co.jp
2018-07-08 09:14:51 +02:00
withCheckOptionList, returningList, retrieved_attrs);
}
/*
* rebuild remote INSERT statement
*
* Provided a number of rows in a batch, builds INSERT statement with the
* right number of parameters.
*/
void
rebuildInsertSql(StringInfo buf, Relation rel,
char *orig_query, List *target_attrs,
int values_end_len, int num_params,
int num_rows)
{
TupleDesc tupdesc = RelationGetDescr(rel);
int i;
int pindex;
bool first;
ListCell *lc;
/* Make sure the values_end_len is sensible */
Assert((values_end_len > 0) && (values_end_len <= strlen(orig_query)));
/* Copy up to the end of the first record from the original query */
appendBinaryStringInfo(buf, orig_query, values_end_len);
/*
* Add records to VALUES clause (we already have parameters for the first
* row, so start at the right offset).
*/
pindex = num_params + 1;
for (i = 0; i < num_rows; i++)
{
appendStringInfoString(buf, ", (");
first = true;
foreach(lc, target_attrs)
{
int attnum = lfirst_int(lc);
Form_pg_attribute attr = TupleDescAttr(tupdesc, attnum - 1);
if (!first)
appendStringInfoString(buf, ", ");
first = false;
if (attr->attgenerated)
appendStringInfoString(buf, "DEFAULT");
else
{
appendStringInfo(buf, "$%d", pindex);
pindex++;
}
}
appendStringInfoChar(buf, ')');
}
/* Copy stuff after VALUES clause from the original query */
appendStringInfoString(buf, orig_query + values_end_len);
}
/*
* deparse remote UPDATE statement
*
* The statement text is appended to buf, and we also create an integer List
Fix WITH CHECK OPTION on views referencing postgres_fdw tables. If a view references a foreign table, and the foreign table has a BEFORE INSERT trigger, then it's possible for a tuple inserted or updated through the view to be changed such that it violates the view's WITH CHECK OPTION constraint. Before this commit, postgres_fdw handled this case inconsistently. A RETURNING clause on the INSERT or UPDATE statement targeting the view would cause the finally-inserted tuple to be read back, and the WITH CHECK OPTION violation would throw an error. But without a RETURNING clause, postgres_fdw would not read the final tuple back, and WITH CHECK OPTION would not throw an error for the violation (or may throw an error when there is no real violation). AFTER ROW triggers on the foreign table had a similar effect as a RETURNING clause on the INSERT or UPDATE statement. To fix, this commit retrieves the attributes needed to enforce the WITH CHECK OPTION constraint along with the attributes needed for the RETURNING clause (if any) from the remote side. Thus, the WITH CHECK OPTION constraint is always evaluated against the final tuple after any triggers on the remote side. This fix may be considered inconsistent with CHECK constraints declared on foreign tables, which are not enforced locally at all (because the constraint is on a remote object). The discussion concluded that this difference is reasonable, because the WITH CHECK OPTION is a constraint on the local view (not any remote object); therefore it only makes sense to enforce its WITH CHECK OPTION constraint locally. Author: Etsuro Fujita Reviewed-by: Arthur Zakirov, Stephen Frost Discussion: https://www.postgresql.org/message-id/7eb58fab-fd3b-781b-ac33-f7cfec96021f%40lab.ntt.co.jp
2018-07-08 09:14:51 +02:00
* of the columns being retrieved by WITH CHECK OPTION or RETURNING (if any),
* which is returned to *retrieved_attrs.
*/
void
deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
Index rtindex, Relation rel,
Fix WITH CHECK OPTION on views referencing postgres_fdw tables. If a view references a foreign table, and the foreign table has a BEFORE INSERT trigger, then it's possible for a tuple inserted or updated through the view to be changed such that it violates the view's WITH CHECK OPTION constraint. Before this commit, postgres_fdw handled this case inconsistently. A RETURNING clause on the INSERT or UPDATE statement targeting the view would cause the finally-inserted tuple to be read back, and the WITH CHECK OPTION violation would throw an error. But without a RETURNING clause, postgres_fdw would not read the final tuple back, and WITH CHECK OPTION would not throw an error for the violation (or may throw an error when there is no real violation). AFTER ROW triggers on the foreign table had a similar effect as a RETURNING clause on the INSERT or UPDATE statement. To fix, this commit retrieves the attributes needed to enforce the WITH CHECK OPTION constraint along with the attributes needed for the RETURNING clause (if any) from the remote side. Thus, the WITH CHECK OPTION constraint is always evaluated against the final tuple after any triggers on the remote side. This fix may be considered inconsistent with CHECK constraints declared on foreign tables, which are not enforced locally at all (because the constraint is on a remote object). The discussion concluded that this difference is reasonable, because the WITH CHECK OPTION is a constraint on the local view (not any remote object); therefore it only makes sense to enforce its WITH CHECK OPTION constraint locally. Author: Etsuro Fujita Reviewed-by: Arthur Zakirov, Stephen Frost Discussion: https://www.postgresql.org/message-id/7eb58fab-fd3b-781b-ac33-f7cfec96021f%40lab.ntt.co.jp
2018-07-08 09:14:51 +02:00
List *targetAttrs,
List *withCheckOptionList, List *returningList,
List **retrieved_attrs)
{
TupleDesc tupdesc = RelationGetDescr(rel);
AttrNumber pindex;
bool first;
ListCell *lc;
appendStringInfoString(buf, "UPDATE ");
deparseRelation(buf, rel);
appendStringInfoString(buf, " SET ");
pindex = 2; /* ctid is always the first param */
first = true;
foreach(lc, targetAttrs)
{
int attnum = lfirst_int(lc);
Form_pg_attribute attr = TupleDescAttr(tupdesc, attnum - 1);
if (!first)
appendStringInfoString(buf, ", ");
first = false;
deparseColumnRef(buf, rtindex, attnum, rte, false);
if (attr->attgenerated)
appendStringInfoString(buf, " = DEFAULT");
else
{
appendStringInfo(buf, " = $%d", pindex);
pindex++;
}
}
appendStringInfoString(buf, " WHERE ctid = $1");
deparseReturningList(buf, rte, rtindex, rel,
rel->trigdesc && rel->trigdesc->trig_update_after_row,
Fix WITH CHECK OPTION on views referencing postgres_fdw tables. If a view references a foreign table, and the foreign table has a BEFORE INSERT trigger, then it's possible for a tuple inserted or updated through the view to be changed such that it violates the view's WITH CHECK OPTION constraint. Before this commit, postgres_fdw handled this case inconsistently. A RETURNING clause on the INSERT or UPDATE statement targeting the view would cause the finally-inserted tuple to be read back, and the WITH CHECK OPTION violation would throw an error. But without a RETURNING clause, postgres_fdw would not read the final tuple back, and WITH CHECK OPTION would not throw an error for the violation (or may throw an error when there is no real violation). AFTER ROW triggers on the foreign table had a similar effect as a RETURNING clause on the INSERT or UPDATE statement. To fix, this commit retrieves the attributes needed to enforce the WITH CHECK OPTION constraint along with the attributes needed for the RETURNING clause (if any) from the remote side. Thus, the WITH CHECK OPTION constraint is always evaluated against the final tuple after any triggers on the remote side. This fix may be considered inconsistent with CHECK constraints declared on foreign tables, which are not enforced locally at all (because the constraint is on a remote object). The discussion concluded that this difference is reasonable, because the WITH CHECK OPTION is a constraint on the local view (not any remote object); therefore it only makes sense to enforce its WITH CHECK OPTION constraint locally. Author: Etsuro Fujita Reviewed-by: Arthur Zakirov, Stephen Frost Discussion: https://www.postgresql.org/message-id/7eb58fab-fd3b-781b-ac33-f7cfec96021f%40lab.ntt.co.jp
2018-07-08 09:14:51 +02:00
withCheckOptionList, returningList, retrieved_attrs);
}
/*
* deparse remote UPDATE statement
*
* 'buf' is the output buffer to append the statement to
* 'rtindex' is the RT index of the associated target relation
* 'rel' is the relation descriptor for the target relation
* 'foreignrel' is the RelOptInfo for the target relation or the join relation
* containing all base relations in the query
* 'targetlist' is the tlist of the underlying foreign-scan plan node
Rework planning and execution of UPDATE and DELETE. This patch makes two closely related sets of changes: 1. For UPDATE, the subplan of the ModifyTable node now only delivers the new values of the changed columns (i.e., the expressions computed in the query's SET clause) plus row identity information such as CTID. ModifyTable must re-fetch the original tuple to merge in the old values of any unchanged columns. The core advantage of this is that the changed columns are uniform across all tables of an inherited or partitioned target relation, whereas the other columns might not be. A secondary advantage, when the UPDATE involves joins, is that less data needs to pass through the plan tree. The disadvantage of course is an extra fetch of each tuple to be updated. However, that seems to be very nearly free in context; even worst-case tests don't show it to add more than a couple percent to the total query cost. At some point it might be interesting to combine the re-fetch with the tuple access that ModifyTable must do anyway to mark the old tuple dead; but that would require a good deal of refactoring and it seems it wouldn't buy all that much, so this patch doesn't attempt it. 2. For inherited UPDATE/DELETE, instead of generating a separate subplan for each target relation, we now generate a single subplan that is just exactly like a SELECT's plan, then stick ModifyTable on top of that. To let ModifyTable know which target relation a given incoming row refers to, a tableoid junk column is added to the row identity information. This gets rid of the horrid hack that was inheritance_planner(), eliminating O(N^2) planning cost and memory consumption in cases where there were many unprunable target relations. Point 2 of course requires point 1, so that there is a uniform definition of the non-junk columns to be returned by the subplan. We can't insist on uniform definition of the row identity junk columns however, if we want to keep the ability to have both plain and foreign tables in a partitioning hierarchy. Since it wouldn't scale very far to have every child table have its own row identity column, this patch includes provisions to merge similar row identity columns into one column of the subplan result. In particular, we can merge the whole-row Vars typically used as row identity by FDWs into one column by pretending they are type RECORD. (It's still okay for the actual composite Datums to be labeled with the table's rowtype OID, though.) There is more that can be done to file down residual inefficiencies in this patch, but it seems to be committable now. FDW authors should note several API changes: * The argument list for AddForeignUpdateTargets() has changed, and so has the method it must use for adding junk columns to the query. Call add_row_identity_var() instead of manipulating the parse tree directly. You might want to reconsider exactly what you're adding, too. * PlanDirectModify() must now work a little harder to find the ForeignScan plan node; if the foreign table is part of a partitioning hierarchy then the ForeignScan might not be the direct child of ModifyTable. See postgres_fdw for sample code. * To check whether a relation is a target relation, it's no longer sufficient to compare its relid to root->parse->resultRelation. Instead, check it against all_result_relids or leaf_result_relids, as appropriate. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
* (note that this only contains new-value expressions and junk attrs)
* 'targetAttrs' is the target columns of the UPDATE
* 'remote_conds' is the qual clauses that must be evaluated remotely
* '*params_list' is an output list of exprs that will become remote Params
* 'returningList' is the RETURNING targetlist
* '*retrieved_attrs' is an output list of integers of columns being retrieved
* by RETURNING (if any)
*/
void
deparseDirectUpdateSql(StringInfo buf, PlannerInfo *root,
Index rtindex, Relation rel,
RelOptInfo *foreignrel,
List *targetlist,
List *targetAttrs,
List *remote_conds,
List **params_list,
List *returningList,
List **retrieved_attrs)
{
deparse_expr_cxt context;
int nestlevel;
bool first;
RangeTblEntry *rte = planner_rt_fetch(rtindex, root);
Rework planning and execution of UPDATE and DELETE. This patch makes two closely related sets of changes: 1. For UPDATE, the subplan of the ModifyTable node now only delivers the new values of the changed columns (i.e., the expressions computed in the query's SET clause) plus row identity information such as CTID. ModifyTable must re-fetch the original tuple to merge in the old values of any unchanged columns. The core advantage of this is that the changed columns are uniform across all tables of an inherited or partitioned target relation, whereas the other columns might not be. A secondary advantage, when the UPDATE involves joins, is that less data needs to pass through the plan tree. The disadvantage of course is an extra fetch of each tuple to be updated. However, that seems to be very nearly free in context; even worst-case tests don't show it to add more than a couple percent to the total query cost. At some point it might be interesting to combine the re-fetch with the tuple access that ModifyTable must do anyway to mark the old tuple dead; but that would require a good deal of refactoring and it seems it wouldn't buy all that much, so this patch doesn't attempt it. 2. For inherited UPDATE/DELETE, instead of generating a separate subplan for each target relation, we now generate a single subplan that is just exactly like a SELECT's plan, then stick ModifyTable on top of that. To let ModifyTable know which target relation a given incoming row refers to, a tableoid junk column is added to the row identity information. This gets rid of the horrid hack that was inheritance_planner(), eliminating O(N^2) planning cost and memory consumption in cases where there were many unprunable target relations. Point 2 of course requires point 1, so that there is a uniform definition of the non-junk columns to be returned by the subplan. We can't insist on uniform definition of the row identity junk columns however, if we want to keep the ability to have both plain and foreign tables in a partitioning hierarchy. Since it wouldn't scale very far to have every child table have its own row identity column, this patch includes provisions to merge similar row identity columns into one column of the subplan result. In particular, we can merge the whole-row Vars typically used as row identity by FDWs into one column by pretending they are type RECORD. (It's still okay for the actual composite Datums to be labeled with the table's rowtype OID, though.) There is more that can be done to file down residual inefficiencies in this patch, but it seems to be committable now. FDW authors should note several API changes: * The argument list for AddForeignUpdateTargets() has changed, and so has the method it must use for adding junk columns to the query. Call add_row_identity_var() instead of manipulating the parse tree directly. You might want to reconsider exactly what you're adding, too. * PlanDirectModify() must now work a little harder to find the ForeignScan plan node; if the foreign table is part of a partitioning hierarchy then the ForeignScan might not be the direct child of ModifyTable. See postgres_fdw for sample code. * To check whether a relation is a target relation, it's no longer sufficient to compare its relid to root->parse->resultRelation. Instead, check it against all_result_relids or leaf_result_relids, as appropriate. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
ListCell *lc,
*lc2;
List *additional_conds = NIL;
/* Set up context struct for recursion */
context.root = root;
context.foreignrel = foreignrel;
context.scanrel = foreignrel;
context.buf = buf;
context.params_list = params_list;
appendStringInfoString(buf, "UPDATE ");
deparseRelation(buf, rel);
if (foreignrel->reloptkind == RELOPT_JOINREL)
appendStringInfo(buf, " %s%d", REL_ALIAS_PREFIX, rtindex);
appendStringInfoString(buf, " SET ");
/* Make sure any constants in the exprs are printed portably */
nestlevel = set_transmission_modes();
first = true;
Rework planning and execution of UPDATE and DELETE. This patch makes two closely related sets of changes: 1. For UPDATE, the subplan of the ModifyTable node now only delivers the new values of the changed columns (i.e., the expressions computed in the query's SET clause) plus row identity information such as CTID. ModifyTable must re-fetch the original tuple to merge in the old values of any unchanged columns. The core advantage of this is that the changed columns are uniform across all tables of an inherited or partitioned target relation, whereas the other columns might not be. A secondary advantage, when the UPDATE involves joins, is that less data needs to pass through the plan tree. The disadvantage of course is an extra fetch of each tuple to be updated. However, that seems to be very nearly free in context; even worst-case tests don't show it to add more than a couple percent to the total query cost. At some point it might be interesting to combine the re-fetch with the tuple access that ModifyTable must do anyway to mark the old tuple dead; but that would require a good deal of refactoring and it seems it wouldn't buy all that much, so this patch doesn't attempt it. 2. For inherited UPDATE/DELETE, instead of generating a separate subplan for each target relation, we now generate a single subplan that is just exactly like a SELECT's plan, then stick ModifyTable on top of that. To let ModifyTable know which target relation a given incoming row refers to, a tableoid junk column is added to the row identity information. This gets rid of the horrid hack that was inheritance_planner(), eliminating O(N^2) planning cost and memory consumption in cases where there were many unprunable target relations. Point 2 of course requires point 1, so that there is a uniform definition of the non-junk columns to be returned by the subplan. We can't insist on uniform definition of the row identity junk columns however, if we want to keep the ability to have both plain and foreign tables in a partitioning hierarchy. Since it wouldn't scale very far to have every child table have its own row identity column, this patch includes provisions to merge similar row identity columns into one column of the subplan result. In particular, we can merge the whole-row Vars typically used as row identity by FDWs into one column by pretending they are type RECORD. (It's still okay for the actual composite Datums to be labeled with the table's rowtype OID, though.) There is more that can be done to file down residual inefficiencies in this patch, but it seems to be committable now. FDW authors should note several API changes: * The argument list for AddForeignUpdateTargets() has changed, and so has the method it must use for adding junk columns to the query. Call add_row_identity_var() instead of manipulating the parse tree directly. You might want to reconsider exactly what you're adding, too. * PlanDirectModify() must now work a little harder to find the ForeignScan plan node; if the foreign table is part of a partitioning hierarchy then the ForeignScan might not be the direct child of ModifyTable. See postgres_fdw for sample code. * To check whether a relation is a target relation, it's no longer sufficient to compare its relid to root->parse->resultRelation. Instead, check it against all_result_relids or leaf_result_relids, as appropriate. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
forboth(lc, targetlist, lc2, targetAttrs)
{
Rework planning and execution of UPDATE and DELETE. This patch makes two closely related sets of changes: 1. For UPDATE, the subplan of the ModifyTable node now only delivers the new values of the changed columns (i.e., the expressions computed in the query's SET clause) plus row identity information such as CTID. ModifyTable must re-fetch the original tuple to merge in the old values of any unchanged columns. The core advantage of this is that the changed columns are uniform across all tables of an inherited or partitioned target relation, whereas the other columns might not be. A secondary advantage, when the UPDATE involves joins, is that less data needs to pass through the plan tree. The disadvantage of course is an extra fetch of each tuple to be updated. However, that seems to be very nearly free in context; even worst-case tests don't show it to add more than a couple percent to the total query cost. At some point it might be interesting to combine the re-fetch with the tuple access that ModifyTable must do anyway to mark the old tuple dead; but that would require a good deal of refactoring and it seems it wouldn't buy all that much, so this patch doesn't attempt it. 2. For inherited UPDATE/DELETE, instead of generating a separate subplan for each target relation, we now generate a single subplan that is just exactly like a SELECT's plan, then stick ModifyTable on top of that. To let ModifyTable know which target relation a given incoming row refers to, a tableoid junk column is added to the row identity information. This gets rid of the horrid hack that was inheritance_planner(), eliminating O(N^2) planning cost and memory consumption in cases where there were many unprunable target relations. Point 2 of course requires point 1, so that there is a uniform definition of the non-junk columns to be returned by the subplan. We can't insist on uniform definition of the row identity junk columns however, if we want to keep the ability to have both plain and foreign tables in a partitioning hierarchy. Since it wouldn't scale very far to have every child table have its own row identity column, this patch includes provisions to merge similar row identity columns into one column of the subplan result. In particular, we can merge the whole-row Vars typically used as row identity by FDWs into one column by pretending they are type RECORD. (It's still okay for the actual composite Datums to be labeled with the table's rowtype OID, though.) There is more that can be done to file down residual inefficiencies in this patch, but it seems to be committable now. FDW authors should note several API changes: * The argument list for AddForeignUpdateTargets() has changed, and so has the method it must use for adding junk columns to the query. Call add_row_identity_var() instead of manipulating the parse tree directly. You might want to reconsider exactly what you're adding, too. * PlanDirectModify() must now work a little harder to find the ForeignScan plan node; if the foreign table is part of a partitioning hierarchy then the ForeignScan might not be the direct child of ModifyTable. See postgres_fdw for sample code. * To check whether a relation is a target relation, it's no longer sufficient to compare its relid to root->parse->resultRelation. Instead, check it against all_result_relids or leaf_result_relids, as appropriate. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
TargetEntry *tle = lfirst_node(TargetEntry, lc);
int attnum = lfirst_int(lc2);
Rework planning and execution of UPDATE and DELETE. This patch makes two closely related sets of changes: 1. For UPDATE, the subplan of the ModifyTable node now only delivers the new values of the changed columns (i.e., the expressions computed in the query's SET clause) plus row identity information such as CTID. ModifyTable must re-fetch the original tuple to merge in the old values of any unchanged columns. The core advantage of this is that the changed columns are uniform across all tables of an inherited or partitioned target relation, whereas the other columns might not be. A secondary advantage, when the UPDATE involves joins, is that less data needs to pass through the plan tree. The disadvantage of course is an extra fetch of each tuple to be updated. However, that seems to be very nearly free in context; even worst-case tests don't show it to add more than a couple percent to the total query cost. At some point it might be interesting to combine the re-fetch with the tuple access that ModifyTable must do anyway to mark the old tuple dead; but that would require a good deal of refactoring and it seems it wouldn't buy all that much, so this patch doesn't attempt it. 2. For inherited UPDATE/DELETE, instead of generating a separate subplan for each target relation, we now generate a single subplan that is just exactly like a SELECT's plan, then stick ModifyTable on top of that. To let ModifyTable know which target relation a given incoming row refers to, a tableoid junk column is added to the row identity information. This gets rid of the horrid hack that was inheritance_planner(), eliminating O(N^2) planning cost and memory consumption in cases where there were many unprunable target relations. Point 2 of course requires point 1, so that there is a uniform definition of the non-junk columns to be returned by the subplan. We can't insist on uniform definition of the row identity junk columns however, if we want to keep the ability to have both plain and foreign tables in a partitioning hierarchy. Since it wouldn't scale very far to have every child table have its own row identity column, this patch includes provisions to merge similar row identity columns into one column of the subplan result. In particular, we can merge the whole-row Vars typically used as row identity by FDWs into one column by pretending they are type RECORD. (It's still okay for the actual composite Datums to be labeled with the table's rowtype OID, though.) There is more that can be done to file down residual inefficiencies in this patch, but it seems to be committable now. FDW authors should note several API changes: * The argument list for AddForeignUpdateTargets() has changed, and so has the method it must use for adding junk columns to the query. Call add_row_identity_var() instead of manipulating the parse tree directly. You might want to reconsider exactly what you're adding, too. * PlanDirectModify() must now work a little harder to find the ForeignScan plan node; if the foreign table is part of a partitioning hierarchy then the ForeignScan might not be the direct child of ModifyTable. See postgres_fdw for sample code. * To check whether a relation is a target relation, it's no longer sufficient to compare its relid to root->parse->resultRelation. Instead, check it against all_result_relids or leaf_result_relids, as appropriate. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
2021-03-31 17:52:34 +02:00
/* update's new-value expressions shouldn't be resjunk */
Assert(!tle->resjunk);
if (!first)
appendStringInfoString(buf, ", ");
first = false;
deparseColumnRef(buf, rtindex, attnum, rte, false);
appendStringInfoString(buf, " = ");
deparseExpr((Expr *) tle->expr, &context);
}
reset_transmission_modes(nestlevel);
if (foreignrel->reloptkind == RELOPT_JOINREL)
{
List *ignore_conds = NIL;
appendStringInfoString(buf, " FROM ");
deparseFromExprForRel(buf, root, foreignrel, true, rtindex,
&ignore_conds, &additional_conds, params_list);
remote_conds = list_concat(remote_conds, ignore_conds);
}
appendWhereClause(remote_conds, additional_conds, &context);
if (additional_conds != NIL)
list_free_deep(additional_conds);
if (foreignrel->reloptkind == RELOPT_JOINREL)
deparseExplicitTargetList(returningList, true, retrieved_attrs,
&context);
else
deparseReturningList(buf, rte, rtindex, rel, false,
Fix WITH CHECK OPTION on views referencing postgres_fdw tables. If a view references a foreign table, and the foreign table has a BEFORE INSERT trigger, then it's possible for a tuple inserted or updated through the view to be changed such that it violates the view's WITH CHECK OPTION constraint. Before this commit, postgres_fdw handled this case inconsistently. A RETURNING clause on the INSERT or UPDATE statement targeting the view would cause the finally-inserted tuple to be read back, and the WITH CHECK OPTION violation would throw an error. But without a RETURNING clause, postgres_fdw would not read the final tuple back, and WITH CHECK OPTION would not throw an error for the violation (or may throw an error when there is no real violation). AFTER ROW triggers on the foreign table had a similar effect as a RETURNING clause on the INSERT or UPDATE statement. To fix, this commit retrieves the attributes needed to enforce the WITH CHECK OPTION constraint along with the attributes needed for the RETURNING clause (if any) from the remote side. Thus, the WITH CHECK OPTION constraint is always evaluated against the final tuple after any triggers on the remote side. This fix may be considered inconsistent with CHECK constraints declared on foreign tables, which are not enforced locally at all (because the constraint is on a remote object). The discussion concluded that this difference is reasonable, because the WITH CHECK OPTION is a constraint on the local view (not any remote object); therefore it only makes sense to enforce its WITH CHECK OPTION constraint locally. Author: Etsuro Fujita Reviewed-by: Arthur Zakirov, Stephen Frost Discussion: https://www.postgresql.org/message-id/7eb58fab-fd3b-781b-ac33-f7cfec96021f%40lab.ntt.co.jp
2018-07-08 09:14:51 +02:00
NIL, returningList, retrieved_attrs);
}
/*
* deparse remote DELETE statement
*
* The statement text is appended to buf, and we also create an integer List
* of the columns being retrieved by RETURNING (if any), which is returned
* to *retrieved_attrs.
*/
void
deparseDeleteSql(StringInfo buf, RangeTblEntry *rte,
Index rtindex, Relation rel,
List *returningList,
List **retrieved_attrs)
{
appendStringInfoString(buf, "DELETE FROM ");
deparseRelation(buf, rel);
appendStringInfoString(buf, " WHERE ctid = $1");
deparseReturningList(buf, rte, rtindex, rel,
rel->trigdesc && rel->trigdesc->trig_delete_after_row,
Fix WITH CHECK OPTION on views referencing postgres_fdw tables. If a view references a foreign table, and the foreign table has a BEFORE INSERT trigger, then it's possible for a tuple inserted or updated through the view to be changed such that it violates the view's WITH CHECK OPTION constraint. Before this commit, postgres_fdw handled this case inconsistently. A RETURNING clause on the INSERT or UPDATE statement targeting the view would cause the finally-inserted tuple to be read back, and the WITH CHECK OPTION violation would throw an error. But without a RETURNING clause, postgres_fdw would not read the final tuple back, and WITH CHECK OPTION would not throw an error for the violation (or may throw an error when there is no real violation). AFTER ROW triggers on the foreign table had a similar effect as a RETURNING clause on the INSERT or UPDATE statement. To fix, this commit retrieves the attributes needed to enforce the WITH CHECK OPTION constraint along with the attributes needed for the RETURNING clause (if any) from the remote side. Thus, the WITH CHECK OPTION constraint is always evaluated against the final tuple after any triggers on the remote side. This fix may be considered inconsistent with CHECK constraints declared on foreign tables, which are not enforced locally at all (because the constraint is on a remote object). The discussion concluded that this difference is reasonable, because the WITH CHECK OPTION is a constraint on the local view (not any remote object); therefore it only makes sense to enforce its WITH CHECK OPTION constraint locally. Author: Etsuro Fujita Reviewed-by: Arthur Zakirov, Stephen Frost Discussion: https://www.postgresql.org/message-id/7eb58fab-fd3b-781b-ac33-f7cfec96021f%40lab.ntt.co.jp
2018-07-08 09:14:51 +02:00
NIL, returningList, retrieved_attrs);
}
/*
* deparse remote DELETE statement
*
* 'buf' is the output buffer to append the statement to
* 'rtindex' is the RT index of the associated target relation
* 'rel' is the relation descriptor for the target relation
* 'foreignrel' is the RelOptInfo for the target relation or the join relation
* containing all base relations in the query
* 'remote_conds' is the qual clauses that must be evaluated remotely
* '*params_list' is an output list of exprs that will become remote Params
* 'returningList' is the RETURNING targetlist
* '*retrieved_attrs' is an output list of integers of columns being retrieved
* by RETURNING (if any)
*/
void
deparseDirectDeleteSql(StringInfo buf, PlannerInfo *root,
Index rtindex, Relation rel,
RelOptInfo *foreignrel,
List *remote_conds,
List **params_list,
List *returningList,
List **retrieved_attrs)
{
deparse_expr_cxt context;
List *additional_conds = NIL;
/* Set up context struct for recursion */
context.root = root;
context.foreignrel = foreignrel;
context.scanrel = foreignrel;
context.buf = buf;
context.params_list = params_list;
appendStringInfoString(buf, "DELETE FROM ");
deparseRelation(buf, rel);
if (foreignrel->reloptkind == RELOPT_JOINREL)
appendStringInfo(buf, " %s%d", REL_ALIAS_PREFIX, rtindex);
if (foreignrel->reloptkind == RELOPT_JOINREL)
{
List *ignore_conds = NIL;
appendStringInfoString(buf, " USING ");
deparseFromExprForRel(buf, root, foreignrel, true, rtindex,
&ignore_conds, &additional_conds, params_list);
remote_conds = list_concat(remote_conds, ignore_conds);
}
appendWhereClause(remote_conds, additional_conds, &context);
if (additional_conds != NIL)
list_free_deep(additional_conds);
if (foreignrel->reloptkind == RELOPT_JOINREL)
deparseExplicitTargetList(returningList, true, retrieved_attrs,
&context);
else
deparseReturningList(buf, planner_rt_fetch(rtindex, root),
rtindex, rel, false,
Fix WITH CHECK OPTION on views referencing postgres_fdw tables. If a view references a foreign table, and the foreign table has a BEFORE INSERT trigger, then it's possible for a tuple inserted or updated through the view to be changed such that it violates the view's WITH CHECK OPTION constraint. Before this commit, postgres_fdw handled this case inconsistently. A RETURNING clause on the INSERT or UPDATE statement targeting the view would cause the finally-inserted tuple to be read back, and the WITH CHECK OPTION violation would throw an error. But without a RETURNING clause, postgres_fdw would not read the final tuple back, and WITH CHECK OPTION would not throw an error for the violation (or may throw an error when there is no real violation). AFTER ROW triggers on the foreign table had a similar effect as a RETURNING clause on the INSERT or UPDATE statement. To fix, this commit retrieves the attributes needed to enforce the WITH CHECK OPTION constraint along with the attributes needed for the RETURNING clause (if any) from the remote side. Thus, the WITH CHECK OPTION constraint is always evaluated against the final tuple after any triggers on the remote side. This fix may be considered inconsistent with CHECK constraints declared on foreign tables, which are not enforced locally at all (because the constraint is on a remote object). The discussion concluded that this difference is reasonable, because the WITH CHECK OPTION is a constraint on the local view (not any remote object); therefore it only makes sense to enforce its WITH CHECK OPTION constraint locally. Author: Etsuro Fujita Reviewed-by: Arthur Zakirov, Stephen Frost Discussion: https://www.postgresql.org/message-id/7eb58fab-fd3b-781b-ac33-f7cfec96021f%40lab.ntt.co.jp
2018-07-08 09:14:51 +02:00
NIL, returningList, retrieved_attrs);
}
/*
* Add a RETURNING clause, if needed, to an INSERT/UPDATE/DELETE.
*/
static void
deparseReturningList(StringInfo buf, RangeTblEntry *rte,
Index rtindex, Relation rel,
bool trig_after_row,
Fix WITH CHECK OPTION on views referencing postgres_fdw tables. If a view references a foreign table, and the foreign table has a BEFORE INSERT trigger, then it's possible for a tuple inserted or updated through the view to be changed such that it violates the view's WITH CHECK OPTION constraint. Before this commit, postgres_fdw handled this case inconsistently. A RETURNING clause on the INSERT or UPDATE statement targeting the view would cause the finally-inserted tuple to be read back, and the WITH CHECK OPTION violation would throw an error. But without a RETURNING clause, postgres_fdw would not read the final tuple back, and WITH CHECK OPTION would not throw an error for the violation (or may throw an error when there is no real violation). AFTER ROW triggers on the foreign table had a similar effect as a RETURNING clause on the INSERT or UPDATE statement. To fix, this commit retrieves the attributes needed to enforce the WITH CHECK OPTION constraint along with the attributes needed for the RETURNING clause (if any) from the remote side. Thus, the WITH CHECK OPTION constraint is always evaluated against the final tuple after any triggers on the remote side. This fix may be considered inconsistent with CHECK constraints declared on foreign tables, which are not enforced locally at all (because the constraint is on a remote object). The discussion concluded that this difference is reasonable, because the WITH CHECK OPTION is a constraint on the local view (not any remote object); therefore it only makes sense to enforce its WITH CHECK OPTION constraint locally. Author: Etsuro Fujita Reviewed-by: Arthur Zakirov, Stephen Frost Discussion: https://www.postgresql.org/message-id/7eb58fab-fd3b-781b-ac33-f7cfec96021f%40lab.ntt.co.jp
2018-07-08 09:14:51 +02:00
List *withCheckOptionList,
List *returningList,
List **retrieved_attrs)
{
Bitmapset *attrs_used = NULL;
if (trig_after_row)
{
/* whole-row reference acquires all non-system columns */
attrs_used =
bms_make_singleton(0 - FirstLowInvalidHeapAttributeNumber);
}
Fix WITH CHECK OPTION on views referencing postgres_fdw tables. If a view references a foreign table, and the foreign table has a BEFORE INSERT trigger, then it's possible for a tuple inserted or updated through the view to be changed such that it violates the view's WITH CHECK OPTION constraint. Before this commit, postgres_fdw handled this case inconsistently. A RETURNING clause on the INSERT or UPDATE statement targeting the view would cause the finally-inserted tuple to be read back, and the WITH CHECK OPTION violation would throw an error. But without a RETURNING clause, postgres_fdw would not read the final tuple back, and WITH CHECK OPTION would not throw an error for the violation (or may throw an error when there is no real violation). AFTER ROW triggers on the foreign table had a similar effect as a RETURNING clause on the INSERT or UPDATE statement. To fix, this commit retrieves the attributes needed to enforce the WITH CHECK OPTION constraint along with the attributes needed for the RETURNING clause (if any) from the remote side. Thus, the WITH CHECK OPTION constraint is always evaluated against the final tuple after any triggers on the remote side. This fix may be considered inconsistent with CHECK constraints declared on foreign tables, which are not enforced locally at all (because the constraint is on a remote object). The discussion concluded that this difference is reasonable, because the WITH CHECK OPTION is a constraint on the local view (not any remote object); therefore it only makes sense to enforce its WITH CHECK OPTION constraint locally. Author: Etsuro Fujita Reviewed-by: Arthur Zakirov, Stephen Frost Discussion: https://www.postgresql.org/message-id/7eb58fab-fd3b-781b-ac33-f7cfec96021f%40lab.ntt.co.jp
2018-07-08 09:14:51 +02:00
if (withCheckOptionList != NIL)
{
/*
* We need the attrs, non-system and system, mentioned in the local
* query's WITH CHECK OPTION list.
*
* Note: we do this to ensure that WCO constraints will be evaluated
* on the data actually inserted/updated on the remote side, which
* might differ from the data supplied by the core code, for example
* as a result of remote triggers.
*/
pull_varattnos((Node *) withCheckOptionList, rtindex,
&attrs_used);
}
if (returningList != NIL)
{
/*
* We need the attrs, non-system and system, mentioned in the local
* query's RETURNING list.
*/
pull_varattnos((Node *) returningList, rtindex,
&attrs_used);
}
if (attrs_used != NULL)
deparseTargetList(buf, rte, rtindex, rel, true, attrs_used, false,
retrieved_attrs);
else
*retrieved_attrs = NIL;
}
/*
* Construct SELECT statement to acquire size in blocks of given relation.
*
* Note: we use local definition of block size, not remote definition.
* This is perhaps debatable.
*
* Note: pg_relation_size() exists in 8.1 and later.
*/
void
deparseAnalyzeSizeSql(StringInfo buf, Relation rel)
{
StringInfoData relname;
/* We'll need the remote relation name as a literal. */
initStringInfo(&relname);
deparseRelation(&relname, rel);
appendStringInfoString(buf, "SELECT pg_catalog.pg_relation_size(");
deparseStringLiteral(buf, relname.data);
appendStringInfo(buf, "::pg_catalog.regclass) / %d", BLCKSZ);
}
/*
* Construct SELECT statement to acquire the number of rows and the relkind of
* a relation.
*
* Note: we just return the remote server's reltuples value, which might
* be off a good deal, but it doesn't seem worth working harder. See
* comments in postgresAcquireSampleRowsFunc.
*/
void
deparseAnalyzeInfoSql(StringInfo buf, Relation rel)
{
StringInfoData relname;
/* We'll need the remote relation name as a literal. */
initStringInfo(&relname);
deparseRelation(&relname, rel);
appendStringInfoString(buf, "SELECT reltuples, relkind FROM pg_catalog.pg_class WHERE oid = ");
deparseStringLiteral(buf, relname.data);
appendStringInfoString(buf, "::pg_catalog.regclass");
}
/*
* Construct SELECT statement to acquire sample rows of given relation.
*
* SELECT command is appended to buf, and list of columns retrieved
* is returned to *retrieved_attrs.
*
* We only support sampling methods we can decide based on server version.
* Allowing custom TSM modules (like tsm_system_rows) might be useful, but it
* would require detecting which extensions are installed, to allow automatic
* fall-back. Moreover, the methods may use different parameters like number
* of rows (and not sampling rate). So we leave this for future improvements.
*
* Using random() to sample rows on the remote server has the advantage that
* this works on all PostgreSQL versions (unlike TABLESAMPLE), and that it
* does the sampling on the remote side (without transferring everything and
* then discarding most rows).
*
* The disadvantage is that we still have to read all rows and evaluate the
* random(), while TABLESAMPLE (at least with the "system" method) may skip.
* It's not that different from the "bernoulli" method, though.
*
* We could also do "ORDER BY random() LIMIT x", which would always pick
* the expected number of rows, but it requires sorting so it may be much
* more expensive (particularly on large tables, which is what the
* remote sampling is meant to improve).
*/
void
deparseAnalyzeSql(StringInfo buf, Relation rel,
PgFdwSamplingMethod sample_method, double sample_frac,
List **retrieved_attrs)
{
Oid relid = RelationGetRelid(rel);
TupleDesc tupdesc = RelationGetDescr(rel);
int i;
char *colname;
List *options;
ListCell *lc;
bool first = true;
*retrieved_attrs = NIL;
appendStringInfoString(buf, "SELECT ");
for (i = 0; i < tupdesc->natts; i++)
{
/* Ignore dropped columns. */
if (TupleDescAttr(tupdesc, i)->attisdropped)
continue;
if (!first)
appendStringInfoString(buf, ", ");
first = false;
/* Use attribute name or column_name option. */
colname = NameStr(TupleDescAttr(tupdesc, i)->attname);
options = GetForeignColumnOptions(relid, i + 1);
foreach(lc, options)
{
DefElem *def = (DefElem *) lfirst(lc);
if (strcmp(def->defname, "column_name") == 0)
{
colname = defGetString(def);
break;
}
}
appendStringInfoString(buf, quote_identifier(colname));
*retrieved_attrs = lappend_int(*retrieved_attrs, i + 1);
}
/* Don't generate bad syntax for zero-column relation. */
if (first)
appendStringInfoString(buf, "NULL");
/*
* Construct FROM clause, and perhaps WHERE clause too, depending on the
* selected sampling method.
*/
appendStringInfoString(buf, " FROM ");
deparseRelation(buf, rel);
switch (sample_method)
{
case ANALYZE_SAMPLE_OFF:
/* nothing to do here */
break;
case ANALYZE_SAMPLE_RANDOM:
appendStringInfo(buf, " WHERE pg_catalog.random() < %f", sample_frac);
break;
case ANALYZE_SAMPLE_SYSTEM:
appendStringInfo(buf, " TABLESAMPLE SYSTEM(%f)", (100.0 * sample_frac));
break;
case ANALYZE_SAMPLE_BERNOULLI:
appendStringInfo(buf, " TABLESAMPLE BERNOULLI(%f)", (100.0 * sample_frac));
break;
case ANALYZE_SAMPLE_AUTO:
/* should have been resolved into actual method */
elog(ERROR, "unexpected sampling method");
break;
}
}
/*
* Construct a simple "TRUNCATE rel" statement
*/
void
deparseTruncateSql(StringInfo buf,
List *rels,
DropBehavior behavior,
bool restart_seqs)
{
ListCell *cell;
appendStringInfoString(buf, "TRUNCATE ");
foreach(cell, rels)
{
Relation rel = lfirst(cell);
if (cell != list_head(rels))
appendStringInfoString(buf, ", ");
deparseRelation(buf, rel);
}
appendStringInfo(buf, " %s IDENTITY",
restart_seqs ? "RESTART" : "CONTINUE");
if (behavior == DROP_RESTRICT)
appendStringInfoString(buf, " RESTRICT");
else if (behavior == DROP_CASCADE)
appendStringInfoString(buf, " CASCADE");
}
/*
* Construct name to use for given column, and emit it into buf.
* If it has a column_name FDW option, use that instead of attribute name.
*
* If qualify_col is true, qualify column name with the alias of relation.
*/
static void
deparseColumnRef(StringInfo buf, int varno, int varattno, RangeTblEntry *rte,
bool qualify_col)
{
/* We support fetching the remote side's CTID and OID. */
if (varattno == SelfItemPointerAttributeNumber)
{
if (qualify_col)
ADD_REL_QUALIFIER(buf, varno);
appendStringInfoString(buf, "ctid");
}
else if (varattno < 0)
{
/*
* All other system attributes are fetched as 0, except for table OID,
* which is fetched as the local table OID. However, we must be
* careful; the table could be beneath an outer join, in which case it
* must go to NULL whenever the rest of the row does.
*/
Oid fetchval = 0;
if (varattno == TableOidAttributeNumber)
fetchval = rte->relid;
if (qualify_col)
{
appendStringInfoString(buf, "CASE WHEN (");
ADD_REL_QUALIFIER(buf, varno);
appendStringInfo(buf, "*)::text IS NOT NULL THEN %u END", fetchval);
}
else
appendStringInfo(buf, "%u", fetchval);
}
else if (varattno == 0)
{
/* Whole row reference */
Relation rel;
Bitmapset *attrs_used;
/* Required only to be passed down to deparseTargetList(). */
List *retrieved_attrs;
/*
* The lock on the relation will be held by upper callers, so it's
* fine to open it with no lock here.
*/
rel = table_open(rte->relid, NoLock);
/*
* The local name of the foreign table can not be recognized by the
* foreign server and the table it references on foreign server might
* have different column ordering or different columns than those
* declared locally. Hence we have to deparse whole-row reference as
* ROW(columns referenced locally). Construct this by deparsing a
* "whole row" attribute.
*/
attrs_used = bms_add_member(NULL,
0 - FirstLowInvalidHeapAttributeNumber);
/*
* In case the whole-row reference is under an outer join then it has
* to go NULL whenever the rest of the row goes NULL. Deparsing a join
* query would always involve multiple relations, thus qualify_col
* would be true.
*/
if (qualify_col)
{
appendStringInfoString(buf, "CASE WHEN (");
ADD_REL_QUALIFIER(buf, varno);
appendStringInfoString(buf, "*)::text IS NOT NULL THEN ");
}
appendStringInfoString(buf, "ROW(");
deparseTargetList(buf, rte, varno, rel, false, attrs_used, qualify_col,
&retrieved_attrs);
appendStringInfoChar(buf, ')');
/* Complete the CASE WHEN statement started above. */
if (qualify_col)
appendStringInfoString(buf, " END");
table_close(rel, NoLock);
bms_free(attrs_used);
}
else
{
char *colname = NULL;
List *options;
ListCell *lc;
/* varno must not be any of OUTER_VAR, INNER_VAR and INDEX_VAR. */
Assert(!IS_SPECIAL_VARNO(varno));
/*
* If it's a column of a foreign table, and it has the column_name FDW
* option, use that value.
*/
options = GetForeignColumnOptions(rte->relid, varattno);
foreach(lc, options)
{
DefElem *def = (DefElem *) lfirst(lc);
if (strcmp(def->defname, "column_name") == 0)
{
colname = defGetString(def);
break;
}
}
/*
* If it's a column of a regular table or it doesn't have column_name
* FDW option, use attribute name.
*/
if (colname == NULL)
colname = get_attname(rte->relid, varattno, false);
if (qualify_col)
ADD_REL_QUALIFIER(buf, varno);
appendStringInfoString(buf, quote_identifier(colname));
}
}
/*
* Append remote name of specified foreign table to buf.
* Use value of table_name FDW option (if any) instead of relation's name.
* Similarly, schema_name FDW option overrides schema name.
*/
static void
deparseRelation(StringInfo buf, Relation rel)
{
ForeignTable *table;
const char *nspname = NULL;
const char *relname = NULL;
ListCell *lc;
/* obtain additional catalog information. */
table = GetForeignTable(RelationGetRelid(rel));
/*
* Use value of FDW options if any, instead of the name of object itself.
*/
foreach(lc, table->options)
{
DefElem *def = (DefElem *) lfirst(lc);
if (strcmp(def->defname, "schema_name") == 0)
nspname = defGetString(def);
else if (strcmp(def->defname, "table_name") == 0)
relname = defGetString(def);
}
/*
* Note: we could skip printing the schema name if it's pg_catalog, but
* that doesn't seem worth the trouble.
*/
if (nspname == NULL)
nspname = get_namespace_name(RelationGetNamespace(rel));
if (relname == NULL)
relname = RelationGetRelationName(rel);
appendStringInfo(buf, "%s.%s",
quote_identifier(nspname), quote_identifier(relname));
}
/*
* Append a SQL string literal representing "val" to buf.
*/
void
deparseStringLiteral(StringInfo buf, const char *val)
{
const char *valptr;
/*
* Rather than making assumptions about the remote server's value of
* standard_conforming_strings, always use E'foo' syntax if there are any
* backslashes. This will fail on remote servers before 8.1, but those
* are long out of support.
*/
if (strchr(val, '\\') != NULL)
appendStringInfoChar(buf, ESCAPE_STRING_SYNTAX);
appendStringInfoChar(buf, '\'');
for (valptr = val; *valptr; valptr++)
{
char ch = *valptr;
if (SQL_STR_DOUBLE(ch, true))
appendStringInfoChar(buf, ch);
appendStringInfoChar(buf, ch);
}
appendStringInfoChar(buf, '\'');
}
/*
* Deparse given expression into context->buf.
*
* This function must support all the same node types that foreign_expr_walker
* accepts.
*
* Note: unlike ruleutils.c, we just use a simple hard-wired parenthesization
* scheme: anything more complex than a Var, Const, function call or cast
* should be self-parenthesized.
*/
static void
deparseExpr(Expr *node, deparse_expr_cxt *context)
{
if (node == NULL)
return;
switch (nodeTag(node))
{
case T_Var:
deparseVar((Var *) node, context);
break;
case T_Const:
deparseConst((Const *) node, context, 0);
break;
case T_Param:
deparseParam((Param *) node, context);
break;
case T_SubscriptingRef:
deparseSubscriptingRef((SubscriptingRef *) node, context);
break;
case T_FuncExpr:
deparseFuncExpr((FuncExpr *) node, context);
break;
case T_OpExpr:
deparseOpExpr((OpExpr *) node, context);
break;
case T_DistinctExpr:
deparseDistinctExpr((DistinctExpr *) node, context);
break;
case T_ScalarArrayOpExpr:
deparseScalarArrayOpExpr((ScalarArrayOpExpr *) node, context);
break;
case T_RelabelType:
deparseRelabelType((RelabelType *) node, context);
break;
case T_BoolExpr:
deparseBoolExpr((BoolExpr *) node, context);
break;
case T_NullTest:
deparseNullTest((NullTest *) node, context);
break;
case T_CaseExpr:
deparseCaseExpr((CaseExpr *) node, context);
break;
case T_ArrayExpr:
deparseArrayExpr((ArrayExpr *) node, context);
break;
case T_Aggref:
deparseAggref((Aggref *) node, context);
break;
default:
elog(ERROR, "unsupported expression type for deparse: %d",
(int) nodeTag(node));
break;
}
}
/*
* Deparse given Var node into context->buf.
*
* If the Var belongs to the foreign relation, just print its remote name.
* Otherwise, it's effectively a Param (and will in fact be a Param at
* run time). Handle it the same way we handle plain Params --- see
* deparseParam for comments.
*/
static void
deparseVar(Var *node, deparse_expr_cxt *context)
{
Relids relids = context->scanrel->relids;
int relno;
int colno;
/* Qualify columns when multiple relations are involved. */
bool qualify_col = (bms_membership(relids) == BMS_MULTIPLE);
/*
* If the Var belongs to the foreign relation that is deparsed as a
* subquery, use the relation and column alias to the Var provided by the
* subquery, instead of the remote name.
*/
if (is_subquery_var(node, context->scanrel, &relno, &colno))
{
appendStringInfo(context->buf, "%s%d.%s%d",
SUBQUERY_REL_ALIAS_PREFIX, relno,
SUBQUERY_COL_ALIAS_PREFIX, colno);
return;
}
if (bms_is_member(node->varno, relids) && node->varlevelsup == 0)
deparseColumnRef(context->buf, node->varno, node->varattno,
planner_rt_fetch(node->varno, context->root),
qualify_col);
else
{
/* Treat like a Param */
if (context->params_list)
{
int pindex = 0;
ListCell *lc;
/* find its index in params_list */
foreach(lc, *context->params_list)
{
pindex++;
if (equal(node, (Node *) lfirst(lc)))
break;
}
if (lc == NULL)
{
/* not in list, so add it */
pindex++;
*context->params_list = lappend(*context->params_list, node);
}
printRemoteParam(pindex, node->vartype, node->vartypmod, context);
}
else
{
printRemotePlaceholder(node->vartype, node->vartypmod, context);
}
}
}
/*
* Deparse given constant value into context->buf.
*
* This function has to be kept in sync with ruleutils.c's get_const_expr.
postgres_fdw: suppress casts on constants in limited cases. When deparsing an expression of the form "remote_var OP constant", we'd normally apply a cast to the constant to make sure that the remote parser thinks it's of the same type we do. However, doing so is often not necessary, and it causes problems if the user has intentionally declared the local column as being of a different type than the remote column. A plausible use-case for that is using text to represent a type that's an enum on the remote side. A comparison on such a column will get shipped as "var = 'foo'::text", which blows up on the remote side because there's no enum = text operator. But if we simply leave off the explicit cast, the comparison will do exactly what the user wants. It's possible to do this without major risk of semantic problems, by relying on the longstanding parser heuristic that "if one operand of an operator is of type unknown, while the other one has a known type, assume that the unknown operand is also of that type". Hence, this patch leaves off the cast only if (a) the operator inputs have the same type locally; (b) the constant will print as a string literal or NULL, both of which are initially taken as type unknown; and (c) the non-Const input is a plain foreign Var. Rule (c) guarantees that the remote parser will know the type of the non-Const input; moreover, it means that if this cast-omission does cause any semantic surprises, that can only happen in cases where the local column has a different type than the remote column. That wasn't guaranteed to work anyway, and this patch should represent a net usability gain for such cases. One point that I (tgl) remain slightly uncomfortable with is that we will ignore an implicit RelabelType when deciding if the non-Const input is a plain Var. That makes it a little squishy to argue that the remote should resolve the Const as being of the same type as its Var, because then our Const is not the same type as our Var. However, if we don't do that, then this hack won't work as desired if the user chooses to use varchar rather than text to represent some remote column. That seems useful, so do it like this for now. We might have to give up the RelabelType-ignoring bit if any problems surface. Dian Fay, with review and kibitzing by me Discussion: https://postgr.es/m/C9LU294V7K4F.34LRRDU449O45@lamia
2021-11-12 17:50:40 +01:00
*
* As in that function, showtype can be -1 to never show "::typename"
* decoration, +1 to always show it, or 0 to show it only if the constant
* wouldn't be assumed to be the right type by default.
*
* In addition, this code allows showtype to be -2 to indicate that we should
* not show "::typename" decoration if the constant is printed as an untyped
* literal or NULL (while in other cases, behaving as for showtype == 0).
*/
static void
deparseConst(Const *node, deparse_expr_cxt *context, int showtype)
{
StringInfo buf = context->buf;
Oid typoutput;
bool typIsVarlena;
char *extval;
bool isfloat = false;
postgres_fdw: suppress casts on constants in limited cases. When deparsing an expression of the form "remote_var OP constant", we'd normally apply a cast to the constant to make sure that the remote parser thinks it's of the same type we do. However, doing so is often not necessary, and it causes problems if the user has intentionally declared the local column as being of a different type than the remote column. A plausible use-case for that is using text to represent a type that's an enum on the remote side. A comparison on such a column will get shipped as "var = 'foo'::text", which blows up on the remote side because there's no enum = text operator. But if we simply leave off the explicit cast, the comparison will do exactly what the user wants. It's possible to do this without major risk of semantic problems, by relying on the longstanding parser heuristic that "if one operand of an operator is of type unknown, while the other one has a known type, assume that the unknown operand is also of that type". Hence, this patch leaves off the cast only if (a) the operator inputs have the same type locally; (b) the constant will print as a string literal or NULL, both of which are initially taken as type unknown; and (c) the non-Const input is a plain foreign Var. Rule (c) guarantees that the remote parser will know the type of the non-Const input; moreover, it means that if this cast-omission does cause any semantic surprises, that can only happen in cases where the local column has a different type than the remote column. That wasn't guaranteed to work anyway, and this patch should represent a net usability gain for such cases. One point that I (tgl) remain slightly uncomfortable with is that we will ignore an implicit RelabelType when deciding if the non-Const input is a plain Var. That makes it a little squishy to argue that the remote should resolve the Const as being of the same type as its Var, because then our Const is not the same type as our Var. However, if we don't do that, then this hack won't work as desired if the user chooses to use varchar rather than text to represent some remote column. That seems useful, so do it like this for now. We might have to give up the RelabelType-ignoring bit if any problems surface. Dian Fay, with review and kibitzing by me Discussion: https://postgr.es/m/C9LU294V7K4F.34LRRDU449O45@lamia
2021-11-12 17:50:40 +01:00
bool isstring = false;
bool needlabel;
if (node->constisnull)
{
appendStringInfoString(buf, "NULL");
if (showtype >= 0)
appendStringInfo(buf, "::%s",
deparse_type_name(node->consttype,
node->consttypmod));
return;
}
getTypeOutputInfo(node->consttype,
&typoutput, &typIsVarlena);
extval = OidOutputFunctionCall(typoutput, node->constvalue);
switch (node->consttype)
{
case INT2OID:
case INT4OID:
case INT8OID:
case OIDOID:
case FLOAT4OID:
case FLOAT8OID:
case NUMERICOID:
{
/*
* No need to quote unless it's a special value such as 'NaN'.
* See comments in get_const_expr().
*/
if (strspn(extval, "0123456789+-eE.") == strlen(extval))
{
if (extval[0] == '+' || extval[0] == '-')
appendStringInfo(buf, "(%s)", extval);
else
appendStringInfoString(buf, extval);
if (strcspn(extval, "eE.") != strlen(extval))
isfloat = true; /* it looks like a float */
}
else
appendStringInfo(buf, "'%s'", extval);
}
break;
case BITOID:
case VARBITOID:
appendStringInfo(buf, "B'%s'", extval);
break;
case BOOLOID:
if (strcmp(extval, "t") == 0)
appendStringInfoString(buf, "true");
else
appendStringInfoString(buf, "false");
break;
default:
deparseStringLiteral(buf, extval);
postgres_fdw: suppress casts on constants in limited cases. When deparsing an expression of the form "remote_var OP constant", we'd normally apply a cast to the constant to make sure that the remote parser thinks it's of the same type we do. However, doing so is often not necessary, and it causes problems if the user has intentionally declared the local column as being of a different type than the remote column. A plausible use-case for that is using text to represent a type that's an enum on the remote side. A comparison on such a column will get shipped as "var = 'foo'::text", which blows up on the remote side because there's no enum = text operator. But if we simply leave off the explicit cast, the comparison will do exactly what the user wants. It's possible to do this without major risk of semantic problems, by relying on the longstanding parser heuristic that "if one operand of an operator is of type unknown, while the other one has a known type, assume that the unknown operand is also of that type". Hence, this patch leaves off the cast only if (a) the operator inputs have the same type locally; (b) the constant will print as a string literal or NULL, both of which are initially taken as type unknown; and (c) the non-Const input is a plain foreign Var. Rule (c) guarantees that the remote parser will know the type of the non-Const input; moreover, it means that if this cast-omission does cause any semantic surprises, that can only happen in cases where the local column has a different type than the remote column. That wasn't guaranteed to work anyway, and this patch should represent a net usability gain for such cases. One point that I (tgl) remain slightly uncomfortable with is that we will ignore an implicit RelabelType when deciding if the non-Const input is a plain Var. That makes it a little squishy to argue that the remote should resolve the Const as being of the same type as its Var, because then our Const is not the same type as our Var. However, if we don't do that, then this hack won't work as desired if the user chooses to use varchar rather than text to represent some remote column. That seems useful, so do it like this for now. We might have to give up the RelabelType-ignoring bit if any problems surface. Dian Fay, with review and kibitzing by me Discussion: https://postgr.es/m/C9LU294V7K4F.34LRRDU449O45@lamia
2021-11-12 17:50:40 +01:00
isstring = true;
break;
}
pfree(extval);
postgres_fdw: suppress casts on constants in limited cases. When deparsing an expression of the form "remote_var OP constant", we'd normally apply a cast to the constant to make sure that the remote parser thinks it's of the same type we do. However, doing so is often not necessary, and it causes problems if the user has intentionally declared the local column as being of a different type than the remote column. A plausible use-case for that is using text to represent a type that's an enum on the remote side. A comparison on such a column will get shipped as "var = 'foo'::text", which blows up on the remote side because there's no enum = text operator. But if we simply leave off the explicit cast, the comparison will do exactly what the user wants. It's possible to do this without major risk of semantic problems, by relying on the longstanding parser heuristic that "if one operand of an operator is of type unknown, while the other one has a known type, assume that the unknown operand is also of that type". Hence, this patch leaves off the cast only if (a) the operator inputs have the same type locally; (b) the constant will print as a string literal or NULL, both of which are initially taken as type unknown; and (c) the non-Const input is a plain foreign Var. Rule (c) guarantees that the remote parser will know the type of the non-Const input; moreover, it means that if this cast-omission does cause any semantic surprises, that can only happen in cases where the local column has a different type than the remote column. That wasn't guaranteed to work anyway, and this patch should represent a net usability gain for such cases. One point that I (tgl) remain slightly uncomfortable with is that we will ignore an implicit RelabelType when deciding if the non-Const input is a plain Var. That makes it a little squishy to argue that the remote should resolve the Const as being of the same type as its Var, because then our Const is not the same type as our Var. However, if we don't do that, then this hack won't work as desired if the user chooses to use varchar rather than text to represent some remote column. That seems useful, so do it like this for now. We might have to give up the RelabelType-ignoring bit if any problems surface. Dian Fay, with review and kibitzing by me Discussion: https://postgr.es/m/C9LU294V7K4F.34LRRDU449O45@lamia
2021-11-12 17:50:40 +01:00
if (showtype == -1)
return; /* never print type label */
/*
* For showtype == 0, append ::typename unless the constant will be
* implicitly typed as the right type when it is read in.
*
* XXX this code has to be kept in sync with the behavior of the parser,
* especially make_const.
*/
switch (node->consttype)
{
case BOOLOID:
case INT4OID:
case UNKNOWNOID:
needlabel = false;
break;
case NUMERICOID:
needlabel = !isfloat || (node->consttypmod >= 0);
break;
default:
postgres_fdw: suppress casts on constants in limited cases. When deparsing an expression of the form "remote_var OP constant", we'd normally apply a cast to the constant to make sure that the remote parser thinks it's of the same type we do. However, doing so is often not necessary, and it causes problems if the user has intentionally declared the local column as being of a different type than the remote column. A plausible use-case for that is using text to represent a type that's an enum on the remote side. A comparison on such a column will get shipped as "var = 'foo'::text", which blows up on the remote side because there's no enum = text operator. But if we simply leave off the explicit cast, the comparison will do exactly what the user wants. It's possible to do this without major risk of semantic problems, by relying on the longstanding parser heuristic that "if one operand of an operator is of type unknown, while the other one has a known type, assume that the unknown operand is also of that type". Hence, this patch leaves off the cast only if (a) the operator inputs have the same type locally; (b) the constant will print as a string literal or NULL, both of which are initially taken as type unknown; and (c) the non-Const input is a plain foreign Var. Rule (c) guarantees that the remote parser will know the type of the non-Const input; moreover, it means that if this cast-omission does cause any semantic surprises, that can only happen in cases where the local column has a different type than the remote column. That wasn't guaranteed to work anyway, and this patch should represent a net usability gain for such cases. One point that I (tgl) remain slightly uncomfortable with is that we will ignore an implicit RelabelType when deciding if the non-Const input is a plain Var. That makes it a little squishy to argue that the remote should resolve the Const as being of the same type as its Var, because then our Const is not the same type as our Var. However, if we don't do that, then this hack won't work as desired if the user chooses to use varchar rather than text to represent some remote column. That seems useful, so do it like this for now. We might have to give up the RelabelType-ignoring bit if any problems surface. Dian Fay, with review and kibitzing by me Discussion: https://postgr.es/m/C9LU294V7K4F.34LRRDU449O45@lamia
2021-11-12 17:50:40 +01:00
if (showtype == -2)
{
/* label unless we printed it as an untyped string */
needlabel = !isstring;
}
else
needlabel = true;
break;
}
if (needlabel || showtype > 0)
appendStringInfo(buf, "::%s",
deparse_type_name(node->consttype,
node->consttypmod));
}
/*
* Deparse given Param node.
*
* If we're generating the query "for real", add the Param to
* context->params_list if it's not already present, and then use its index
* in that list as the remote parameter number. During EXPLAIN, there's
* no need to identify a parameter number.
*/
static void
deparseParam(Param *node, deparse_expr_cxt *context)
{
if (context->params_list)
{
int pindex = 0;
ListCell *lc;
/* find its index in params_list */
foreach(lc, *context->params_list)
{
pindex++;
if (equal(node, (Node *) lfirst(lc)))
break;
}
if (lc == NULL)
{
/* not in list, so add it */
pindex++;
*context->params_list = lappend(*context->params_list, node);
}
printRemoteParam(pindex, node->paramtype, node->paramtypmod, context);
}
else
{
printRemotePlaceholder(node->paramtype, node->paramtypmod, context);
}
}
/*
* Deparse a container subscript expression.
*/
static void
deparseSubscriptingRef(SubscriptingRef *node, deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
ListCell *lowlist_item;
ListCell *uplist_item;
/* Always parenthesize the expression. */
appendStringInfoChar(buf, '(');
/*
* Deparse referenced array expression first. If that expression includes
* a cast, we have to parenthesize to prevent the array subscript from
* being taken as typename decoration. We can avoid that in the typical
* case of subscripting a Var, but otherwise do it.
*/
if (IsA(node->refexpr, Var))
deparseExpr(node->refexpr, context);
else
{
appendStringInfoChar(buf, '(');
deparseExpr(node->refexpr, context);
appendStringInfoChar(buf, ')');
}
/* Deparse subscript expressions. */
lowlist_item = list_head(node->reflowerindexpr); /* could be NULL */
foreach(uplist_item, node->refupperindexpr)
{
appendStringInfoChar(buf, '[');
if (lowlist_item)
{
deparseExpr(lfirst(lowlist_item), context);
appendStringInfoChar(buf, ':');
Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
lowlist_item = lnext(node->reflowerindexpr, lowlist_item);
}
deparseExpr(lfirst(uplist_item), context);
appendStringInfoChar(buf, ']');
}
appendStringInfoChar(buf, ')');
}
/*
* Deparse a function call.
*/
static void
deparseFuncExpr(FuncExpr *node, deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
bool use_variadic;
bool first;
ListCell *arg;
/*
* If the function call came from an implicit coercion, then just show the
* first argument.
*/
if (node->funcformat == COERCE_IMPLICIT_CAST)
{
deparseExpr((Expr *) linitial(node->args), context);
return;
}
/*
* If the function call came from a cast, then show the first argument
* plus an explicit cast operation.
*/
if (node->funcformat == COERCE_EXPLICIT_CAST)
{
Oid rettype = node->funcresulttype;
int32 coercedTypmod;
/* Get the typmod if this is a length-coercion function */
(void) exprIsLengthCoercion((Node *) node, &coercedTypmod);
deparseExpr((Expr *) linitial(node->args), context);
appendStringInfo(buf, "::%s",
deparse_type_name(rettype, coercedTypmod));
return;
}
/* Check if need to print VARIADIC (cf. ruleutils.c) */
Fix non-equivalence of VARIADIC and non-VARIADIC function call formats. For variadic functions (other than VARIADIC ANY), the syntaxes foo(x,y,...) and foo(VARIADIC ARRAY[x,y,...]) should be considered equivalent, since the former is converted to the latter at parse time. They have indeed been equivalent, in all releases before 9.3. However, commit 75b39e790 made an ill-considered decision to record which syntax had been used in FuncExpr nodes, and then to make equal() test that in checking node equality --- which caused the syntaxes to not be seen as equivalent by the planner. This is the underlying cause of bug #9817 from Dmitry Ryabov. It might seem that a quick fix would be to make equal() disregard FuncExpr.funcvariadic, but the same commit made that untenable, because the field actually *is* semantically significant for some VARIADIC ANY functions. This patch instead adopts the approach of redefining funcvariadic (and aggvariadic, in HEAD) as meaning that the last argument is a variadic array, whether it got that way by parser intervention or was supplied explicitly by the user. Therefore the value will always be true for non-ANY variadic functions, restoring the principle of equivalence. (However, the planner will continue to consider use of VARIADIC as a meaningful difference for VARIADIC ANY functions, even though some such functions might disregard it.) In HEAD, this change lets us simplify the decompilation logic in ruleutils.c, since the funcvariadic/aggvariadic flag tells directly whether to print VARIADIC. However, in 9.3 we have to continue to cope with existing stored rules/views that might contain the previous definition. Fortunately, this just means no change in ruleutils.c, since its existing behavior effectively ignores funcvariadic for all cases other than VARIADIC ANY functions. In HEAD, bump catversion to reflect the fact that FuncExpr.funcvariadic changed meanings; this is sort of pro forma, since I don't believe any built-in views are affected. Unfortunately, this patch doesn't magically fix everything for affected 9.3 users. After installing 9.3.5, they might need to recreate their rules/views/indexes containing variadic function calls in order to get everything consistent with the new definition. As in the cited bug, the symptom of a problem would be failure to use a nominally matching index that has a variadic function call in its definition. We'll need to mention this in the 9.3.5 release notes.
2014-04-04 04:02:24 +02:00
use_variadic = node->funcvariadic;
/*
* Normal function: display as proname(args).
*/
appendFunctionName(node->funcid, context);
appendStringInfoChar(buf, '(');
/* ... and all the arguments */
first = true;
foreach(arg, node->args)
{
if (!first)
appendStringInfoString(buf, ", ");
Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
if (use_variadic && lnext(node->args, arg) == NULL)
appendStringInfoString(buf, "VARIADIC ");
deparseExpr((Expr *) lfirst(arg), context);
first = false;
}
appendStringInfoChar(buf, ')');
}
/*
* Deparse given operator expression. To avoid problems around
* priority of operations, we always parenthesize the arguments.
*/
static void
deparseOpExpr(OpExpr *node, deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
HeapTuple tuple;
Form_pg_operator form;
postgres_fdw: suppress casts on constants in limited cases. When deparsing an expression of the form "remote_var OP constant", we'd normally apply a cast to the constant to make sure that the remote parser thinks it's of the same type we do. However, doing so is often not necessary, and it causes problems if the user has intentionally declared the local column as being of a different type than the remote column. A plausible use-case for that is using text to represent a type that's an enum on the remote side. A comparison on such a column will get shipped as "var = 'foo'::text", which blows up on the remote side because there's no enum = text operator. But if we simply leave off the explicit cast, the comparison will do exactly what the user wants. It's possible to do this without major risk of semantic problems, by relying on the longstanding parser heuristic that "if one operand of an operator is of type unknown, while the other one has a known type, assume that the unknown operand is also of that type". Hence, this patch leaves off the cast only if (a) the operator inputs have the same type locally; (b) the constant will print as a string literal or NULL, both of which are initially taken as type unknown; and (c) the non-Const input is a plain foreign Var. Rule (c) guarantees that the remote parser will know the type of the non-Const input; moreover, it means that if this cast-omission does cause any semantic surprises, that can only happen in cases where the local column has a different type than the remote column. That wasn't guaranteed to work anyway, and this patch should represent a net usability gain for such cases. One point that I (tgl) remain slightly uncomfortable with is that we will ignore an implicit RelabelType when deciding if the non-Const input is a plain Var. That makes it a little squishy to argue that the remote should resolve the Const as being of the same type as its Var, because then our Const is not the same type as our Var. However, if we don't do that, then this hack won't work as desired if the user chooses to use varchar rather than text to represent some remote column. That seems useful, so do it like this for now. We might have to give up the RelabelType-ignoring bit if any problems surface. Dian Fay, with review and kibitzing by me Discussion: https://postgr.es/m/C9LU294V7K4F.34LRRDU449O45@lamia
2021-11-12 17:50:40 +01:00
Expr *right;
bool canSuppressRightConstCast = false;
char oprkind;
/* Retrieve information about the operator from system catalog. */
tuple = SearchSysCache1(OPEROID, ObjectIdGetDatum(node->opno));
if (!HeapTupleIsValid(tuple))
elog(ERROR, "cache lookup failed for operator %u", node->opno);
form = (Form_pg_operator) GETSTRUCT(tuple);
oprkind = form->oprkind;
/* Sanity check. */
Assert((oprkind == 'l' && list_length(node->args) == 1) ||
(oprkind == 'b' && list_length(node->args) == 2));
postgres_fdw: suppress casts on constants in limited cases. When deparsing an expression of the form "remote_var OP constant", we'd normally apply a cast to the constant to make sure that the remote parser thinks it's of the same type we do. However, doing so is often not necessary, and it causes problems if the user has intentionally declared the local column as being of a different type than the remote column. A plausible use-case for that is using text to represent a type that's an enum on the remote side. A comparison on such a column will get shipped as "var = 'foo'::text", which blows up on the remote side because there's no enum = text operator. But if we simply leave off the explicit cast, the comparison will do exactly what the user wants. It's possible to do this without major risk of semantic problems, by relying on the longstanding parser heuristic that "if one operand of an operator is of type unknown, while the other one has a known type, assume that the unknown operand is also of that type". Hence, this patch leaves off the cast only if (a) the operator inputs have the same type locally; (b) the constant will print as a string literal or NULL, both of which are initially taken as type unknown; and (c) the non-Const input is a plain foreign Var. Rule (c) guarantees that the remote parser will know the type of the non-Const input; moreover, it means that if this cast-omission does cause any semantic surprises, that can only happen in cases where the local column has a different type than the remote column. That wasn't guaranteed to work anyway, and this patch should represent a net usability gain for such cases. One point that I (tgl) remain slightly uncomfortable with is that we will ignore an implicit RelabelType when deciding if the non-Const input is a plain Var. That makes it a little squishy to argue that the remote should resolve the Const as being of the same type as its Var, because then our Const is not the same type as our Var. However, if we don't do that, then this hack won't work as desired if the user chooses to use varchar rather than text to represent some remote column. That seems useful, so do it like this for now. We might have to give up the RelabelType-ignoring bit if any problems surface. Dian Fay, with review and kibitzing by me Discussion: https://postgr.es/m/C9LU294V7K4F.34LRRDU449O45@lamia
2021-11-12 17:50:40 +01:00
right = llast(node->args);
/* Always parenthesize the expression. */
appendStringInfoChar(buf, '(');
/* Deparse left operand, if any. */
if (oprkind == 'b')
{
postgres_fdw: suppress casts on constants in limited cases. When deparsing an expression of the form "remote_var OP constant", we'd normally apply a cast to the constant to make sure that the remote parser thinks it's of the same type we do. However, doing so is often not necessary, and it causes problems if the user has intentionally declared the local column as being of a different type than the remote column. A plausible use-case for that is using text to represent a type that's an enum on the remote side. A comparison on such a column will get shipped as "var = 'foo'::text", which blows up on the remote side because there's no enum = text operator. But if we simply leave off the explicit cast, the comparison will do exactly what the user wants. It's possible to do this without major risk of semantic problems, by relying on the longstanding parser heuristic that "if one operand of an operator is of type unknown, while the other one has a known type, assume that the unknown operand is also of that type". Hence, this patch leaves off the cast only if (a) the operator inputs have the same type locally; (b) the constant will print as a string literal or NULL, both of which are initially taken as type unknown; and (c) the non-Const input is a plain foreign Var. Rule (c) guarantees that the remote parser will know the type of the non-Const input; moreover, it means that if this cast-omission does cause any semantic surprises, that can only happen in cases where the local column has a different type than the remote column. That wasn't guaranteed to work anyway, and this patch should represent a net usability gain for such cases. One point that I (tgl) remain slightly uncomfortable with is that we will ignore an implicit RelabelType when deciding if the non-Const input is a plain Var. That makes it a little squishy to argue that the remote should resolve the Const as being of the same type as its Var, because then our Const is not the same type as our Var. However, if we don't do that, then this hack won't work as desired if the user chooses to use varchar rather than text to represent some remote column. That seems useful, so do it like this for now. We might have to give up the RelabelType-ignoring bit if any problems surface. Dian Fay, with review and kibitzing by me Discussion: https://postgr.es/m/C9LU294V7K4F.34LRRDU449O45@lamia
2021-11-12 17:50:40 +01:00
Expr *left = linitial(node->args);
Oid leftType = exprType((Node *) left);
Oid rightType = exprType((Node *) right);
bool canSuppressLeftConstCast = false;
/*
* When considering a binary operator, if one operand is a Const that
* can be printed as a bare string literal or NULL (i.e., it will look
* like type UNKNOWN to the remote parser), the Const normally
* receives an explicit cast to the operator's input type. However,
* in Const-to-Var comparisons where both operands are of the same
* type, we prefer to suppress the explicit cast, leaving the Const's
* type resolution up to the remote parser. The remote's resolution
* heuristic will assume that an unknown input type being compared to
* a known input type is of that known type as well.
*
* This hack allows some cases to succeed where a remote column is
* declared with a different type in the local (foreign) table. By
* emitting "foreigncol = 'foo'" not "foreigncol = 'foo'::text" or the
* like, we allow the remote parser to pick an "=" operator that's
* compatible with whatever type the remote column really is, such as
* an enum.
*
* We allow cast suppression to happen only when the other operand is
* a plain foreign Var. Although the remote's unknown-type heuristic
* would apply to other cases just as well, we would be taking a
* bigger risk that the inferred type is something unexpected. With
* this restriction, if anything goes wrong it's the user's fault for
* not declaring the local column with the same type as the remote
* column.
*/
if (leftType == rightType)
{
if (IsA(left, Const))
canSuppressLeftConstCast = isPlainForeignVar(right, context);
else if (IsA(right, Const))
canSuppressRightConstCast = isPlainForeignVar(left, context);
}
if (canSuppressLeftConstCast)
deparseConst((Const *) left, context, -2);
else
deparseExpr(left, context);
appendStringInfoChar(buf, ' ');
}
/* Deparse operator name. */
deparseOperatorName(buf, form);
/* Deparse right operand. */
appendStringInfoChar(buf, ' ');
postgres_fdw: suppress casts on constants in limited cases. When deparsing an expression of the form "remote_var OP constant", we'd normally apply a cast to the constant to make sure that the remote parser thinks it's of the same type we do. However, doing so is often not necessary, and it causes problems if the user has intentionally declared the local column as being of a different type than the remote column. A plausible use-case for that is using text to represent a type that's an enum on the remote side. A comparison on such a column will get shipped as "var = 'foo'::text", which blows up on the remote side because there's no enum = text operator. But if we simply leave off the explicit cast, the comparison will do exactly what the user wants. It's possible to do this without major risk of semantic problems, by relying on the longstanding parser heuristic that "if one operand of an operator is of type unknown, while the other one has a known type, assume that the unknown operand is also of that type". Hence, this patch leaves off the cast only if (a) the operator inputs have the same type locally; (b) the constant will print as a string literal or NULL, both of which are initially taken as type unknown; and (c) the non-Const input is a plain foreign Var. Rule (c) guarantees that the remote parser will know the type of the non-Const input; moreover, it means that if this cast-omission does cause any semantic surprises, that can only happen in cases where the local column has a different type than the remote column. That wasn't guaranteed to work anyway, and this patch should represent a net usability gain for such cases. One point that I (tgl) remain slightly uncomfortable with is that we will ignore an implicit RelabelType when deciding if the non-Const input is a plain Var. That makes it a little squishy to argue that the remote should resolve the Const as being of the same type as its Var, because then our Const is not the same type as our Var. However, if we don't do that, then this hack won't work as desired if the user chooses to use varchar rather than text to represent some remote column. That seems useful, so do it like this for now. We might have to give up the RelabelType-ignoring bit if any problems surface. Dian Fay, with review and kibitzing by me Discussion: https://postgr.es/m/C9LU294V7K4F.34LRRDU449O45@lamia
2021-11-12 17:50:40 +01:00
if (canSuppressRightConstCast)
deparseConst((Const *) right, context, -2);
else
deparseExpr(right, context);
appendStringInfoChar(buf, ')');
ReleaseSysCache(tuple);
}
postgres_fdw: suppress casts on constants in limited cases. When deparsing an expression of the form "remote_var OP constant", we'd normally apply a cast to the constant to make sure that the remote parser thinks it's of the same type we do. However, doing so is often not necessary, and it causes problems if the user has intentionally declared the local column as being of a different type than the remote column. A plausible use-case for that is using text to represent a type that's an enum on the remote side. A comparison on such a column will get shipped as "var = 'foo'::text", which blows up on the remote side because there's no enum = text operator. But if we simply leave off the explicit cast, the comparison will do exactly what the user wants. It's possible to do this without major risk of semantic problems, by relying on the longstanding parser heuristic that "if one operand of an operator is of type unknown, while the other one has a known type, assume that the unknown operand is also of that type". Hence, this patch leaves off the cast only if (a) the operator inputs have the same type locally; (b) the constant will print as a string literal or NULL, both of which are initially taken as type unknown; and (c) the non-Const input is a plain foreign Var. Rule (c) guarantees that the remote parser will know the type of the non-Const input; moreover, it means that if this cast-omission does cause any semantic surprises, that can only happen in cases where the local column has a different type than the remote column. That wasn't guaranteed to work anyway, and this patch should represent a net usability gain for such cases. One point that I (tgl) remain slightly uncomfortable with is that we will ignore an implicit RelabelType when deciding if the non-Const input is a plain Var. That makes it a little squishy to argue that the remote should resolve the Const as being of the same type as its Var, because then our Const is not the same type as our Var. However, if we don't do that, then this hack won't work as desired if the user chooses to use varchar rather than text to represent some remote column. That seems useful, so do it like this for now. We might have to give up the RelabelType-ignoring bit if any problems surface. Dian Fay, with review and kibitzing by me Discussion: https://postgr.es/m/C9LU294V7K4F.34LRRDU449O45@lamia
2021-11-12 17:50:40 +01:00
/*
* Will "node" deparse as a plain foreign Var?
*/
static bool
isPlainForeignVar(Expr *node, deparse_expr_cxt *context)
{
/*
* We allow the foreign Var to have an implicit RelabelType, mainly so
* that this'll work with varchar columns. Note that deparseRelabelType
* will not print such a cast, so we're not breaking the restriction that
* the expression print as a plain Var. We won't risk it for an implicit
* cast that requires a function, nor for non-implicit RelabelType; such
* cases seem too likely to involve semantics changes compared to what
* would happen on the remote side.
*/
if (IsA(node, RelabelType) &&
((RelabelType *) node)->relabelformat == COERCE_IMPLICIT_CAST)
node = ((RelabelType *) node)->arg;
if (IsA(node, Var))
{
/*
* The Var must be one that'll deparse as a foreign column reference
* (cf. deparseVar).
*/
Var *var = (Var *) node;
Relids relids = context->scanrel->relids;
if (bms_is_member(var->varno, relids) && var->varlevelsup == 0)
return true;
}
return false;
}
/*
* Print the name of an operator.
*/
static void
deparseOperatorName(StringInfo buf, Form_pg_operator opform)
{
char *opname;
/* opname is not a SQL identifier, so we should not quote it. */
opname = NameStr(opform->oprname);
/* Print schema name only if it's not pg_catalog */
if (opform->oprnamespace != PG_CATALOG_NAMESPACE)
{
const char *opnspname;
opnspname = get_namespace_name(opform->oprnamespace);
/* Print fully qualified operator name. */
appendStringInfo(buf, "OPERATOR(%s.%s)",
quote_identifier(opnspname), opname);
}
else
{
/* Just print operator name. */
appendStringInfoString(buf, opname);
}
}
/*
* Deparse IS DISTINCT FROM.
*/
static void
deparseDistinctExpr(DistinctExpr *node, deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
Assert(list_length(node->args) == 2);
appendStringInfoChar(buf, '(');
deparseExpr(linitial(node->args), context);
appendStringInfoString(buf, " IS DISTINCT FROM ");
deparseExpr(lsecond(node->args), context);
appendStringInfoChar(buf, ')');
}
/*
* Deparse given ScalarArrayOpExpr expression. To avoid problems
* around priority of operations, we always parenthesize the arguments.
*/
static void
deparseScalarArrayOpExpr(ScalarArrayOpExpr *node, deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
HeapTuple tuple;
Form_pg_operator form;
Expr *arg1;
Expr *arg2;
/* Retrieve information about the operator from system catalog. */
tuple = SearchSysCache1(OPEROID, ObjectIdGetDatum(node->opno));
if (!HeapTupleIsValid(tuple))
elog(ERROR, "cache lookup failed for operator %u", node->opno);
form = (Form_pg_operator) GETSTRUCT(tuple);
/* Sanity check. */
Assert(list_length(node->args) == 2);
/* Always parenthesize the expression. */
appendStringInfoChar(buf, '(');
/* Deparse left operand. */
arg1 = linitial(node->args);
deparseExpr(arg1, context);
appendStringInfoChar(buf, ' ');
/* Deparse operator name plus decoration. */
deparseOperatorName(buf, form);
appendStringInfo(buf, " %s (", node->useOr ? "ANY" : "ALL");
/* Deparse right operand. */
arg2 = lsecond(node->args);
deparseExpr(arg2, context);
appendStringInfoChar(buf, ')');
/* Always parenthesize the expression. */
appendStringInfoChar(buf, ')');
ReleaseSysCache(tuple);
}
/*
* Deparse a RelabelType (binary-compatible cast) node.
*/
static void
deparseRelabelType(RelabelType *node, deparse_expr_cxt *context)
{
deparseExpr(node->arg, context);
if (node->relabelformat != COERCE_IMPLICIT_CAST)
appendStringInfo(context->buf, "::%s",
deparse_type_name(node->resulttype,
node->resulttypmod));
}
/*
* Deparse a BoolExpr node.
*/
static void
deparseBoolExpr(BoolExpr *node, deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
const char *op = NULL; /* keep compiler quiet */
bool first;
ListCell *lc;
switch (node->boolop)
{
case AND_EXPR:
op = "AND";
break;
case OR_EXPR:
op = "OR";
break;
case NOT_EXPR:
appendStringInfoString(buf, "(NOT ");
deparseExpr(linitial(node->args), context);
appendStringInfoChar(buf, ')');
return;
}
appendStringInfoChar(buf, '(');
first = true;
foreach(lc, node->args)
{
if (!first)
appendStringInfo(buf, " %s ", op);
deparseExpr((Expr *) lfirst(lc), context);
first = false;
}
appendStringInfoChar(buf, ')');
}
/*
* Deparse IS [NOT] NULL expression.
*/
static void
deparseNullTest(NullTest *node, deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
appendStringInfoChar(buf, '(');
deparseExpr(node->arg, context);
Fix assorted fallout from IS [NOT] NULL patch. Commits 4452000f3 et al established semantics for NullTest.argisrow that are a bit different from its initial conception: rather than being merely a cache of whether we've determined the input to have composite type, the flag now has the further meaning that we should apply field-by-field testing as per the standard's definition of IS [NOT] NULL. If argisrow is false and yet the input has composite type, the construct instead has the semantics of IS [NOT] DISTINCT FROM NULL. Update the comments in primnodes.h to clarify this, and fix ruleutils.c and deparse.c to print such cases correctly. In the case of ruleutils.c, this merely results in cosmetic changes in EXPLAIN output, since the case can't currently arise in stored rules. However, it represents a live bug for deparse.c, which would formerly have sent a remote query that had semantics different from the local behavior. (From the user's standpoint, this means that testing a remote nested-composite column for null-ness could have had unexpected recursive behavior much like that fixed in 4452000f3.) In a related but somewhat independent fix, make plancat.c set argisrow to false in all NullTest expressions constructed to represent "attnotnull" constructs. Since attnotnull is actually enforced as a simple null-value check, this is a more accurate representation of the semantics; we were previously overpromising what it meant for composite columns, which might possibly lead to incorrect planner optimizations. (It seems that what the SQL spec expects a NOT NULL constraint to mean is an IS NOT NULL test, so arguably we are violating the spec and should fix attnotnull to do the other thing. If we ever do, this part should get reverted.) Back-patch, same as the previous commit. Discussion: <10682.1469566308@sss.pgh.pa.us>
2016-07-28 22:09:15 +02:00
/*
* For scalar inputs, we prefer to print as IS [NOT] NULL, which is
* shorter and traditional. If it's a rowtype input but we're applying a
* scalar test, must print IS [NOT] DISTINCT FROM NULL to be semantically
* correct.
*/
if (node->argisrow || !type_is_rowtype(exprType((Node *) node->arg)))
{
if (node->nulltesttype == IS_NULL)
appendStringInfoString(buf, " IS NULL)");
else
appendStringInfoString(buf, " IS NOT NULL)");
}
else
Fix assorted fallout from IS [NOT] NULL patch. Commits 4452000f3 et al established semantics for NullTest.argisrow that are a bit different from its initial conception: rather than being merely a cache of whether we've determined the input to have composite type, the flag now has the further meaning that we should apply field-by-field testing as per the standard's definition of IS [NOT] NULL. If argisrow is false and yet the input has composite type, the construct instead has the semantics of IS [NOT] DISTINCT FROM NULL. Update the comments in primnodes.h to clarify this, and fix ruleutils.c and deparse.c to print such cases correctly. In the case of ruleutils.c, this merely results in cosmetic changes in EXPLAIN output, since the case can't currently arise in stored rules. However, it represents a live bug for deparse.c, which would formerly have sent a remote query that had semantics different from the local behavior. (From the user's standpoint, this means that testing a remote nested-composite column for null-ness could have had unexpected recursive behavior much like that fixed in 4452000f3.) In a related but somewhat independent fix, make plancat.c set argisrow to false in all NullTest expressions constructed to represent "attnotnull" constructs. Since attnotnull is actually enforced as a simple null-value check, this is a more accurate representation of the semantics; we were previously overpromising what it meant for composite columns, which might possibly lead to incorrect planner optimizations. (It seems that what the SQL spec expects a NOT NULL constraint to mean is an IS NOT NULL test, so arguably we are violating the spec and should fix attnotnull to do the other thing. If we ever do, this part should get reverted.) Back-patch, same as the previous commit. Discussion: <10682.1469566308@sss.pgh.pa.us>
2016-07-28 22:09:15 +02:00
{
if (node->nulltesttype == IS_NULL)
appendStringInfoString(buf, " IS NOT DISTINCT FROM NULL)");
else
appendStringInfoString(buf, " IS DISTINCT FROM NULL)");
}
}
/*
* Deparse CASE expression
*/
static void
deparseCaseExpr(CaseExpr *node, deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
ListCell *lc;
appendStringInfoString(buf, "(CASE");
/* If this is a CASE arg WHEN then emit the arg expression */
if (node->arg != NULL)
{
appendStringInfoChar(buf, ' ');
deparseExpr(node->arg, context);
}
/* Add each condition/result of the CASE clause */
foreach(lc, node->args)
{
CaseWhen *whenclause = (CaseWhen *) lfirst(lc);
/* WHEN */
appendStringInfoString(buf, " WHEN ");
if (node->arg == NULL) /* CASE WHEN */
deparseExpr(whenclause->expr, context);
else /* CASE arg WHEN */
{
/* Ignore the CaseTestExpr and equality operator. */
deparseExpr(lsecond(castNode(OpExpr, whenclause->expr)->args),
context);
}
/* THEN */
appendStringInfoString(buf, " THEN ");
deparseExpr(whenclause->result, context);
}
/* add ELSE if present */
if (node->defresult != NULL)
{
appendStringInfoString(buf, " ELSE ");
deparseExpr(node->defresult, context);
}
/* append END */
appendStringInfoString(buf, " END)");
}
/*
* Deparse ARRAY[...] construct.
*/
static void
deparseArrayExpr(ArrayExpr *node, deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
bool first = true;
ListCell *lc;
appendStringInfoString(buf, "ARRAY[");
foreach(lc, node->elements)
{
if (!first)
appendStringInfoString(buf, ", ");
deparseExpr(lfirst(lc), context);
first = false;
}
appendStringInfoChar(buf, ']');
/* If the array is empty, we need an explicit cast to the array type. */
if (node->elements == NIL)
appendStringInfo(buf, "::%s",
deparse_type_name(node->array_typeid, -1));
}
/*
* Deparse an Aggref node.
*/
static void
deparseAggref(Aggref *node, deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
bool use_variadic;
/* Only basic, non-split aggregation accepted. */
Assert(node->aggsplit == AGGSPLIT_SIMPLE);
/* Check if need to print VARIADIC (cf. ruleutils.c) */
use_variadic = node->aggvariadic;
/* Find aggregate name from aggfnoid which is a pg_proc entry */
appendFunctionName(node->aggfnoid, context);
appendStringInfoChar(buf, '(');
/* Add DISTINCT */
appendStringInfoString(buf, (node->aggdistinct != NIL) ? "DISTINCT " : "");
if (AGGKIND_IS_ORDERED_SET(node->aggkind))
{
/* Add WITHIN GROUP (ORDER BY ..) */
ListCell *arg;
bool first = true;
Assert(!node->aggvariadic);
Assert(node->aggorder != NIL);
foreach(arg, node->aggdirectargs)
{
if (!first)
appendStringInfoString(buf, ", ");
first = false;
deparseExpr((Expr *) lfirst(arg), context);
}
appendStringInfoString(buf, ") WITHIN GROUP (ORDER BY ");
appendAggOrderBy(node->aggorder, node->args, context);
}
else
{
/* aggstar can be set only in zero-argument aggregates */
if (node->aggstar)
appendStringInfoChar(buf, '*');
else
{
ListCell *arg;
bool first = true;
/* Add all the arguments */
foreach(arg, node->args)
{
TargetEntry *tle = (TargetEntry *) lfirst(arg);
Node *n = (Node *) tle->expr;
if (tle->resjunk)
continue;
if (!first)
appendStringInfoString(buf, ", ");
first = false;
/* Add VARIADIC */
Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-07-15 19:41:58 +02:00
if (use_variadic && lnext(node->args, arg) == NULL)
appendStringInfoString(buf, "VARIADIC ");
deparseExpr((Expr *) n, context);
}
}
/* Add ORDER BY */
if (node->aggorder != NIL)
{
appendStringInfoString(buf, " ORDER BY ");
appendAggOrderBy(node->aggorder, node->args, context);
}
}
/* Add FILTER (WHERE ..) */
if (node->aggfilter != NULL)
{
appendStringInfoString(buf, ") FILTER (WHERE ");
deparseExpr((Expr *) node->aggfilter, context);
}
appendStringInfoChar(buf, ')');
}
/*
* Append ORDER BY within aggregate function.
*/
static void
appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
ListCell *lc;
bool first = true;
foreach(lc, orderList)
{
SortGroupClause *srt = (SortGroupClause *) lfirst(lc);
Node *sortexpr;
if (!first)
appendStringInfoString(buf, ", ");
first = false;
/* Deparse the sort expression proper. */
sortexpr = deparseSortGroupClause(srt->tleSortGroupRef, targetList,
false, context);
/* Add decoration as needed. */
appendOrderBySuffix(srt->sortop, exprType(sortexpr), srt->nulls_first,
context);
}
}
/*
* Append the ASC, DESC, USING <OPERATOR> and NULLS FIRST / NULLS LAST parts
* of an ORDER BY clause.
*/
static void
appendOrderBySuffix(Oid sortop, Oid sortcoltype, bool nulls_first,
deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
TypeCacheEntry *typentry;
/* See whether operator is default < or > for sort expr's datatype. */
typentry = lookup_type_cache(sortcoltype,
TYPECACHE_LT_OPR | TYPECACHE_GT_OPR);
if (sortop == typentry->lt_opr)
appendStringInfoString(buf, " ASC");
else if (sortop == typentry->gt_opr)
appendStringInfoString(buf, " DESC");
else
{
HeapTuple opertup;
Form_pg_operator operform;
appendStringInfoString(buf, " USING ");
/* Append operator name. */
opertup = SearchSysCache1(OPEROID, ObjectIdGetDatum(sortop));
if (!HeapTupleIsValid(opertup))
elog(ERROR, "cache lookup failed for operator %u", sortop);
operform = (Form_pg_operator) GETSTRUCT(opertup);
deparseOperatorName(buf, operform);
ReleaseSysCache(opertup);
}
if (nulls_first)
appendStringInfoString(buf, " NULLS FIRST");
else
appendStringInfoString(buf, " NULLS LAST");
}
/*
* Print the representation of a parameter to be sent to the remote side.
*
* Note: we always label the Param's type explicitly rather than relying on
* transmitting a numeric type OID in PQsendQueryParams(). This allows us to
* avoid assuming that types have the same OIDs on the remote side as they
* do locally --- they need only have the same names.
*/
static void
printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod,
deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
char *ptypename = deparse_type_name(paramtype, paramtypmod);
appendStringInfo(buf, "$%d::%s", paramindex, ptypename);
}
/*
* Print the representation of a placeholder for a parameter that will be
* sent to the remote side at execution time.
*
* This is used when we're just trying to EXPLAIN the remote query.
* We don't have the actual value of the runtime parameter yet, and we don't
* want the remote planner to generate a plan that depends on such a value
* anyway. Thus, we can't do something simple like "$1::paramtype".
* Instead, we emit "((SELECT null::paramtype)::paramtype)".
* In all extant versions of Postgres, the planner will see that as an unknown
* constant value, which is what we want. This might need adjustment if we
* ever make the planner flatten scalar subqueries. Note: the reason for the
* apparently useless outer cast is to ensure that the representation as a
* whole will be parsed as an a_expr and not a select_with_parens; the latter
* would do the wrong thing in the context "x = ANY(...)".
*/
static void
printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
char *ptypename = deparse_type_name(paramtype, paramtypmod);
appendStringInfo(buf, "((SELECT null::%s)::%s)", ptypename, ptypename);
}
/*
* Deparse GROUP BY clause.
*/
static void
appendGroupByClause(List *tlist, deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
Query *query = context->root->parse;
ListCell *lc;
bool first = true;
/* Nothing to be done, if there's no GROUP BY clause in the query. */
if (!query->groupClause)
return;
appendStringInfoString(buf, " GROUP BY ");
/*
* Queries with grouping sets are not pushed down, so we don't expect
* grouping sets here.
*/
Assert(!query->groupingSets);
Remove redundant grouping and DISTINCT columns. Avoid explicitly grouping by columns that we know are redundant for sorting, for example we need group by only one of x and y in SELECT ... WHERE x = y GROUP BY x, y This comes up more often than you might think, as shown by the changes in the regression tests. It's nearly free to detect too, since we are just piggybacking on the existing logic that detects redundant pathkeys. (In some of the existing plans that change, it's visible that a sort step preceding the grouping step already didn't bother to sort by the redundant column, making the old plan a bit silly-looking.) To do this, build processed_groupClause and processed_distinctClause lists that omit any provably-redundant sort items, and consult those not the originals where relevant. This means that within the planner, one should usually consult root->processed_groupClause or root->processed_distinctClause if one wants to know which columns are to be grouped on; but to check whether grouping or distinct-ing is happening at all, check non-NIL-ness of parse->groupClause or parse->distinctClause. This is comparable to longstanding rules about handling the HAVING clause, so I don't think it'll be a huge maintenance problem. nodeAgg.c also needs minor mods, because it's now possible to generate AGG_PLAIN and AGG_SORTED Agg nodes with zero grouping columns. Patch by me; thanks to Richard Guo and David Rowley for review. Discussion: https://postgr.es/m/185315.1672179489@sss.pgh.pa.us
2023-01-18 18:37:57 +01:00
/*
* We intentionally print query->groupClause not processed_groupClause,
* leaving it to the remote planner to get rid of any redundant GROUP BY
* items again. This is necessary in case processed_groupClause reduced
* to empty, and in any case the redundancy situation on the remote might
* be different than what we think here.
*/
foreach(lc, query->groupClause)
{
SortGroupClause *grp = (SortGroupClause *) lfirst(lc);
if (!first)
appendStringInfoString(buf, ", ");
first = false;
deparseSortGroupClause(grp->tleSortGroupRef, tlist, true, context);
}
}
/*
* Deparse ORDER BY clause defined by the given pathkeys.
*
* The clause should use Vars from context->scanrel if !has_final_sort,
* or from context->foreignrel's targetlist if has_final_sort.
*
* We find a suitable pathkey expression (some earlier step
* should have verified that there is one) and deparse it.
*/
static void
appendOrderByClause(List *pathkeys, bool has_final_sort,
deparse_expr_cxt *context)
{
ListCell *lcell;
int nestlevel;
StringInfo buf = context->buf;
bool gotone = false;
/* Make sure any constants in the exprs are printed portably */
nestlevel = set_transmission_modes();
foreach(lcell, pathkeys)
{
PathKey *pathkey = lfirst(lcell);
EquivalenceMember *em;
Expr *em_expr;
Oid oprid;
if (has_final_sort)
{
/*
* By construction, context->foreignrel is the input relation to
* the final sort.
*/
em = find_em_for_rel_target(context->root,
pathkey->pk_eclass,
context->foreignrel);
}
else
em = find_em_for_rel(context->root,
pathkey->pk_eclass,
context->scanrel);
/*
* We don't expect any error here; it would mean that shippability
* wasn't verified earlier. For the same reason, we don't recheck
* shippability of the sort operator.
*/
if (em == NULL)
elog(ERROR, "could not find pathkey item to sort");
em_expr = em->em_expr;
/*
* If the member is a Const expression then we needn't add it to the
* ORDER BY clause. This can happen in UNION ALL queries where the
* union child targetlist has a Const. Adding these would be
* wasteful, but also, for INT columns, an integer literal would be
* seen as an ordinal column position rather than a value to sort by.
* deparseConst() does have code to handle this, but it seems less
* effort on all accounts just to skip these for ORDER BY clauses.
*/
if (IsA(em_expr, Const))
continue;
if (!gotone)
{
appendStringInfoString(buf, " ORDER BY ");
gotone = true;
}
else
appendStringInfoString(buf, ", ");
/*
* Lookup the operator corresponding to the strategy in the opclass.
* The datatype used by the opfamily is not necessarily the same as
* the expression type (for array types for example).
*/
oprid = get_opfamily_member(pathkey->pk_opfamily,
em->em_datatype,
em->em_datatype,
pathkey->pk_strategy);
if (!OidIsValid(oprid))
elog(ERROR, "missing operator %d(%u,%u) in opfamily %u",
pathkey->pk_strategy, em->em_datatype, em->em_datatype,
pathkey->pk_opfamily);
deparseExpr(em_expr, context);
/*
* Here we need to use the expression's actual type to discover
* whether the desired operator will be the default or not.
*/
appendOrderBySuffix(oprid, exprType((Node *) em_expr),
pathkey->pk_nulls_first, context);
}
reset_transmission_modes(nestlevel);
}
/*
* Deparse LIMIT/OFFSET clause.
*/
static void
appendLimitClause(deparse_expr_cxt *context)
{
PlannerInfo *root = context->root;
StringInfo buf = context->buf;
int nestlevel;
/* Make sure any constants in the exprs are printed portably */
nestlevel = set_transmission_modes();
if (root->parse->limitCount)
{
appendStringInfoString(buf, " LIMIT ");
deparseExpr((Expr *) root->parse->limitCount, context);
}
if (root->parse->limitOffset)
{
appendStringInfoString(buf, " OFFSET ");
deparseExpr((Expr *) root->parse->limitOffset, context);
}
reset_transmission_modes(nestlevel);
}
/*
* appendFunctionName
* Deparses function name from given function oid.
*/
static void
appendFunctionName(Oid funcid, deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
HeapTuple proctup;
Form_pg_proc procform;
const char *proname;
proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
if (!HeapTupleIsValid(proctup))
elog(ERROR, "cache lookup failed for function %u", funcid);
procform = (Form_pg_proc) GETSTRUCT(proctup);
/* Print schema name only if it's not pg_catalog */
if (procform->pronamespace != PG_CATALOG_NAMESPACE)
{
const char *schemaname;
schemaname = get_namespace_name(procform->pronamespace);
appendStringInfo(buf, "%s.", quote_identifier(schemaname));
}
/* Always print the function name */
proname = NameStr(procform->proname);
appendStringInfoString(buf, quote_identifier(proname));
ReleaseSysCache(proctup);
}
/*
* Appends a sort or group clause.
*
* Like get_rule_sortgroupclause(), returns the expression tree, so caller
* need not find it again.
*/
static Node *
deparseSortGroupClause(Index ref, List *tlist, bool force_colno,
deparse_expr_cxt *context)
{
StringInfo buf = context->buf;
TargetEntry *tle;
Expr *expr;
tle = get_sortgroupref_tle(ref, tlist);
expr = tle->expr;
if (force_colno)
{
/* Use column-number form when requested by caller. */
Assert(!tle->resjunk);
appendStringInfo(buf, "%d", tle->resno);
}
else if (expr && IsA(expr, Const))
{
/*
* Force a typecast here so that we don't emit something like "GROUP
* BY 2", which will be misconstrued as a column position rather than
* a constant.
*/
deparseConst((Const *) expr, context, 1);
}
else if (!expr || IsA(expr, Var))
deparseExpr(expr, context);
else
{
/* Always parenthesize the expression. */
appendStringInfoChar(buf, '(');
deparseExpr(expr, context);
appendStringInfoChar(buf, ')');
}
return (Node *) expr;
}
/*
* Returns true if given Var is deparsed as a subquery output column, in
* which case, *relno and *colno are set to the IDs for the relation and
* column alias to the Var provided by the subquery.
*/
static bool
is_subquery_var(Var *node, RelOptInfo *foreignrel, int *relno, int *colno)
{
PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
RelOptInfo *outerrel = fpinfo->outerrel;
RelOptInfo *innerrel = fpinfo->innerrel;
/* Should only be called in these cases. */
Assert(IS_SIMPLE_REL(foreignrel) || IS_JOIN_REL(foreignrel));
/*
* If the given relation isn't a join relation, it doesn't have any lower
* subqueries, so the Var isn't a subquery output column.
*/
if (!IS_JOIN_REL(foreignrel))
return false;
/*
* If the Var doesn't belong to any lower subqueries, it isn't a subquery
* output column.
*/
if (!bms_is_member(node->varno, fpinfo->lower_subquery_rels))
return false;
if (bms_is_member(node->varno, outerrel->relids))
{
/*
* If outer relation is deparsed as a subquery, the Var is an output
* column of the subquery; get the IDs for the relation/column alias.
*/
if (fpinfo->make_outerrel_subquery)
{
get_relation_column_alias_ids(node, outerrel, relno, colno);
return true;
}
/* Otherwise, recurse into the outer relation. */
return is_subquery_var(node, outerrel, relno, colno);
}
else
{
Assert(bms_is_member(node->varno, innerrel->relids));
/*
* If inner relation is deparsed as a subquery, the Var is an output
* column of the subquery; get the IDs for the relation/column alias.
*/
if (fpinfo->make_innerrel_subquery)
{
get_relation_column_alias_ids(node, innerrel, relno, colno);
return true;
}
/* Otherwise, recurse into the inner relation. */
return is_subquery_var(node, innerrel, relno, colno);
}
}
/*
* Get the IDs for the relation and column alias to given Var belonging to
* given relation, which are returned into *relno and *colno.
*/
static void
get_relation_column_alias_ids(Var *node, RelOptInfo *foreignrel,
int *relno, int *colno)
{
PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
int i;
ListCell *lc;
/* Get the relation alias ID */
*relno = fpinfo->relation_index;
/* Get the column alias ID */
i = 1;
foreach(lc, foreignrel->reltarget->exprs)
{
Make Vars be outer-join-aware. Traditionally we used the same Var struct to represent the value of a table column everywhere in parse and plan trees. This choice predates our support for SQL outer joins, and it's really a pretty bad idea with outer joins, because the Var's value can depend on where it is in the tree: it might go to NULL above an outer join. So expression nodes that are equal() per equalfuncs.c might not represent the same value, which is a huge correctness hazard for the planner. To improve this, decorate Var nodes with a bitmapset showing which outer joins (identified by RTE indexes) may have nulled them at the point in the parse tree where the Var appears. This allows us to trust that equal() Vars represent the same value. A certain amount of klugery is still needed to cope with cases where we re-order two outer joins, but it's possible to make it work without sacrificing that core principle. PlaceHolderVars receive similar decoration for the same reason. In the planner, we include these outer join bitmapsets into the relids that an expression is considered to depend on, and in consequence also add outer-join relids to the relids of join RelOptInfos. This allows us to correctly perceive whether an expression can be calculated above or below a particular outer join. This change affects FDWs that want to plan foreign joins. They *must* follow suit when labeling foreign joins in order to match with the core planner, but for many purposes (if postgres_fdw is any guide) they'd prefer to consider only base relations within the join. To support both requirements, redefine ForeignScan.fs_relids as base+OJ relids, and add a new field fs_base_relids that's set up by the core planner. Large though it is, this commit just does the minimum necessary to install the new mechanisms and get check-world passing again. Follow-up patches will perform some cleanup. (The README additions and comments mention some stuff that will appear in the follow-up.) Patch by me; thanks to Richard Guo for review. Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
2023-01-30 19:16:20 +01:00
Var *tlvar = (Var *) lfirst(lc);
/*
* Match reltarget entries only on varno/varattno. Ideally there
* would be some cross-check on varnullingrels, but it's unclear what
* to do exactly; we don't have enough context to know what that value
* should be.
*/
if (IsA(tlvar, Var) &&
tlvar->varno == node->varno &&
tlvar->varattno == node->varattno)
{
*colno = i;
return;
}
i++;
}
/* Shouldn't get here */
elog(ERROR, "unexpected expression in subquery output");
}